[Recent Interests] Data Engineering / MLOps
[airflow] ubuntu ์„œ๋ฒ„์— airflow ์„œ๋น„์Šค ๋“ฑ๋กํ•˜๊ธฐ

ubuntu 22.04์—์„œ systemd๋ฅผ ์‚ฌ์šฉํ•ด์„œ airflow ๋ฐ๋ชฌ์„ ์ƒ์„ฑํ•ด๋ด…๋‹ˆ๋‹ค. linux์—๋Š” ์‚ฌ์šฉ์ž ์ •์˜ ์„œ๋น„์Šค๋ฅผ ์ƒ์„ฑํ•ด์„œ ์‹œ์Šคํ…œ์„ ๋ถ€ํŒ…ํ•  ๋•Œ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.airflow webserver์™€ scheduler์šฉ ์„œ๋น„์Šค๋ฅผ ์ƒ์„ฑํ•˜๊ณ  ์‹คํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์•Œ์•„๋ด…๋‹ˆ๋‹ค.

์„œ๋น„์Šค ํŒŒ์ผ ์ƒ์„ฑ

์„œ๋น„์Šค ํŒŒ์ผ์„ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด ์•„๋ž˜ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

 sudo vi /etc/systemd/system/airflow-webserver.service

์ด์ œ ์„œ๋น„์Šค ํŒŒ์ผ์˜ ๋‚ด์šฉ์„ ์•„๋ž˜์™€ ๊ฐ™์ด ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.

[Unit]
Description=Airflow webserver daemon
After=network.target mysql.service
Wants=mysql.service
[Service]
Environment="PATH=$PATH:/home/<user_name>/airflow_env/bin:/home/<user_name>/airflow/"
User=<user_name>
Group=<group_name>
Type=simple
ExecStart=/home/<user_name>/airflow_env/bin/airflow webserver
Restart=on-failure
RestartSec=5s
PrivateTmp=true
[Install]
WantedBy=multi-user.target

์œ„ ๊ตฌ์„ฑ์— ๋Œ€ํ•œ ์„ค๋ช… 

- Description: ์„œ๋น„์Šค์— ๋Œ€ํ•œ ์„ค๋ช…

- After: ๋ฐ๋ชฌ์„ ์‹คํ–‰ํ•˜๋Š”๋ฐ ํ•„์š”ํ•œ ์„œ๋น„์Šค๊ฐ€ ์‹œ์ž‘๋œ ํ›„์— ์‹คํ–‰๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด์„œ airflow๋Š” ์‹œ์ž‘ํ•˜๊ธฐ ์ „์— network, mysql์ด ์ค€๋น„๋˜์–ด์žˆ์–ด์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๋ ‡๊ฒŒ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค. ์ถ”๊ฐ€์ ์œผ๋กœ ํ•„์š”ํ•œ ์„œ๋น„์Šค๋ฅผ ์ž‘์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

- Enviroment: ์„œ๋น„์Šค์— ํ•„์š”ํ•œ ํ™˜๊ฒฝ๋ณ€์ˆ˜๋“ค์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. Environment์˜ ๊ฐ’์€ PATH = [path๋ณ€์ˆ˜ ๊ฐ’] HIVE_HOME=/opt/apache-hive-3.1.2-bin ์ฒ˜๋Ÿผ [ํ™˜๊ฒฝ๋ณ€์ˆ˜๋ช…]=[ํ™˜๊ฒฝ๋ณ€์ˆ˜๊ฐ’] ์œผ๋กœ ์ž‘์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

- User: ๋กœ๊ทธ์ธ ์‚ฌ์šฉ์ž๋ช…, ์„œ๋น„์Šค ํ˜ธ์ถœํ•˜๋Š” ์‚ฌ์šฉ์ž๋ช…์ž…๋‹ˆ๋‹ค.

- ExecStart: ์‹คํ–‰ํ•  ๋ช…๋ น์–ด๋ฅผ ์ง€์ •ํ•˜๊ฑฐ๋‚˜ ์Šคํฌ๋ฆฝํŠธ์˜ ๊ฒฝ๋กœ๋ฅผ ์„ค์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ €๋Š” anaconda ํ™˜๊ฒฝ์— airflow๊ฐ€ ์„ค์น˜๋˜์–ด์žˆ์—ˆ๊ณ , ํฌํŠธ๋ฅผ ๋ณ€๊ฒฝํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์Œ ๋ช…๋ น์–ด๋กœ ์ž…๋ ฅํ–ˆ์Šต๋‹ˆ๋‹ค.

/root/anaconda3/bin/airflow webserver --port 5555

 - Restart: ์„œ๋น„์Šค ์‹คํ–‰์— ์‹คํŒจํ–ˆ์„ ๊ฒฝ์šฐ ๋‹ค์‹œ ์‹คํ–‰ํ•˜๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

- RestartSec: systemd๋Š” 100ms ํ›„์— ๋‹ค์‹œ ์‹œ์ž‘์„ ์‹œ๋„ํ•˜๋ฉฐ, ์ด ๊ฐ’์„ ๋ณ€๊ฒฝํ•˜๋ฉด ๋Œ€๊ธฐํ•  ์ดˆ๋ฅผ ์ง€์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์œ„์™€ ๊ฐ™์ด ๋‚ด์šฉ์„ ์ž‘์„ฑํ•œ ๋‹ค์Œ ์„œ๋น„์Šค๋ฅผ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค.

์„œ๋น„์Šค ๋กœ๋“œ, ํ™œ์„ฑํ™”, ์‹œ์ž‘

์„œ๋น„์Šค ๋กœ๋“œํ•˜๊ธฐ

sudo systemctl daemon-reload

์„œ๋น„์Šค ํ™œ์„ฑํ™”ํ•˜๊ธฐ

sudo systemctl enable airflow-webserver.service

์„œ๋น„์Šค ์‹œ์ž‘ํ•˜๊ธฐ

sudo systemctl start airflow-webserver.service

์„œ๋น„์Šค๋ฅผ ์‹œ์ž‘ํ•œ ํ›„ ์„œ๋น„์Šค ์ƒํƒœ ํ™•์ธ

sudo systemctl status airflow-webserver.service

์„œ๋น„์Šค๋ฅผ ์„ฑ๊ณต์ ์œผ๋กœ ์‹œ์ž‘ํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™์ด active์ƒํƒœ๋กœ ์‹œ์ž‘๋œ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด๋ ‡๊ฒŒ ํ‘œ์‹œ๋˜๋ฉด ์‹œ์Šคํ…œ ๋ถ€ํŒ…์‹œ ์‹œ์ž‘ํ•˜๋Š” ์„œ๋น„์Šค๋ฅผ ์„ฑ๊ณต์ ์œผ๋กœ ๋“ฑ๋กํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค. 

airflow scheduler์— ๋Œ€ํ•œ ์„œ๋น„์Šค๋Š” webserver์™€ ๋™์ผํ•œ ๋ฐฉ์‹์œผ๋กœ ์ง„ํ–‰ํ•˜๊ณ  ExecStart ๋ถ€๋ถ„์—์„œ webserver๋งŒ scheduler๋กœ ๋Œ€์ฒดํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค. 


refernece

https://www.upsolver.com/blog/airflow-as-a-daemon