--- tags: - project/sriphat - airflow - workflow - etl created: 2026-05-07 status: active folder: 05-airflow --- # Apache Airflow (05-airflow) > **Docker Compose:** `05-airflow/docker-compose.yaml` > **Env File:** `05-airflow/.env` > **Version:** Apache Airflow 3.1.5 ## Overview Apache Airflow ใช้สำหรับ Workflow Orchestration: - รัน DAGs (Directed Acyclic Graphs) แบบตั้งเวลา - ประมวลผล Excel/CSV files จาก Finance - ETL pipeline orchestration - Integration กับ API Service **Executor:** CeleryExecutor (ใช้ Redis เป็น broker) --- ## Services | Container | หน้าที่ | Port | |-----------|--------|------| | `airflow-apiserver` | REST API + Web UI | `8200:8080` | | `airflow-scheduler` | DAG scheduling | internal | | `airflow-dag-processor` | DAG file parsing | internal | | `airflow-worker` | Task execution (Celery) | internal | | `airflow-triggerer` | Deferred task triggering | internal | | `airflow-init` | Database migration (one-time) | — | | `airflow-cli` | CLI tool (debug profile) | — | | `flower` | Celery monitoring (optional) | `5555:5555` | --- ## Architecture ``` ┌─────────────────┐ │ airflow- │ │ apiserver │ ← Web UI + REST API (port 8200) │ (port 8080) │ └────────┬────────┘ │ ┌───────────────┼───────────────┐ │ │ │ ┌──────▼──────┐ ┌──────▼──────┐ ┌─────▼──────┐ │ airflow- │ │ airflow- │ │ airflow- │ │ scheduler │ │ dag- │ │ triggerer │ │ │ │ processor │ │ │ └──────┬──────┘ └─────────────┘ └────────────┘ │ ▼ (Celery tasks via Redis) ┌──────────────┐ │ airflow- │ │ worker │ ← รัน tasks จริง └──────────────┘ │ ▼ ┌──────────────┐ │ PostgreSQL │ (Airflow metadata DB) │ Redis │ (Celery broker) └──────────────┘ ``` --- ## Database Configuration Airflow ใช้ PostgreSQL บน Infra server: ```bash # Connection string AIRFLOW__DATABASE__SQL_ALCHEMY_CONN= postgresql+psycopg2://${AIRFLOW_DB_USER}:${AIRFLOW_DB_PASSWD}@${AIRFLOW_DB_HOST}:${AIRFLOW_DB_PORT}/${AIRFLOW_DB_NAME} AIRFLOW__CELERY__RESULT_BACKEND= db+postgresql://${AIRFLOW_DB_USER}:${AIRFLOW_DB_PASSWD}@${AIRFLOW_DB_HOST}:${AIRFLOW_DB_PORT}/${AIRFLOW_DB_NAME} # Redis broker AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0 ``` --- ## Volume Mounts ``` 05-airflow/ ├── dags/ → /opt/airflow/dags (DAG files) ├── logs/ → /opt/airflow/logs (Task logs) ├── config/ → /opt/airflow/config (airflow.cfg) │ └── airflow.cfg └── plugins/ → /opt/airflow/plugins (Custom plugins) ``` --- ## Web UI **URL:** `http://localhost:8200` หรือ `https://ai.sriphat.com/airflow` ```bash # Config AIRFLOW__WEBSERVER__BASE_URL=https://ai.sriphat.com/airflow AIRFLOW__WEBSERVER__WEB_SERVER_PORT=8080 ``` Default credentials (ถ้าไม่เปลี่ยน): - Username: `airflow` - Password: `airflow` --- ## DAGs ที่มีอยู่ | DAG ID | หน้าที่ | ถูก Trigger จาก | |--------|--------|----------------| | `process_finance_excel` | ประมวลผล Excel ของ Finance | API Service | --- ## Airflow Configuration (airflow.cfg) **Path:** `05-airflow/config/airflow.cfg` Key settings: ```ini [core] executor = CeleryExecutor load_examples = False dags_are_paused_at_creation = True [webserver] base_url = https://ai.sriphat.com/airflow [execution_api] execution_api_server_url = http://airflow-apiserver:8080/execution/ ``` --- ## Environment Variables ```bash # Airflow image AIRFLOW_IMAGE_NAME=apache/airflow:3.1.5 # Database AIRFLOW_DB_USER= AIRFLOW_DB_PASSWD= AIRFLOW_DB_HOST= AIRFLOW_DB_PORT=5432 AIRFLOW_DB_NAME=airflow # Security AIRFLOW__CORE__FERNET_KEY= # Admin user _AIRFLOW_WWW_USER_USERNAME=airflow _AIRFLOW_WWW_USER_PASSWORD= # Optional pip packages _PIP_ADDITIONAL_REQUIREMENTS= ``` --- ## Deploy Commands ```bash cd 05-airflow # Initialize (first time only) docker compose up airflow-init # Start all services docker compose up -d # View logs docker logs airflow-apiserver -f docker logs airflow-scheduler -f docker logs airflow-worker -f # Run Celery Flower monitoring docker compose --profile flower up -d # Scale workers (เพิ่ม worker) docker compose up -d --scale airflow-worker=3 ``` --- ## System Requirements Airflow ต้องการ resources ขั้นต่ำ: - **RAM:** ≥ 4 GB - **CPU:** ≥ 2 cores - **Disk:** ≥ 10 GB --- ## Ingestion Layer (04-ingestion / Airbyte) > **หมายเหตุ:** `04-ingestion/docker-compose.yml` ปัจจุบัน **commented out ทั้งหมด** > Airbyte ถูก deploy แยกต่างหาก (ผ่าน `abctl` หรือ standalone) ### Airbyte ที่ระบุในแผน | Source | ชนิดข้อมูล | |--------|----------| | SQL Server (HIS) | ข้อมูลผู้ป่วย, OPD | | Oracle (Lab) | ผลตรวจทางห้องปฏิบัติการ | | REST API | External data | | Excel/CSV | Finance, รายงาน | **Destination:** PostgreSQL `raw_data` schema **Port:** `8030` (เมื่อ deploy แล้ว) --- ## Related - [[00-Project-Overview]] - [[01-Infrastructure]] - [[03-API-Service]] - [[08-Operations-Runbook]]