feat: MinIO integration — bucket finance, API service upload, Nginx routing
- 01-infra/nginx-configs: add MinIO /minio/ and /minio-console/ location blocks (port 9000 S3 API, port 9001 Console UI, path stripping via rewrite) - 03-apiservice: integrate MinIO minio-python SDK for file upload - requirements.txt: add minio==7.2.11 - app/core/config.py: add MINIO_ENDPOINT, ACCESS_KEY, SECRET_KEY, BUCKET_FINANCE, USE_SSL - app/services/minio_client.py: new — upload_file(), get_presigned_url(), delete_file() - app/routes/pages.py: replace local /data/uploads/ write with MinIO upload to finance bucket - docker-compose.yml: pass MinIO env vars to container - .env.example: document MinIO vars - 07-minio/.env.example: add MINIO_SVC_ACCESS_KEY/SECRET_KEY section - 07-minio/README.md: add Python minio SDK and Airflow DAG usage guide - CLAUDE.md: project context (servers, SSH, paths, service distribution) - document-obsidiant/: initial Obsidian docs for all services
This commit is contained in:
@@ -0,0 +1,237 @@
|
||||
---
|
||||
tags:
|
||||
- project/sriphat
|
||||
- airflow
|
||||
- workflow
|
||||
- etl
|
||||
created: 2026-05-07
|
||||
status: active
|
||||
folder: 05-airflow
|
||||
---
|
||||
|
||||
# Apache Airflow (05-airflow)
|
||||
|
||||
> **Docker Compose:** `05-airflow/docker-compose.yaml`
|
||||
> **Env File:** `05-airflow/.env`
|
||||
> **Version:** Apache Airflow 3.1.5
|
||||
|
||||
## Overview
|
||||
|
||||
Apache Airflow ใช้สำหรับ Workflow Orchestration:
|
||||
- รัน DAGs (Directed Acyclic Graphs) แบบตั้งเวลา
|
||||
- ประมวลผล Excel/CSV files จาก Finance
|
||||
- ETL pipeline orchestration
|
||||
- Integration กับ API Service
|
||||
|
||||
**Executor:** CeleryExecutor (ใช้ Redis เป็น broker)
|
||||
|
||||
---
|
||||
|
||||
## Services
|
||||
|
||||
| Container | หน้าที่ | Port |
|
||||
|-----------|--------|------|
|
||||
| `airflow-apiserver` | REST API + Web UI | `8200:8080` |
|
||||
| `airflow-scheduler` | DAG scheduling | internal |
|
||||
| `airflow-dag-processor` | DAG file parsing | internal |
|
||||
| `airflow-worker` | Task execution (Celery) | internal |
|
||||
| `airflow-triggerer` | Deferred task triggering | internal |
|
||||
| `airflow-init` | Database migration (one-time) | — |
|
||||
| `airflow-cli` | CLI tool (debug profile) | — |
|
||||
| `flower` | Celery monitoring (optional) | `5555:5555` |
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────┐
|
||||
│ airflow- │
|
||||
│ apiserver │ ← Web UI + REST API (port 8200)
|
||||
│ (port 8080) │
|
||||
└────────┬────────┘
|
||||
│
|
||||
┌───────────────┼───────────────┐
|
||||
│ │ │
|
||||
┌──────▼──────┐ ┌──────▼──────┐ ┌─────▼──────┐
|
||||
│ airflow- │ │ airflow- │ │ airflow- │
|
||||
│ scheduler │ │ dag- │ │ triggerer │
|
||||
│ │ │ processor │ │ │
|
||||
└──────┬──────┘ └─────────────┘ └────────────┘
|
||||
│
|
||||
▼ (Celery tasks via Redis)
|
||||
┌──────────────┐
|
||||
│ airflow- │
|
||||
│ worker │ ← รัน tasks จริง
|
||||
└──────────────┘
|
||||
│
|
||||
▼
|
||||
┌──────────────┐
|
||||
│ PostgreSQL │ (Airflow metadata DB)
|
||||
│ Redis │ (Celery broker)
|
||||
└──────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Database Configuration
|
||||
|
||||
Airflow ใช้ PostgreSQL บน Infra server:
|
||||
|
||||
```bash
|
||||
# Connection string
|
||||
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=
|
||||
postgresql+psycopg2://${AIRFLOW_DB_USER}:${AIRFLOW_DB_PASSWD}@${AIRFLOW_DB_HOST}:${AIRFLOW_DB_PORT}/${AIRFLOW_DB_NAME}
|
||||
|
||||
AIRFLOW__CELERY__RESULT_BACKEND=
|
||||
db+postgresql://${AIRFLOW_DB_USER}:${AIRFLOW_DB_PASSWD}@${AIRFLOW_DB_HOST}:${AIRFLOW_DB_PORT}/${AIRFLOW_DB_NAME}
|
||||
|
||||
# Redis broker
|
||||
AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Volume Mounts
|
||||
|
||||
```
|
||||
05-airflow/
|
||||
├── dags/ → /opt/airflow/dags (DAG files)
|
||||
├── logs/ → /opt/airflow/logs (Task logs)
|
||||
├── config/ → /opt/airflow/config (airflow.cfg)
|
||||
│ └── airflow.cfg
|
||||
└── plugins/ → /opt/airflow/plugins (Custom plugins)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Web UI
|
||||
|
||||
**URL:** `http://localhost:8200` หรือ `https://ai.sriphat.com/airflow`
|
||||
|
||||
```bash
|
||||
# Config
|
||||
AIRFLOW__WEBSERVER__BASE_URL=https://ai.sriphat.com/airflow
|
||||
AIRFLOW__WEBSERVER__WEB_SERVER_PORT=8080
|
||||
```
|
||||
|
||||
Default credentials (ถ้าไม่เปลี่ยน):
|
||||
- Username: `airflow`
|
||||
- Password: `airflow`
|
||||
|
||||
---
|
||||
|
||||
## DAGs ที่มีอยู่
|
||||
|
||||
| DAG ID | หน้าที่ | ถูก Trigger จาก |
|
||||
|--------|--------|----------------|
|
||||
| `process_finance_excel` | ประมวลผล Excel ของ Finance | API Service |
|
||||
|
||||
---
|
||||
|
||||
## Airflow Configuration (airflow.cfg)
|
||||
|
||||
**Path:** `05-airflow/config/airflow.cfg`
|
||||
|
||||
Key settings:
|
||||
```ini
|
||||
[core]
|
||||
executor = CeleryExecutor
|
||||
load_examples = False
|
||||
dags_are_paused_at_creation = True
|
||||
|
||||
[webserver]
|
||||
base_url = https://ai.sriphat.com/airflow
|
||||
|
||||
[execution_api]
|
||||
execution_api_server_url = http://airflow-apiserver:8080/execution/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
# Airflow image
|
||||
AIRFLOW_IMAGE_NAME=apache/airflow:3.1.5
|
||||
|
||||
# Database
|
||||
AIRFLOW_DB_USER=<user>
|
||||
AIRFLOW_DB_PASSWD=<password>
|
||||
AIRFLOW_DB_HOST=<postgres-host>
|
||||
AIRFLOW_DB_PORT=5432
|
||||
AIRFLOW_DB_NAME=airflow
|
||||
|
||||
# Security
|
||||
AIRFLOW__CORE__FERNET_KEY=<fernet-key>
|
||||
|
||||
# Admin user
|
||||
_AIRFLOW_WWW_USER_USERNAME=airflow
|
||||
_AIRFLOW_WWW_USER_PASSWORD=<password>
|
||||
|
||||
# Optional pip packages
|
||||
_PIP_ADDITIONAL_REQUIREMENTS=
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Deploy Commands
|
||||
|
||||
```bash
|
||||
cd 05-airflow
|
||||
|
||||
# Initialize (first time only)
|
||||
docker compose up airflow-init
|
||||
|
||||
# Start all services
|
||||
docker compose up -d
|
||||
|
||||
# View logs
|
||||
docker logs airflow-apiserver -f
|
||||
docker logs airflow-scheduler -f
|
||||
docker logs airflow-worker -f
|
||||
|
||||
# Run Celery Flower monitoring
|
||||
docker compose --profile flower up -d
|
||||
|
||||
# Scale workers (เพิ่ม worker)
|
||||
docker compose up -d --scale airflow-worker=3
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## System Requirements
|
||||
|
||||
Airflow ต้องการ resources ขั้นต่ำ:
|
||||
- **RAM:** ≥ 4 GB
|
||||
- **CPU:** ≥ 2 cores
|
||||
- **Disk:** ≥ 10 GB
|
||||
|
||||
---
|
||||
|
||||
## Ingestion Layer (04-ingestion / Airbyte)
|
||||
|
||||
> **หมายเหตุ:** `04-ingestion/docker-compose.yml` ปัจจุบัน **commented out ทั้งหมด**
|
||||
> Airbyte ถูก deploy แยกต่างหาก (ผ่าน `abctl` หรือ standalone)
|
||||
|
||||
### Airbyte ที่ระบุในแผน
|
||||
|
||||
| Source | ชนิดข้อมูล |
|
||||
|--------|----------|
|
||||
| SQL Server (HIS) | ข้อมูลผู้ป่วย, OPD |
|
||||
| Oracle (Lab) | ผลตรวจทางห้องปฏิบัติการ |
|
||||
| REST API | External data |
|
||||
| Excel/CSV | Finance, รายงาน |
|
||||
|
||||
**Destination:** PostgreSQL `raw_data` schema
|
||||
|
||||
**Port:** `8030` (เมื่อ deploy แล้ว)
|
||||
|
||||
---
|
||||
|
||||
## Related
|
||||
|
||||
- [[00-Project-Overview]]
|
||||
- [[01-Infrastructure]]
|
||||
- [[03-API-Service]]
|
||||
- [[08-Operations-Runbook]]
|
||||
Reference in New Issue
Block a user