Files
sriphat-dataplatform/04-ingestion/ARCHITECTURE.md
2026-03-02 21:58:51 +07:00

170 lines
4.2 KiB
Markdown

# Airbyte Network Architecture
## Overview
Airbyte deployment uses the **existing** Nginx Proxy Manager from `01-infra`. No additional nginx is needed in `04-ingestion`.
## Network Flow
```
Internet (HTTPS)
Nginx Proxy Manager (01-infra)
- Container: nginx-proxy-manager
- Ports: 80, 443, 8021 (admin)
- Network: shared_data_network
airbyte-proxy (deployed by abctl)
- Container: airbyte-proxy
- Internal Port: 8000
- External Port: 8030 (mapped)
- Network: shared_data_network
Airbyte Services
- airbyte-server
- airbyte-worker
- airbyte-webapp
- airbyte-temporal
- etc.
```
## Access Methods
### 1. Production (via Domain)
```
https://ai.sriphat.com/airbyte
Nginx Proxy Manager (01-infra)
airbyte-proxy:8000 (internal)
Airbyte Services
```
### 2. Local/Development
```
http://localhost:8030
airbyte-proxy:8030 (port mapping)
Airbyte Services
```
### 3. Direct IP Access
```
http://[SERVER_IP]:8030
airbyte-proxy:8030 (port mapping)
Airbyte Services
```
## Components
### 01-infra (Shared Infrastructure)
- **Nginx Proxy Manager**: External reverse proxy
- Handles SSL/TLS termination
- Routes traffic to backend services
- Manages authentication (OAuth2/Basic Auth)
- Domain: ai.sriphat.com
- **PostgreSQL**: Shared database
- Databases: `airbyte`, `temporal`, `temporal_visibility`
- Used by Airbyte for metadata storage
- **Keycloak**: Identity provider (optional)
- Can be integrated via OAuth2 Proxy
- Provides SSO for all services
### 04-ingestion (Airbyte)
- **airbyte-proxy**: Internal nginx (deployed by abctl)
- Routes between Airbyte microservices
- NOT for external access
- Listens on port 8000 (internal), 8030 (external)
- **Airbyte Services**: Deployed by abctl
- All services connect to `shared_data_network`
- Communicate with PostgreSQL and each other
## Network Configuration
### shared_data_network
All services connect to this Docker network:
- nginx-proxy-manager (01-infra)
- postgres (01-infra)
- keycloak (01-infra)
- airbyte-proxy (04-ingestion)
- airbyte-server (04-ingestion)
- airbyte-worker (04-ingestion)
- airbyte-webapp (04-ingestion)
- airbyte-temporal (04-ingestion)
- etc.
### Port Mappings
**External Ports:**
- 80, 443: Nginx Proxy Manager (HTTPS)
- 8021: Nginx Proxy Manager Admin UI
- 8030: Airbyte (direct access, optional)
- 5435: PostgreSQL (external access)
**Internal Ports:**
- 8000: airbyte-proxy (accessed by Nginx Proxy Manager)
- 5432: postgres (internal network only)
- 8080: keycloak (internal network only)
## Why No Additional Nginx?
1. **abctl deploys airbyte-proxy**: This is Airbyte's internal nginx for routing between microservices
2. **Nginx Proxy Manager exists**: Already running in `01-infra` for external access
3. **Shared network**: Both can communicate via `shared_data_network`
4. **Single point of entry**: Nginx Proxy Manager handles all external traffic
## Configuration Steps
1. **Deploy Infrastructure** (01-infra)
```bash
cd 01-infra
docker compose --env-file ../.env.global up -d
```
2. **Deploy Airbyte** (04-ingestion)
```bash
cd 04-ingestion
bash setup-airbyte.sh
```
- This deploys airbyte-proxy automatically
- Connects to shared_data_network
- Uses shared PostgreSQL
3. **Configure Nginx Proxy Manager**
- Add proxy host for `ai.sriphat.com`
- Forward to `airbyte-proxy:8000`
- Enable SSL
- Add authentication (optional)
## Security Layers
1. **SSL/TLS**: Nginx Proxy Manager (Let's Encrypt)
2. **Authentication**: OAuth2 Proxy + Keycloak OR Basic Auth
3. **Network Isolation**: Docker network (shared_data_network)
4. **Firewall**: Only expose necessary ports
## Troubleshooting
### Cannot access via domain
- Check Nginx Proxy Manager is running
- Verify proxy host configuration
- Check DNS points to server
- Verify SSL certificate
### Cannot access locally
- Check airbyte-proxy is running: `docker ps | grep airbyte-proxy`
- Verify port 8030 is mapped
- Check firewall allows port 8030
### Services cannot communicate
- Verify all containers on `shared_data_network`
- Check network: `docker network inspect shared_data_network`
- Verify container names resolve (postgres, airbyte-proxy, etc.)