Files
jigoong a587be08bd feat: MinIO integration — bucket finance, API service upload, Nginx routing
- 01-infra/nginx-configs: add MinIO /minio/ and /minio-console/ location blocks
  (port 9000 S3 API, port 9001 Console UI, path stripping via rewrite)
- 03-apiservice: integrate MinIO minio-python SDK for file upload
  - requirements.txt: add minio==7.2.11
  - app/core/config.py: add MINIO_ENDPOINT, ACCESS_KEY, SECRET_KEY, BUCKET_FINANCE, USE_SSL
  - app/services/minio_client.py: new — upload_file(), get_presigned_url(), delete_file()
  - app/routes/pages.py: replace local /data/uploads/ write with MinIO upload to finance bucket
  - docker-compose.yml: pass MinIO env vars to container
  - .env.example: document MinIO vars
- 07-minio/.env.example: add MINIO_SVC_ACCESS_KEY/SECRET_KEY section
- 07-minio/README.md: add Python minio SDK and Airflow DAG usage guide
- CLAUDE.md: project context (servers, SSH, paths, service distribution)
- document-obsidiant/: initial Obsidian docs for all services
2026-05-20 17:42:39 +07:00

648 lines
15 KiB
Markdown

# MinIO Object Storage Service
MinIO is a high-performance, S3-compatible object storage system. This setup includes persistent storage, HTTPS access via Nginx reverse proxy, and Keycloak SSO integration.
## 🎯 Overview
**MinIO Features:**
- **S3-Compatible API** - Works with AWS S3 SDKs and tools
- **High Performance** - Optimized for large-scale data workloads
- **Distributed Storage** - Supports multi-node deployment
- **Web Console** - User-friendly web interface
- **Encryption** - Server-side and client-side encryption
- **Versioning** - Object versioning support
- **Lifecycle Management** - Automatic data retention policies
**This Setup Includes:**
- Docker Compose configuration
- Persistent storage with volume mounts
- HTTPS access via Nginx reverse proxy
- Keycloak SSO integration (OpenID Connect)
- Health checks and monitoring
## 📋 Prerequisites
- Docker and Docker Compose installed
- Network: `shared_data_network` created
- Nginx reverse proxy configured
- Keycloak instance running (for SSO)
- Server: 192.168.100.9
## 🚀 Quick Start
### **Step 1: Configure Environment**
```bash
cd 07-minio
# Copy example environment file
cp .env.example .env
# Edit .env with your settings
nano .env
```
**Required Configuration:**
```bash
# MinIO Credentials
MINIO_ROOT_USER=minioadmin
MINIO_ROOT_PASSWORD=your-secure-password-here
# Keycloak Integration
MINIO_IDENTITY_OPENID_CLIENT_SECRET=your-keycloak-client-secret
```
### **Step 2: Create Data Directory**
```bash
# Create persistent storage directory
mkdir -p data
# Set permissions
chmod 755 data
```
### **Step 3: Start MinIO**
```bash
# Start service
docker compose up -d
# Check status
docker compose ps
# View logs
docker logs minio -f
```
### **Step 4: Configure Nginx Reverse Proxy**
Add the configuration from `nginx-minio.conf` to your Nginx Proxy Manager:
1. Go to Nginx Proxy Manager UI
2. Create/Edit Proxy Host for `ai.sriphat.com`
3. Add MinIO configuration to "Custom Nginx Configuration"
4. Save and test
### **Step 5: Setup Keycloak Integration**
Follow the detailed guide in `KEYCLOAK_INTEGRATION.md`:
1. Create MinIO client in Keycloak
2. Configure client scopes and mappers
3. Add policy attributes to users
4. Update MinIO environment variables
5. Restart MinIO service
## 🌐 Access URLs
**MinIO Console (Web UI):**
```
https://ai.sriphat.com/minio-console
```
**MinIO API (S3 Compatible):**
```
https://ai.sriphat.com/minio
```
**Direct Access (Development):**
```
http://192.168.100.9:9001 (Console)
http://192.168.100.9:9000 (API)
```
## 🔑 Authentication
### **Option 1: Root Credentials (Default)**
Login with root credentials from `.env`:
- **Username**: Value of `MINIO_ROOT_USER`
- **Password**: Value of `MINIO_ROOT_PASSWORD`
### **Option 2: Keycloak SSO (Recommended)**
1. Click "Login with SSO" on MinIO Console
2. Authenticate with Keycloak
3. Access granted based on policy mapping
See `KEYCLOAK_INTEGRATION.md` for setup instructions.
## 📦 Using MinIO
### **Web Console**
1. Access: `https://ai.sriphat.com/minio-console`
2. Login with credentials or SSO
3. Create buckets, upload files, manage access
### **MinIO Client (mc)**
```bash
# Install mc
wget https://dl.min.io/client/mc/release/linux-amd64/mc
chmod +x mc
sudo mv mc /usr/local/bin/
# Configure alias
mc alias set myminio https://ai.sriphat.com/minio minioadmin your-password
# List buckets
mc ls myminio
# Create bucket
mc mb myminio/my-bucket
# Upload file
mc cp myfile.txt myminio/my-bucket/
# Download file
mc cp myminio/my-bucket/myfile.txt ./
# List objects
mc ls myminio/my-bucket
# Remove object
mc rm myminio/my-bucket/myfile.txt
```
### **Python SDK (minio — แนะนำสำหรับ Sriphat Platform)**
ใช้ `minio` package (Official MinIO Python SDK) แทน boto3 สำหรับ internal services:
```python
from minio import Minio
import io
# Connection — ใช้ internal IP จาก service บน server อื่น
client = Minio(
endpoint="192.168.100.9:9000", # internal IP, ไม่ใช่ public URL
access_key="sp_service_ac",
secret_key="<MINIO_SVC_SECRET_KEY>",
secure=False, # HTTP ภายใน network
)
# Upload file
with open("report.xlsx", "rb") as f:
data = f.read()
client.put_object(
bucket_name="finance",
object_name="finance/20260520_report.xlsx",
data=io.BytesIO(data),
length=len(data),
content_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
)
# Download file (คืนค่าเป็น HTTPResponse)
response = client.get_object("finance", "finance/20260520_report.xlsx")
data = response.read()
response.close()
response.release_conn()
# Download ไปยัง file โดยตรง
client.fget_object("finance", "finance/20260520_report.xlsx", "/tmp/report.xlsx")
# List objects ใน bucket
for obj in client.list_objects("finance", prefix="finance/", recursive=True):
print(obj.object_name, obj.size)
# Generate presigned URL (สำหรับให้ภายนอกดาวน์โหลด ใช้ได้ 1 ชั่วโมง)
from datetime import timedelta
url = client.presigned_get_object("finance", "finance/report.xlsx", expires=timedelta(hours=1))
print(url)
```
---
### **Airflow DAG — อ่านไฟล์จาก MinIO finance bucket**
Airflow อยู่บน server .9 (server เดียวกับ MinIO) ใช้ container name `minio:9000` หรือ `192.168.100.9:9000`:
```python
from minio import Minio
import pandas as pd
import io
from airflow.decorators import dag, task
from airflow.utils.dates import days_ago
@dag(schedule=None, start_date=days_ago(1), catchup=False)
def process_finance_excel():
@task
def download_and_process(filepath: str, **context):
"""
filepath = MinIO object key เช่น "finance/20260520_123000_report.xlsx"
ส่งมาจาก API Service ผ่าน DAG trigger conf
"""
client = Minio(
endpoint="minio:9000", # container name บน shared_data_network
access_key="sp_service_ac", # ใช้ service account เดียวกับ API
secret_key="{{ var.value.MINIO_SVC_SECRET_KEY }}", # เก็บใน Airflow Variables
secure=False,
)
# Download file จาก MinIO
response = client.get_object(bucket_name="finance", object_name=filepath)
file_bytes = response.read()
response.close()
response.release_conn()
# ประมวลผลด้วย pandas
df = pd.read_excel(io.BytesIO(file_bytes))
print(f"Loaded {len(df)} rows from {filepath}")
# ... process data ...
return {"rows": len(df), "filepath": filepath}
@task
def get_filepath(**context):
conf = context["dag_run"].conf or {}
return conf.get("filepath", "")
fp = get_filepath()
download_and_process(fp)
process_finance_excel()
```
**ตั้งค่า Airflow Connection (ทางเลือก — ใช้ S3Hook)**
ถ้าต้องการใช้ `S3Hook` หรือ Airflow Operators:
```
Connection ID : minio_s3
Connection Type: Amazon Web Services
Extra (JSON) : {"endpoint_url": "http://minio:9000", "region_name": "ap-southeast-1"}
Login : sp_service_ac
Password : <MINIO_SVC_SECRET_KEY>
```
```python
from airflow.providers.amazon.aws.hooks.s3 import S3Hook
hook = S3Hook(aws_conn_id="minio_s3")
obj = hook.get_key(key="finance/20260520_report.xlsx", bucket_name="finance")
data = obj.get()["Body"].read()
df = pd.read_excel(io.BytesIO(data))
```
**Airflow Variables ที่ต้องสร้าง:**
| Key | Value |
|-----|-------|
| `MINIO_ENDPOINT` | `192.168.100.9:9000` |
| `MINIO_SVC_SECRET_KEY` | (ดูจาก `07-minio/.env`) |
---
### **Python SDK (boto3)**
```python
import boto3
from botocore.client import Config
# Configure S3 client
s3 = boto3.client(
's3',
endpoint_url='https://ai.sriphat.com/minio',
aws_access_key_id='minioadmin',
aws_secret_access_key='your-password',
config=Config(signature_version='s3v4'),
region_name='ap-southeast-1'
)
# List buckets
response = s3.list_buckets()
for bucket in response['Buckets']:
print(bucket['Name'])
# Upload file
s3.upload_file('myfile.txt', 'my-bucket', 'myfile.txt')
# Download file
s3.download_file('my-bucket', 'myfile.txt', 'downloaded.txt')
# List objects
response = s3.list_objects_v2(Bucket='my-bucket')
for obj in response.get('Contents', []):
print(obj['Key'])
```
### **AWS CLI**
```bash
# Configure AWS CLI
aws configure set aws_access_key_id minioadmin
aws configure set aws_secret_access_key your-password
aws configure set region ap-southeast-1
# List buckets
aws --endpoint-url https://ai.sriphat.com/minio s3 ls
# Create bucket
aws --endpoint-url https://ai.sriphat.com/minio s3 mb s3://my-bucket
# Upload file
aws --endpoint-url https://ai.sriphat.com/minio s3 cp myfile.txt s3://my-bucket/
# Download file
aws --endpoint-url https://ai.sriphat.com/minio s3 cp s3://my-bucket/myfile.txt ./
# Sync directory
aws --endpoint-url https://ai.sriphat.com/minio s3 sync ./mydir s3://my-bucket/mydir/
```
## 🔧 Configuration
### **Environment Variables**
| Variable | Description | Default |
|----------|-------------|---------|
| `MINIO_ROOT_USER` | Root username | minioadmin |
| `MINIO_ROOT_PASSWORD` | Root password | - |
| `MINIO_API_PORT` | API port | 9000 |
| `MINIO_CONSOLE_PORT` | Console port | 9001 |
| `MINIO_SERVER_URL` | API endpoint URL | - |
| `MINIO_BROWSER_REDIRECT_URL` | Console URL | - |
| `MINIO_REGION` | Default region | ap-southeast-1 |
### **Keycloak Integration**
| Variable | Description |
|----------|-------------|
| `MINIO_IDENTITY_OPENID_CONFIG_URL` | Keycloak OIDC config URL |
| `MINIO_IDENTITY_OPENID_CLIENT_ID` | Client ID in Keycloak |
| `MINIO_IDENTITY_OPENID_CLIENT_SECRET` | Client secret |
| `MINIO_IDENTITY_OPENID_CLAIM_NAME` | Policy claim name |
| `MINIO_IDENTITY_OPENID_SCOPES` | OIDC scopes |
### **Storage**
**Persistent Data:**
```
07-minio/data/ # Object storage data
07-minio/certs/ # SSL certificates (optional)
```
**Volume Mounts:**
```yaml
volumes:
- ./data:/data # Storage data
- ./certs:/root/.minio/certs:ro # SSL certs
```
## 🔒 Security
### **1. Strong Passwords**
```bash
# Generate strong password
openssl rand -base64 32
# Update .env
MINIO_ROOT_PASSWORD=generated-password-here
```
### **2. Network Security**
```bash
# Firewall rules (if needed)
sudo ufw allow from 192.168.100.0/24 to any port 9000
sudo ufw allow from 192.168.100.0/24 to any port 9001
```
### **3. HTTPS Only**
- Always use HTTPS in production
- Configure SSL certificates in Nginx
- Set `MINIO_SERVER_URL` and `MINIO_BROWSER_REDIRECT_URL` to HTTPS
### **4. Access Policies**
```bash
# Create read-only policy
mc admin policy create myminio readonly-policy readonly-policy.json
# Assign policy to user
mc admin policy attach myminio readonly-policy --user=username
```
### **5. Bucket Policies**
```bash
# Set bucket policy (public read)
mc anonymous set download myminio/public-bucket
# Set bucket policy (private)
mc anonymous set none myminio/private-bucket
```
## 📊 Monitoring
### **Health Check**
```bash
# Check MinIO health
curl -k https://ai.sriphat.com/minio/health/live
# Check from container
docker exec minio curl -f http://localhost:9000/minio/health/live
```
### **Logs**
```bash
# View logs
docker logs minio -f
# View last 100 lines
docker logs minio --tail 100
# Export logs
docker logs minio > minio.log
```
### **Metrics**
```bash
# View server info
mc admin info myminio
# View server stats
mc admin prometheus metrics myminio
```
### **Disk Usage**
```bash
# Check disk usage
mc admin info myminio
# Check bucket size
mc du myminio/my-bucket
```
## 🐛 Troubleshooting
### **Issue: Cannot access MinIO Console**
**Check:**
```bash
# Verify container is running
docker ps | grep minio
# Check logs
docker logs minio
# Test direct access
curl http://192.168.100.9:9001
```
**Solution:**
- Ensure container is running: `docker compose up -d`
- Check firewall rules
- Verify Nginx configuration
### **Issue: SSO login not working**
**Check:**
```bash
# Verify Keycloak config
docker exec minio printenv | grep MINIO_IDENTITY_OPENID
# Test Keycloak connectivity
docker exec minio curl -k https://ai.sriphat.com/keycloak/realms/sriphat/.well-known/openid-configuration
```
**Solution:**
- Verify all Keycloak environment variables are set
- Check client secret is correct
- Ensure redirect URIs match in Keycloak
- See `KEYCLOAK_INTEGRATION.md` for detailed troubleshooting
### **Issue: Upload fails**
**Check:**
```bash
# Check disk space
df -h
# Check permissions
ls -la data/
```
**Solution:**
- Ensure sufficient disk space
- Check directory permissions: `chmod 755 data/`
- Increase `client_max_body_size` in Nginx
### **Issue: S3 API connection refused**
**Check:**
```bash
# Test API endpoint
curl -k https://ai.sriphat.com/minio/
# Test direct connection
curl http://192.168.100.9:9000/
```
**Solution:**
- Verify `MINIO_SERVER_URL` is set correctly
- Check Nginx proxy configuration
- Ensure port 9000 is accessible
## 🔄 Maintenance
### **Backup**
```bash
# Backup data directory
tar -czf minio-backup-$(date +%Y%m%d).tar.gz data/
# Backup to remote location
rsync -avz data/ user@backup-server:/backups/minio/
```
### **Update MinIO**
```bash
# Pull latest image
docker compose pull
# Restart with new image
docker compose up -d
# Verify version
docker exec minio minio --version
```
### **Restore**
```bash
# Stop MinIO
docker compose down
# Restore data
tar -xzf minio-backup-20260325.tar.gz
# Start MinIO
docker compose up -d
```
## 📚 Documentation
- **MinIO Official Docs**: https://min.io/docs/minio/linux/
- **S3 API Reference**: https://docs.aws.amazon.com/AmazonS3/latest/API/
- **Keycloak Integration**: See `KEYCLOAK_INTEGRATION.md`
- **Nginx Configuration**: See `nginx-minio.conf`
## 🎯 Use Cases
### **1. Data Lake Storage**
- Store raw data files (CSV, JSON, Parquet)
- Integrate with Spark, Pandas, Dask
- Version control for datasets
### **2. Backup Storage**
- Database backups
- Application backups
- Log archival
### **3. Media Storage**
- Images, videos, documents
- CDN integration
- Static website hosting
### **4. ML/AI Workflows**
- Model storage
- Training data storage
- Experiment artifacts
### **5. Application Storage**
- User uploads
- Generated reports
- Temporary files
## 🎉 Summary
**What You Have:**
- ✅ MinIO object storage service
- ✅ Persistent storage with volume mounts
- ✅ HTTPS access via Nginx reverse proxy
- ✅ Keycloak SSO integration ready
- ✅ S3-compatible API
- ✅ Web console for management
- ✅ Health checks and monitoring
**Access:**
- Console: `https://ai.sriphat.com/minio-console`
- API: `https://ai.sriphat.com/minio`
**Next Steps:**
1. Configure `.env` file
2. Start MinIO: `docker compose up -d`
3. Setup Keycloak integration (optional)
4. Configure Nginx reverse proxy
5. Create buckets and start using!
For detailed Keycloak SSO setup, see `KEYCLOAK_INTEGRATION.md` 🚀