- 01-infra/nginx-configs: add MinIO /minio/ and /minio-console/ location blocks (port 9000 S3 API, port 9001 Console UI, path stripping via rewrite) - 03-apiservice: integrate MinIO minio-python SDK for file upload - requirements.txt: add minio==7.2.11 - app/core/config.py: add MINIO_ENDPOINT, ACCESS_KEY, SECRET_KEY, BUCKET_FINANCE, USE_SSL - app/services/minio_client.py: new — upload_file(), get_presigned_url(), delete_file() - app/routes/pages.py: replace local /data/uploads/ write with MinIO upload to finance bucket - docker-compose.yml: pass MinIO env vars to container - .env.example: document MinIO vars - 07-minio/.env.example: add MINIO_SVC_ACCESS_KEY/SECRET_KEY section - 07-minio/README.md: add Python minio SDK and Airflow DAG usage guide - CLAUDE.md: project context (servers, SSH, paths, service distribution) - document-obsidiant/: initial Obsidian docs for all services
648 lines
15 KiB
Markdown
648 lines
15 KiB
Markdown
# MinIO Object Storage Service
|
|
|
|
MinIO is a high-performance, S3-compatible object storage system. This setup includes persistent storage, HTTPS access via Nginx reverse proxy, and Keycloak SSO integration.
|
|
|
|
## 🎯 Overview
|
|
|
|
**MinIO Features:**
|
|
- **S3-Compatible API** - Works with AWS S3 SDKs and tools
|
|
- **High Performance** - Optimized for large-scale data workloads
|
|
- **Distributed Storage** - Supports multi-node deployment
|
|
- **Web Console** - User-friendly web interface
|
|
- **Encryption** - Server-side and client-side encryption
|
|
- **Versioning** - Object versioning support
|
|
- **Lifecycle Management** - Automatic data retention policies
|
|
|
|
**This Setup Includes:**
|
|
- Docker Compose configuration
|
|
- Persistent storage with volume mounts
|
|
- HTTPS access via Nginx reverse proxy
|
|
- Keycloak SSO integration (OpenID Connect)
|
|
- Health checks and monitoring
|
|
|
|
## 📋 Prerequisites
|
|
|
|
- Docker and Docker Compose installed
|
|
- Network: `shared_data_network` created
|
|
- Nginx reverse proxy configured
|
|
- Keycloak instance running (for SSO)
|
|
- Server: 192.168.100.9
|
|
|
|
## 🚀 Quick Start
|
|
|
|
### **Step 1: Configure Environment**
|
|
|
|
```bash
|
|
cd 07-minio
|
|
|
|
# Copy example environment file
|
|
cp .env.example .env
|
|
|
|
# Edit .env with your settings
|
|
nano .env
|
|
```
|
|
|
|
**Required Configuration:**
|
|
```bash
|
|
# MinIO Credentials
|
|
MINIO_ROOT_USER=minioadmin
|
|
MINIO_ROOT_PASSWORD=your-secure-password-here
|
|
|
|
# Keycloak Integration
|
|
MINIO_IDENTITY_OPENID_CLIENT_SECRET=your-keycloak-client-secret
|
|
```
|
|
|
|
### **Step 2: Create Data Directory**
|
|
|
|
```bash
|
|
# Create persistent storage directory
|
|
mkdir -p data
|
|
|
|
# Set permissions
|
|
chmod 755 data
|
|
```
|
|
|
|
### **Step 3: Start MinIO**
|
|
|
|
```bash
|
|
# Start service
|
|
docker compose up -d
|
|
|
|
# Check status
|
|
docker compose ps
|
|
|
|
# View logs
|
|
docker logs minio -f
|
|
```
|
|
|
|
### **Step 4: Configure Nginx Reverse Proxy**
|
|
|
|
Add the configuration from `nginx-minio.conf` to your Nginx Proxy Manager:
|
|
|
|
1. Go to Nginx Proxy Manager UI
|
|
2. Create/Edit Proxy Host for `ai.sriphat.com`
|
|
3. Add MinIO configuration to "Custom Nginx Configuration"
|
|
4. Save and test
|
|
|
|
### **Step 5: Setup Keycloak Integration**
|
|
|
|
Follow the detailed guide in `KEYCLOAK_INTEGRATION.md`:
|
|
|
|
1. Create MinIO client in Keycloak
|
|
2. Configure client scopes and mappers
|
|
3. Add policy attributes to users
|
|
4. Update MinIO environment variables
|
|
5. Restart MinIO service
|
|
|
|
## 🌐 Access URLs
|
|
|
|
**MinIO Console (Web UI):**
|
|
```
|
|
https://ai.sriphat.com/minio-console
|
|
```
|
|
|
|
**MinIO API (S3 Compatible):**
|
|
```
|
|
https://ai.sriphat.com/minio
|
|
```
|
|
|
|
**Direct Access (Development):**
|
|
```
|
|
http://192.168.100.9:9001 (Console)
|
|
http://192.168.100.9:9000 (API)
|
|
```
|
|
|
|
## 🔑 Authentication
|
|
|
|
### **Option 1: Root Credentials (Default)**
|
|
|
|
Login with root credentials from `.env`:
|
|
- **Username**: Value of `MINIO_ROOT_USER`
|
|
- **Password**: Value of `MINIO_ROOT_PASSWORD`
|
|
|
|
### **Option 2: Keycloak SSO (Recommended)**
|
|
|
|
1. Click "Login with SSO" on MinIO Console
|
|
2. Authenticate with Keycloak
|
|
3. Access granted based on policy mapping
|
|
|
|
See `KEYCLOAK_INTEGRATION.md` for setup instructions.
|
|
|
|
## 📦 Using MinIO
|
|
|
|
### **Web Console**
|
|
|
|
1. Access: `https://ai.sriphat.com/minio-console`
|
|
2. Login with credentials or SSO
|
|
3. Create buckets, upload files, manage access
|
|
|
|
### **MinIO Client (mc)**
|
|
|
|
```bash
|
|
# Install mc
|
|
wget https://dl.min.io/client/mc/release/linux-amd64/mc
|
|
chmod +x mc
|
|
sudo mv mc /usr/local/bin/
|
|
|
|
# Configure alias
|
|
mc alias set myminio https://ai.sriphat.com/minio minioadmin your-password
|
|
|
|
# List buckets
|
|
mc ls myminio
|
|
|
|
# Create bucket
|
|
mc mb myminio/my-bucket
|
|
|
|
# Upload file
|
|
mc cp myfile.txt myminio/my-bucket/
|
|
|
|
# Download file
|
|
mc cp myminio/my-bucket/myfile.txt ./
|
|
|
|
# List objects
|
|
mc ls myminio/my-bucket
|
|
|
|
# Remove object
|
|
mc rm myminio/my-bucket/myfile.txt
|
|
```
|
|
|
|
### **Python SDK (minio — แนะนำสำหรับ Sriphat Platform)**
|
|
|
|
ใช้ `minio` package (Official MinIO Python SDK) แทน boto3 สำหรับ internal services:
|
|
|
|
```python
|
|
from minio import Minio
|
|
import io
|
|
|
|
# Connection — ใช้ internal IP จาก service บน server อื่น
|
|
client = Minio(
|
|
endpoint="192.168.100.9:9000", # internal IP, ไม่ใช่ public URL
|
|
access_key="sp_service_ac",
|
|
secret_key="<MINIO_SVC_SECRET_KEY>",
|
|
secure=False, # HTTP ภายใน network
|
|
)
|
|
|
|
# Upload file
|
|
with open("report.xlsx", "rb") as f:
|
|
data = f.read()
|
|
client.put_object(
|
|
bucket_name="finance",
|
|
object_name="finance/20260520_report.xlsx",
|
|
data=io.BytesIO(data),
|
|
length=len(data),
|
|
content_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
|
|
)
|
|
|
|
# Download file (คืนค่าเป็น HTTPResponse)
|
|
response = client.get_object("finance", "finance/20260520_report.xlsx")
|
|
data = response.read()
|
|
response.close()
|
|
response.release_conn()
|
|
|
|
# Download ไปยัง file โดยตรง
|
|
client.fget_object("finance", "finance/20260520_report.xlsx", "/tmp/report.xlsx")
|
|
|
|
# List objects ใน bucket
|
|
for obj in client.list_objects("finance", prefix="finance/", recursive=True):
|
|
print(obj.object_name, obj.size)
|
|
|
|
# Generate presigned URL (สำหรับให้ภายนอกดาวน์โหลด ใช้ได้ 1 ชั่วโมง)
|
|
from datetime import timedelta
|
|
url = client.presigned_get_object("finance", "finance/report.xlsx", expires=timedelta(hours=1))
|
|
print(url)
|
|
```
|
|
|
|
---
|
|
|
|
### **Airflow DAG — อ่านไฟล์จาก MinIO finance bucket**
|
|
|
|
Airflow อยู่บน server .9 (server เดียวกับ MinIO) ใช้ container name `minio:9000` หรือ `192.168.100.9:9000`:
|
|
|
|
```python
|
|
from minio import Minio
|
|
import pandas as pd
|
|
import io
|
|
from airflow.decorators import dag, task
|
|
from airflow.utils.dates import days_ago
|
|
|
|
@dag(schedule=None, start_date=days_ago(1), catchup=False)
|
|
def process_finance_excel():
|
|
|
|
@task
|
|
def download_and_process(filepath: str, **context):
|
|
"""
|
|
filepath = MinIO object key เช่น "finance/20260520_123000_report.xlsx"
|
|
ส่งมาจาก API Service ผ่าน DAG trigger conf
|
|
"""
|
|
client = Minio(
|
|
endpoint="minio:9000", # container name บน shared_data_network
|
|
access_key="sp_service_ac", # ใช้ service account เดียวกับ API
|
|
secret_key="{{ var.value.MINIO_SVC_SECRET_KEY }}", # เก็บใน Airflow Variables
|
|
secure=False,
|
|
)
|
|
|
|
# Download file จาก MinIO
|
|
response = client.get_object(bucket_name="finance", object_name=filepath)
|
|
file_bytes = response.read()
|
|
response.close()
|
|
response.release_conn()
|
|
|
|
# ประมวลผลด้วย pandas
|
|
df = pd.read_excel(io.BytesIO(file_bytes))
|
|
print(f"Loaded {len(df)} rows from {filepath}")
|
|
|
|
# ... process data ...
|
|
return {"rows": len(df), "filepath": filepath}
|
|
|
|
@task
|
|
def get_filepath(**context):
|
|
conf = context["dag_run"].conf or {}
|
|
return conf.get("filepath", "")
|
|
|
|
fp = get_filepath()
|
|
download_and_process(fp)
|
|
|
|
process_finance_excel()
|
|
```
|
|
|
|
**ตั้งค่า Airflow Connection (ทางเลือก — ใช้ S3Hook)**
|
|
|
|
ถ้าต้องการใช้ `S3Hook` หรือ Airflow Operators:
|
|
```
|
|
Connection ID : minio_s3
|
|
Connection Type: Amazon Web Services
|
|
Extra (JSON) : {"endpoint_url": "http://minio:9000", "region_name": "ap-southeast-1"}
|
|
Login : sp_service_ac
|
|
Password : <MINIO_SVC_SECRET_KEY>
|
|
```
|
|
|
|
```python
|
|
from airflow.providers.amazon.aws.hooks.s3 import S3Hook
|
|
|
|
hook = S3Hook(aws_conn_id="minio_s3")
|
|
obj = hook.get_key(key="finance/20260520_report.xlsx", bucket_name="finance")
|
|
data = obj.get()["Body"].read()
|
|
df = pd.read_excel(io.BytesIO(data))
|
|
```
|
|
|
|
**Airflow Variables ที่ต้องสร้าง:**
|
|
| Key | Value |
|
|
|-----|-------|
|
|
| `MINIO_ENDPOINT` | `192.168.100.9:9000` |
|
|
| `MINIO_SVC_SECRET_KEY` | (ดูจาก `07-minio/.env`) |
|
|
|
|
---
|
|
|
|
### **Python SDK (boto3)**
|
|
|
|
```python
|
|
import boto3
|
|
from botocore.client import Config
|
|
|
|
# Configure S3 client
|
|
s3 = boto3.client(
|
|
's3',
|
|
endpoint_url='https://ai.sriphat.com/minio',
|
|
aws_access_key_id='minioadmin',
|
|
aws_secret_access_key='your-password',
|
|
config=Config(signature_version='s3v4'),
|
|
region_name='ap-southeast-1'
|
|
)
|
|
|
|
# List buckets
|
|
response = s3.list_buckets()
|
|
for bucket in response['Buckets']:
|
|
print(bucket['Name'])
|
|
|
|
# Upload file
|
|
s3.upload_file('myfile.txt', 'my-bucket', 'myfile.txt')
|
|
|
|
# Download file
|
|
s3.download_file('my-bucket', 'myfile.txt', 'downloaded.txt')
|
|
|
|
# List objects
|
|
response = s3.list_objects_v2(Bucket='my-bucket')
|
|
for obj in response.get('Contents', []):
|
|
print(obj['Key'])
|
|
```
|
|
|
|
### **AWS CLI**
|
|
|
|
```bash
|
|
# Configure AWS CLI
|
|
aws configure set aws_access_key_id minioadmin
|
|
aws configure set aws_secret_access_key your-password
|
|
aws configure set region ap-southeast-1
|
|
|
|
# List buckets
|
|
aws --endpoint-url https://ai.sriphat.com/minio s3 ls
|
|
|
|
# Create bucket
|
|
aws --endpoint-url https://ai.sriphat.com/minio s3 mb s3://my-bucket
|
|
|
|
# Upload file
|
|
aws --endpoint-url https://ai.sriphat.com/minio s3 cp myfile.txt s3://my-bucket/
|
|
|
|
# Download file
|
|
aws --endpoint-url https://ai.sriphat.com/minio s3 cp s3://my-bucket/myfile.txt ./
|
|
|
|
# Sync directory
|
|
aws --endpoint-url https://ai.sriphat.com/minio s3 sync ./mydir s3://my-bucket/mydir/
|
|
```
|
|
|
|
## 🔧 Configuration
|
|
|
|
### **Environment Variables**
|
|
|
|
| Variable | Description | Default |
|
|
|----------|-------------|---------|
|
|
| `MINIO_ROOT_USER` | Root username | minioadmin |
|
|
| `MINIO_ROOT_PASSWORD` | Root password | - |
|
|
| `MINIO_API_PORT` | API port | 9000 |
|
|
| `MINIO_CONSOLE_PORT` | Console port | 9001 |
|
|
| `MINIO_SERVER_URL` | API endpoint URL | - |
|
|
| `MINIO_BROWSER_REDIRECT_URL` | Console URL | - |
|
|
| `MINIO_REGION` | Default region | ap-southeast-1 |
|
|
|
|
### **Keycloak Integration**
|
|
|
|
| Variable | Description |
|
|
|----------|-------------|
|
|
| `MINIO_IDENTITY_OPENID_CONFIG_URL` | Keycloak OIDC config URL |
|
|
| `MINIO_IDENTITY_OPENID_CLIENT_ID` | Client ID in Keycloak |
|
|
| `MINIO_IDENTITY_OPENID_CLIENT_SECRET` | Client secret |
|
|
| `MINIO_IDENTITY_OPENID_CLAIM_NAME` | Policy claim name |
|
|
| `MINIO_IDENTITY_OPENID_SCOPES` | OIDC scopes |
|
|
|
|
### **Storage**
|
|
|
|
**Persistent Data:**
|
|
```
|
|
07-minio/data/ # Object storage data
|
|
07-minio/certs/ # SSL certificates (optional)
|
|
```
|
|
|
|
**Volume Mounts:**
|
|
```yaml
|
|
volumes:
|
|
- ./data:/data # Storage data
|
|
- ./certs:/root/.minio/certs:ro # SSL certs
|
|
```
|
|
|
|
## 🔒 Security
|
|
|
|
### **1. Strong Passwords**
|
|
|
|
```bash
|
|
# Generate strong password
|
|
openssl rand -base64 32
|
|
|
|
# Update .env
|
|
MINIO_ROOT_PASSWORD=generated-password-here
|
|
```
|
|
|
|
### **2. Network Security**
|
|
|
|
```bash
|
|
# Firewall rules (if needed)
|
|
sudo ufw allow from 192.168.100.0/24 to any port 9000
|
|
sudo ufw allow from 192.168.100.0/24 to any port 9001
|
|
```
|
|
|
|
### **3. HTTPS Only**
|
|
|
|
- Always use HTTPS in production
|
|
- Configure SSL certificates in Nginx
|
|
- Set `MINIO_SERVER_URL` and `MINIO_BROWSER_REDIRECT_URL` to HTTPS
|
|
|
|
### **4. Access Policies**
|
|
|
|
```bash
|
|
# Create read-only policy
|
|
mc admin policy create myminio readonly-policy readonly-policy.json
|
|
|
|
# Assign policy to user
|
|
mc admin policy attach myminio readonly-policy --user=username
|
|
```
|
|
|
|
### **5. Bucket Policies**
|
|
|
|
```bash
|
|
# Set bucket policy (public read)
|
|
mc anonymous set download myminio/public-bucket
|
|
|
|
# Set bucket policy (private)
|
|
mc anonymous set none myminio/private-bucket
|
|
```
|
|
|
|
## 📊 Monitoring
|
|
|
|
### **Health Check**
|
|
|
|
```bash
|
|
# Check MinIO health
|
|
curl -k https://ai.sriphat.com/minio/health/live
|
|
|
|
# Check from container
|
|
docker exec minio curl -f http://localhost:9000/minio/health/live
|
|
```
|
|
|
|
### **Logs**
|
|
|
|
```bash
|
|
# View logs
|
|
docker logs minio -f
|
|
|
|
# View last 100 lines
|
|
docker logs minio --tail 100
|
|
|
|
# Export logs
|
|
docker logs minio > minio.log
|
|
```
|
|
|
|
### **Metrics**
|
|
|
|
```bash
|
|
# View server info
|
|
mc admin info myminio
|
|
|
|
# View server stats
|
|
mc admin prometheus metrics myminio
|
|
```
|
|
|
|
### **Disk Usage**
|
|
|
|
```bash
|
|
# Check disk usage
|
|
mc admin info myminio
|
|
|
|
# Check bucket size
|
|
mc du myminio/my-bucket
|
|
```
|
|
|
|
## 🐛 Troubleshooting
|
|
|
|
### **Issue: Cannot access MinIO Console**
|
|
|
|
**Check:**
|
|
```bash
|
|
# Verify container is running
|
|
docker ps | grep minio
|
|
|
|
# Check logs
|
|
docker logs minio
|
|
|
|
# Test direct access
|
|
curl http://192.168.100.9:9001
|
|
```
|
|
|
|
**Solution:**
|
|
- Ensure container is running: `docker compose up -d`
|
|
- Check firewall rules
|
|
- Verify Nginx configuration
|
|
|
|
### **Issue: SSO login not working**
|
|
|
|
**Check:**
|
|
```bash
|
|
# Verify Keycloak config
|
|
docker exec minio printenv | grep MINIO_IDENTITY_OPENID
|
|
|
|
# Test Keycloak connectivity
|
|
docker exec minio curl -k https://ai.sriphat.com/keycloak/realms/sriphat/.well-known/openid-configuration
|
|
```
|
|
|
|
**Solution:**
|
|
- Verify all Keycloak environment variables are set
|
|
- Check client secret is correct
|
|
- Ensure redirect URIs match in Keycloak
|
|
- See `KEYCLOAK_INTEGRATION.md` for detailed troubleshooting
|
|
|
|
### **Issue: Upload fails**
|
|
|
|
**Check:**
|
|
```bash
|
|
# Check disk space
|
|
df -h
|
|
|
|
# Check permissions
|
|
ls -la data/
|
|
```
|
|
|
|
**Solution:**
|
|
- Ensure sufficient disk space
|
|
- Check directory permissions: `chmod 755 data/`
|
|
- Increase `client_max_body_size` in Nginx
|
|
|
|
### **Issue: S3 API connection refused**
|
|
|
|
**Check:**
|
|
```bash
|
|
# Test API endpoint
|
|
curl -k https://ai.sriphat.com/minio/
|
|
|
|
# Test direct connection
|
|
curl http://192.168.100.9:9000/
|
|
```
|
|
|
|
**Solution:**
|
|
- Verify `MINIO_SERVER_URL` is set correctly
|
|
- Check Nginx proxy configuration
|
|
- Ensure port 9000 is accessible
|
|
|
|
## 🔄 Maintenance
|
|
|
|
### **Backup**
|
|
|
|
```bash
|
|
# Backup data directory
|
|
tar -czf minio-backup-$(date +%Y%m%d).tar.gz data/
|
|
|
|
# Backup to remote location
|
|
rsync -avz data/ user@backup-server:/backups/minio/
|
|
```
|
|
|
|
### **Update MinIO**
|
|
|
|
```bash
|
|
# Pull latest image
|
|
docker compose pull
|
|
|
|
# Restart with new image
|
|
docker compose up -d
|
|
|
|
# Verify version
|
|
docker exec minio minio --version
|
|
```
|
|
|
|
### **Restore**
|
|
|
|
```bash
|
|
# Stop MinIO
|
|
docker compose down
|
|
|
|
# Restore data
|
|
tar -xzf minio-backup-20260325.tar.gz
|
|
|
|
# Start MinIO
|
|
docker compose up -d
|
|
```
|
|
|
|
## 📚 Documentation
|
|
|
|
- **MinIO Official Docs**: https://min.io/docs/minio/linux/
|
|
- **S3 API Reference**: https://docs.aws.amazon.com/AmazonS3/latest/API/
|
|
- **Keycloak Integration**: See `KEYCLOAK_INTEGRATION.md`
|
|
- **Nginx Configuration**: See `nginx-minio.conf`
|
|
|
|
## 🎯 Use Cases
|
|
|
|
### **1. Data Lake Storage**
|
|
- Store raw data files (CSV, JSON, Parquet)
|
|
- Integrate with Spark, Pandas, Dask
|
|
- Version control for datasets
|
|
|
|
### **2. Backup Storage**
|
|
- Database backups
|
|
- Application backups
|
|
- Log archival
|
|
|
|
### **3. Media Storage**
|
|
- Images, videos, documents
|
|
- CDN integration
|
|
- Static website hosting
|
|
|
|
### **4. ML/AI Workflows**
|
|
- Model storage
|
|
- Training data storage
|
|
- Experiment artifacts
|
|
|
|
### **5. Application Storage**
|
|
- User uploads
|
|
- Generated reports
|
|
- Temporary files
|
|
|
|
## 🎉 Summary
|
|
|
|
**What You Have:**
|
|
- ✅ MinIO object storage service
|
|
- ✅ Persistent storage with volume mounts
|
|
- ✅ HTTPS access via Nginx reverse proxy
|
|
- ✅ Keycloak SSO integration ready
|
|
- ✅ S3-compatible API
|
|
- ✅ Web console for management
|
|
- ✅ Health checks and monitoring
|
|
|
|
**Access:**
|
|
- Console: `https://ai.sriphat.com/minio-console`
|
|
- API: `https://ai.sriphat.com/minio`
|
|
|
|
**Next Steps:**
|
|
1. Configure `.env` file
|
|
2. Start MinIO: `docker compose up -d`
|
|
3. Setup Keycloak integration (optional)
|
|
4. Configure Nginx reverse proxy
|
|
5. Create buckets and start using!
|
|
|
|
For detailed Keycloak SSO setup, see `KEYCLOAK_INTEGRATION.md` 🚀
|