# MinIO Object Storage Service MinIO is a high-performance, S3-compatible object storage system. This setup includes persistent storage, HTTPS access via Nginx reverse proxy, and Keycloak SSO integration. ## 🎯 Overview **MinIO Features:** - **S3-Compatible API** - Works with AWS S3 SDKs and tools - **High Performance** - Optimized for large-scale data workloads - **Distributed Storage** - Supports multi-node deployment - **Web Console** - User-friendly web interface - **Encryption** - Server-side and client-side encryption - **Versioning** - Object versioning support - **Lifecycle Management** - Automatic data retention policies **This Setup Includes:** - Docker Compose configuration - Persistent storage with volume mounts - HTTPS access via Nginx reverse proxy - Keycloak SSO integration (OpenID Connect) - Health checks and monitoring ## 📋 Prerequisites - Docker and Docker Compose installed - Network: `shared_data_network` created - Nginx reverse proxy configured - Keycloak instance running (for SSO) - Server: 192.168.100.9 ## 🚀 Quick Start ### **Step 1: Configure Environment** ```bash cd 07-minio # Copy example environment file cp .env.example .env # Edit .env with your settings nano .env ``` **Required Configuration:** ```bash # MinIO Credentials MINIO_ROOT_USER=minioadmin MINIO_ROOT_PASSWORD=your-secure-password-here # Keycloak Integration MINIO_IDENTITY_OPENID_CLIENT_SECRET=your-keycloak-client-secret ``` ### **Step 2: Create Data Directory** ```bash # Create persistent storage directory mkdir -p data # Set permissions chmod 755 data ``` ### **Step 3: Start MinIO** ```bash # Start service docker compose up -d # Check status docker compose ps # View logs docker logs minio -f ``` ### **Step 4: Configure Nginx Reverse Proxy** Add the configuration from `nginx-minio.conf` to your Nginx Proxy Manager: 1. Go to Nginx Proxy Manager UI 2. Create/Edit Proxy Host for `ai.sriphat.com` 3. Add MinIO configuration to "Custom Nginx Configuration" 4. Save and test ### **Step 5: Setup Keycloak Integration** Follow the detailed guide in `KEYCLOAK_INTEGRATION.md`: 1. Create MinIO client in Keycloak 2. Configure client scopes and mappers 3. Add policy attributes to users 4. Update MinIO environment variables 5. Restart MinIO service ## 🌐 Access URLs **MinIO Console (Web UI):** ``` https://ai.sriphat.com/minio-console ``` **MinIO API (S3 Compatible):** ``` https://ai.sriphat.com/minio ``` **Direct Access (Development):** ``` http://192.168.100.9:9001 (Console) http://192.168.100.9:9000 (API) ``` ## 🔑 Authentication ### **Option 1: Root Credentials (Default)** Login with root credentials from `.env`: - **Username**: Value of `MINIO_ROOT_USER` - **Password**: Value of `MINIO_ROOT_PASSWORD` ### **Option 2: Keycloak SSO (Recommended)** 1. Click "Login with SSO" on MinIO Console 2. Authenticate with Keycloak 3. Access granted based on policy mapping See `KEYCLOAK_INTEGRATION.md` for setup instructions. ## 📦 Using MinIO ### **Web Console** 1. Access: `https://ai.sriphat.com/minio-console` 2. Login with credentials or SSO 3. Create buckets, upload files, manage access ### **MinIO Client (mc)** ```bash # Install mc wget https://dl.min.io/client/mc/release/linux-amd64/mc chmod +x mc sudo mv mc /usr/local/bin/ # Configure alias mc alias set myminio https://ai.sriphat.com/minio minioadmin your-password # List buckets mc ls myminio # Create bucket mc mb myminio/my-bucket # Upload file mc cp myfile.txt myminio/my-bucket/ # Download file mc cp myminio/my-bucket/myfile.txt ./ # List objects mc ls myminio/my-bucket # Remove object mc rm myminio/my-bucket/myfile.txt ``` ### **Python SDK (minio — แนะนำสำหรับ Sriphat Platform)** ใช้ `minio` package (Official MinIO Python SDK) แทน boto3 สำหรับ internal services: ```python from minio import Minio import io # Connection — ใช้ internal IP จาก service บน server อื่น client = Minio( endpoint="192.168.100.9:9000", # internal IP, ไม่ใช่ public URL access_key="sp_service_ac", secret_key="", secure=False, # HTTP ภายใน network ) # Upload file with open("report.xlsx", "rb") as f: data = f.read() client.put_object( bucket_name="finance", object_name="finance/20260520_report.xlsx", data=io.BytesIO(data), length=len(data), content_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet", ) # Download file (คืนค่าเป็น HTTPResponse) response = client.get_object("finance", "finance/20260520_report.xlsx") data = response.read() response.close() response.release_conn() # Download ไปยัง file โดยตรง client.fget_object("finance", "finance/20260520_report.xlsx", "/tmp/report.xlsx") # List objects ใน bucket for obj in client.list_objects("finance", prefix="finance/", recursive=True): print(obj.object_name, obj.size) # Generate presigned URL (สำหรับให้ภายนอกดาวน์โหลด ใช้ได้ 1 ชั่วโมง) from datetime import timedelta url = client.presigned_get_object("finance", "finance/report.xlsx", expires=timedelta(hours=1)) print(url) ``` --- ### **Airflow DAG — อ่านไฟล์จาก MinIO finance bucket** Airflow อยู่บน server .9 (server เดียวกับ MinIO) ใช้ container name `minio:9000` หรือ `192.168.100.9:9000`: ```python from minio import Minio import pandas as pd import io from airflow.decorators import dag, task from airflow.utils.dates import days_ago @dag(schedule=None, start_date=days_ago(1), catchup=False) def process_finance_excel(): @task def download_and_process(filepath: str, **context): """ filepath = MinIO object key เช่น "finance/20260520_123000_report.xlsx" ส่งมาจาก API Service ผ่าน DAG trigger conf """ client = Minio( endpoint="minio:9000", # container name บน shared_data_network access_key="sp_service_ac", # ใช้ service account เดียวกับ API secret_key="{{ var.value.MINIO_SVC_SECRET_KEY }}", # เก็บใน Airflow Variables secure=False, ) # Download file จาก MinIO response = client.get_object(bucket_name="finance", object_name=filepath) file_bytes = response.read() response.close() response.release_conn() # ประมวลผลด้วย pandas df = pd.read_excel(io.BytesIO(file_bytes)) print(f"Loaded {len(df)} rows from {filepath}") # ... process data ... return {"rows": len(df), "filepath": filepath} @task def get_filepath(**context): conf = context["dag_run"].conf or {} return conf.get("filepath", "") fp = get_filepath() download_and_process(fp) process_finance_excel() ``` **ตั้งค่า Airflow Connection (ทางเลือก — ใช้ S3Hook)** ถ้าต้องการใช้ `S3Hook` หรือ Airflow Operators: ``` Connection ID : minio_s3 Connection Type: Amazon Web Services Extra (JSON) : {"endpoint_url": "http://minio:9000", "region_name": "ap-southeast-1"} Login : sp_service_ac Password : ``` ```python from airflow.providers.amazon.aws.hooks.s3 import S3Hook hook = S3Hook(aws_conn_id="minio_s3") obj = hook.get_key(key="finance/20260520_report.xlsx", bucket_name="finance") data = obj.get()["Body"].read() df = pd.read_excel(io.BytesIO(data)) ``` **Airflow Variables ที่ต้องสร้าง:** | Key | Value | |-----|-------| | `MINIO_ENDPOINT` | `192.168.100.9:9000` | | `MINIO_SVC_SECRET_KEY` | (ดูจาก `07-minio/.env`) | --- ### **Python SDK (boto3)** ```python import boto3 from botocore.client import Config # Configure S3 client s3 = boto3.client( 's3', endpoint_url='https://ai.sriphat.com/minio', aws_access_key_id='minioadmin', aws_secret_access_key='your-password', config=Config(signature_version='s3v4'), region_name='ap-southeast-1' ) # List buckets response = s3.list_buckets() for bucket in response['Buckets']: print(bucket['Name']) # Upload file s3.upload_file('myfile.txt', 'my-bucket', 'myfile.txt') # Download file s3.download_file('my-bucket', 'myfile.txt', 'downloaded.txt') # List objects response = s3.list_objects_v2(Bucket='my-bucket') for obj in response.get('Contents', []): print(obj['Key']) ``` ### **AWS CLI** ```bash # Configure AWS CLI aws configure set aws_access_key_id minioadmin aws configure set aws_secret_access_key your-password aws configure set region ap-southeast-1 # List buckets aws --endpoint-url https://ai.sriphat.com/minio s3 ls # Create bucket aws --endpoint-url https://ai.sriphat.com/minio s3 mb s3://my-bucket # Upload file aws --endpoint-url https://ai.sriphat.com/minio s3 cp myfile.txt s3://my-bucket/ # Download file aws --endpoint-url https://ai.sriphat.com/minio s3 cp s3://my-bucket/myfile.txt ./ # Sync directory aws --endpoint-url https://ai.sriphat.com/minio s3 sync ./mydir s3://my-bucket/mydir/ ``` ## 🔧 Configuration ### **Environment Variables** | Variable | Description | Default | |----------|-------------|---------| | `MINIO_ROOT_USER` | Root username | minioadmin | | `MINIO_ROOT_PASSWORD` | Root password | - | | `MINIO_API_PORT` | API port | 9000 | | `MINIO_CONSOLE_PORT` | Console port | 9001 | | `MINIO_SERVER_URL` | API endpoint URL | - | | `MINIO_BROWSER_REDIRECT_URL` | Console URL | - | | `MINIO_REGION` | Default region | ap-southeast-1 | ### **Keycloak Integration** | Variable | Description | |----------|-------------| | `MINIO_IDENTITY_OPENID_CONFIG_URL` | Keycloak OIDC config URL | | `MINIO_IDENTITY_OPENID_CLIENT_ID` | Client ID in Keycloak | | `MINIO_IDENTITY_OPENID_CLIENT_SECRET` | Client secret | | `MINIO_IDENTITY_OPENID_CLAIM_NAME` | Policy claim name | | `MINIO_IDENTITY_OPENID_SCOPES` | OIDC scopes | ### **Storage** **Persistent Data:** ``` 07-minio/data/ # Object storage data 07-minio/certs/ # SSL certificates (optional) ``` **Volume Mounts:** ```yaml volumes: - ./data:/data # Storage data - ./certs:/root/.minio/certs:ro # SSL certs ``` ## 🔒 Security ### **1. Strong Passwords** ```bash # Generate strong password openssl rand -base64 32 # Update .env MINIO_ROOT_PASSWORD=generated-password-here ``` ### **2. Network Security** ```bash # Firewall rules (if needed) sudo ufw allow from 192.168.100.0/24 to any port 9000 sudo ufw allow from 192.168.100.0/24 to any port 9001 ``` ### **3. HTTPS Only** - Always use HTTPS in production - Configure SSL certificates in Nginx - Set `MINIO_SERVER_URL` and `MINIO_BROWSER_REDIRECT_URL` to HTTPS ### **4. Access Policies** ```bash # Create read-only policy mc admin policy create myminio readonly-policy readonly-policy.json # Assign policy to user mc admin policy attach myminio readonly-policy --user=username ``` ### **5. Bucket Policies** ```bash # Set bucket policy (public read) mc anonymous set download myminio/public-bucket # Set bucket policy (private) mc anonymous set none myminio/private-bucket ``` ## 📊 Monitoring ### **Health Check** ```bash # Check MinIO health curl -k https://ai.sriphat.com/minio/health/live # Check from container docker exec minio curl -f http://localhost:9000/minio/health/live ``` ### **Logs** ```bash # View logs docker logs minio -f # View last 100 lines docker logs minio --tail 100 # Export logs docker logs minio > minio.log ``` ### **Metrics** ```bash # View server info mc admin info myminio # View server stats mc admin prometheus metrics myminio ``` ### **Disk Usage** ```bash # Check disk usage mc admin info myminio # Check bucket size mc du myminio/my-bucket ``` ## 🐛 Troubleshooting ### **Issue: Cannot access MinIO Console** **Check:** ```bash # Verify container is running docker ps | grep minio # Check logs docker logs minio # Test direct access curl http://192.168.100.9:9001 ``` **Solution:** - Ensure container is running: `docker compose up -d` - Check firewall rules - Verify Nginx configuration ### **Issue: SSO login not working** **Check:** ```bash # Verify Keycloak config docker exec minio printenv | grep MINIO_IDENTITY_OPENID # Test Keycloak connectivity docker exec minio curl -k https://ai.sriphat.com/keycloak/realms/sriphat/.well-known/openid-configuration ``` **Solution:** - Verify all Keycloak environment variables are set - Check client secret is correct - Ensure redirect URIs match in Keycloak - See `KEYCLOAK_INTEGRATION.md` for detailed troubleshooting ### **Issue: Upload fails** **Check:** ```bash # Check disk space df -h # Check permissions ls -la data/ ``` **Solution:** - Ensure sufficient disk space - Check directory permissions: `chmod 755 data/` - Increase `client_max_body_size` in Nginx ### **Issue: S3 API connection refused** **Check:** ```bash # Test API endpoint curl -k https://ai.sriphat.com/minio/ # Test direct connection curl http://192.168.100.9:9000/ ``` **Solution:** - Verify `MINIO_SERVER_URL` is set correctly - Check Nginx proxy configuration - Ensure port 9000 is accessible ## 🔄 Maintenance ### **Backup** ```bash # Backup data directory tar -czf minio-backup-$(date +%Y%m%d).tar.gz data/ # Backup to remote location rsync -avz data/ user@backup-server:/backups/minio/ ``` ### **Update MinIO** ```bash # Pull latest image docker compose pull # Restart with new image docker compose up -d # Verify version docker exec minio minio --version ``` ### **Restore** ```bash # Stop MinIO docker compose down # Restore data tar -xzf minio-backup-20260325.tar.gz # Start MinIO docker compose up -d ``` ## 📚 Documentation - **MinIO Official Docs**: https://min.io/docs/minio/linux/ - **S3 API Reference**: https://docs.aws.amazon.com/AmazonS3/latest/API/ - **Keycloak Integration**: See `KEYCLOAK_INTEGRATION.md` - **Nginx Configuration**: See `nginx-minio.conf` ## 🎯 Use Cases ### **1. Data Lake Storage** - Store raw data files (CSV, JSON, Parquet) - Integrate with Spark, Pandas, Dask - Version control for datasets ### **2. Backup Storage** - Database backups - Application backups - Log archival ### **3. Media Storage** - Images, videos, documents - CDN integration - Static website hosting ### **4. ML/AI Workflows** - Model storage - Training data storage - Experiment artifacts ### **5. Application Storage** - User uploads - Generated reports - Temporary files ## 🎉 Summary **What You Have:** - ✅ MinIO object storage service - ✅ Persistent storage with volume mounts - ✅ HTTPS access via Nginx reverse proxy - ✅ Keycloak SSO integration ready - ✅ S3-compatible API - ✅ Web console for management - ✅ Health checks and monitoring **Access:** - Console: `https://ai.sriphat.com/minio-console` - API: `https://ai.sriphat.com/minio` **Next Steps:** 1. Configure `.env` file 2. Start MinIO: `docker compose up -d` 3. Setup Keycloak integration (optional) 4. Configure Nginx reverse proxy 5. Create buckets and start using! For detailed Keycloak SSO setup, see `KEYCLOAK_INTEGRATION.md` 🚀