Files
sriphat-dataplatform/07-minio/README.md
jigoong a587be08bd feat: MinIO integration — bucket finance, API service upload, Nginx routing
- 01-infra/nginx-configs: add MinIO /minio/ and /minio-console/ location blocks
  (port 9000 S3 API, port 9001 Console UI, path stripping via rewrite)
- 03-apiservice: integrate MinIO minio-python SDK for file upload
  - requirements.txt: add minio==7.2.11
  - app/core/config.py: add MINIO_ENDPOINT, ACCESS_KEY, SECRET_KEY, BUCKET_FINANCE, USE_SSL
  - app/services/minio_client.py: new — upload_file(), get_presigned_url(), delete_file()
  - app/routes/pages.py: replace local /data/uploads/ write with MinIO upload to finance bucket
  - docker-compose.yml: pass MinIO env vars to container
  - .env.example: document MinIO vars
- 07-minio/.env.example: add MINIO_SVC_ACCESS_KEY/SECRET_KEY section
- 07-minio/README.md: add Python minio SDK and Airflow DAG usage guide
- CLAUDE.md: project context (servers, SSH, paths, service distribution)
- document-obsidiant/: initial Obsidian docs for all services
2026-05-20 17:42:39 +07:00

15 KiB

MinIO Object Storage Service

MinIO is a high-performance, S3-compatible object storage system. This setup includes persistent storage, HTTPS access via Nginx reverse proxy, and Keycloak SSO integration.

🎯 Overview

MinIO Features:

  • S3-Compatible API - Works with AWS S3 SDKs and tools
  • High Performance - Optimized for large-scale data workloads
  • Distributed Storage - Supports multi-node deployment
  • Web Console - User-friendly web interface
  • Encryption - Server-side and client-side encryption
  • Versioning - Object versioning support
  • Lifecycle Management - Automatic data retention policies

This Setup Includes:

  • Docker Compose configuration
  • Persistent storage with volume mounts
  • HTTPS access via Nginx reverse proxy
  • Keycloak SSO integration (OpenID Connect)
  • Health checks and monitoring

📋 Prerequisites

  • Docker and Docker Compose installed
  • Network: shared_data_network created
  • Nginx reverse proxy configured
  • Keycloak instance running (for SSO)
  • Server: 192.168.100.9

🚀 Quick Start

Step 1: Configure Environment

cd 07-minio

# Copy example environment file
cp .env.example .env

# Edit .env with your settings
nano .env

Required Configuration:

# MinIO Credentials
MINIO_ROOT_USER=minioadmin
MINIO_ROOT_PASSWORD=your-secure-password-here

# Keycloak Integration
MINIO_IDENTITY_OPENID_CLIENT_SECRET=your-keycloak-client-secret

Step 2: Create Data Directory

# Create persistent storage directory
mkdir -p data

# Set permissions
chmod 755 data

Step 3: Start MinIO

# Start service
docker compose up -d

# Check status
docker compose ps

# View logs
docker logs minio -f

Step 4: Configure Nginx Reverse Proxy

Add the configuration from nginx-minio.conf to your Nginx Proxy Manager:

  1. Go to Nginx Proxy Manager UI
  2. Create/Edit Proxy Host for ai.sriphat.com
  3. Add MinIO configuration to "Custom Nginx Configuration"
  4. Save and test

Step 5: Setup Keycloak Integration

Follow the detailed guide in KEYCLOAK_INTEGRATION.md:

  1. Create MinIO client in Keycloak
  2. Configure client scopes and mappers
  3. Add policy attributes to users
  4. Update MinIO environment variables
  5. Restart MinIO service

🌐 Access URLs

MinIO Console (Web UI):

https://ai.sriphat.com/minio-console

MinIO API (S3 Compatible):

https://ai.sriphat.com/minio

Direct Access (Development):

http://192.168.100.9:9001  (Console)
http://192.168.100.9:9000  (API)

🔑 Authentication

Option 1: Root Credentials (Default)

Login with root credentials from .env:

  • Username: Value of MINIO_ROOT_USER
  • Password: Value of MINIO_ROOT_PASSWORD
  1. Click "Login with SSO" on MinIO Console
  2. Authenticate with Keycloak
  3. Access granted based on policy mapping

See KEYCLOAK_INTEGRATION.md for setup instructions.

📦 Using MinIO

Web Console

  1. Access: https://ai.sriphat.com/minio-console
  2. Login with credentials or SSO
  3. Create buckets, upload files, manage access

MinIO Client (mc)

# Install mc
wget https://dl.min.io/client/mc/release/linux-amd64/mc
chmod +x mc
sudo mv mc /usr/local/bin/

# Configure alias
mc alias set myminio https://ai.sriphat.com/minio minioadmin your-password

# List buckets
mc ls myminio

# Create bucket
mc mb myminio/my-bucket

# Upload file
mc cp myfile.txt myminio/my-bucket/

# Download file
mc cp myminio/my-bucket/myfile.txt ./

# List objects
mc ls myminio/my-bucket

# Remove object
mc rm myminio/my-bucket/myfile.txt

Python SDK (minio — แนะนำสำหรับ Sriphat Platform)

ใช้ minio package (Official MinIO Python SDK) แทน boto3 สำหรับ internal services:

from minio import Minio
import io

# Connection — ใช้ internal IP จาก service บน server อื่น
client = Minio(
    endpoint="192.168.100.9:9000",   # internal IP, ไม่ใช่ public URL
    access_key="sp_service_ac",
    secret_key="<MINIO_SVC_SECRET_KEY>",
    secure=False,                     # HTTP ภายใน network
)

# Upload file
with open("report.xlsx", "rb") as f:
    data = f.read()
client.put_object(
    bucket_name="finance",
    object_name="finance/20260520_report.xlsx",
    data=io.BytesIO(data),
    length=len(data),
    content_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
)

# Download file (คืนค่าเป็น HTTPResponse)
response = client.get_object("finance", "finance/20260520_report.xlsx")
data = response.read()
response.close()
response.release_conn()

# Download ไปยัง file โดยตรง
client.fget_object("finance", "finance/20260520_report.xlsx", "/tmp/report.xlsx")

# List objects ใน bucket
for obj in client.list_objects("finance", prefix="finance/", recursive=True):
    print(obj.object_name, obj.size)

# Generate presigned URL (สำหรับให้ภายนอกดาวน์โหลด ใช้ได้ 1 ชั่วโมง)
from datetime import timedelta
url = client.presigned_get_object("finance", "finance/report.xlsx", expires=timedelta(hours=1))
print(url)

Airflow DAG — อ่านไฟล์จาก MinIO finance bucket

Airflow อยู่บน server .9 (server เดียวกับ MinIO) ใช้ container name minio:9000 หรือ 192.168.100.9:9000:

from minio import Minio
import pandas as pd
import io
from airflow.decorators import dag, task
from airflow.utils.dates import days_ago

@dag(schedule=None, start_date=days_ago(1), catchup=False)
def process_finance_excel():

    @task
    def download_and_process(filepath: str, **context):
        """
        filepath = MinIO object key เช่น "finance/20260520_123000_report.xlsx"
        ส่งมาจาก API Service ผ่าน DAG trigger conf
        """
        client = Minio(
            endpoint="minio:9000",        # container name บน shared_data_network
            access_key="sp_service_ac",   # ใช้ service account เดียวกับ API
            secret_key="{{ var.value.MINIO_SVC_SECRET_KEY }}",  # เก็บใน Airflow Variables
            secure=False,
        )

        # Download file จาก MinIO
        response = client.get_object(bucket_name="finance", object_name=filepath)
        file_bytes = response.read()
        response.close()
        response.release_conn()

        # ประมวลผลด้วย pandas
        df = pd.read_excel(io.BytesIO(file_bytes))
        print(f"Loaded {len(df)} rows from {filepath}")

        # ... process data ...
        return {"rows": len(df), "filepath": filepath}

    @task
    def get_filepath(**context):
        conf = context["dag_run"].conf or {}
        return conf.get("filepath", "")

    fp = get_filepath()
    download_and_process(fp)

process_finance_excel()

ตั้งค่า Airflow Connection (ทางเลือก — ใช้ S3Hook)

ถ้าต้องการใช้ S3Hook หรือ Airflow Operators:

Connection ID : minio_s3
Connection Type: Amazon Web Services
Extra (JSON)  : {"endpoint_url": "http://minio:9000", "region_name": "ap-southeast-1"}
Login         : sp_service_ac
Password      : <MINIO_SVC_SECRET_KEY>
from airflow.providers.amazon.aws.hooks.s3 import S3Hook

hook = S3Hook(aws_conn_id="minio_s3")
obj = hook.get_key(key="finance/20260520_report.xlsx", bucket_name="finance")
data = obj.get()["Body"].read()
df = pd.read_excel(io.BytesIO(data))

Airflow Variables ที่ต้องสร้าง:

Key Value
MINIO_ENDPOINT 192.168.100.9:9000
MINIO_SVC_SECRET_KEY (ดูจาก 07-minio/.env)

Python SDK (boto3)

import boto3
from botocore.client import Config

# Configure S3 client
s3 = boto3.client(
    's3',
    endpoint_url='https://ai.sriphat.com/minio',
    aws_access_key_id='minioadmin',
    aws_secret_access_key='your-password',
    config=Config(signature_version='s3v4'),
    region_name='ap-southeast-1'
)

# List buckets
response = s3.list_buckets()
for bucket in response['Buckets']:
    print(bucket['Name'])

# Upload file
s3.upload_file('myfile.txt', 'my-bucket', 'myfile.txt')

# Download file
s3.download_file('my-bucket', 'myfile.txt', 'downloaded.txt')

# List objects
response = s3.list_objects_v2(Bucket='my-bucket')
for obj in response.get('Contents', []):
    print(obj['Key'])

AWS CLI

# Configure AWS CLI
aws configure set aws_access_key_id minioadmin
aws configure set aws_secret_access_key your-password
aws configure set region ap-southeast-1

# List buckets
aws --endpoint-url https://ai.sriphat.com/minio s3 ls

# Create bucket
aws --endpoint-url https://ai.sriphat.com/minio s3 mb s3://my-bucket

# Upload file
aws --endpoint-url https://ai.sriphat.com/minio s3 cp myfile.txt s3://my-bucket/

# Download file
aws --endpoint-url https://ai.sriphat.com/minio s3 cp s3://my-bucket/myfile.txt ./

# Sync directory
aws --endpoint-url https://ai.sriphat.com/minio s3 sync ./mydir s3://my-bucket/mydir/

🔧 Configuration

Environment Variables

Variable Description Default
MINIO_ROOT_USER Root username minioadmin
MINIO_ROOT_PASSWORD Root password -
MINIO_API_PORT API port 9000
MINIO_CONSOLE_PORT Console port 9001
MINIO_SERVER_URL API endpoint URL -
MINIO_BROWSER_REDIRECT_URL Console URL -
MINIO_REGION Default region ap-southeast-1

Keycloak Integration

Variable Description
MINIO_IDENTITY_OPENID_CONFIG_URL Keycloak OIDC config URL
MINIO_IDENTITY_OPENID_CLIENT_ID Client ID in Keycloak
MINIO_IDENTITY_OPENID_CLIENT_SECRET Client secret
MINIO_IDENTITY_OPENID_CLAIM_NAME Policy claim name
MINIO_IDENTITY_OPENID_SCOPES OIDC scopes

Storage

Persistent Data:

07-minio/data/          # Object storage data
07-minio/certs/         # SSL certificates (optional)

Volume Mounts:

volumes:
  - ./data:/data                        # Storage data
  - ./certs:/root/.minio/certs:ro      # SSL certs

🔒 Security

1. Strong Passwords

# Generate strong password
openssl rand -base64 32

# Update .env
MINIO_ROOT_PASSWORD=generated-password-here

2. Network Security

# Firewall rules (if needed)
sudo ufw allow from 192.168.100.0/24 to any port 9000
sudo ufw allow from 192.168.100.0/24 to any port 9001

3. HTTPS Only

  • Always use HTTPS in production
  • Configure SSL certificates in Nginx
  • Set MINIO_SERVER_URL and MINIO_BROWSER_REDIRECT_URL to HTTPS

4. Access Policies

# Create read-only policy
mc admin policy create myminio readonly-policy readonly-policy.json

# Assign policy to user
mc admin policy attach myminio readonly-policy --user=username

5. Bucket Policies

# Set bucket policy (public read)
mc anonymous set download myminio/public-bucket

# Set bucket policy (private)
mc anonymous set none myminio/private-bucket

📊 Monitoring

Health Check

# Check MinIO health
curl -k https://ai.sriphat.com/minio/health/live

# Check from container
docker exec minio curl -f http://localhost:9000/minio/health/live

Logs

# View logs
docker logs minio -f

# View last 100 lines
docker logs minio --tail 100

# Export logs
docker logs minio > minio.log

Metrics

# View server info
mc admin info myminio

# View server stats
mc admin prometheus metrics myminio

Disk Usage

# Check disk usage
mc admin info myminio

# Check bucket size
mc du myminio/my-bucket

🐛 Troubleshooting

Issue: Cannot access MinIO Console

Check:

# Verify container is running
docker ps | grep minio

# Check logs
docker logs minio

# Test direct access
curl http://192.168.100.9:9001

Solution:

  • Ensure container is running: docker compose up -d
  • Check firewall rules
  • Verify Nginx configuration

Issue: SSO login not working

Check:

# Verify Keycloak config
docker exec minio printenv | grep MINIO_IDENTITY_OPENID

# Test Keycloak connectivity
docker exec minio curl -k https://ai.sriphat.com/keycloak/realms/sriphat/.well-known/openid-configuration

Solution:

  • Verify all Keycloak environment variables are set
  • Check client secret is correct
  • Ensure redirect URIs match in Keycloak
  • See KEYCLOAK_INTEGRATION.md for detailed troubleshooting

Issue: Upload fails

Check:

# Check disk space
df -h

# Check permissions
ls -la data/

Solution:

  • Ensure sufficient disk space
  • Check directory permissions: chmod 755 data/
  • Increase client_max_body_size in Nginx

Issue: S3 API connection refused

Check:

# Test API endpoint
curl -k https://ai.sriphat.com/minio/

# Test direct connection
curl http://192.168.100.9:9000/

Solution:

  • Verify MINIO_SERVER_URL is set correctly
  • Check Nginx proxy configuration
  • Ensure port 9000 is accessible

🔄 Maintenance

Backup

# Backup data directory
tar -czf minio-backup-$(date +%Y%m%d).tar.gz data/

# Backup to remote location
rsync -avz data/ user@backup-server:/backups/minio/

Update MinIO

# Pull latest image
docker compose pull

# Restart with new image
docker compose up -d

# Verify version
docker exec minio minio --version

Restore

# Stop MinIO
docker compose down

# Restore data
tar -xzf minio-backup-20260325.tar.gz

# Start MinIO
docker compose up -d

📚 Documentation

🎯 Use Cases

1. Data Lake Storage

  • Store raw data files (CSV, JSON, Parquet)
  • Integrate with Spark, Pandas, Dask
  • Version control for datasets

2. Backup Storage

  • Database backups
  • Application backups
  • Log archival

3. Media Storage

  • Images, videos, documents
  • CDN integration
  • Static website hosting

4. ML/AI Workflows

  • Model storage
  • Training data storage
  • Experiment artifacts

5. Application Storage

  • User uploads
  • Generated reports
  • Temporary files

🎉 Summary

What You Have:

  • MinIO object storage service
  • Persistent storage with volume mounts
  • HTTPS access via Nginx reverse proxy
  • Keycloak SSO integration ready
  • S3-compatible API
  • Web console for management
  • Health checks and monitoring

Access:

  • Console: https://ai.sriphat.com/minio-console
  • API: https://ai.sriphat.com/minio

Next Steps:

  1. Configure .env file
  2. Start MinIO: docker compose up -d
  3. Setup Keycloak integration (optional)
  4. Configure Nginx reverse proxy
  5. Create buckets and start using!

For detailed Keycloak SSO setup, see KEYCLOAK_INTEGRATION.md 🚀