update configuration docker setup for data platform
This commit is contained in:
520
07-minio/README.md
Normal file
520
07-minio/README.md
Normal file
@@ -0,0 +1,520 @@
|
||||
# MinIO Object Storage Service
|
||||
|
||||
MinIO is a high-performance, S3-compatible object storage system. This setup includes persistent storage, HTTPS access via Nginx reverse proxy, and Keycloak SSO integration.
|
||||
|
||||
## 🎯 Overview
|
||||
|
||||
**MinIO Features:**
|
||||
- **S3-Compatible API** - Works with AWS S3 SDKs and tools
|
||||
- **High Performance** - Optimized for large-scale data workloads
|
||||
- **Distributed Storage** - Supports multi-node deployment
|
||||
- **Web Console** - User-friendly web interface
|
||||
- **Encryption** - Server-side and client-side encryption
|
||||
- **Versioning** - Object versioning support
|
||||
- **Lifecycle Management** - Automatic data retention policies
|
||||
|
||||
**This Setup Includes:**
|
||||
- Docker Compose configuration
|
||||
- Persistent storage with volume mounts
|
||||
- HTTPS access via Nginx reverse proxy
|
||||
- Keycloak SSO integration (OpenID Connect)
|
||||
- Health checks and monitoring
|
||||
|
||||
## 📋 Prerequisites
|
||||
|
||||
- Docker and Docker Compose installed
|
||||
- Network: `shared_data_network` created
|
||||
- Nginx reverse proxy configured
|
||||
- Keycloak instance running (for SSO)
|
||||
- Server: 192.168.100.9
|
||||
|
||||
## 🚀 Quick Start
|
||||
|
||||
### **Step 1: Configure Environment**
|
||||
|
||||
```bash
|
||||
cd 07-minio
|
||||
|
||||
# Copy example environment file
|
||||
cp .env.example .env
|
||||
|
||||
# Edit .env with your settings
|
||||
nano .env
|
||||
```
|
||||
|
||||
**Required Configuration:**
|
||||
```bash
|
||||
# MinIO Credentials
|
||||
MINIO_ROOT_USER=minioadmin
|
||||
MINIO_ROOT_PASSWORD=your-secure-password-here
|
||||
|
||||
# Keycloak Integration
|
||||
MINIO_IDENTITY_OPENID_CLIENT_SECRET=your-keycloak-client-secret
|
||||
```
|
||||
|
||||
### **Step 2: Create Data Directory**
|
||||
|
||||
```bash
|
||||
# Create persistent storage directory
|
||||
mkdir -p data
|
||||
|
||||
# Set permissions
|
||||
chmod 755 data
|
||||
```
|
||||
|
||||
### **Step 3: Start MinIO**
|
||||
|
||||
```bash
|
||||
# Start service
|
||||
docker compose up -d
|
||||
|
||||
# Check status
|
||||
docker compose ps
|
||||
|
||||
# View logs
|
||||
docker logs minio -f
|
||||
```
|
||||
|
||||
### **Step 4: Configure Nginx Reverse Proxy**
|
||||
|
||||
Add the configuration from `nginx-minio.conf` to your Nginx Proxy Manager:
|
||||
|
||||
1. Go to Nginx Proxy Manager UI
|
||||
2. Create/Edit Proxy Host for `ai.sriphat.com`
|
||||
3. Add MinIO configuration to "Custom Nginx Configuration"
|
||||
4. Save and test
|
||||
|
||||
### **Step 5: Setup Keycloak Integration**
|
||||
|
||||
Follow the detailed guide in `KEYCLOAK_INTEGRATION.md`:
|
||||
|
||||
1. Create MinIO client in Keycloak
|
||||
2. Configure client scopes and mappers
|
||||
3. Add policy attributes to users
|
||||
4. Update MinIO environment variables
|
||||
5. Restart MinIO service
|
||||
|
||||
## 🌐 Access URLs
|
||||
|
||||
**MinIO Console (Web UI):**
|
||||
```
|
||||
https://ai.sriphat.com/minio-console
|
||||
```
|
||||
|
||||
**MinIO API (S3 Compatible):**
|
||||
```
|
||||
https://ai.sriphat.com/minio
|
||||
```
|
||||
|
||||
**Direct Access (Development):**
|
||||
```
|
||||
http://192.168.100.9:9001 (Console)
|
||||
http://192.168.100.9:9000 (API)
|
||||
```
|
||||
|
||||
## 🔑 Authentication
|
||||
|
||||
### **Option 1: Root Credentials (Default)**
|
||||
|
||||
Login with root credentials from `.env`:
|
||||
- **Username**: Value of `MINIO_ROOT_USER`
|
||||
- **Password**: Value of `MINIO_ROOT_PASSWORD`
|
||||
|
||||
### **Option 2: Keycloak SSO (Recommended)**
|
||||
|
||||
1. Click "Login with SSO" on MinIO Console
|
||||
2. Authenticate with Keycloak
|
||||
3. Access granted based on policy mapping
|
||||
|
||||
See `KEYCLOAK_INTEGRATION.md` for setup instructions.
|
||||
|
||||
## 📦 Using MinIO
|
||||
|
||||
### **Web Console**
|
||||
|
||||
1. Access: `https://ai.sriphat.com/minio-console`
|
||||
2. Login with credentials or SSO
|
||||
3. Create buckets, upload files, manage access
|
||||
|
||||
### **MinIO Client (mc)**
|
||||
|
||||
```bash
|
||||
# Install mc
|
||||
wget https://dl.min.io/client/mc/release/linux-amd64/mc
|
||||
chmod +x mc
|
||||
sudo mv mc /usr/local/bin/
|
||||
|
||||
# Configure alias
|
||||
mc alias set myminio https://ai.sriphat.com/minio minioadmin your-password
|
||||
|
||||
# List buckets
|
||||
mc ls myminio
|
||||
|
||||
# Create bucket
|
||||
mc mb myminio/my-bucket
|
||||
|
||||
# Upload file
|
||||
mc cp myfile.txt myminio/my-bucket/
|
||||
|
||||
# Download file
|
||||
mc cp myminio/my-bucket/myfile.txt ./
|
||||
|
||||
# List objects
|
||||
mc ls myminio/my-bucket
|
||||
|
||||
# Remove object
|
||||
mc rm myminio/my-bucket/myfile.txt
|
||||
```
|
||||
|
||||
### **Python SDK (boto3)**
|
||||
|
||||
```python
|
||||
import boto3
|
||||
from botocore.client import Config
|
||||
|
||||
# Configure S3 client
|
||||
s3 = boto3.client(
|
||||
's3',
|
||||
endpoint_url='https://ai.sriphat.com/minio',
|
||||
aws_access_key_id='minioadmin',
|
||||
aws_secret_access_key='your-password',
|
||||
config=Config(signature_version='s3v4'),
|
||||
region_name='ap-southeast-1'
|
||||
)
|
||||
|
||||
# List buckets
|
||||
response = s3.list_buckets()
|
||||
for bucket in response['Buckets']:
|
||||
print(bucket['Name'])
|
||||
|
||||
# Upload file
|
||||
s3.upload_file('myfile.txt', 'my-bucket', 'myfile.txt')
|
||||
|
||||
# Download file
|
||||
s3.download_file('my-bucket', 'myfile.txt', 'downloaded.txt')
|
||||
|
||||
# List objects
|
||||
response = s3.list_objects_v2(Bucket='my-bucket')
|
||||
for obj in response.get('Contents', []):
|
||||
print(obj['Key'])
|
||||
```
|
||||
|
||||
### **AWS CLI**
|
||||
|
||||
```bash
|
||||
# Configure AWS CLI
|
||||
aws configure set aws_access_key_id minioadmin
|
||||
aws configure set aws_secret_access_key your-password
|
||||
aws configure set region ap-southeast-1
|
||||
|
||||
# List buckets
|
||||
aws --endpoint-url https://ai.sriphat.com/minio s3 ls
|
||||
|
||||
# Create bucket
|
||||
aws --endpoint-url https://ai.sriphat.com/minio s3 mb s3://my-bucket
|
||||
|
||||
# Upload file
|
||||
aws --endpoint-url https://ai.sriphat.com/minio s3 cp myfile.txt s3://my-bucket/
|
||||
|
||||
# Download file
|
||||
aws --endpoint-url https://ai.sriphat.com/minio s3 cp s3://my-bucket/myfile.txt ./
|
||||
|
||||
# Sync directory
|
||||
aws --endpoint-url https://ai.sriphat.com/minio s3 sync ./mydir s3://my-bucket/mydir/
|
||||
```
|
||||
|
||||
## 🔧 Configuration
|
||||
|
||||
### **Environment Variables**
|
||||
|
||||
| Variable | Description | Default |
|
||||
|----------|-------------|---------|
|
||||
| `MINIO_ROOT_USER` | Root username | minioadmin |
|
||||
| `MINIO_ROOT_PASSWORD` | Root password | - |
|
||||
| `MINIO_API_PORT` | API port | 9000 |
|
||||
| `MINIO_CONSOLE_PORT` | Console port | 9001 |
|
||||
| `MINIO_SERVER_URL` | API endpoint URL | - |
|
||||
| `MINIO_BROWSER_REDIRECT_URL` | Console URL | - |
|
||||
| `MINIO_REGION` | Default region | ap-southeast-1 |
|
||||
|
||||
### **Keycloak Integration**
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `MINIO_IDENTITY_OPENID_CONFIG_URL` | Keycloak OIDC config URL |
|
||||
| `MINIO_IDENTITY_OPENID_CLIENT_ID` | Client ID in Keycloak |
|
||||
| `MINIO_IDENTITY_OPENID_CLIENT_SECRET` | Client secret |
|
||||
| `MINIO_IDENTITY_OPENID_CLAIM_NAME` | Policy claim name |
|
||||
| `MINIO_IDENTITY_OPENID_SCOPES` | OIDC scopes |
|
||||
|
||||
### **Storage**
|
||||
|
||||
**Persistent Data:**
|
||||
```
|
||||
07-minio/data/ # Object storage data
|
||||
07-minio/certs/ # SSL certificates (optional)
|
||||
```
|
||||
|
||||
**Volume Mounts:**
|
||||
```yaml
|
||||
volumes:
|
||||
- ./data:/data # Storage data
|
||||
- ./certs:/root/.minio/certs:ro # SSL certs
|
||||
```
|
||||
|
||||
## 🔒 Security
|
||||
|
||||
### **1. Strong Passwords**
|
||||
|
||||
```bash
|
||||
# Generate strong password
|
||||
openssl rand -base64 32
|
||||
|
||||
# Update .env
|
||||
MINIO_ROOT_PASSWORD=generated-password-here
|
||||
```
|
||||
|
||||
### **2. Network Security**
|
||||
|
||||
```bash
|
||||
# Firewall rules (if needed)
|
||||
sudo ufw allow from 192.168.100.0/24 to any port 9000
|
||||
sudo ufw allow from 192.168.100.0/24 to any port 9001
|
||||
```
|
||||
|
||||
### **3. HTTPS Only**
|
||||
|
||||
- Always use HTTPS in production
|
||||
- Configure SSL certificates in Nginx
|
||||
- Set `MINIO_SERVER_URL` and `MINIO_BROWSER_REDIRECT_URL` to HTTPS
|
||||
|
||||
### **4. Access Policies**
|
||||
|
||||
```bash
|
||||
# Create read-only policy
|
||||
mc admin policy create myminio readonly-policy readonly-policy.json
|
||||
|
||||
# Assign policy to user
|
||||
mc admin policy attach myminio readonly-policy --user=username
|
||||
```
|
||||
|
||||
### **5. Bucket Policies**
|
||||
|
||||
```bash
|
||||
# Set bucket policy (public read)
|
||||
mc anonymous set download myminio/public-bucket
|
||||
|
||||
# Set bucket policy (private)
|
||||
mc anonymous set none myminio/private-bucket
|
||||
```
|
||||
|
||||
## 📊 Monitoring
|
||||
|
||||
### **Health Check**
|
||||
|
||||
```bash
|
||||
# Check MinIO health
|
||||
curl -k https://ai.sriphat.com/minio/health/live
|
||||
|
||||
# Check from container
|
||||
docker exec minio curl -f http://localhost:9000/minio/health/live
|
||||
```
|
||||
|
||||
### **Logs**
|
||||
|
||||
```bash
|
||||
# View logs
|
||||
docker logs minio -f
|
||||
|
||||
# View last 100 lines
|
||||
docker logs minio --tail 100
|
||||
|
||||
# Export logs
|
||||
docker logs minio > minio.log
|
||||
```
|
||||
|
||||
### **Metrics**
|
||||
|
||||
```bash
|
||||
# View server info
|
||||
mc admin info myminio
|
||||
|
||||
# View server stats
|
||||
mc admin prometheus metrics myminio
|
||||
```
|
||||
|
||||
### **Disk Usage**
|
||||
|
||||
```bash
|
||||
# Check disk usage
|
||||
mc admin info myminio
|
||||
|
||||
# Check bucket size
|
||||
mc du myminio/my-bucket
|
||||
```
|
||||
|
||||
## 🐛 Troubleshooting
|
||||
|
||||
### **Issue: Cannot access MinIO Console**
|
||||
|
||||
**Check:**
|
||||
```bash
|
||||
# Verify container is running
|
||||
docker ps | grep minio
|
||||
|
||||
# Check logs
|
||||
docker logs minio
|
||||
|
||||
# Test direct access
|
||||
curl http://192.168.100.9:9001
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
- Ensure container is running: `docker compose up -d`
|
||||
- Check firewall rules
|
||||
- Verify Nginx configuration
|
||||
|
||||
### **Issue: SSO login not working**
|
||||
|
||||
**Check:**
|
||||
```bash
|
||||
# Verify Keycloak config
|
||||
docker exec minio printenv | grep MINIO_IDENTITY_OPENID
|
||||
|
||||
# Test Keycloak connectivity
|
||||
docker exec minio curl -k https://ai.sriphat.com/keycloak/realms/sriphat/.well-known/openid-configuration
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
- Verify all Keycloak environment variables are set
|
||||
- Check client secret is correct
|
||||
- Ensure redirect URIs match in Keycloak
|
||||
- See `KEYCLOAK_INTEGRATION.md` for detailed troubleshooting
|
||||
|
||||
### **Issue: Upload fails**
|
||||
|
||||
**Check:**
|
||||
```bash
|
||||
# Check disk space
|
||||
df -h
|
||||
|
||||
# Check permissions
|
||||
ls -la data/
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
- Ensure sufficient disk space
|
||||
- Check directory permissions: `chmod 755 data/`
|
||||
- Increase `client_max_body_size` in Nginx
|
||||
|
||||
### **Issue: S3 API connection refused**
|
||||
|
||||
**Check:**
|
||||
```bash
|
||||
# Test API endpoint
|
||||
curl -k https://ai.sriphat.com/minio/
|
||||
|
||||
# Test direct connection
|
||||
curl http://192.168.100.9:9000/
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
- Verify `MINIO_SERVER_URL` is set correctly
|
||||
- Check Nginx proxy configuration
|
||||
- Ensure port 9000 is accessible
|
||||
|
||||
## 🔄 Maintenance
|
||||
|
||||
### **Backup**
|
||||
|
||||
```bash
|
||||
# Backup data directory
|
||||
tar -czf minio-backup-$(date +%Y%m%d).tar.gz data/
|
||||
|
||||
# Backup to remote location
|
||||
rsync -avz data/ user@backup-server:/backups/minio/
|
||||
```
|
||||
|
||||
### **Update MinIO**
|
||||
|
||||
```bash
|
||||
# Pull latest image
|
||||
docker compose pull
|
||||
|
||||
# Restart with new image
|
||||
docker compose up -d
|
||||
|
||||
# Verify version
|
||||
docker exec minio minio --version
|
||||
```
|
||||
|
||||
### **Restore**
|
||||
|
||||
```bash
|
||||
# Stop MinIO
|
||||
docker compose down
|
||||
|
||||
# Restore data
|
||||
tar -xzf minio-backup-20260325.tar.gz
|
||||
|
||||
# Start MinIO
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
## 📚 Documentation
|
||||
|
||||
- **MinIO Official Docs**: https://min.io/docs/minio/linux/
|
||||
- **S3 API Reference**: https://docs.aws.amazon.com/AmazonS3/latest/API/
|
||||
- **Keycloak Integration**: See `KEYCLOAK_INTEGRATION.md`
|
||||
- **Nginx Configuration**: See `nginx-minio.conf`
|
||||
|
||||
## 🎯 Use Cases
|
||||
|
||||
### **1. Data Lake Storage**
|
||||
- Store raw data files (CSV, JSON, Parquet)
|
||||
- Integrate with Spark, Pandas, Dask
|
||||
- Version control for datasets
|
||||
|
||||
### **2. Backup Storage**
|
||||
- Database backups
|
||||
- Application backups
|
||||
- Log archival
|
||||
|
||||
### **3. Media Storage**
|
||||
- Images, videos, documents
|
||||
- CDN integration
|
||||
- Static website hosting
|
||||
|
||||
### **4. ML/AI Workflows**
|
||||
- Model storage
|
||||
- Training data storage
|
||||
- Experiment artifacts
|
||||
|
||||
### **5. Application Storage**
|
||||
- User uploads
|
||||
- Generated reports
|
||||
- Temporary files
|
||||
|
||||
## 🎉 Summary
|
||||
|
||||
**What You Have:**
|
||||
- ✅ MinIO object storage service
|
||||
- ✅ Persistent storage with volume mounts
|
||||
- ✅ HTTPS access via Nginx reverse proxy
|
||||
- ✅ Keycloak SSO integration ready
|
||||
- ✅ S3-compatible API
|
||||
- ✅ Web console for management
|
||||
- ✅ Health checks and monitoring
|
||||
|
||||
**Access:**
|
||||
- Console: `https://ai.sriphat.com/minio-console`
|
||||
- API: `https://ai.sriphat.com/minio`
|
||||
|
||||
**Next Steps:**
|
||||
1. Configure `.env` file
|
||||
2. Start MinIO: `docker compose up -d`
|
||||
3. Setup Keycloak integration (optional)
|
||||
4. Configure Nginx reverse proxy
|
||||
5. Create buckets and start using!
|
||||
|
||||
For detailed Keycloak SSO setup, see `KEYCLOAK_INTEGRATION.md` 🚀
|
||||
Reference in New Issue
Block a user