6.2 KiB
Web Pages Documentation
Overview
The API service now includes web pages for easy access to all platform services and data management functionality.
Pages
1. Landing Page (/)
Main dashboard with navigation to all services:
- Supabase: PostgreSQL database with REST API
- API Docs: Interactive API documentation
- Airflow: Workflow orchestration
- Airbyte: Data integration and ETL
- Data Management: File upload and processing (with submenu)
- DBT: Data transformation
- Superset: Business intelligence and visualization
2. Finance Excel Upload (/data-management/finance)
Upload and process financial Excel files with the following features:
Upload Form:
- Drag-and-drop or click to select Excel files (.xlsx, .xls)
- Optional description field
- File validation and size display
Upload Processing:
- Files saved to
/data/uploads/with timestamp prefix - Unique filename generation to prevent conflicts
- Automatic Airflow job triggering (when configured)
Upload History:
- Real-time list of all uploaded files
- Status tracking: pending, processing, success, error
- Expandable accordion for logs and error details
- Auto-refresh every 10 seconds
API Endpoints
Page Routes
GET / - Landing page
GET /data-management/finance - Finance upload page
POST /data-management/finance/upload - Upload file endpoint
GET /data-management/finance/uploads - Get all uploads
GET /data-management/finance/uploads/{upload_id} - Get specific upload
Upload Endpoint
POST /data-management/finance/upload
Form Data:
file: Excel file (.xlsx or .xls)description: Optional description text
Response:
{
"success": true,
"message": "File 'example.xlsx' uploaded successfully",
"upload_id": "upload_20260306_174500",
"filename": "20260306_174500_example.xlsx"
}
Error Response:
{
"detail": "Invalid file type. Only .xlsx and .xls files are allowed."
}
File Storage
Uploaded files are stored in:
/data/uploads/YYYYMMDD_HHMMSS_original_filename.xlsx
Example:
/data/uploads/20260306_174530_finance_report.xlsx
Airflow Integration
The upload endpoint includes a placeholder for Airflow integration. To enable:
-
Configure Airflow endpoint in
app/routes/pages.py:airflow_url = "http://airflow-webserver:8080/api/v1/dags/{dag_id}/dagRuns" -
Set DAG ID for the finance processing job
-
Uncomment the Airflow trigger code in
trigger_airflow_job()function -
Configure authentication (username/password or API token)
Example Airflow Integration
async def trigger_airflow_job(filepath: str, upload_id: str) -> str:
import httpx
airflow_url = "http://airflow-webserver:8080/api/v1/dags/finance_processing/dagRuns"
headers = {"Content-Type": "application/json"}
auth = ("airflow", "airflow")
payload = {
"conf": {
"filepath": filepath,
"upload_id": upload_id
}
}
async with httpx.AsyncClient() as client:
response = await client.post(
airflow_url,
json=payload,
headers=headers,
auth=auth
)
response.raise_for_status()
result = response.json()
return result["dag_run_id"]
Upload Status Tracking
Upload records include:
id: Unique identifierfilename: Original filenamefilepath: Full path to saved fileuploaded_at: ISO timestampdescription: User-provided descriptionstatus: pending | processing | success | errorjob_id: Airflow DAG run ID (when triggered)logs: Error logs or processing information
Database Storage
Currently uses in-memory storage (uploads_db list). For production:
-
Create database table:
CREATE TABLE uploads ( id VARCHAR(100) PRIMARY KEY, filename VARCHAR(255) NOT NULL, filepath VARCHAR(500) NOT NULL, uploaded_at TIMESTAMP NOT NULL, description TEXT, status VARCHAR(20) NOT NULL, job_id VARCHAR(100), logs TEXT, created_at TIMESTAMP DEFAULT NOW() ); -
Replace in-memory storage with SQLAlchemy model
-
Add database queries in route handlers
Access URLs
- Local: http://localhost:8040/apiservice/
- Production: https://ai.sriphat.com/apiservice/
Security Considerations
- File validation: Only .xlsx and .xls files allowed
- Filename sanitization: Spaces replaced with underscores
- Unique filenames: Timestamp prefix prevents conflicts
- File size limits: Configure in nginx/FastAPI if needed
- Authentication: Add authentication middleware for production
Future Enhancements
- Database persistence: Replace in-memory storage
- File preview: Show Excel data preview before processing
- Batch upload: Support multiple file uploads
- Download processed files: Allow downloading results
- User management: Track uploads by user
- Email notifications: Notify on processing completion
- Scheduled jobs: Automatic periodic processing
- Data validation: Validate Excel structure before processing
Troubleshooting
Upload fails with 500 error
- Check
/data/uploads/directory exists and is writable - Verify volume mount in docker-compose.yml
- Check container logs:
docker logs apiservice
Files not persisting after container restart
- Ensure volume mount is configured:
./data/uploads:/data/uploads - Check file permissions on host directory
Airflow job not triggering
- Verify Airflow is accessible from container
- Check network connectivity:
docker network inspect shared_data_network - Verify Airflow credentials and DAG ID
- Check logs in upload history
Development
To add new data management pages:
- Create HTML template in
app/templates/ - Add route in
app/routes/pages.py - Update landing page menu in
index.html - Add submenu item if needed
Example:
@router.get("/data-management/hr", response_class=HTMLResponse)
async def hr_page(request: Request):
return templates.TemplateResponse(
"data_management_hr.html",
{"request": request, "root_path": settings.ROOT_PATH}
)