GF Images Library
Thegf_images_lib package is the core image processing and management system in GloFlow. It provides a complete solution for uploading, processing, storing, and serving images with support for multiple storage backends.
Overview
The images library handles the full lifecycle of images in the GloFlow system:- Upload Management: Client-side direct uploads via presigned URLs
- Image Processing: Automatic thumbnail generation, format conversion, and transformations
- Storage: Flexible storage backend support (S3, local filesystem, IPFS)
- Jobs System: Asynchronous processing of image operations
- Flows: Organization of images into logical collections
- Classification: ML-based image classification capabilities
Architecture
The library is organized into several key subsystems:- gf_images_core: Core image data structures and operations
- gf_images_storage: Storage abstraction layer
- gf_images_jobs_core: Jobs manager for async processing
- gf_images_service: High-level service layer
- gf_images_flows: Image collections/flows management
- gf_gif_lib: GIF-specific handling
- gf_image_editor: Image editing capabilities
HTTP Handlers
Image Operations
/images/d/<image_name>
Method: GETAuth: Not required
Description: Resolve and redirect to the public URL of an image. Query Parameters:
fname- Flow name (default: “general”)
/v1/images/get
Method: GETAuth: Optional
Description: Retrieve metadata and information about a specific image. Query Parameters:
img_id- GloFlow image ID
/v1/images/classify
Method: POSTAuth: Required
Description: Classify one or more images using ML models. Request Body:
/v1/images/share
Method: POSTAuth: Required
Description: Share an image via email. Request Body:
Upload Workflow
The upload process is a two-step workflow designed for efficient direct-to-storage uploads:Step 1: /v1/images/upload_init
Method: GETAuth: Required
Description: Initialize an upload and receive a presigned URL for direct upload to storage. Query Parameters:
imgf- Image format (e.g., “jpeg”, “png”)imgn- Image name (optional)f- Comma-separated flow namesct- Client type
- Client calls this endpoint
- Server generates unique image ID
- Server creates presigned S3 URL (if using S3)
- Client uploads directly to storage using presigned URL
- Client calls upload_complete
Step 2: /v1/images/upload_complete
Method: POSTAuth: Required
Description: Notify the server that upload is complete and trigger processing. Query Parameters:
imgid- Image ID from upload_initf- Comma-separated flow names
- Image is validated
- Thumbnails are generated
- Image is added to specified flows
- Job ID is returned for status tracking
/v1/images/upload_metrics
Method: POSTAuth: Required
Description: Report client-side upload metrics for monitoring. Query Parameters:
imgid- Image IDct- Client type
Jobs Management
/images/jobs/start
Method: POSTAuth: Should be authenticated (FIX in progress)
Description: Start a new image processing job for external images. Request Body:
/images/jobs/status
Method: GET (SSE - Server-Sent Events)Auth: Not required
Description: Stream real-time status updates for a running job. Query Parameters:
images_job_id_str- Job ID to monitor
ok- Processing step completed successfullyerror- Error occurred with error detailscompleted- Job finished
Flows Management
/v1/images/flows/all
Method: GETAuth: Not required
Description: Get list of all available image flows. Response:
/v1/images/flows/add_img
Method: POSTAuth: Required
Description: Add an external image to one or more flows. Request Body:
Image Editor
/images/editor/save
Method: POSTAuth: Not required (should be added)
Description: Save an edited image. Use Case: Client-side image editing with server-side persistence.
GIF Operations
/images/gif/get_info
Method: GETAuth: Not required
Description: Retrieve information about a GIF image. Query Parameters:
orig_url- Original URL of the GIF (option 1)gfimg_id- GloFlow image ID (option 2)
Browser Client Processing
/v1/images/c
Method: POSTAuth: Not required
Description: Receive processing results from browser-based distributed jobs. Request Body:
Health Check
/v1/images/v1/healthz
Method: GETAuth: Not required
Description: Health check endpoint for infrastructure monitoring. Note: Path will be simplified to
/v1/images/healthz in future version.
Storage System
The storage system provides a unified interface for multiple storage backends:Storage Types
Local Storage
Files stored on the local filesystem. Configuration:ThumbsDirPathStr- Directory for thumbnailsUploadsSourceDirPathStr- Upload staging directoryUploadsTargetDirPathStr- Final upload destinationExternImagesDirPathStr- External images cache
S3 Storage
AWS S3 cloud storage. Features:- Presigned URL generation for direct uploads
- Multi-bucket support for different image types
- Automatic public URL generation
ThumbsS3bucketNameStr- S3 bucket for thumbnailsUploadsSourceS3bucketNameStr- Upload staging bucketUploadsTargetS3bucketNameStr- Final upload bucketExternImagesS3bucketNameStr- External images bucket
IPFS Storage
InterPlanetary File System for decentralized storage. Features:- Content-addressed storage
- Permanent, immutable image references
- Distributed hosting
IPFSnodeHostStr- IPFS node connection
Storage Operations
The storage layer provides these core operations:- Get: Download image from storage to local filesystem
- Put: Upload image from local filesystem to storage
- Copy: Copy image between storage locations/buckets
- GeneratePresignedURL: Create temporary upload/download URLs
Multi-Storage Support
Images can be stored across multiple backends simultaneously:- Primary storage for serving (typically S3)
- Backup storage (local or another cloud)
- Archival storage (IPFS for permanence)
Jobs Manager
The jobs manager handles asynchronous image processing operations.Job Types
External Images Job
Process images from external URLs. Operations:- Fetch image from URL
- Validate format and size
- Generate thumbnails (multiple sizes)
- Extract metadata
- Store in configured storage
- Add to flows
- Update database
Uploaded Images Job
Process user-uploaded images. Operations:- Validate uploaded file
- Detect format
- Generate thumbnails
- Create multiple sizes/formats
- Move from staging to production storage
- Index in database
Local Images Job
Process images already on local filesystem. Use Case: Batch processing, migration, or background processing of local image sets.Image Classification Job
Run ML models on images for automatic tagging. Operations:- Load image
- Preprocess for ML model
- Run classification model
- Extract top classes/tags
- Store classification results
Job Lifecycle
- Submission: Client submits job via API
- Initialization: Job manager creates job record
- Processing: Worker processes images
- Updates: Status updates sent via SSE or events
- Completion: Final status persisted to database
- Cleanup: Temporary files removed
Job Status Tracking
Jobs can be in these states:running- Currently processingcompleted- Successfully finishedfailed- Failed completelyfailed_partial- Some images failed, some succeeded
Job Updates
Real-time updates via Server-Sent Events:ok- Step completed successfullyerror- Error with detailscompleted- Job finished
Image Flows
Flows are logical collections of images organized by purpose or category.Flow Concepts
- Flow Name: Unique identifier (e.g., “gallery”, “products”, “avatars”)
- Flow-to-S3-Bucket Mapping: Each flow can use a different S3 bucket
- Multi-Flow Images: Images can belong to multiple flows
Common Flow Patterns
- general: Default flow for miscellaneous images
- gallery: Public image galleries
- products: E-commerce product images
- avatars: User profile images
- backgrounds: Background/wallpaper images
Flow Operations
- Add Image to Flow: Associate existing or new image with flow
- List All Flows: Get available flows
- Get Flow Images: Query images in a specific flow
- Flow-based Storage: Route images to appropriate S3 buckets
Image Processing Pipeline
Thumbnail Generation
Automatic generation of multiple thumbnail sizes:- Small: Quick preview (typically 150x150)
- Medium: List/grid view (typically 400x400)
- Large: Detail view (typically 800x800)
Image Transformations
- Format conversion (JPEG, PNG, WebP)
- Resizing with aspect ratio preservation
- Quality optimization
- Metadata extraction (EXIF, dimensions, format)
Processing Plugins
Extensible plugin system for custom processing:- Custom filters
- Watermarking
- Face detection
- Custom ML models
Best Practices
Upload Workflow
- Use upload_init/upload_complete: This avoids server bandwidth bottlenecks
- Upload directly to S3: Use presigned URLs for client-to-S3 uploads
- Track job status: Monitor processing via SSE endpoint
- Handle failures: Implement retry logic for failed uploads
Performance
- Use appropriate flows: Organize images for efficient storage
- Leverage thumbnails: Serve thumbnails for lists/grids
- CDN integration: Use CloudFront or similar for S3 buckets
- Batch operations: Use jobs for processing multiple images
Security
- Authenticate uploads: All upload endpoints require auth
- Validate formats: Server validates image formats and sizes
- Rate limiting: Implement rate limits for upload endpoints
- Presigned URL expiration: URLs expire after short time window
Configuration
Required Configuration
Storage Configuration
Error Handling
Common error scenarios:- Invalid image format: Server rejects non-image uploads
- File too large: Size limits enforced
- Storage failure: S3 upload/download errors
- Processing timeout: Long-running jobs may timeout
- Flow not found: Unknown flow name in request
Metrics and Monitoring
Available Metrics
- Upload success/failure rates
- Processing time per image
- Thumbnail generation time
- Storage operation latency
- Job queue depth
- Error rates by type
Monitoring Endpoints
/v1/images/v1/healthz- Service health- Job status SSE streams - Real-time job monitoring
Future Improvements
Planned enhancements noted in code:- Authentication: Add auth to image editor endpoints
- Bulk operations:
/images/d/bulkfor batch URL resolution - Path simplification: Clean up healthz endpoint path
- Event streaming: Move all clients to new events system
- Flow arguments: Accept flows_names from HTTP args in jobs