A robust service designed to efficiently process thousands of store visit images, ensuring accurate management and streamlined operations for retail businesses. Additionally, I created a frontend interface for this service, which can be accessed at store-management-processor.fly.dev
Retail Pulse requires a system to handle and process large volumes of images collected from various stores. This application addresses that need by:
- Receiving Jobs with Image URLs and Store IDs:
- Accepts multiple concurrent jobs, each containing numerous images.
- Jobs may take from a few minutes to an hour to complete.
-
Processing Images:
- Downloads each image and calculates its perimeter using the formula
2 * (Height + Width). - Introduces a random sleep time between 0.1 to 0.4 seconds to simulate GPU processing.
- Downloads each image and calculates its perimeter using the formula
-
Validating Store Information:
- Cross-references submitted
store_idwith the provided Store Master data, which includesstore_id,store_name, andarea_code.
- Cross-references submitted
-
Job Tracking:
- Provides APIs to submit jobs and retrieve their status and results.
- High Throughput Processing: Efficiently manages multiple jobs with thousands of images.
- Accurate Store Validation: Ensures all
store_ids are validated against the master data. - Simulated GPU Processing: Mimics real-world processing delays to provide realistic performance metrics.
- Comprehensive Job Tracking: Allows users to monitor the status and results of their submitted jobs.
- Store Master Data: Assumes the provided CSV file contains accurate and up-to-date store information.
- Image Processing: Focuses solely on perimeter calculation without additional image transformations.
- Simulated Processing Delay: The random sleep time is used to emulate GPU processing times.
The application is deployed and accessible at: Store Management Processor System
- Submit Job
- Endpoint:
POST /api/submit - Description: Submits a new job for processing store visit images.
- Request Body:
- Endpoint:
{
"count": 2,
"visits": [
{
"store_id": "RP00001",
"image_url": [
"https://www.gstatic.com/webp/gallery/2.jpg",
"https://www.gstatic.com/webp/gallery/3.jpg"
],
"visit_time": "2024-03-15T10:00:00Z"
},
{
"store_id": "RP00002",
"image_url": [
"https://www.gstatic.com/webp/gallery/4.jpg"
],
"visit_time": "2024-03-15T11:00:00Z"
}
]
}- Check Job Status
- Endpoint:
GET /api/status?jobid=<job_id> - Description: Retrieves the status and results of a submitted job.
- Parameters:
jobid: The unique identifier of the job.
- Endpoint:
- Go: Ensure Go is installed on your system.
- Docker: Required for containerized setup.
-
Clone the Repository:
git clone https://github.com/hitaarthh/store-management-processor cd store-management-processor -
Build and Run the Docker Container:
docker build -t store-management-processor . docker run -p 8081:8081 store-management-processor -
Access the Application: Open
http://localhost:8081in your browser.
-
Clone the Repository:
git clone https://github.com/hitaarthh/Store-Management-Processor cd store-management-processor -
Run the Application:
go run main.go
-
Access the Application: Open
http://localhost:8081in your browser.
- Submit a Job: Use tools like Postman or curl to send a POST request to
/api/submitwith the required JSON body. - Check Job Status: Send a GET request to
/api/status?jobid=<job_id>to retrieve the status of a submitted job. - Error Handling: Test with invalid
store_ids or malformed JSON to ensure proper error responses.
- Operating System: macOS Monterey 12.0.1
- Text Editor/IDE: Visual Studio Code 1.62.3
- Programming Language: Go 1.17.3
- Libraries: Standard Go libraries (
net/http,encoding/json,image, etc.) - Tools: Docker 20.10.8, Fly.io CLI 0.0.250
Based on the schema of the master store data and the original requirements, here are some unique and impactful future improvements that align with the existing project while creatively leveraging the data and functionality:
-
Heat Map Visualization:
- Create a live heat map that shows store density by area code using the
area_codecolumn from the store master data. - This can be used to identify high-density regions for better resource allocation.
- Create a live heat map that shows store density by area code using the
-
Performance Metrics Comparison:
- Compare job completion times, error rates, and visit frequencies for stores within the same area code.
- Highlight underperforming or high-potential regions.
-
Route Optimization for Store Visits:
- Implement an API that calculates the optimal route for visiting multiple stores in a single trip.
- This can help field teams save time and fuel by organizing visits more efficiently.
-
Visit Frequency Tracking:
- Log visit timestamps and calculate metrics like the average number of visits per store or peak visit times.
- This would help monitor store engagement trends over time.
-
Image Processing Metrics:
- Track metrics like success/failure rates of image downloads and processing times for each store.
- Provide insights into which stores are consistently problematic.
-
Custom Store Grouping:
- Allow users to define and group stores based on criteria like store names (e.g., "PAN shops") or area codes.
- Display aggregated data for these groups to identify patterns or outliers.
-
Similar Store Recommendations:
- Use string similarity algorithms to recommend stores with similar naming patterns.
- Example: A search for "PAN Corner" could also return "Pan Paradise" and "The Pan Stop."
-
Automatic Store Categorization:
- Categorize stores (e.g., "Hotels," "Pan Shops," "Grocery Stores") using keywords from the
store_namecolumn. - This enables tailored processing rules for specific categories.
- Categorize stores (e.g., "Hotels," "Pan Shops," "Grocery Stores") using keywords from the
-
Category-Specific Rules:
- Implement rules like giving higher priority to certain store types (e.g., hotels) or customizing image validation criteria based on the category.
-
Trends by Area Code:
- Analyze area performance trends like increasing/decreasing visit frequency or average processing times.
- Use this data to identify regions needing more attention or resources.
-
Custom Reporting:
- Generate detailed reports segmented by area codes, showing metrics like job completion rates, visit counts, and error rates.
-
Load Balancing for Processing:
- Dynamically distribute processing workloads based on store density in each area.
- Regions with high store counts get assigned more resources to handle the load efficiently.
-
Real-Time Error Reporting:
- Enhance the
/api/statusendpoint to provide real-time updates about errors encountered during image processing. - Users can take immediate corrective actions instead of waiting for the job to fail.
- Enhance the
-
Live Job Monitoring:
- Create a dashboard that displays live progress updates for ongoing jobs, showing the percentage of completed images and any encountered errors.
-
Predictive Analytics:
- Use historical data to predict job completion times, error probabilities, or store performance trends.
- Example: Flag stores with historically high error rates for preemptive resolution.
-
Anomaly Detection:
- Implement machine learning models to detect anomalies in visit patterns, such as sudden drops in visit frequency or unusual processing times.
-
Distributed Processing:
- Migrate image processing to a distributed system to handle higher volumes of concurrent jobs.
- Use tools like Kubernetes for scaling.
-
Cloud Integration:
- Integrate with cloud services for faster image processing and dynamic resource allocation.

