This project provides a REST API to extract text from images using Azure Computer Vision OCR. Users can upload images to this FastAPI-based API and receive the extracted text in JSON format.
- Support for JPEG and PNG formats: The API supports images in JPEG and PNG formats.
- Automatic format conversion: Images with unsupported modes (e.g., RGBA) are automatically converted to RGB.
- Resizing large images: Very large images are automatically resized for optimal processing.
- Pillow integration: Utilizes the powerful Pillow library for image processing.
- Azure OCR integration: Leverages Azure Computer Vision OCR for fast and accurate text extraction.
- Python 3.8 or above
- Azure Computer Vision subscription key and endpoint
- Python dependencies stored in the
requirements.txt
file.
-
Clone the repository:
git clone https://github.com/GunalHincal/azure-ocr-api.git cd azure-ocr-api
-
Install dependencies:
pip install -r requirements.txt
-
Set up the configuration:
-
Create a
config.py
file. This file stores your Azure credentials needed to connect to the Azure service. -
Add the following to
config.py
:AZURE_ENDPOINT = "https://<your-endpoint>.cognitiveservices.azure.com/" AZURE_KEY = "<your-subscription-key>"
-
-
Run the API:
uvicorn main:app --reload
Here are the available API endpoints:
-
Root endpoint:
- URL:
/
- Method:
GET
- Description: Checks if the API is running.
- URL:
-
Text extraction endpoint:
-
URL:
/extract-text/
-
Method:
POST
-
Body:
multipart/form-data
-
Description: Upload an image file (uploaded with the
file
key). -
Response:
{"extracted_text": ["Extracted text line 1", "Extracted text line 2"]}
-
On Error:
{"error": "Error message"}
- Common Error Types:
"Invalid Image Format"
if the uploaded image is not in JPEG or PNG format."Azure OCR API Error"
if there are issues with the Azure service."Internal Server Error"
for general server-side problems.
- Common Error Types:
-
Access the API documentation:
Open the following URL in your browser:
http://127.0.0.1:8000/docs
You can test the API using the Swagger UI, which provides an easy-to-use graphical interface for testing endpoints.
-
Start the API server:
bash uvicorn main:app --reload
-
Open your browser and navigate to:
http://127.0.0.1:8000/docs
-
Locate the POST /extract-text/ endpoint in the Swagger UI.
-
Click the Try it out button.
-
Upload an image file under the file parameter.
-
Click Execute to send the request.
-
View the response in the Responses section, where the extracted text will be displayed as JSON.
Alternatively, you can test the API using Postman.
-
Open Postman and create a new POST request.
-
Set the request URL to:
http://127.0.0.1:8000/extract-text/
-
In the Body tab, select form-data and add the following:
-
Key: file (type: File)
-
Value: Upload your image file (e.g., example.png).
-
-
Click Send to make the request.
If successful, the response will contain the extracted text in JSON format.
curl -X POST "http://127.0.0.1:8000/extract-text/"
-H "accept: application/json"
-F "file=@example.png"
You can use Docker to containerize and run the application. Here's a simple approach:
-
Create a Dockerfile:
FROM python:3.9-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
-
Build the Docker image:
docker build -t azure-ocr-api .
-
Run the Docker container:
docker run -p 8000:8000 azure-ocr-api
- Now, you can access the API at http://localhost:8000.
This project is live and can be accessed here: Render Deploy Link
-
Unsupported Image Formats: Only JPEG and PNG formats are supported. Must be 5mb or less.
-
Large Images: Very large images are resized automatically.
-
Fork this repository.
-
Create a new branch:
git checkout -b feature/feature-name
-
Make your changes and commit them:
git commit -m "Added a new feature"
-
Push your branch:
git push origin feature/feature-name
-
Open a Pull Request.
Stay connected and follow me for updates on my projects, insights, and tutorials:
-
LinkedIn: Connect with me professionally to learn more about my work and collaborations
-
Medium: Check out my blog for articles on technology, data science, and more!
Feel free to reach out or follow for more updates! 😊 Have Fun!