A Node.js application for processing data using Large Language Models (LLM).
- Node.js 22 or higher
- Docker
- OpenAI API key
- It will always spin up 4 thread workers.
- Chosen LLM model
gpt-4o-mini
assuming that minor accuracy trade-offs in edge cases are acceptable due to the cost efficency which in production environment where millions of entries would be proccessed might be important - if not switch model in the code togpt-4o
For production environment it would be wise to upgrade script to respect rate limits from headers which we are receiving from OpenAI API.
Launch Redis using Docker Compose:
docker compose up -d
npm install
npm run build
Run the main application:
npm start
Generate custom test datasets:
# Generate data with specified number of entries per file
npm run generate-data -- -c <count>
Where <count>
is the number of entries in each file (defaults to 1000).
The application requires proper environment configuration to work correctly. See .env.tpl
for required variables.
This project is licensed under the MIT License.
Author: Maciej Lisowski