AI APIs are great. You can experiment with them, integrate them into your apps, and probably do a bunch of other things I haven’t even started to think about. But there’s a catch: all the ones I found are either paid or have an incredibly low usage limit. So, if you have some AWS credits to burn, this might help you out. And if you don’t, ask for them!
This Terraform setup deploys an EC2 instance with Ollama and all the necessary infrastructure to use the Ollama API from the internet with a basic level of authentication. This is not a production-ready script, but it can be a good starting point.
- An AWS account with some funds to burn
- An AWS user with an appropriate permission policy to manage EC2s instances, VPCs and DNS records
- The AWS cli installed and set up
- The Terraform cli installed and set up
- An SSH key generated WITHOUT A PASSPHRASE
- (Optional) Using a custom domain and SSL requires a preconfigured domain in AWS or NS records pointing to corresponding AWS NS servers
- Create your
terraform.tfvars
file from the example:
cp terraform.tfvars.example terraform.tfvars
- Update the AWS profile, instance names, and both SSH key paths to match yours.
- Create your secret.tfvars file from the example:
cp secret.tfvars.example secret.tfvars
- Generate your API token and append it to secret.tfvars. The key must be enclosed in double quotes:
openssl rand -base64 32
- Initialize Terraform:
terraform init
From here, it’s all pretty straightforward. The only difference from other Terraform projects is that we’re adding the API key secret manually.
- List all elements to be deployed:
terraform plan -var-file="secret.tfvars"
- Then deploy them:
terraform apply -var-file="secret.tfvars"
The deployment will output your instance public ip. Save it for later.
- And at some point, destroy them:
terraform destroy -var-file="secret.tfvars"
If everything went well, you should be able to hit the Ollama API using your API key. Here’s a simple script to test connectivity and functionality from the terminal:
curl http://<instance_public_ip>/api/chat \
-H "Authorization: Bearer <api_key>" -d '{
"model": "<selected_model>",
"messages": [
{
"role": "user",
"content": "why is the sky blue?"
}
],
"stream": false
}'
If your requests aren’t working, you can log into the instance via SSH and check the Ollama and Nginx process statuses and configurations:
ssh -i <ssh_private_key_path> ec2-user@<instance_public_ip>
I'm using the OpenAI Node package to interact with the API because the configuration is quite simple:
import OpenAI from 'openai';
const openai = new OpenAI({
baseURL: 'http://<public_ip>/api',
apiKey: '<api_key>',
});
async function main() {
const chatCompletion = await client.chat.completions.create({
messages: [{ role: 'user', content: 'Say this is a test' }],
model: 'tinyllama',
});
}
main();
Enjoy!
- User configuration
- Some proper key management would be nice