A lambda to apply a pre-trained algorithm to predict whether a given search string is in the form of a citation. This function will perform the following work:
- Receives a set of parameters (submitted to the lambda via POST)
- Loads a pickle file containing a pre-trained machine learning model.
- Submits the parameters to the model to generate a binary prediction.
- Returns the result of that prediction.
This lambda's operation is placed in context of our larger discovery ecosystem in the following diagram. The lambda is responsible for the shaded region.
sequenceDiagram
participant User
participant UI
participant Tacos
box PaleVioletRed Citation detector
participant Lambda
participant S3
end
User->>UI: "popcorn"
UI->>Tacos: "popcorn"
Tacos-->Tacos: Extract features from "popcorn"
Tacos-->Tacos: Load Lambda URL from Config Vars
Tacos->>Lambda: {"features": {...}}
Lambda-->Lambda: Load S3 address from ENV
Lambda-->Lambda: Load default model filename "knn" from ENV
Lambda-->>S3: Request "knn" model
S3-->>Lambda: pkl file
Lambda-->Lambda: Generate prediction
Lambda->>Tacos: {"prediction": false}
- To preview a list of available Makefile commands:
make help
- To install with dev dependencies:
make install
- To update dependencies:
make update
- To run unit tests:
make test
- To lint the repo:
make lint
Ensure that AWS SAM CLI is installed: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/install-sam-cli.html
All following actions and command should be performed from the root of the project (i.e. same directory as the
Dockerfile
).
-
Create a JSON file for SAM that has the environment variables for the container.
- Copy
tests/sam/env.json.template
totests/sam/env.json
(which is git ignored) - Fill in missing sensitive env vars
NOTE: AWS credentials are automatically passed from the terminal context that runs
make sam-run
; they do not need to be explicitly set as env vars. - Copy
-
Build Docker image:
make sam-build
The following outlines how to run the Lambda SAM docker image as an HTTP endpoint, accepting requests and returning responses similar to a lambda behind an ALB, Function URL, or API Gateway.
-
Ensure AWS Dev credentials set in terminal and any other env vars in
tests/sam/env.json
are up-to-date.(TODO: Update once an AWS SSO role is created for this project.)
-
Run HTTP server:
make sam-http-run
This starts a server at http://localhost:3000
. Requests must include a path, e.g. /foo
, but are arbitrary insofar as
the lambda does not utilize them in request payload.
- In another terminal, perform an HTTP request via another
Makefile
command:
make sam-http-ping
Response should have an HTTP status of 200
and respond with:
{
"response": "pong"
}
While lambdas can be invoked via HTTP methods (ALB, Function URL, etc), they are also often invoked directly with an
event
payload. To do so with SAM, you do not need to first start an HTTP server with make sam-run
, you can
invoke the function image directly:
echo '{"action": "ping", "challenge_secret": "secret_phrase"}' | sam local invoke --env-vars tests/sam/env.json -e -
Response:
{"statusCode": 200, "statusDescription": "200 OK", "headers": {"Content-Type": "application/json"}, "isBase64Encoded":
false, "body": "{\"response\": \"pong\"}"}
As you can see from this response, the lambda is still returning a dictionary that would work for an HTTP response, but is actually just a dictionary with the required information.
It's unknown at this time if this lambda will get invoked via non-HTTP methods, but SAM will be helpful for testing and development if so.
CHALLENGE_SECRET=### Secret string that is passed as part of lambda invocation payload, and checked before running
WORKSPACE=### Set to `dev` for local development, this will be set to `stage` and `prod` in those environments by Terraform.
SENTRY_DSN=### If set to a valid Sentry DSN, enables Sentry exception monitoring. This is not needed for local development.
- mitlib-tf-workloads-tacos for the infrastructure that supports this application
- mitlib-tf-workloads-ecr for the ECR infrastructure for the container image