Skip to content

Commit 9306981

Browse files
Merge pull request #3 from code-dot-org/openai-routes
Openai routes
2 parents c41cbe6 + 1be7f4c commit 9306981

25 files changed

+1190
-13
lines changed

.dockerignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
pybin
2+
__pycache__
3+
config.txt

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
config.txt
2+
__pycache__

API.md

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# API
2+
3+
`GET /`: The root serves as a quick status check.
4+
5+
`GET /test`: Will respond with some kind of successful JSON response.
6+
7+
`GET /openai/models`: Will report back information about the available OpenAI models.
8+
9+
`GET /test/openai`: Will issue a small test prompt to OpenAI's ChatGPT.
10+
11+
* `model`: The model to use. Default: `gpt-3.5-turbo`
12+
* `api-key`: The API key associated with the model.
13+
14+
`POST /assessment`: Issue a rubric assessment to the AI agent and wait for a response.
15+
16+
* `model`: The model to use. Default: `gpt-4`
17+
* `api-key`: The API key associated with the model. Default: the configured key
18+
* `code`: The code to assess. Required.
19+
* `prompt`: The system prompt. Required.
20+
* `rubric`: The rubric, as a CSV. Required.
21+
* `examples`: Array of pairs of code (js) and openai response (tsv).
22+
* `remove-comments`: When `1`, attempts to strip comments out of the code before assessment. Default: 0
23+
* `num-responses`: The number of times it should ask the AI model. It votes on the final answer. Default: 1
24+
* `num-passing-grades`: The number of grades to consider 'passing'. Defaults: 2 (pass fail)
25+
* `temperature`: The 'temperature' value for ChatGPT LLMs.
26+
27+
* **Response**: `application/json`: Data and metadata related to the response. The `data` is the list of key concepts, assessment values, and reasons. The `metadata` is the input to the AI and some usage information. `n` is the number of responses asked for in the input. Example below.
28+
29+
```
30+
{
31+
"metadata": {
32+
"time": 39.43,
33+
"student_id": 1553633,
34+
"usage": {
35+
"prompt_tokens": 454,
36+
"completion_tokens": 1886,
37+
"total_tokens": 2340
38+
},
39+
"request": {
40+
"model": "gpt4",
41+
"temperature": 0.2,
42+
"messages": [ ... ],
43+
"n": 3
44+
}
45+
},
46+
"data": [
47+
{
48+
"Key Concept": "Program Development 2",
49+
"Observations": "The program uses whitespace good nami [... snipped for brevity ...]. The code is easily readable.",
50+
"Grade": "Extensive Evidence",
51+
"Reason": "The program code effectively uses whitespace, good naming conventions, indentation and comments to make the code easily readable."
52+
}, {
53+
"Key Concept": "Algorithms and Control Structures",
54+
"Observations": "Sprite interactions occur at lines 48-50 (player touches burger), 52 (sw[... snipped for brevity ...]",
55+
"Grade": "Extensive Evidence",
56+
"Reason": "The game includes multiple different interactions between sprites, responds to multiple types of user input (e.g. different arrow keys)."
57+
}
58+
]
59+
```
60+
61+
`(GET|POST) /test/assessment`: Issue a test rubric assessment to the AI agent and wait for a response.
62+
63+
* `model`: The model to use. Default: `gpt-4`
64+
* `api-key`: The API key associated with the model. Default: the configured key
65+
* `remove-comments`: When `1`, attempts to strip comments out of the code before assessment. Default: 0
66+
* `num-responses`: The number of times it should ask the AI model. It votes on the final answer. Default: 1
67+
* `num-passing-grades`: The number of grades to consider 'passing'. Defaults: 2 (pass fail)
68+
* `temperature`: The 'temperature' value for ChatGPT LLMs.
69+
70+
* **Response**: `application/json`: A set of data and metadata where `data` is a list of key concepts, assessment values, and reasons. See above.

Dockerfile

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,13 @@
11
FROM python:3.11-slim
22

33
WORKDIR /app
4-
COPY ./src /app
4+
COPY requirements.txt .
55

6-
RUN pip install Flask
6+
RUN pip install -r requirements.txt
7+
8+
COPY ./test /app/test
9+
COPY ./lib /app/lib
10+
COPY ./src /app/src
711

812
EXPOSE 5000
9-
CMD ["python", "app.py"]
13+
CMD ["waitress-serve", "--host=0.0.0.0", "--port=5000", "--call", "src:create_app"]

README.md

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,47 @@ To Do:
1111
* [ ] create [application cloudformation template](cicd/3-app/aiproxy/template.yml)
1212
* [ ] authentication
1313

14+
## Configuration
15+
16+
The configuration is done via environment variables stored in the `config.txt` file.
17+
18+
For local development, copy the `config.txt.sample` file to `config.txt` to have a
19+
starting point. Then set the `OPENAI_API_KEY` variable to a valid OpenAI API key to
20+
enable that service. Or, otherwise set that variable the appropriate way when
21+
deploying the service.
22+
23+
To control the logging information, use the `LOG_LEVEL` configuration parameter. Set
24+
to `DEBUG`, `INFO`, `WARNING`, `ERROR`, or `CRITICAL`. The `DEBUG` setting is the
25+
most permissive and shows all logging text. The `CRITICAL` prevents most logging
26+
from happening. Most logging happens at `INFO`, which is the default setting.
27+
1428
## Local Development
1529

16-
The Python app exists within "/src"
30+
All of our server code is written using [Flask](https://flask.palletsprojects.com/en/2.3.x/).
31+
32+
The Flask web service exists within `/src`. The `__init__.py` is the
33+
entry point for the app. The other files provide the routes.
34+
35+
Other related Python code that implement features are within `/lib`.
36+
37+
To build the app, use `docker compose build`.
38+
You will need to rebuild when you change the source.
1739

1840
To run the app locally, use `docker compose up` from the repo root.
1941

42+
This will run a webserver accessible at <http://localhost:5000>.
43+
44+
**Note**: You need to provide the API keys in the `config.txt` file
45+
before the service runs. See the above "Configuration" section.
46+
47+
## API
48+
49+
For information about the API, see the [API documentation](API.md).
50+
51+
## Testing
52+
53+
For information about testing the service, see the [Testing documentation](TESTING.md).
54+
2055
## CICD
2156

2257
See [CICD Readme](./cicd/README.md)

TESTING.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Testing
2+
3+
This assumes you have built and are running the container as depicted in the main `README.md`.
4+
In that case, you have a running server on port 5000.
5+
6+
## Scripted
7+
8+
You can run the provided Ruby script to issue a request to the server on your native machine.
9+
To do so, if you have some system Ruby installed, you can invoke this command (at the root
10+
of this repository):
11+
12+
```
13+
ruby test/assessment-test.rb
14+
```
15+
16+
This will ping the running server's `/assessment` route building out the appropriate `POST`
17+
request with the test data in the `test` path of the repository.
18+
19+
## Curl
20+
21+
You can issue a "test" rubric assessment using hard-coded content that is found in the
22+
`test/data` path by using the `/test/assessment` URL. Here, I'm using `curl` to show
23+
me the headers and send a `POST` to that route (may take 30 to 50 seconds):
24+
25+
```
26+
curl localhost:5000/test/assessment -i --header "Content-Type:multipart/form-data" --form "num-responses=3"
27+
```
28+
29+
This gives me this response:
30+
31+
```
32+
HTTP/1.1 200 OK
33+
Content-Length: 10883
34+
Content-Type: application/json
35+
Date: Tue, 10 Oct 2023 21:00:48 GMT
36+
Server: waitress
37+
38+
{"data":[{"Grade":"Extensive Evidence","Key Concept":"Program Development 2","Observations":"The program uses whitespace effectively (e.g. lines 18, 24, 30, 36, 42, 48, 54, 60). The program uses good naming conventions (e.g. \"player\", \"burger\", \"sword\", \"sword2\"). The program has good indentation (e.g. lines 20-23, 26-29, 32-35, 38-41, 44-47, 50-53, 56-59). The program has good comments (e.g. lines 1, 2, 18, 24, 30, 36, 42, 48, 54, 60). The code is easily readable.","Reason":"The program code effectively uses whitespace, good naming conventions, indentation and comments to make the code easily readable."},{"Grade":"Extensive Evidence","Key Concept":"Algorithms and Control Structures","Observations":"Sprite interactions occur at lines 50-53 (player touches burger) and lines 56-57 (sword and sword2 displace player). The program responds to user input at lines 32-35 (up key), lines 38-41 (left key), and lines 44-47 (right key).","Reason":"The game includes multiple different interactions between sprites, responds to multiple types of user input (e.g. different arrow keys)."},{"Grade":"Extensive Evidence","Key Concept":"Position and Movement","Observations":"The program generates movement at lines 14-15 (sword and sword2), line 20 (player falling), lines 32-35 (player moves up), lines 38-41 (player moves left), and lines 44-47 (player moves right). The movement involves acceleration at lines 20, 32-35, 38-41, and 44-47.","Reason":"Complex movement such as acceleration, moving in a curve, or jumping is included in multiple places in the program."},{"Grade":"Extensive Evidence","Key Concept":"Variables","Observations":"The program updates sprite properties inside the draw loop at lines 20 (player.velocityY), lines 32-35 (player.velocityY), lines 38-41 (player.velocityX), lines 44-47 (player.velocityX), and lines 50-53 (burger.x and burger.y). These updates affect the user's experience of playing the game by controlling the player's movement and the burger's position.","Reason":"The game includes multiple variables or sprite properties that are updated during the game and affect the user's experience of playing the game."}],"metadata":{"request":{"messages":[{"content":"You are a teaching assistant whose job is to assess a student program written in\njavascript based on several Key Concepts. For each Key Concept you will answer by\ngiving the highest grade which accurately describes the student's program:\nExtensive Evidence, Convincing Evidence, Limited Evidence, or No\nEvidence. You will also provide a reason explaining your grade for each\nKey Concept, citing examples from the code to support your decision when possible.\n\nThe student's code should contain a method called `draw()` which will be\nreferred to as the \"draw loop\". Any code outside of the draw loop will be run\nonce, then any code inside the draw loop will be run repeatedly, like this:\n```\n// student's code\n\nwhile (true) {\n draw();\n}\n```\n\nPlease keep in mind that acceleration occurs when the velocity of a sprite is changed incrementally within the draw loop, such as in these examples:\n* `sprite.velocityX += 0.2;`\n* `sprite.velocityY -= 1;`\n* `foo.velocityX = foo.velocityX + 5;`\n* `foo.velocityY = foo.velocityY - 10;`\n\nThe following examples do not count as acceleration, because they set the velocity to a specific value, rather than changing it incrementally:\n* `sprite.velocityX = 5;`\n* `sprite.velocityY = -10;`\n\nThe following does not count as acceleration, because it sets the velocity to a random value, rather than changing it incrementally:\n* `foo.velocityX = randomNumber(-5, 5);`\n\nThe student's code will access an API defined by Code.org's fork of the p5play\nlibrary. This API contains methods like createSprite(), background(), and drawSprites(),\nas well as sprite properties like x, y, velocityX and velocityY.\n\nIn order to help you evaluate the student's work, you will be given a rubric in\nCSV format. The first column provides the list of Key Concepts to evaluate,\nthe second column, Instructions, tells you what aspects of the code to consider\nwhen choosing a grade. the next four columns describe what it means for a program\nto be classified as each of the four possible grades.\n\nwhen choosing a grade for each Key Concept, please follow the following steps:\n1. follow the instructions in the Instructions column from the rubric to generate observations about the student's program. Include the result to the Observations column in your response.\n2. based on those observations, determine the highest grade which accurately describes the student's program. Write this result to the Grade column in your response.\n3. write a reason for your grade in the Reason column, citing evidence from the Observations column when possible.\n\nplease provide your evaluation formatted as a TSV table including a header row\nwith column names Key Concept, Observations, Grade, and Reason. There should be one\nnon-header row for each Key Concept.\n\nThe student's work should be evaluated based on what they have added beyond the\nstarter code that was provided to them. Here is the starter code:\n```\n// GAME SETUP\n// create player, target, and obstacles\nvar player = createSprite(200, 100);\nplayer.setAnimation(\"fly_bot\");\nplayer.scale = 0.8;\n\n\nfunction draw() {\n background(\"lightblue\");\n\n // FALLING\n\n // LOOPING\n\n\n // PLAYER CONTROLS\n // change the y velocity when the user clicks \"up\"\n\n // decrease the x velocity when user clicks \"left\"\n\n // increase the x velocity when the user clicks \"right\"\n\n // SPRITE INTERACTIONS\n // reset the coin when the player touches it\n\n // make the obstacles push the player\n\n\n // DRAW SPRITES\n drawSprites();\n\n // GAME OVER\n if (player.x < -50 || player.x > 450 || player.y < -50 || player.y > 450) {\n background(\"black\");\n textSize(50);\n fill(\"green\");\n text(\"Game Over!\", 50, 200);\n }\n\n}\n```\n\n\nRubric:\nKey Concept,Instructions,Extensive Evidence,Convincing Evidence,Limited Evidence,No Evidence\nProgram Development 2,(1) does the program effectively use whitespace? (2) does the program use good naming conventions? (3) does the program have good indentation? (4) does the program have good comments? (5) is the code easily readable?,\"The program code effectively uses whitespace, good naming conventions, indentation and comments to make the code easily readable.\",\"The program code makes use of whitespace, indentation, and comments.\",The program code has few comments and does not consistently use formatting such as whitespace and indentation.,The program code does not contain comments and is difficult to read.\nAlgorithms and Control Structures,\"(1) list the line number of each sprite interaction, and note the type of interaction. (2) list the line number of each place the program responds to user input, and note the type of user input (e.g. which key or mouse event).\",\"The game includes multiple different interactions between sprites, responds to multiple types of user input (e.g. different arrow keys).\",The game includes at least one type of sprite interaction and responds to user input.,\"The game responds to user input through a conditional, but has no sprite interactions.\",The game includes no conditionals.\nPosition and Movement,\"list the line numbers of each place the program generates movement, and note whether the movement involves acceleration, keeping in mind that acceleration is incremental change to velocity (e.g. `sprite.velocityX = sprite.velocityX + 1` or `sprite.velocityY -= 1`).\",\"Complex movement such as acceleration, moving in a curve, or jumping is included in multiple places in the program.\",\"The program includes some complex movement, such as jumping, acceleration, or moving in a curve.\",\"The program does not include complex movement such as jumping, acceleration or moving in a curve. However, the program does include simple independent movement, such as a straight line, rotation or bouncing.\",\"There is no movement in the program, other than direct user control.\"\nVariables,\"(1) list the line number of every place a variable (including sprite properties) is updated inside the draw loop (2) for each variable or sprite property, describe whether it affects the user's experience of playing the game.\",The game includes multiple variables or sprite properties that are updated during the game and affect the user's experience of playing the game.,The game includes at least one variable or sprite property that is updated during the game and affects the user's experience of playing the game.,There is at least one variable or sprite property updated in the program.,\"There are no variables or sprite properties, or they are not updated.\"\n","role":"system"},{"content":"// GAME SETUP\n// create player, target, and obstacles\nvar player = createSprite(200, 100);\nplayer.setAnimation(\"player\");\nplayer.scale = 0.8;\nvar burger = createSprite(randomNumber(0,400),randomNumber(0,400));\nburger.setAnimation(\"burger\");\nburger.scale = 0.2;\nburger.setCollider(\"circle\");\nvar sword = createSprite(-50, randomNumber(0, 400));\nsword.setAnimation(\"sword\");\nsword.scale = 0.5;\nvar sword2 = createSprite(randomNumber(0, 400), -50);\nsword2.setAnimation(\"sword2\");\nsword2.scale = 0.5;\nsword.velocityX = 3;\nsword2.velocityY = 3;\n\n\nfunction draw() {\n background(\"green\");\n \n // FALLING\n player.velocityY+=0.7;\n \n // LOOPING\n if (sword.x>425){\n sword.y=-50;\n sword.x=randomNumber(0, 400);\n }\n if (sword2.x>425){\n sword2.y=-50;\n sword2.x=randomNumber(0, 400);\n }\n \n // PLAYER CONTROLS\n // change the y velocity when the user clicks \"up\"\n if (keyDown(\"up\")) {\n player.velocityY-=1.5;\n }\n \n // decrease the x velocity when user clicks \"left\"\n if (keyDown(\"LEFT\")) {\n player.velocityX-=0.1;\n \n }\n // increase the x velocity when the user clicks \"right\"\n if (keyDown(\"RIGHT\")) {\n player.velocityX+=0.1;\n \n }\n // SPRITE INTERACTIONS\n // reset the coin when the player touches it\n if (player.isTouching(burger)) {\n burger.x=randomNumber(0, 400);\n burger.y=randomNumber(0,400);\n }\n \n // make the obstacles push the player\n sword.displace(player);\n sword2.displace(player);\n // DRAW SPRITES\n drawSprites();\n \n // GAME OVER\n if (player.x < -50 || player.x > 450 || player.y < -50 || player.y > 450) {\n background(\"black\");\n textSize(50);\n fill(\"blue\");\n text(\"Game Over!\", 50, 200);\n }\n \n}\n","role":"user"}],"model":"gpt-4","n":1,"temperature":0.2},"student_id":"student","time":31.761487007141113,"usage":{"completion_tokens":504,"prompt_tokens":1897,"total_tokens":2401}}}
39+
```
40+
41+
If you want a cleaner response, ignore printing the headers and use Python as well:
42+
43+
```
44+
curl localhost:5000/test/assessment --header "Content-Type:multipart/form-data" --form "num-responses=3" | python -m json.tool
45+
```

config.txt.sample

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
OPENAI_API_KEY=
2+
LOG_LEVEL=INFO

docker-compose.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,6 @@ version: '3'
22
services:
33
python-proxy:
44
build: .
5+
env_file: "config.txt"
56
ports:
67
- "5000:5000"

lib/__init__.py

Whitespace-only changes.

lib/assessment/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)