Skip to content

Create an obfuscation agent to protect sensitive student PII #11

@sxflynn

Description

@sxflynn

In production, no real student names and other PII should be sent to 3rd party LLM API endpoints. Functionality needs to be built that will detect real names, map and replace the real name with Faker names, then filter the LLM response and use the map to re-insert the real name into the locally rendered response.

Here is a simplified workflow of what happens now and what should happen.

Now

  1. User: "What homework does John need to complete tonight?"
  2. Application: Search database for students with the first name Johnny -> returns Johnny Jones and all info
  3. Application: Search homework API for Johnny Jones's homework assignments -> Returns Math and Reading assignment
  4. Application: Prompt LLM to answer the question and provide all of Johnny Jone's context.
  5. LLM Server -> Johnny Jones has Math and Reading homework tonight

Proposed enhancement

  1. User: "What homework does John need to complete tonight?"
  2. Application: Search database for students with the first name Johnny -> returns Johnny Jones and all info
  3. Application: Search homework API for Johnny Jones's homework assignments -> Returns Math and Reading assignment
  4. Application: Map Johnny Jones to Faker() created name Matthew McReed.
  5. Application: Prompt LLM to answer the question and provide all of Matthew McReed's context.
  6. LLM Server -> Matthew McReed has Math and Reading homework tonight
  7. Application -> Matthew McReed Johnny Jones has Math and Reading homework tonight

A middleware-styled class needs to sit between the LLM call and the server retrieval logic, detecting potential PII through an NLP model like spaCy, maps PII to a Faker generated name, and then replacing the Faker name with PII once the response is returned from the server.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions