Releases: navapbc/labs-decision-support-tool
Releases · navapbc/labs-decision-support-tool
v1.4.1
What's Changed
- chore: Update readmes with findings from local setup by @mhadfieldNavaPBC in #318
- fix: Improve response for out-of-scope questions by @yoomlam in #320
- feat: Add Gemini LLMs by @yoomlam in #322
- fix: Use singleton PostgresDBClient (Sqlalchemy engine) by @yoomlam in #321
- chore: Remove email alerts for app response time by @yoomlam in #323
New Contributors
- @mhadfieldNavaPBC made their first contribution in #318
Full Changelog: v1.4.0...v1.4.1
v1.4.0
What's Changed
- feat: Import Benefit Hub content from Contentful API by @KevinJBoyer in #292
- feat: Include feedback scores and comments in LiteralAI export by @yoomlam in #295
- feat: Add streaming endpoint with citations and traces by @fg-nava in #293
- docs: Add details on production deployments by @KevinJBoyer in #297
- feat: Add streaming to chainlit chat UI by @fg-nava in #298
- feat: Wrap SentenceTransformer to enable other embedding libraries by @KevinJBoyer in #299
- fix: Don't create LiteralAI data layer in test_chat_api by @yoomlam in #300
- fix: Improve referral link decision criteria in ImagineLaEngine by @fg-nava in #301
- feat: Support Cohere API for embeddings by @KevinJBoyer in #303
- fix: Check for Cohere models in app config by @KevinJBoyer in #304
- fix: Translate canned response and alert message to the same language as the user's question by @yoomlam in #302
- docs: PromptFoo getting started with Google Sheets by @fg-nava in #306
- feat: Setup Github Action workflow file for PromptFoo by @fg-nava in #307
- feat: Run refresh-ingestion in parallel by @yoomlam in #309
- fix: resolve Promptfoo exports to Google Sheet by @fg-nava in #308
- ci: Extract evaluation ID from share URL by @fg-nava in #310
- docs: Promptfoo evaluation instructions by @KevinJBoyer in #311
- fix: Ignore table formatting for CoveredCA data source by @yoomlam in #312
- ci: Add readability assessment to promptfoo GHA workflow by @fg-nava in #313
- feat: Add document-level recall to retrieval evaluation pipeline by @fg-nava in #314
- doc: Promptfoo evaluations instructions: don't use SSL by @yoomlam in #316
- perf: Minimize Promptfoo concurrency to 1 thread by @yoomlam in #317
Full Changelog: v1.3.0...v1.4.0
v1.3.0
What's Changed
- fix: Handle empty NavigableString in beautifulsoup 4.13 by @KevinJBoyer in #283
- feat: Merge contiguous cited subsections by @yoomlam in #282
- refactor: Use simplify_citation_numbers() consistently by @yoomlam in #285
- chore: Exclude out-of-date DPSS webpage Coverage_for_Immigrants by @yoomlam in #286
- chore: Update system prompts based on pilot user usage by @yoomlam in #287
- fix: Chainlit: replace 3 space list indents to 4 spaces by @yoomlam in #288
- fix: LA policy ingestion formatting change by @KevinJBoyer in #290
- feat: Add AWS Bedrock's Anthropic Claude 3.7 LLM by @yoomlam in #289
- fix: Ignore unknown citations and other related fixes by @yoomlam in #291
- docs: Tech spec for streaming responses feature by @fg-nava in #284
- fix: Preserve new lines in batch CSV processing by @KevinJBoyer in #294
Full Changelog: v1.2.0...v1.3.0
v1.2.0
Key changes include:
- Update logging to also log to Postgres in addition to the LiteralAI platform
- Auto tag users in the pilot via a cron job
What's Changed
- feat: Add timing logs for LLM operations and /query endpoint by @fg-nava in #270
- refactor: Add min_samples parameter to stratified sampling by @fg-nava in #262
- fix: LiteralAI exporter: ignore chainlit message by @yoomlam in #271
- feat: Log chats to DB and LiteralAI by @yoomlam in #265
- fix: Update connection URL for deployed environments by @KevinJBoyer in #275
- refactor: API: use ContextVar for db_session by @yoomlam in #272
- fix: Ensure the
user.id
is consistent across Chainlit data layers by @yoomlam in #274 - fix: Correctly URL-encode special characters in the database password by @KevinJBoyer in #276
- chore: Rename and reorder columns in LiteralAI export by @KevinJBoyer in #278
- refactor: Get header from first QA row by @KevinJBoyer in #279
- feat: Add DB connection pool for Chainlit data layer by @yoomlam in #280
- ci: Add cron: literalai_tagger by @yoomlam in #281
- feat: Update chatbot API to save to both our DB and LiteralAI by @yoomlam in #273
Full Changelog: v1.1.0...v1.2.0
v1.1.0
What's Changed
Key changes: update infra template version to 0.15.4, and update Chainlit version to 2.4.0
- docs: Updating README.md with steps using ingest-runner and refresh-ingestion by @fg-nava in #245
- feat: Back up and restore DB contents by @yoomlam in #243
- refactor: Fix evaluation test files with better test data using existing factory models by @fg-nava in #246
- feat: Add versioned storage system for QA pairs with tests by @fg-nava in #247
- chore: Bump DB max capacity by @KevinJBoyer in #248
- feat: Add attributes metadata to LiteralAI by @yoomlam in #249
- feat: Export QA pairs from LiteralAI logs by @yoomlam in #250
- feat: Add export LiteralAI logs capability to Chainlit web UI by @yoomlam in #251
- fix: Use 'app' schema for local DB by @yoomlam in #252
- fix: Update imagine_la scraping; improve refresh-ingestion.sh by @yoomlam in #253
- fix: LiteralAI exporter: handle threads without citation.source_dataset by @yoomlam in #254
- feat: Add utility to save LiteralAI logs for archiving by @yoomlam in #255
- feat: Utility to auto-tag LiteralAI threads based on user_id by @yoomlam in #257
- chore: Reorganize repo docs and add project description by @KevinJBoyer in #256
- chore: Update infra template to latest (1/?) by @KevinJBoyer in #259
- chore: Upgrade to Chainlit 2.4.0 by @yoomlam in #258
- fix: Fix URLs to source code by @KevinJBoyer in #264
- chore: Remove Tax prep referral link by @yoomlam in #268
- fix: Commit language files by @KevinJBoyer in #269
- chore: Update infra to v0.15.4 by @KevinJBoyer in #267
Full Changelog: v1.0.1...v1.1.0
v1.0.1
Bug fix release to allow requests to the API from the Imagine LA production environment
What's Changed
- fix: Allow CORS from prod environment by @KevinJBoyer in #244
Full Changelog: v1.0.0...v1.0.1
v1.0.0
Chat engines
This release contains the imagine-la
for our initial pilot with Imagine LA in March 2025.
What's Changed
- fix: Show only two nested headers by @KevinJBoyer in #117
- perf: Use raw responses of chat context in chat history by @ccheng26 in #116
- feat: Add Markdown tree support utilities for hierarchical chunking by @yoomlam in #118
- feat: Add hierarchical chunking of markdown tree by @yoomlam in #119
- feat: Add support for retrieval in other languages by @KevinJBoyer in #120
- feat: Group web source citations by @ccheng26 in #122
- feat: Support batch processing by @KevinJBoyer in #123
- fix: Include all headers in batch CSV generation by @KevinJBoyer in #126
- feat: Show document.dataset in UI; fix: citation markdown rendering by @yoomlam in #125
- fix: Have superscript citation link to source URL by @yoomlam in #128
- feat: Add tree-based chunking to EDD ingestion by @yoomlam in #121
- refactor: Extract CA EDD system prompt into the chat engine by @yoomlam in #130
- refactor: Use FormattingConfig in format.py by @yoomlam in #129
- feat: Improve batch processing by @KevinJBoyer in #131
- refactor: Add ChatHistory class by @yoomlam in #133
- refactor: Add finalize_result() to remap and replace citation ids in response by @yoomlam in #134
- fix: Add quotes around Makefile ingest arguments by @yoomlam in #135
- fix: Handle blank sublist for EDD ingest by @yoomlam in #136
- feat: Add minimal API v0 by @yoomlam in #132
- refactor: More robust _fix_input_markdown() by @yoomlam in #138
- Notebook: Add edd-chunking.ipynb to debug markdown chunking issues by @yoomlam in #137
- refactor: Exclude user's latest message from chat_history by @yoomlam in #139
- feat: Ingest ImageLA content hub by @KevinJBoyer in #141
- feat: Add conversational memory to v0 API by @yoomlam in #140
- feat: Add chat_message DB table by @yoomlam in #142
- API Spec by @KevinJBoyer in #124
- feat: Add Imagine LA engine by @KevinJBoyer in #143
- feat: Write document questions and answers to csv by @ccheng26 in #127
- feat: add question generation for imagine la and bem dataset by @ccheng26 in #144
- fix: Remove use of getopt for ingestion by @yoomlam in #145
- feat: Investigation of multilingual LLMs by @KevinJBoyer in #109
- fix: ImagineLA scraper: handle '/' suffix for root_url by @yoomlam in #147
- feat: Analyze precision and recall for initial eval dataset by @KevinJBoyer in #150
- feat: Enable resuming ingestion of the EDD website by @yoomlam in #146
- fix: Assign subsections to their own markdown headings by @yoomlam in #148
- feat: Tree-based chunk splitting into subsections by @yoomlam in #149
- fix: Handle table markdown formatting in assertion by @yoomlam in #152
- feat: Literal ai feedback by @ccheng26 in #151
- feat: Whitelist Imagine LA's dev site by @KevinJBoyer in #158
- feat: Citation footnotes open accordions by @ccheng26 in #155
- feat: Scrape LA county policy manual website by @yoomlam in #156
- refactor: remove BEM-specific code and generalize PDF processing by @fg-nava in #159
- refactor: remove Guru card functionality by @fg-nava in #160
- fix: Setup Terraform before AWS credentials by @KevinJBoyer in #163
- fix: Update checkout actions to to fix action linting by @KevinJBoyer in #164
- feat: Ingest LA County Policy Manual by @yoomlam in #161
- feat: Use separate Literal project for API by @KevinJBoyer in #166
- feat: Provide direct link to text by @ccheng26 in #162
- feat: Pull dataset values from DB by @KevinJBoyer in #167
- fix: Switch to sequential processing in batch_process to resolve thread-safety issues by @fg-nava in #169
- feat: Export markdown files during ingestion by @yoomlam in #171
- chore: Rename dataset to LA "DPSS Policy" by @yoomlam in #174
- feat: LA DPSS policy: add program name to document.name by @yoomlam in #175
- fix: Use cl.make_async when calling synchronous batch_process by @fg-nava in #176
- fix: Refine memory handling and timeout params for batch processing by @fg-nava in #177
- feat: Add IRS family tax credit webpages as a new dataset by @yoomlam in #178
- feat: Add scraper and ingest code for Public Charge dataset by @ccheng26 in #173
- fix: Use regex pattern for CORS allowed origins by @fg-nava in #182
- revert: Revert batch process and related files to d61791c by @fg-nava in #180
- fix: Unblock Chainlit's websocket during batch processing by @yoomlam in #181
- fix: Limit origins to localhost:5173 and remove pattern from regex by @fg-nava in #184
- fix: revise headers in public charge content by @ccheng26 in #183
- refactor: Simplify command line for scraping datasets by @yoomlam in #186
- dev experience: Add app-shell target to Makefile by @yoomlam in #188
- fix: Cleanup BOM characters from fieldnames in batch processing by @fg-nava in #187
- refactor: Add general ingest_runner by @yoomlam in #189
- feat: Ingest CA FTB dataset by @yoomlam in #190
- refactor: Generalize scrapy-runner by @yoomlam in #192
- feat: Ingest CA WIC dataset by @yoomlam in #191
- Update system prompt for ImagineLA chat engine by @yoomlam in #194
- refactor: Make ingest-runner consistent with scrapy-runner by @yoomlam in #193
- fix: Remove unused ImagineLA chat engine settings from UI by @yoomlam in #195
- feat: Add Covered CA dataset by @yoomlam in #197
- fix: Update Imagine LA Content Hub scraper and ingester by @yoomlam in #198
- feat: Distinguish two separate system prompts in the UI by @yoomlam in #199
- feat: Add Step 2 logic and show Policy Updates in UI by @yoomlam in #200
- feat: ...
v0.0.0: guru-snap, bridges-eligibility-manual, ca-edd-web
Initial release
Chat engines
This release contains three chat engines:
guru-snap
: A prototype that uses content from Guru cards exported as JSONbridges-eligibility-manual
: A prototype that uses content from Bridges Eligibility Manual PDFsca-edd-web
: A chatbot that uses content found on edd.ca.gov
You may wish to deploy this release or run it locally if you need to run guru-snap
or bridges-eligibility-manual
, as these prototypes will likely be deprecated in a future release.
What's Changed
- feat: Install Application-Flask template by @KevinJBoyer in #1
- DST-257: install infra temp by @ccheng26 in #3
- feat: Add Chainlit by @KevinJBoyer in #2
- Remove Flask and example database models by @KevinJBoyer in #4
- feat: Add MockSentenceTransformer by @KevinJBoyer in #5
- feat: Add LiteLLM by @KevinJBoyer in #6
- DST-263: chainlit healthcheck by @ccheng26 in #7
- DST-260: configure env for pgvector by @ccheng26 in #8
- feat: Add Document and Chunk models by @KevinJBoyer in #9
- DST-258: deploy infra by @ccheng26 in #10
- DST-258: deploy infra- enable tests by @ccheng26 in #11
- feat: Ingest Guru Cards by @KevinJBoyer in #13
- DST-271 feat: generate llm result by @ccheng26 in #12
- fix import by @ccheng26 in #14
- feat: Augment response with retrieved cards by @KevinJBoyer in #15
- feat: Show accordions with cards by @KevinJBoyer in #16
- feat: Require login by @KevinJBoyer in #17
- fix: Pin Terraform version for deploys and migrations by @KevinJBoyer in #18
- perf: Lower database auto-scaling settings by @KevinJBoyer in #19
- feat: add ollama call to get_models by @ccheng26 in #20
- feat: productionize uvicorn by @ccheng26 in #22
- fix: Conditionally import ollama by @yoomlam in #23
- feat: Use URL query 'engine' parameter to set chatbot's configuration by @yoomlam in #25
- fix: set uvicorn worker to 1 by @ccheng26 in #26
- feat: Enable ingest of distinct datasets by @yoomlam in #27
- feat: Set default chat engine to 'guru-snap' by @yoomlam in #28
- feat: Retrieve from database based on filters by @yoomlam in #29
- feat: add similarity score to accordion value by @ccheng26 in #30
- refactor: Use generalized ChunkWithScore by @yoomlam in #32
- refactor: Move db_session and embedding_model args into AppConfig by @yoomlam in #33
- refactor: Remove extraneous MockAppConfig and monkeypatch parameter in tests by @yoomlam in #35
- feat: Add ingest-policy-pdfs command to print out PDF file list by @ccheng26 in #31
- feat: Normalize similarity scores of retrieved Guru cards by @yoomlam in #36
- fix: retrieve build date and service env by @ccheng26 in #34
- feat: Add retrieval and docs_shown thresholds by @yoomlam in #37
- feat: Add LLM selection in Chainlit by @yoomlam in #38
- feat: Use URL query params to set initial chat settings by @yoomlam in #39
- feat: chunk and store BEM pdf by @ccheng26 in #40
- fix: Commit after ingesting Guru cards by @KevinJBoyer in #42
- feat: Enable BEM Chatbot by @KevinJBoyer in #41
- feat:add branding logos by @ccheng26 in #43
- feat: Make ingestion drop dataset if already exists by @KevinJBoyer in #44
- feat: add accordion chunk citations by @ccheng26 in #46
- fix: Rename docs_shown_* variable to chunks_shown_* by @yoomlam in #47
- bug: fix accordion by @ccheng26 in #48
- fix: Read titles from PDF metadata by @KevinJBoyer in #49
- feature: Log metadata for retrieved chunks by @KevinJBoyer in #50
- bug: citation duplicates and accordion overflow by @ccheng26 in #51
- feat: Link to BEM documents by @KevinJBoyer in #52
- feat: Support Jupyter notebooks by @KevinJBoyer in #53
- Jupyter notebook: exploration into pdfminer.six capabilities by @yoomlam in #54
- Jupyter notebook: Investigate unstructured for parsing semantics from PDFs by @KevinJBoyer in #55
- feat: Add extract_outline() PDF utility by @yoomlam in #57
- feat: Group markdown text list items by @KevinJBoyer in #56
- fix: Merge list items only if they have the same heading by @yoomlam in #61
- feat: BEM ingest skeleton by @yoomlam in #59
- feat: Convert list items to markdown_texts by @yoomlam in #60
- DST-401: inline citations by @ccheng26 in #58
- feat: update db schema for chunk by @ccheng26 in #62
- feat: Utility to extract bolded text from PDFs by @yoomlam in #63
- feat: Associate and apply stylings to create bolded markdown by @yoomlam in #64
- feat: Format links as markdown by @yoomlam in #65
- feat: Merges texts that are split across consecutive pages by @yoomlam in #66
- test: Use 707.pdf to test ingest_bem_pdfs.py by @yoomlam in #70
- feat: Save BEM JSON chunks to S3 by @KevinJBoyer in #69
- feat: enrich text using unstructured data by @ccheng26 in #68
- fix: Add heuristics and fixes to improve BEM pdf parsing by @yoomlam in #71
- feat: Update citation UI by @KevinJBoyer in #73
- feat: Split long paragraphs and lists into chunks by @yoomlam in #72
- feat: add dash formatting to list items by @ccheng26 in #74
- feat: add ellipses to start/end of text for chunks by @ccheng26 in #75
- package update: format files and update black by @ccheng26 in #77
- feat: add headings to context by @ccheng26 in #76
- CI: Post test coverage report to PR by @yoomlam in #78
- feat: Group text with the same heading by @ccheng26 in #79
- fix: Address BEM PDF ingestion error cases with bigger chunks by @yoomlam in #80
- feat: Sub-chunk citations by @KevinJBoyer in #81
- fix: Prevent citations from being rendered inline by Chainlit by @KevinJBoyer in ...