Skip to content

Release 3.2.3

Compare
Choose a tag to compare
@thiswillbeyourgithub thiswillbeyourgithub released this 13 May 16:35
· 302 commits to main since this release

What's new

This release primarily focuses on enhancing context management for embedding models, improving debugging utilities, and updating documentation for better clarity. It also includes several important bug fixes and feature additions.

✨ Features

  • Introduced a new environment variable WDOC_MAX_EMBED_CONTEXT to allow capping the context size for embedding models ([d9e200f8])
    • Documentation for this new variable has been added ([a2408fd0])
  • Enhanced debugging by ensuring debug prints are always active when md_printer is used. This helps in retrieving LLM answers from logs if they weren't saved to a file ([69db1916])
  • Added the current date to summary metadata and headers to help reduce potential LLM hallucinations ([64ca4665])

🐛 Fixes

  • Text Splitting & Context Handling:
    • Addressed an issue where large language models have more context than embedding models by setting a max_tokens limit for the text splitter ([dac6802d])
    • Fixed an edge case where the wdoc max chunk setting could be ignored ([196b3a00])
    • Corrected an old variable name within the text splitting logic ([767bc754])
  • Updated the default model to gemini 2.5 preview to reflect its renaming on OpenRouter ([22978609])
  • Improved the mechanism for ignoring initial "breathing" or placeholder lines in summaries ([4dbcf158])

📚 Documentation

  • Clarity and Enhancements:
    • Clarified the usage of save and load functionalities ([9d9642d4]) and specifically advised against using them simultaneously ([5270c350])
    • Made multiple clarifications to the README for better understanding ([9284ff54], [cb4cb519], [f677e5a2], [39e0da55])
    • Updated Ollama examples to recommend snowflake-arctic-embed2 instead of bge-m3 ([d045702b])
    • Added documentation for the WDOC_MAX_EMBED_CONTEXT environment variable ([a2408fd0])
  • Removed a documentation file (summary_rag.md) that was not yet ready for release ([6d20c220])

⚙️ Chore & Maintenance

  • Version bumped to 3.2.3 (following an earlier bump to 3.2.2 [[71ac503c]]) ([f62a2322])
  • README Updates:
    • Updated TODO items ([8f2cbfd7], [5d090421])
    • Added a PyPI badge for better project visibility ([60ef4112])

Commits details since the last release

bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py

summary_rag.md

bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py

README.md

  • [69db191] by @thiswillbeyourgithub, 40 minutes ago:
    new: now debug print is used anyway when md_printer is used
    this is to make you able to go to the logs to fetch and answer form the
    LLM if you have forgotten to store it to a file

Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com

wdoc/utils/logger.py
wdoc/wdoc.py

wdoc/docs/help.md

wdoc/utils/env.py
wdoc/utils/misc.py

wdoc/utils/misc.py

wdoc/utils/misc.py

wdoc/utils/misc.py

  • [2297860] by @thiswillbeyourgithub, 86 minutes ago:
    fix: set default model to gemini 2.5 preview without date timestamp
    openrouter renamed that model apparently

Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com

README.md
wdoc/utils/env.py

wdoc/docs/help.md

wdoc/docs/help.md

wdoc/docs/examples.md

README.md

README.md

README.md

README.md

README.md

README.md

wdoc/wdoc.py

wdoc/utils/tasks/summarize.py