Release 3.2.3
What's new
This release primarily focuses on enhancing context management for embedding models, improving debugging utilities, and updating documentation for better clarity. It also includes several important bug fixes and feature additions.
✨ Features
- Introduced a new environment variable
WDOC_MAX_EMBED_CONTEXT
to allow capping the context size for embedding models ([d9e200f8]
)- Documentation for this new variable has been added (
[a2408fd0]
)
- Documentation for this new variable has been added (
- Enhanced debugging by ensuring debug prints are always active when
md_printer
is used. This helps in retrieving LLM answers from logs if they weren't saved to a file ([69db1916]
) - Added the current date to summary metadata and headers to help reduce potential LLM hallucinations (
[64ca4665]
)
🐛 Fixes
- Text Splitting & Context Handling:
- Addressed an issue where large language models have more context than embedding models by setting a
max_tokens
limit for the text splitter ([dac6802d]
) - Fixed an edge case where the
wdoc max chunk
setting could be ignored ([196b3a00]
) - Corrected an old variable name within the text splitting logic (
[767bc754]
)
- Addressed an issue where large language models have more context than embedding models by setting a
- Updated the default model to
gemini 2.5 preview
to reflect its renaming on OpenRouter ([22978609]
) - Improved the mechanism for ignoring initial "breathing" or placeholder lines in summaries (
[4dbcf158]
)
📚 Documentation
- Clarity and Enhancements:
- Clarified the usage of
save
andload
functionalities ([9d9642d4]
) and specifically advised against using them simultaneously ([5270c350]
) - Made multiple clarifications to the README for better understanding (
[9284ff54]
,[cb4cb519]
,[f677e5a2]
,[39e0da55]
) - Updated Ollama examples to recommend
snowflake-arctic-embed2
instead ofbge-m3
([d045702b]
) - Added documentation for the
WDOC_MAX_EMBED_CONTEXT
environment variable ([a2408fd0]
)
- Clarified the usage of
- Removed a documentation file (
summary_rag.md
) that was not yet ready for release ([6d20c220]
)
⚙️ Chore & Maintenance
- Version bumped to
3.2.3
(following an earlier bump to3.2.2
[[71ac503c]
]) ([f62a2322]
) - README Updates:
- Updated TODO items (
[8f2cbfd7]
,[5d090421]
) - Added a PyPI badge for better project visibility (
[60ef4112]
)
- Updated TODO items (
Commits details since the last release
- [f62a232] by @thiswillbeyourgithub, 46 seconds ago:
bump version 3.2.2 -> 3.2.3
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [6d20c22] by @thiswillbeyourgithub, 76 seconds ago:
doc: removed file not yet ready
Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com
summary_rag.md
- [71ac503] by @thiswillbeyourgithub, 4 minutes ago:
bump version 3.2.1 -> 3.2.2
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [8f2cbfd] by @thiswillbeyourgithub, 3 minutes ago:
todo
Signed-off-by: thiswillbeyourgithub
26625900+thiswillbeyourgithub@users.noreply.github.com
README.md
- [69db191] by @thiswillbeyourgithub, 40 minutes ago:
new: now debug print is used anyway when md_printer is used
this is to make you able to go to the logs to fetch and answer form the
LLM if you have forgotten to store it to a file
Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com
wdoc/utils/logger.py
wdoc/wdoc.py
- [a2408fd] by @thiswillbeyourgithub (aider), 66 minutes ago:
docs: Add documentation for WDOC_MAX_EMBED_CONTEXT variable
wdoc/docs/help.md
- [d9e200f] by @thiswillbeyourgithub, 66 minutes ago:
feat: add new env var to cap the context size for embedding models
Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com
wdoc/utils/env.py
wdoc/utils/misc.py
- [196b3a0] by @thiswillbeyourgithub, 72 minutes ago:
fix: edge case where wdoc max chunk would be ignored
Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com
wdoc/utils/misc.py
- [dac6802] by @thiswillbeyourgithub, 76 minutes ago:
fix: set a limit to max_tokens for the text splitter as large LLM have more context than embeddings models nowadays
Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com
wdoc/utils/misc.py
- [767bc75] by @thiswillbeyourgithub, 80 minutes ago:
fix: forgot to rename an old variable name
Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com
wdoc/utils/misc.py
- [2297860] by @thiswillbeyourgithub, 86 minutes ago:
fix: set default model to gemini 2.5 preview without date timestamp
openrouter renamed that model apparently
Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com
README.md
wdoc/utils/env.py
- [9d9642d] by @thiswillbeyourgithub, 22 hours ago:
doc: clarify save and load
Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com
wdoc/docs/help.md
- [5270c35] by @thiswillbeyourgithub, 22 hours ago:
doc: clarify that load and save shouldnt be used at the same time
Signed-off-by: thiswillbeyourgithub
26625900+thiswillbeyourgithub@users.noreply.github.com
wdoc/docs/help.md
- [d045702] by @thiswillbeyourgithub, 23 hours ago:
doc: use snowflake-arctic-embed2 instead of bge-m3 for ollama examples
Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com
wdoc/docs/examples.md
- [60ef411] by @thiswillbeyourgithub, 26 hours ago:
add a pypi badge
Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com
README.md
- [5d09042] by @thiswillbeyourgithub, 7 days ago:
update todo
Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com
README.md
- [9284ff5] by @thiswillbeyourgithub, 7 days ago:
doc: clarify
Signed-off-by: thiswillbeyourgithub
26625900+thiswillbeyourgithub@users.noreply.github.com
Signed-off-by: thiswillbeyourgithub
26625900+thiswillbeyourgithub@users.noreply.github.com
Signed-off-by: thiswillbeyourgithub
26625900+thiswillbeyourgithub@users.noreply.github.com
Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com
README.md
- [cb4cb51] by @thiswillbeyourgithub, 7 days ago:
doc: clarify
Signed-off-by: thiswillbeyourgithub
26625900+thiswillbeyourgithub@users.noreply.github.com
Signed-off-by: thiswillbeyourgithub
26625900+thiswillbeyourgithub@users.noreply.github.com
Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com
README.md
- [f677e5a] by @thiswillbeyourgithub, 7 days ago:
doc: clarify
Signed-off-by: thiswillbeyourgithub
26625900+thiswillbeyourgithub@users.noreply.github.com
Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com
README.md
- [39e0da5] by @thiswillbeyourgithub, 7 days ago:
doc: clarify
Signed-off-by: thiswillbeyourgithub 26625900+thiswillbeyourgithub@users.noreply.github.com
README.md
- [64ca466] by @thiswillbeyourgithub (aider), 10 days ago:
feat: Add current date to summary metadata and header to reduce hallucinations
wdoc/wdoc.py
- [4dbcf15] by @thiswillbeyourgithub, 10 days ago:
enh: better ignoring of first line of summary if just breathing
Signed-off-by: thiswillbeyourgithub
26625900+thiswillbeyourgithub@users.noreply.github.com
wdoc/utils/tasks/summarize.py