Skip to content

What methods do you recommend for chunk generation in RAG applications using local agents with OLLAMA? #31965

Answered by onestardao
gilsonfiho asked this question in Q&A
Discussion options

You must be logged in to vote

Hi Gilson, great question — I’ve been through a very similar struggle.

I also started with LangChain’s RecursiveCharacterTextSplitter, but noticed the same issue you mentioned — it often cuts sentences awkwardly and breaks the semantic flow.

Eventually, I moved toward a different approach: instead of cutting based on tokens or character counts, I tried to segment based on semantic tension — basically aiming to keep each chunk internally coherent in meaning. This allows:

Longer chunks with dense, focused meaning (especially useful for contracts, whitepapers, or scientific texts)

Chunks that can be reused across different tasks without losing context

Dynamic overlap depending on the meaning…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by gilsonfiho
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants