Skip to content

[BUG] Markdown Text Splitter - Overlap does not work #5173

@fmancardi

Description

@fmancardi

Describe the bug

I've created a document store and create 4 file loaders using always the same md file

jack_russell_guide_5pages.md

I've used 4 different configuration of splitter type, and chunk size and overlap.
Following images shows the outcome of tests

Image

Here seems to exist an issue (Markdown Text Splitter)
Image

Image

Testing using Recursive Character Text Splitter, shows the overlap - chunk size: 1000, overlap 400

Image Image

To Reproduce

  1. Create a document store
  2. Create a document loader of text type
  3. Upload this md file jack_russell_guide_5pages.md
  4. Select chunk size: 1000, chunk overlap: 200
  5. Process
  6. Wait till document store status will be sync
  7. Access the view chunks option for the document loader just created.

No overlap seems to exists between chunks

Expected behavior

Overlaps are present, like happens when using Recursive Character Text Splitter

Screenshots

No response

Flow

No response

Use Method

None

Flowise Version

No response

Operating System

None

Browser

None

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions