Skip to content

feat: add advanced configuration for llms-full.txt #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kirillmakhonin-brt
Copy link

This pull request introduces new features to control the content of llms-full.txt, aiming to improve chunking efficiency and context-awareness.

You can now fine-tune the output of the llms-full.txt file by setting the following configuration options:

  • prefix_url_per_page:

    • Type: Boolean
    • Description: If set to true, the URL of each page will be prefixed to the content of the page in the llms-full.txt file. This can be useful for providing context to the LLM.
  • use_section_separator:

    • Type: String
    • Description: Can be set to the string that should be used to separate sections in the llms-full.txt file. The separator will be wrapped with \n on both sides (if not empty).
  • use_section_pages_separator:

    • Type: String
    • Description: Can be set to the string that should be used to separate pages within a section in the llms-full.txt file. The separator will be wrapped with \n on both sides (if not empty).
  • prefix_url_base_url:

    • Type: String
    • Description: Can be set to the URL that should be used as the base URL for building URLs in llms-full.txt. If not set, the site_url will be used by default.
  • include_section_content_in_full_output:

    • Type: Boolean
    • Description: If set to true, the content of each section will be included in the full output of llms-full.txt. This can help in maintaining a comprehensive context.

@pawamoy
Copy link
Owner

pawamoy commented Jun 6, 2025

Thanks for the PR 🙂

prefix_url_per_page

Do I understand correctly that this adds the URL of the current page at the top of each markdown page? Why is this useful? Have you noticed cases where it helps the LLM?

use_section_separator

Why is this useful? Have you noticed cases where simple blank lines in between sections are not enough for the LLM to "distinguish" them?

use_section_pages_separator

Same questions.

prefix_url_base_url

for building URLs

What do you mean? Shouldn't the site URL be used, always? Are there cases where it's not desirable?

include_section_content_in_full_output

the content of each section will be included

Do you mean the Markdown lists, listing each page in each sections? Have you noticed cases where it helps the LLM?


Can you share examples of full output files that use these formatting methods? Is this common? Is this official and documented in the spec (haven't checked it in a while)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants