Skip to content

Add other "non-class" text fields? #10

@jukofyork

Description

@jukofyork

I have been experimenting with the Structured Outputs mode a lot recently and just looking at your examples, wondered if it's possible to add other non-class fields like in these two templates:


Task: Determine if this book is fiction or nonfiction

Examine both the beginning and ending sections of this book:

${start_context}
${end_context}

Analyze the text to determine if this is fiction or nonfiction. Look for these key indicators:

FICTION indicators:

  • Narrative scenes with dialogue and character interactions
  • Descriptions of characters' thoughts and feelings
  • Story-like plot elements and dramatic scenes
  • Use of literary devices like metaphors and vivid descriptions
  • Events that appear imagined rather than documented

NONFICTION indicators:

  • Facts, dates, and real historical events
  • Academic or technical language
  • Citations or references to sources
  • Explanatory or instructional tone
  • Discussion of real people, places or concepts
  • Analytical or argumentative structure

Note: Focus only on distinguishing between fiction and nonfiction. Do not get distracted by specific genres (like romance, mystery, biography, textbook etc).

Respond using this JSON format:

{
    "analysis": "string",
    "genre": "fiction"|"nonfiction"|"unknown"
}

Guidelines:

  • In your analysis, cite specific examples from the text that support your classification.
  • Only classify as "fiction" or "nonfiction" if you are reasonably confident and can see clear evidence.
  • Use "unknown" if the evidence is ambiguous or insufficient.

Task: Locate the start of the main content in this book

Carefully examine the initial section of this book:

${context}

Try to identify where the main narrative (or factual if a non-fiction book) content begins:

  • IGNORE IRRELEVANT SECTIONS: Exclude all "front matter" content such as copyright notices, tables of contents, author's notes, dedications, epigraphs, or other preliminary material.
  • USE THE CONTEXT: Search for chapter headings like "Preface", "Prologue", "Chapter: XXX", "Chapter 1", or even just "1". If there is a table of contents, check if it provides any hints about where the main content starts.
  • LOOK FOR CLUES: For each candidate, examine the preceding line(s) for indicators of chapter headings that might signal the beginning of the main content. Look for Markdown headers, bold or italic formatting, or text in all caps.
  • BE CAUTIOUS: The provided initial book section may not actually contain the transition to the narrative/factual content, or even have any "front matter" content at all. Don't assume it does!

Respond using this JSON format:

{
    "candidates": ["string1", "string2", ...],
    "analysis": "string",
    "first_header": "string",
    "first_paragraph": "string"
}
  1. The candidates array should list all the headers and paragraphs that you think could potentially be the beginning of the book's main content.
  2. Use the analysis string to explain how you evaluated each candidate in relation to the surrounding text, or why you think the book doesn't actually have any "front matter" content.
  3. If you are reasonably confident you have successfully identified the beginning, then set first_header and first_paragraph based on your analysis.
  4. If you are still unsure of where the main content begins, then set both first_header and first_paragraph to null.
  5. If you have identified the first paragraph but think there is no header that goes before it, then set only first_header to null.
  6. If your analysis indicates that there is no "front matter" content present, then set both first_header and first_paragraph to "none".

(The second task isn't really classification, it shows how to use a chain of these "free text" fields in order to help even more)

I've found you can actually get gpt-4o-mini to outperform the larger models (often by a very large margin too!) by simply adding these extra "free text" fields BEFORE the actual classification field.

It might be worth adding this to the examples and making sure the code allows for this as it is very worthwhile (from my experience with prompts like the above; it actually gains much more than adding few-shot examples).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions