SynthLink Catalog

The SynthLink Catalog is a collection of complex, multi-hop questions designed for testing deep search / deep research systems. It is split into categories, each in a separate Markdown file:

Evaluation

The SynthLink Catalog evaluates deep search responses using a scoring system that measures answer accuracy, source relevance, reasoning quality, fact-checking, and search efficiency. Each question is scored on five metrics:

F1 Score: Checks how well the answer matches the expected summary.
Precision@5 (P@5): Measures relevance of the top 5 retrieved sources.
Reasoning Quality Score (RQS): Assesses if all reasoning steps are covered.
Fact-Checking Score (FCS): Ensures answers are verifiable, avoiding false claims.
Iterative Efficiency (IE): Evaluates how quickly the correct answer is found.

Scores are combined into an aggregate score (0–1) with weights emphasizing accuracy and reasoning. For details, see SynthLink_Scoring_System.md. Run scripts/score_synthlink.py to compute scores automatically.

Scoring

The SynthLink Catalog scores deep search responses on five metrics: answer accuracy (F1), source relevance (P@5), reasoning quality (RQS), fact-checking (FCS), and efficiency (IE). A great score is ~0.85, indicating excellent performance. See SynthLink_Scoring_System.md and SynthLink_Scoring_Methodology.md. Run scripts/score_synthlink.py to compute scores.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
EEC		EEC
EIS		EIS
HIA		HIA
PSM		PSM
SFT		SFT
STE		STE
scripts		scripts
LICENSE		LICENSE
README.md		README.md
SynthLink_Scoring_Methodology.md		SynthLink_Scoring_Methodology.md
SynthLink_Scoring_System.md		SynthLink_Scoring_System.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SynthLink Catalog

Evaluation

Scoring

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

scalytics/SynthLink

Folders and files

Latest commit

History

Repository files navigation

SynthLink Catalog

Evaluation

Scoring

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages