Research: Investigate Feasibility of Using LangSmith to Automate Test & Evaluation of RAG

## Research Issue: Investigate Feasibility of Using LangSmith for Test & Evaluation of RAG

### Context
This is an action item as a result of this research issue: https://github.com/astronomer/ask-astro/issues/195.

We are exploring the possibility of automating the rest & evaluation process for the RAG application using LangSmith. The objective is to compare pre and post-changes responses generated by RAG, assess the alignment with a reference answer, and determine the overall improvement in outcomes.

### Proposed Approach
Utilize LangSmith's Dataset and Test feature ([LangSmith Documentation](https://docs.smith.langchain.com/evaluation)) to conduct head-to-head comparisons of responses. This involves evaluating whether the RAG application's outputs align with a predefined reference answer and identifying any improvements.

### Implementation Details
No immediate code changes are required. The investigation will likely involve running local scripts to assess the feasibility of integrating LangSmith into the Test & Evaluation workflow. At the end, the script may be uploaded to the repository depending on whether it is secure to be published to the public.

### Action Items
1. Investigate the feasibility of LangSmith with the current test & evaluation process for RAG.
2. Assess the effectiveness of LangSmith in comparing responses and identifying improvements.
3. Identify costs associated with using LangSmith's test and evaluation features.
3. Document findings and considerations regarding the integration of LangSmith.

This research initiative is not expected to result in immediate code changes and aims to explore the potential benefits of leveraging LangSmith for enhanced Test & Evaluation of RAG.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Research: Investigate Feasibility of Using LangSmith to Automate Test & Evaluation of RAG #257

Research Issue: Investigate Feasibility of Using LangSmith for Test & Evaluation of RAG

Context

Proposed Approach

Implementation Details

Action Items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Research: Investigate Feasibility of Using LangSmith to Automate Test & Evaluation of RAG #257

Description

Research Issue: Investigate Feasibility of Using LangSmith for Test & Evaluation of RAG

Context

Proposed Approach

Implementation Details

Action Items

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions