Skip to content

Research: Investigate Feasibility of Using LangSmith to Automate Test & Evaluation of RAG #257

@davidgxue

Description

@davidgxue

Research Issue: Investigate Feasibility of Using LangSmith for Test & Evaluation of RAG

Context

This is an action item as a result of this research issue: #195.

We are exploring the possibility of automating the rest & evaluation process for the RAG application using LangSmith. The objective is to compare pre and post-changes responses generated by RAG, assess the alignment with a reference answer, and determine the overall improvement in outcomes.

Proposed Approach

Utilize LangSmith's Dataset and Test feature (LangSmith Documentation) to conduct head-to-head comparisons of responses. This involves evaluating whether the RAG application's outputs align with a predefined reference answer and identifying any improvements.

Implementation Details

No immediate code changes are required. The investigation will likely involve running local scripts to assess the feasibility of integrating LangSmith into the Test & Evaluation workflow. At the end, the script may be uploaded to the repository depending on whether it is secure to be published to the public.

Action Items

  1. Investigate the feasibility of LangSmith with the current test & evaluation process for RAG.
  2. Assess the effectiveness of LangSmith in comparing responses and identifying improvements.
  3. Identify costs associated with using LangSmith's test and evaluation features.
  4. Document findings and considerations regarding the integration of LangSmith.

This research initiative is not expected to result in immediate code changes and aims to explore the potential benefits of leveraging LangSmith for enhanced Test & Evaluation of RAG.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions