LSB: Latency Sensitive Benchmarks for LLM Agents.

Paper (Win Fast or Lose Slow) | Website(Competitve Agents)

Latency Sensitive Benchmarks (LSB) are specifically designed to evaluate LLM Agents in realistic, latency-sensitive scenarios such as competitive games and high-frequency trading. In these tasks, both latency and accuracy jointly determine the final reward (e.g., game win rate or trading yield). Unlike previous benchmarks, LSB introduces two novel tasks that not only assess the intelligence of LLM agents, but also rigorously evaluate the efficiency of the underlying serving systems and algorithms. By integrating latency, accuracy, and real-world reward into a unified framework, LSB pioneers a new direction for benchmarking—encouraging the development of efficient, adaptive, and latency-aware LLM systems and algorithms. We hope our benchmarks and findings inspire the community to move beyond accuracy-centric evaluation and to build LLM solutions that truly excel in real-world, time-critical applications. We invite you to try LSB and join us in advancing this exciting frontier!

Key Features

Diverse Benchmarks: LSB offers two cutting-edge benchmarks， competitive gaming (StreetFighter) and high frequency trading backtesting system, capturing the essence of real-world, latency-sensitive tasks.
Flexible Agent Deployment: Provides LLM agent implementations that support local, remote, and API-based serving, enabling comprehensive evaluation across different system architectures.
System-Aware Evaluation: Highlights how agent performance varies with different serving systems and hardware configurations, offering actionable insights for both algorithm and system optimization.

Experience how LSB can help you benchmark and improve your LLM agents in truly challenging, real-time environments!

Visualization

StreetFighter

video1_8vsfpx.mp4

HFTBench

Installation

StreetFighter

Diambra

pip install diambra diambra-arena

Install huggingface, vllm, sglang.
Install other relavant envs

pip install loguru llama_index dotenv gymnasium rich openai

Register your diambra account at here
Install StreetFighter kernel at here. And put the zip file(do not unzip it) at $GAME_PATH(wherever you like).

HFTBench

Install huggingface, vllm, sglang. Install other relavant envs

pip install loguru llama_index dotenv gymnasium rich openai

Quick Start

StreetFighter

change $GAME_PATH to the root path of where you put the zip file.

cd ./StreetFighter
python3 diambra -r $GAME_PATH -l python3 run_api.py --serving-choice huggingface --agent1 Qwen/Qwen3-4B --agent2 Qwen/Qwen3-8B --logdir "test.log" --device1 cuda:0 --device2 cuda:1

HFTBench

cd ./HFTBench
python3 Simulation.py --agent_count 1 --device_list cuda:0

See more details at HFTBench and StreetFighter.

Evaluation

Here we provide results on two RTX5090. More results on H100 are comming soon.

HFTBench

Model Parameter Size	Bitwidth Avg	Latency (ms) ↓	Daily Yield (%) ↑
14B (ours)	7.2	713	26.52
14B	8	801	23.14
14B	16	1302	17.20
7B	16	619	-3.28
7B (ours)	7.6	386	-7.25
7B	8	394	-12.94

Street Fighter

Model Parameter Size	Bitwidth Avg	Latency (ms) ↓	Ranking Score ↑
3B (ours)	6.8	195	5.99
7B (ours)	7.2	354	2.33
3B	8	222	2.19
3B	16	349	0.25
7B	8	394	-0.44
1.5B	8	142	-1.25

Self-define Agent

We welcome any design of agents tested on hardware. Please create a issue/request including your code and serving hardware.

Reference

If you find Win Fast or Lose slow useful or relevant to your research, please kindly cite our paper:

@misc{kang2025winfastloseslow,
      title={Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs}, 
      author={Hao Kang and Qingru Zhang and Han Cai and Weiyuan Xu and Tushar Krishna and Yilun Du and Tsachy Weissman},
      year={2025},
      eprint={2505.19481},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.19481}, 
}

Related projects

Diambra Arena

Polygon Platform

TODOs

Self define agent
Per-tick data trading with multiple agents
FPX support with sglang and vllm engine

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
Figs		Figs
HFTBench		HFTBench
StreetFighter		StreetFighter
.gitignore		.gitignore
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LSB: Latency Sensitive Benchmarks for LLM Agents.

Key Features

Visualization

StreetFighter

HFTBench

Contents

Installation

StreetFighter

HFTBench

Quick Start

StreetFighter

HFTBench

Evaluation

HFTBench

Street Fighter

Self-define Agent

Reference

Related projects

TODOs

About

Uh oh!

Releases

Packages

Languages

HaoKang-Timmy/LatencySensitiveBench

Folders and files

Latest commit

History

Repository files navigation

LSB: Latency Sensitive Benchmarks for LLM Agents.

Key Features

Visualization

StreetFighter

HFTBench

Contents

Installation

StreetFighter

HFTBench

Quick Start

StreetFighter

HFTBench

Evaluation

HFTBench

Street Fighter

Self-define Agent

Reference

Related projects

TODOs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages