Skip to content

A curated collection of resources for building, evaluating, and understanding AI agents and Large Language Models.

License

Notifications You must be signed in to change notification settings

locchh/agent-handbook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agent Engineering Handbook

Gather a broad range of aspects and resources (frameworks, libraries, tools, etc.). Choose the most representative ones, learn just enough to build something, then practice until you can see beyond the details to the underlying patterns.

Figure 1 – A generic agent architecture.

Figure 2 – An AI-agent stack that uses Model Context Protocol for unified context (tools, memory, docs).

Figure 3 – An AI Multi Agent System.

📚 General Resources

  • Hugging Face Learn: Tutorials and courses on a wide range of ML topics.
  • Hugging Face Docs: Official documentation for Hugging Face libraries and tools.
  • Hugging Face Spaces: A platform to build, share, and host ML demos.
  • Model Spec: a document that specifies desired behavior for our models in the OpenAI API and ChatGPT.

🧠 Large Language Models (LLM)

  • Hugging Face LLM Course: A free course on Large Language Models.
  • smol-course: A small, focused course on LLMs from Hugging Face.
  • smollm: A repository related to the smol-course.
  • smollm Collection: A Hugging Face collection of models and datasets for the smollm course.
  • DSPy: A declarative framework for building modular AI software.

🤖 AI Agents

  • AutoGen: A framework for building AI agents and applications
  • Hugging Face Agents Course: A course dedicated to building AI agents.
  • smolagents Docs: Documentation for the smolagents library.
  • smolagents GitHub: The source code for the smolagents library.
  • Why LangGraph?: An introduction to the concepts behind LangGraph.
  • LangGraph GitHub: A library for building stateful, multi-actor applications with LLMs.
  • BeeAI: The open-source platform to discover, run, and compose AI agents from any framework.
  • BeeAI Document: BeeAI is an open-source platform that makes it easy to discover, run, and share AI agents across frameworks.

🤝 Protocols

🔍 Observation

  • MLflow: An open-source platform to manage the ML lifecycle, including experimentation, reproducibility, and deployment.
  • MLflow GitHub: The source code for MLflow.
  • Arize Phoenix Docs: Documentation for Phoenix, an open-source ML observability library.
  • Phoenix GitHub: The source code for Arize Phoenix.

🎨 UI

  • Streamlit: A faster way to build and share data apps
  • Chainlit: Chainlit is an open-source Python package to build production ready Conversational AI.
  • Gradio: A library for building UIs for LLMs.

🚀 Deploy

  • agentops.ai: The leading developer platform for building AI agents and LLM apps. Agent observability for OpenAI, CrewAI, Autogen, and 400+ LLMs and frameworks.
  • agentops-repo: Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more.

📊 Evaluation

  • GAIA: GAIA is a benchmark which aims at evaluating next-generation LLMs (LLMs with augmented capabilities due to added tooling, efficient prompting, access to search, etc).
  • SWE-bench: SWE-bench tests AI systems' ability to solve GitHub issues.
  • Tau-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains.
  • AgentBench: AgentBench is the first benchmark designed to evaluate LLM-as-Agent across a diverse spectrum of different environments.
  • ACPBench: Reasoning about Action, Change, and Planning.
  • PaperBench: Evaluating AI's Ability to Replicate AI Research.
  • WebArena: A Realistic Web Environment for Building Autonomous Agents.

🌐 Engineering Resources

  • Engineering Resources: A collection of resources for building AI agents and Large Language Models.
  • MCP Hub: A centralized repository for Model Context Protocol (MCP) projects.
  • ACP Hub: A centralized repository for Agent Communication Protocol (ACP) projects.
  • A2A Hub: A centralized repository for Agent to Agent Protocol (A2A) projects.
  • Agent Hub: A centralized repository for Agent-based projects.
  • LLM Playground: A collection of experiments with large language models across various NLP tasks.

About

A curated collection of resources for building, evaluating, and understanding AI agents and Large Language Models.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published