-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[UNIT] Add draft Gradio bonus module for agents-course #120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
f851589
f475ee3
caf0745
ce7844a
dc44528
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -72,8 +72,9 @@ Here is the **general syllabus for the course**. A more detailed list of topics | |
| 2 | Frameworks | Understand how the fundamentals are implemented in popular libraries : smolagents, LangGraph, LLamaIndex | | ||
| 3 | Use Cases | Let's build some real life use cases (open to PRs 🤗 from experienced Agent builders) | | ||
| 4 | Final Assignment | Build an agent for a selected benchmark and prove your understanding of Agents on the student leaderboard 🚀 | | ||
| 5 | Bonus Gradio Module | Learn to build and deploy interactive AI agents with Gradio interfaces | | ||
|
||
*We are also planning to release some bonus units, stay tuned!* | ||
*We have one bonus unit available for you: Gradio module helps you create interactive interfaces for your agents. More bonus units coming soon!* | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would not change that sentence. As it removes the emphasis on other future bonus Units There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed with @Jofthomas. Because we don't have only bonus units with gradio, for instance next week we have one that is not about gradio but fine-tune. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense. Thanks for the reviews. |
||
|
||
## What are the prerequisites? | ||
|
||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,85 @@ | ||||||
# Introduction to Gradio for AI Agents | ||||||
|
||||||
<img src="https://huggingface.co/datasets/agents-course/course-images/resolve/main/en/bonus-gradio/whiteboard.jpg" alt="Gradio Module Planning"/> | ||||||
|
||||||
In the previous units, we learned how to create powerful AI Agents that can reason, plan, and take actions. But an Agent is only as effective as its ability to interact with users. This is where **Gradio** comes in. | ||||||
|
||||||
## Why Do We Need a UI for Our Agents? | ||||||
|
||||||
Meet Sarah, a data scientist who built an amazing AI Agent that can analyze financial data and generate reports. But there's a challenge: | ||||||
|
||||||
- Her teammates need to interact with the Agent | ||||||
- Not everyone is comfortable with code or command line | ||||||
- Users want to see the Agent's thought process | ||||||
- The Agent needs to handle file uploads and display visualizations | ||||||
|
||||||
The solution? A user-friendly interface that makes the Agent accessible to everyone. | ||||||
|
||||||
## What is Gradio? | ||||||
|
||||||
Gradio is a Python library that makes it easy to create **beautiful web interfaces** for your AI models and Agents. Think of it as a bridge between your Agent's capabilities and its users: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like "web applications" instead of "web interfaces" personally. |
||||||
|
||||||
|
||||||
With Gradio, you can: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you mentioned adding gifs. Would be nice to have gifs here to show off these features. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I've been trying to create GIFs for the demos included in the course, but I haven't found any tool that produces good-resolution GIFs from short videos. I've used Windows' native ClipChamp and various free online tools like Ezgif, freeconvert etc, but none have provided satisfactory results. Do you have any recommendations for a free tool? Otherwise, I can also embed short videos. |
||||||
- Create chat interfaces for your Agents in just a few lines of code | ||||||
- Display your Agent's thought process and tool usage | ||||||
- Handle file uploads and multimedia content that can be useful for your AI agents | ||||||
- Share your Agent with anyone via a URL | ||||||
- Deploy your Agent UI to Hugging Face Spaces | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
## Why Gradio for Agents? | ||||||
|
||||||
Gradio is particularly well-suited for AI Agents because it offers: | ||||||
|
||||||
1. **Native Chat Support**: Built-in components for chat interfaces that match how Agents communicate | ||||||
|
||||||
2. **Thought Process Visualization**: Special features to display your Agent's reasoning steps and tool usage | ||||||
|
||||||
3. **Real-time Updates**: Stream your Agent's responses and show its progress | ||||||
|
||||||
4. **File Handling**: Easy integration of file uploads for Agents that process documents or media | ||||||
|
||||||
Here's a quick example of how simple it is to create an Agent interface with Gradio: | ||||||
|
||||||
```python | ||||||
from smolagents import ( | ||||||
load_tool, | ||||||
CodeAgent, | ||||||
HfApiModel, | ||||||
GradioUI | ||||||
) | ||||||
|
||||||
# Import tool from Hub | ||||||
image_generation_tool = load_tool("m-ric/text-to-image", trust_remote_code=True) | ||||||
# initialize a model | ||||||
model = HfApiModel() | ||||||
# Initialize the agent with the image generation tool | ||||||
agent = CodeAgent(tools=[image_generation_tool], model=model) | ||||||
# launch the gradio agentic ui | ||||||
GradioUI(agent).launch(share=True) | ||||||
Comment on lines
+52
to
+59
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove inline comments? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are you referring to all of them? I noticed comments in the code examples from other units, so I just followed that pattern. However, I understand your point, it does appear overcrowded with the number of comments. Perhaps I could eliminate some of the unnecessary ones to reduce the clutter. |
||||||
``` | ||||||
|
||||||
This creates a complete chat interface: | ||||||
|
||||||
<need to add an image of the generated ui with an example in the agents-course dataset at https://huggingface.co/datasets/agents-course/course-images> | ||||||
|
||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should add a sentence here explaining that while GradioUI(agent) launches a default Gradio UI, it doesn't offer the ability to customize the Gradio interface or to build larger UIs that the agent UI can be a part of, or use other frameworks besides smolagents, etc. which is why we'll understand how to build a UI using Gradio directly. |
||||||
Note that this launches a Gradio UI; however, it does not offer the option to customize the interface. We will explore how to create a custom agentic UI with Gradio and how to build more complex systems that incorporate the Gradio agent UI. | ||||||
|
||||||
|
||||||
## Setting Up Gradio | ||||||
|
||||||
Before we dive deeper, let's set up Gradio in your environment: | ||||||
|
||||||
```bash | ||||||
pip install --upgrade gradio | ||||||
``` | ||||||
|
||||||
|
||||||
If you're working in Google Colab or Jupyter, you can restart your runtime after installing Gradio to ensure all components are properly loaded. | ||||||
|
||||||
In the next section, we'll build our first Agent interface using Gradio's ChatInterface. We'll see how to: | ||||||
- Create a basic chat UI | ||||||
- Display the Agent's responses | ||||||
- Handle user inputs effectively | ||||||
|
||||||
Ready to build your first Agent UI? Let's move on to the next chapter! |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,241 @@ | ||
# Building Your First Agent Interface | ||
|
||
Now that we understand why Gradio is useful for Agents, let's create our first Agent interface! We'll start with the gr.ChatInterface, which provides everything we need to get an Agent up and running quickly. | ||
|
||
## The ChatInterface Component | ||
|
||
The `gr.ChatInterface` is a high-level component that handles all the essential parts of a chat application: | ||
- Message history management | ||
- User input handling | ||
- Bot response display | ||
- Real-time / streaming updates | ||
|
||
|
||
## Your First Agent UI | ||
|
||
We'll start by installing langchain and langgraph for this section. Additionally, set up the environment variable `OPENAI_API_KEY` to store your OpenAI API keys. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Btw it'd be great if we could showcase HF inference providers! |
||
|
||
```python | ||
pip install langchain | ||
pip install langchain-openai langgraph | ||
|
||
# OPENAI_API_KEY | ||
import os | ||
os.environ["OPENAI_API_KEY"]="your-openai-api-key" | ||
``` | ||
|
||
|
||
Let's create a simple Agent that helps with weather information. Here's how to build it: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps we should mention to isntall lanchgain and langgraph |
||
|
||
```python | ||
import os | ||
import gradio as gr | ||
from gradio import ChatMessage | ||
import requests | ||
from typing import Dict, List | ||
from langchain_core.messages import HumanMessage | ||
from langchain_core.tools import tool | ||
from langchain_openai import ChatOpenAI | ||
from langgraph.checkpoint.memory import MemorySaver | ||
from langgraph.prebuilt import create_react_agent | ||
|
||
# Weather and location tools | ||
@tool | ||
def get_lat_lng(location_description: str) -> dict[str, float]: | ||
"""Get the latitude and longitude of a location.""" | ||
return {"lat": 51.1, "lng": -0.1} # London coordinates as dummy response | ||
|
||
@tool | ||
def get_weather(lat: float, lng: float) -> dict[str, str]: | ||
"""Get the weather at a location.""" | ||
return {"temperature": "21°C", "description": "Sunny"} # Dummy response | ||
|
||
|
||
def stream_from_agent(message: str, history: List[Dict[str, str]]) -> gr.ChatMessage: | ||
"""Process messages through the LangChain agent with visible reasoning.""" | ||
|
||
# Initialize the agent | ||
llm = ChatOpenAI(temperature=0, model="gpt-4") | ||
memory = MemorySaver() | ||
tools = [get_lat_lng, get_weather] | ||
agent_executor = create_react_agent(llm, tools, checkpointer=memory) | ||
|
||
# Add message to history | ||
past_messages = [HumanMessage(content=message)] | ||
for h in history: | ||
if h["role"] == "user": | ||
past_messages.append(HumanMessage(content=h["content"])) | ||
|
||
messages_to_display = [] | ||
final_response = None | ||
|
||
for chunk in agent_executor.stream( | ||
{"messages": past_messages}, | ||
config={"configurable": {"thread_id": "abc123"}} | ||
): | ||
# Handle agent's actions and tool usage | ||
if chunk.get("agent"): | ||
for msg in chunk["agent"]["messages"]: | ||
if msg.content: | ||
final_response = msg.content | ||
|
||
# Handle tool calls | ||
for tool_call in msg.tool_calls: | ||
tool_message = ChatMessage( | ||
content=f"Parameters: {tool_call['args']}", | ||
metadata={ | ||
"title": f"🛠️ Using {tool_call['name']}", | ||
"id": tool_call["id"], | ||
} | ||
) | ||
messages_to_display.append(tool_message) | ||
yield messages_to_display | ||
|
||
# Handle tool responses | ||
if chunk.get("tools"): | ||
for tool_response in chunk["tools"]["messages"]: | ||
# Find the corresponding tool message | ||
for msg in messages_to_display: | ||
if msg.metadata.get("id") == tool_response.tool_call_id: | ||
msg.content += f"\nResult: {tool_response.content}" | ||
yield messages_to_display | ||
|
||
# Add the final response as a regular message | ||
if final_response: | ||
messages_to_display.append(ChatMessage(content=final_response)) | ||
yield messages_to_display | ||
|
||
# Create the Gradio interface | ||
demo = gr.ChatInterface( | ||
fn=stream_from_agent, | ||
type="messages", | ||
title="🌤️ Weather Assistant", | ||
description="Ask about the weather anywhere! Watch as I gather the information step by step.", | ||
examples=[ | ||
"What's the weather like in Tokyo?", | ||
"Is it sunny in Paris right now?", | ||
"Should I bring an umbrella in New York today?" | ||
], | ||
|
||
) | ||
|
||
demo.launch() | ||
``` | ||
|
||
This creates a complete chat interface: | ||
<add an image or a gif of the chatinterface. It is available at the location: https://huggingface.co/spaces/ysharma/Your_First_Agent_UI. Note that we first need to set OpenAI API key as an env variable oin the Space> | ||
|
||
## Understanding the key elements | ||
|
||
Let's break down what's happening: | ||
|
||
1. **The Agent Function**: | ||
```python | ||
def stream_from_agent(message: str, history: List[Dict[str, str]]) -> gr.ChatMessage: | ||
``` | ||
- Processes messages through the LangChain agent | ||
- Takes the current message and chat history | ||
- Uses `yield` to show intermediate tool-usage and thoughts | ||
- Returns the final response | ||
|
||
2. **Tools available**: | ||
```python | ||
def get_lat_lng(location_description: str) -> dict[str, float]: | ||
def get_weather(lat: float, lng: float) -> dict[str, str]: | ||
``` | ||
- get_lat_lng : Get lattitude and longitude of a place from the place description | ||
- get_weather : Get weather details for the given latitude and logitude | ||
|
||
3. **ChatInterface Configuration**: | ||
```python | ||
demo = gr.ChatInterface( | ||
fn=stream_from_agent, | ||
title="🌤️ Weather Assistant", | ||
... | ||
) | ||
``` | ||
- Connects the Agent function | ||
- Sets up the UI appearance | ||
- Configures interactive features | ||
<I can add more latest features to the chatbot (like history, sidebar etc.)> | ||
|
||
3. **Streaming Responses**: | ||
- Using `yield` enables real-time updates | ||
- Shows the Agent's thought process and tool-usage in real time | ||
- Keeps users engaged while waiting | ||
|
||
We are using `type="messages"` in `gr.ChatInterface`. This passes the history as a list of dictionaries with openai-style "role" and "content" keys. | ||
|
||
ChatInterface requires only one parameter: fn, a function that returns LLM responses based on the user input and chat history. There are additional parameters, like `type`, that help control the chatbot's appearance and behavior. We'll explore the ChatInterface class further in the following chapters. | ||
|
||
## Adding More Features | ||
|
||
Let's enhance our Agent UI with more capabilities: | ||
|
||
1. **Auto closing the tool usage accordion after the agent is done populating them.** | ||
|
||
Using the Metadata dictionary key called `status` to communicate to the gradio ui when the tool usage is complete. | ||
This also adds a spinner appears next to the thought title. If the `status` is "done", the thought accordion becomes closed. If not provided, the thought accordion is open and no spinner is displayed. | ||
Comment on lines
+175
to
+178
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This first point seems out-of-place, since we haven't introduced ChatMessage. I think it should go in the next chapter. Perhaps in this part, only focus on the UI tweaks listed in point (2) and explain them in a bit more detail? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it! |
||
|
||
```python | ||
|
||
def stream_from_agent(message: str, history: List[Dict[str, str]]) -> gr.ChatMessage: | ||
""" | ||
Process messages through the LangChain agent with visible reasoning. | ||
""" | ||
... | ||
# Handle tool calls | ||
for tool_call in msg.tool_calls: | ||
tool_message = ChatMessage( | ||
content=f"Parameters: {tool_call['args']}", | ||
metadata={ | ||
"title": f"🛠️ Using {tool_call['name']}", | ||
"id": tool_call["id"], | ||
"status": "pending", | ||
} | ||
) | ||
messages_to_display.append(tool_message) | ||
yield messages_to_display | ||
tool_message.metadata["status"] = "done" | ||
... | ||
``` | ||
|
||
2. **Add more features** | ||
- Allow editing of past messages to regenerate response. | ||
- Add example icons to the examples | ||
- Display Chatgpt type Chat history in the UI. | ||
|
||
```python | ||
|
||
# Create enhanced Gradio interface | ||
demo = gr.ChatInterface( | ||
fn=stream_from_agent, | ||
type="messages", | ||
title="🌤️ Weather Assistant", | ||
description="Ask about the weather anywhere! Watch as I gather the information step by step.", | ||
examples=[ | ||
"What's the weather like in Tokyo?", | ||
"Is it sunny in Paris right now?", | ||
"Should I bring an umbrella in New York today?" | ||
], | ||
example_icons=["https://cdn3.iconfinder.com/data/icons/landmark-outline/432/japan_tower_tokyo_landmark_travel_architecture_tourism_view-256.png", | ||
"https://cdn2.iconfinder.com/data/icons/city-building-1/200/ArcdeTriomphe-256.png", | ||
"https://cdn2.iconfinder.com/data/icons/city-icons-for-offscreen-magazine/80/new-york-256.png" | ||
], | ||
save_history=True, | ||
editable=True | ||
|
||
) | ||
``` | ||
|
||
Try modifying the example above to create your own Agent interface. Experiment with different use-cases like research, article-writing, travel-planner, _etc._ | ||
|
||
## What's Next? | ||
|
||
You now have a foundation for building Agent interfaces! In the next chapter, we'll dive deeper into `gr.ChatMessage` and learn how to display complex Agent behaviors like: | ||
- Tool usage visualization | ||
- Structured thought processes | ||
- Multi-step reasoning | ||
- Nested agents | ||
|
||
We'll also explore how to integrate these interfaces with more world-class LLMs and external tools. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a suggestion, not sure if this makes more sense