Proposal: Simplify microagents + support MCP natively #7547

neubig · 2025-03-27T15:42:09Z

Summary

There are two issues:

Microagents are hard to understand, let's consider how to fix this.
It would be good to consider first-class support for MCP within microagents.

This issue proposes a way to fix this.

Technical Design

Here is a proposal of documentation, and if this looks good we can modify the implementation accordingly.

Micro-agents

A micro-agent is a way that we allow the OpenHands main agent to be customized. The agent can be triggered on-the-fly, giving the main OpenHands agent additional abilities or instructions that it does not have typically.

This doc describes:

How micro-agents can be defined
The syntax used to define micro-agents
Where you put micro-agents so they can be discovered by OpenHands

Micro-agent Definition

A micro-agent is defined by three aspects:

Its trigger
Its instruction
Its additional tools

Micro-agent Trigger Types

Micro-agents are "triggered" by different events, and when the trigger occurs, the micro-agent will be activated. Currently OpenHands supports three varieties of trigger_type.

always: a micro-agent with the "always" trigger will always be activated. These sorts of agents can be useful when you want to specify
- Repository-wide coding conventions. In this case, the micro-agent can be placed in the repo (see below for details) and will always be triggered every time OpenHands works on that repo. Here is the example from OpenHands.
- Organization-wide coding conventions. In the future, we are planning on implementing support for micro-agents that specify coding conventions across an entire organization. If you are interested in this functionality, please thumbs-up or comment on this issue and we will work on implementing it.
keyword: a micro-agent will be triggered when a particular keyword appears in the conversation. For instance, here is an example of a micro-agent that is triggered when the agent says "github", telling OpenHands how to interact with GitHub through the API.
manual: These microagents can only be triggered through manual intervention by the user in the OpenHands interface (or other programmatic means). This feature is particularly useful for microagents that describe how to solve a task, such as todo.

Micro-agent Instructions

Micro-agent instructions are an additional prompt that is provided to the agent when the microagent is triggered. They basically provide additional information that modifies the agent behavior in the appropriate way. They are written in English (or whatever other language you work in). You can see examples in the OpenHands micro-agent directory.

Micro-agent Tools

Triggering micro-agents can optionally provide OpenHands with additional tools.

In the case that additional tools are provided, they are specified through MCP.
This is done by providing a location of an MCP server that is used to read in and access the API.
The API information will automatically be provided to the agent, so you do not need to specifically enumerate all of the functions in the API.

For instance, here is an example of a tool that provides access to TODO.

Micro-agent Syntax

All micro-agents use markdown files with YAML frontmatter.

---
name: <Name of the microagent>
trigger_type: <always, keyword, or manual>
keywords:
- <Optional keywords only active when `trigger_type` is "keyword">
mcp_location: <Optional location of an MCP server that provides additional tools>
---

<Markdown with any special guidelines, instructions, and prompts that OpenHands should follow.
Check out the specific documentation for each microagent on best practices for more information.>

Micro-agent Location

Micro-agents are located in several places:

Public micro-agents: These are micro-agents that are included in the OpenHands main repo here. They are meant to be general and widely usable by many different people or organizations, and document best practices on how to use OpenHands in general.
Repository micro-agents: These can be included directly in the repo that OpenHands is working on. These micro-agents should added in the .openhands/microagents/ directory.
Organization micro-agents: We are working on implementing micro-agents that are easily accessible at a cross-repository organizational level, so please comment on this issue if this would be useful to you.

The text was updated successfully, but these errors were encountered:

kjenney · 2025-03-27T16:20:33Z

I think it would be substantially better to use Model Context Protocol as a standard rather than managing your own.

caique · 2025-03-27T16:32:11Z

Organization-wide microagents are an essential feature for enterprise adoption. +1 to that.

caique · 2025-03-27T16:36:12Z

I'm not entirely convinced we should mix microagents and MCPs.

I like the idea of supporting both but I'd delegate the decision to use a MCP to the LLM rather than attach to the microagents syntax.

In addition, can't we just refer to the MCP in the microagent content? How do you envision the frontmater processing for this particular field?

ryanhoangt · 2025-03-27T16:58:09Z

In addition, can't we just refer to the MCP in the microagent content? How do you envision the frontmater processing for this particular field?

I think mcp_location is the path of the config file for the MCP server? We can also provide more info about it in the micro-agent content.

I'd delegate the decision to use a MCP to the LLM rather than attach to the microagents syntax.

Can you elaborate a bit about this?

caique · 2025-03-27T17:20:54Z

Can you elaborate a bit about this?

I prefer sending MCPs as "available tools" to the LLM and not tied to any specific microagents.

We continue handling the activation of microagents through the existing triggers (always, keyword, or manual).

Then, we delegate the decision to use a MCP or not to the LLM instead of attaching it to a specific microagent trigger.

Makes sense?

neubig · 2025-03-27T17:22:27Z

mcp_location could be attached to a microagent triggered with "always", in which case the tools are always available. Would that work?

caique · 2025-03-27T17:42:33Z

mcp_location could be attached to a microagent triggered with "always", in which case the tools are always available. Would that work?

That would be the way to register the MCPs?
Instead of a mcp.json (or another config file), we would use an "always" microagent? 😊

neubig · 2025-03-27T17:44:05Z

Yes, it would be the microagent for the whole repo, like repo.md currently. The idea would be that all repo-level customization would go in that file, and the file format would be the same as the other microagent files.

The advantage of this method is that microagents all come in the same formats, and mcps can either be registered always or only when certain triggers happen.

caique · 2025-03-27T17:55:00Z

The advantage of this method is that microagents all come in the same formats, and mcps can either be registered always or only when certain triggers happen.

Makes sense to me now!

This would allows us to use MCPs without microagents (and vice-versa) but also create combinations between both feature for advanced use-cases!

oconnorjoseph · 2025-03-27T19:01:46Z

Micro-agent Syntax
All micro-agents use markdown files with YAML frontmatter.

name:
trigger_type: <always, keyword, or manual>
keywords:

<Optional keywords only active when trigger_type is "keyword">
mcp_location:

<Markdown with any special guidelines, instructions, and prompts that OpenHands should follow.
Check out the specific documentation for each microagent on best practices for more information.>

Hi @neubig, does the above apply to knowledge/ microagents too? The documentation examples on the codebase still use different YAML frontmatter keys of name, type, agent, triggers.

enyst · 2025-03-27T19:17:51Z

I'm thinking about what we can do to support task better. One way would be to prod the LLM to use a task.md to keep track of where it's at, e.g. breakdown the task, make itself a checklist with the steps, check the boxes as it fulfills them.

In some cases, this has worked very well to keep Claude Sonnet (and not only) focused, as it went through all of them.

jasonburt · 2025-03-27T19:46:46Z

@neubig : Created a ticket for the org wide feed back that you mentioned.
#7557

Thinking aloud:

Naming and Framing
Micro Agents, should be OpenHands Agent Customization
The terminology for Repo and Keyword Micro-Agents really should be something like Agent Directives or Agent Context. Ex Repo Agent Directives, Keyword Directives. They allow users to add more general knowledge to the agent and reduce writing longer prompts.

MCP and Task
Tasks (Single Step & Multi Step) : They map to planning and user actions. Can be setup for something simple like steps around a git commit to something more advanced like running end to end tests for an application, building out any missing results and sending an email when complete. Also Tasks are good wrappers for MCP or external tools.

Todo
It would be good to map out a few user scenarios and look for gaps. "Organization micro-agents" add another layer of complexity. For a smaller group a centralized repo that stores Best in Class configurations and agent scripts would work. For a larger org they would be standardizing on authorized workflows and templates.

caique · 2025-03-27T20:01:29Z

+1 to @jasonburt thoughts!

Naming and Framing Micro Agents, should be OpenHands Agent Customization The terminology for Repo and Keyword Micro-Agents really should be something like Agent Directives or Agent Context. Ex Repo Agent Directives, Keyword Directives. They allow users to add more general knowledge to the agent and reduce writing longer prompts.

That is a great opportunity for us to find better names for these concepts.

IMHO "Microagents" is a very misleading name. First time I read it, I thought OpenHands would start new agent instances to delegate tasks too. 😅

Adding to it, the "Repository-specific microagents" are ambiguous to the "repo.md microagent" which has the repo type.

MCP and Task Tasks (Single Step & Multi Step) : They map to planning and user actions. Can be setup for something simple like steps around a git commit to something more advanced like running end to end tests for an application, building out any missing results and sending an email when complete. Also Tasks are good wrappers for MCP or external tools.

I like the idea of step-by-step tasks and workflows for common tasks. Despite the whole LLM UX be based on natural language, it is super annoying to see the agent do 10 attempts to commit and push.

I have not noticed anything in other tools and I know that I would appreciate having the ability to define a checklist that I can trigger in a conversation.

neubig mentioned this issue Mar 27, 2025

docs: Improve the Microagents usage documentation #7542

Merged

1 task

jasonburt mentioned this issue Mar 27, 2025

Org Wide Coding Conventions #7557

Open

mamoodi added the enhancement New feature or request label Mar 28, 2025

ryanhoangt mentioned this issue Apr 1, 2025

Add support for MCP servers #7620

Closed

6 tasks

ducphamle2 mentioned this issue Apr 1, 2025

feat (backend): Add support for MCP servers natively via CodeActAgent #7637

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Simplify microagents + support MCP natively #7547

Proposal: Simplify microagents + support MCP natively #7547

neubig commented Mar 27, 2025 •

edited

Loading

kjenney commented Mar 27, 2025

caique commented Mar 27, 2025

caique commented Mar 27, 2025

ryanhoangt commented Mar 27, 2025 •

edited

Loading

caique commented Mar 27, 2025

neubig commented Mar 27, 2025

caique commented Mar 27, 2025

neubig commented Mar 27, 2025 •

edited

Loading

caique commented Mar 27, 2025

oconnorjoseph commented Mar 27, 2025

enyst commented Mar 27, 2025

jasonburt commented Mar 27, 2025

caique commented Mar 27, 2025

Proposal: Simplify microagents + support MCP natively #7547

Proposal: Simplify microagents + support MCP natively #7547

Comments

neubig commented Mar 27, 2025 • edited Loading

Micro-agents

Micro-agent Definition

Micro-agent Trigger Types

Micro-agent Instructions

Micro-agent Tools

Micro-agent Syntax

Micro-agent Location

kjenney commented Mar 27, 2025

caique commented Mar 27, 2025

caique commented Mar 27, 2025

ryanhoangt commented Mar 27, 2025 • edited Loading

caique commented Mar 27, 2025

neubig commented Mar 27, 2025

caique commented Mar 27, 2025

neubig commented Mar 27, 2025 • edited Loading

caique commented Mar 27, 2025

oconnorjoseph commented Mar 27, 2025

enyst commented Mar 27, 2025

jasonburt commented Mar 27, 2025

caique commented Mar 27, 2025

neubig commented Mar 27, 2025 •

edited

Loading

ryanhoangt commented Mar 27, 2025 •

edited

Loading

neubig commented Mar 27, 2025 •

edited

Loading