Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Simplify microagents + support MCP natively #7547

Open
neubig opened this issue Mar 27, 2025 · 13 comments
Open

Proposal: Simplify microagents + support MCP natively #7547

neubig opened this issue Mar 27, 2025 · 13 comments
Labels
enhancement New feature or request

Comments

@neubig
Copy link
Contributor

neubig commented Mar 27, 2025

Summary

There are two issues:

  1. Microagents are hard to understand, let's consider how to fix this.
  2. It would be good to consider first-class support for MCP within microagents.

This issue proposes a way to fix this.

Technical Design

Here is a proposal of documentation, and if this looks good we can modify the implementation accordingly.


Micro-agents

A micro-agent is a way that we allow the OpenHands main agent to be customized. The agent can be triggered on-the-fly, giving the main OpenHands agent additional abilities or instructions that it does not have typically.

This doc describes:

  1. How micro-agents can be defined
  2. The syntax used to define micro-agents
  3. Where you put micro-agents so they can be discovered by OpenHands

Micro-agent Definition

A micro-agent is defined by three aspects:

  1. Its trigger
  2. Its instruction
  3. Its additional tools

Micro-agent Trigger Types

Micro-agents are "triggered" by different events, and when the trigger occurs, the micro-agent will be activated. Currently OpenHands supports three varieties of trigger_type.

  • always: a micro-agent with the "always" trigger will always be activated. These sorts of agents can be useful when you want to specify
    • Repository-wide coding conventions. In this case, the micro-agent can be placed in the repo (see below for details) and will always be triggered every time OpenHands works on that repo. Here is the example from OpenHands.
    • Organization-wide coding conventions. In the future, we are planning on implementing support for micro-agents that specify coding conventions across an entire organization. If you are interested in this functionality, please thumbs-up or comment on this issue and we will work on implementing it.
  • keyword: a micro-agent will be triggered when a particular keyword appears in the conversation. For instance, here is an example of a micro-agent that is triggered when the agent says "github", telling OpenHands how to interact with GitHub through the API.
  • manual: These microagents can only be triggered through manual intervention by the user in the OpenHands interface (or other programmatic means). This feature is particularly useful for microagents that describe how to solve a task, such as todo.

Micro-agent Instructions

Micro-agent instructions are an additional prompt that is provided to the agent when the microagent is triggered. They basically provide additional information that modifies the agent behavior in the appropriate way. They are written in English (or whatever other language you work in). You can see examples in the OpenHands micro-agent directory.

Micro-agent Tools

Triggering micro-agents can optionally provide OpenHands with additional tools.

In the case that additional tools are provided, they are specified through MCP.
This is done by providing a location of an MCP server that is used to read in and access the API.
The API information will automatically be provided to the agent, so you do not need to specifically enumerate all of the functions in the API.

For instance, here is an example of a tool that provides access to TODO.

Micro-agent Syntax

All micro-agents use markdown files with YAML frontmatter.

---
name: <Name of the microagent>
trigger_type: <always, keyword, or manual>
keywords:
- <Optional keywords only active when `trigger_type` is "keyword">
mcp_location: <Optional location of an MCP server that provides additional tools>
---

<Markdown with any special guidelines, instructions, and prompts that OpenHands should follow.
Check out the specific documentation for each microagent on best practices for more information.>

Micro-agent Location

Micro-agents are located in several places:

  • Public micro-agents: These are micro-agents that are included in the OpenHands main repo here. They are meant to be general and widely usable by many different people or organizations, and document best practices on how to use OpenHands in general.
  • Repository micro-agents: These can be included directly in the repo that OpenHands is working on. These micro-agents should added in the .openhands/microagents/ directory.
  • Organization micro-agents: We are working on implementing micro-agents that are easily accessible at a cross-repository organizational level, so please comment on this issue if this would be useful to you.
@kjenney
Copy link
Contributor

kjenney commented Mar 27, 2025

I think it would be substantially better to use Model Context Protocol as a standard rather than managing your own.

@caique
Copy link
Contributor

caique commented Mar 27, 2025

Organization-wide microagents are an essential feature for enterprise adoption. +1 to that.

@caique
Copy link
Contributor

caique commented Mar 27, 2025

I'm not entirely convinced we should mix microagents and MCPs.

I like the idea of supporting both but I'd delegate the decision to use a MCP to the LLM rather than attach to the microagents syntax.

In addition, can't we just refer to the MCP in the microagent content? How do you envision the frontmater processing for this particular field?

@ryanhoangt
Copy link
Collaborator

ryanhoangt commented Mar 27, 2025

In addition, can't we just refer to the MCP in the microagent content? How do you envision the frontmater processing for this particular field?

I think mcp_location is the path of the config file for the MCP server? We can also provide more info about it in the micro-agent content.

I'd delegate the decision to use a MCP to the LLM rather than attach to the microagents syntax.

Can you elaborate a bit about this?

@caique
Copy link
Contributor

caique commented Mar 27, 2025

Can you elaborate a bit about this?

I prefer sending MCPs as "available tools" to the LLM and not tied to any specific microagents.

We continue handling the activation of microagents through the existing triggers (always, keyword, or manual).

Then, we delegate the decision to use a MCP or not to the LLM instead of attaching it to a specific microagent trigger.

Makes sense?

@neubig
Copy link
Contributor Author

neubig commented Mar 27, 2025

mcp_location could be attached to a microagent triggered with "always", in which case the tools are always available. Would that work?

@caique
Copy link
Contributor

caique commented Mar 27, 2025

mcp_location could be attached to a microagent triggered with "always", in which case the tools are always available. Would that work?

That would be the way to register the MCPs?
Instead of a mcp.json (or another config file), we would use an "always" microagent? 😊

@neubig
Copy link
Contributor Author

neubig commented Mar 27, 2025

Yes, it would be the microagent for the whole repo, like repo.md currently. The idea would be that all repo-level customization would go in that file, and the file format would be the same as the other microagent files.

The advantage of this method is that microagents all come in the same formats, and mcps can either be registered always or only when certain triggers happen.

@caique
Copy link
Contributor

caique commented Mar 27, 2025

The advantage of this method is that microagents all come in the same formats, and mcps can either be registered always or only when certain triggers happen.

Makes sense to me now!

This would allows us to use MCPs without microagents (and vice-versa) but also create combinations between both feature for advanced use-cases!

@oconnorjoseph
Copy link
Contributor

Micro-agent Syntax
All micro-agents use markdown files with YAML frontmatter.


name:
trigger_type: <always, keyword, or manual>
keywords:

  • <Optional keywords only active when trigger_type is "keyword">
    mcp_location:

<Markdown with any special guidelines, instructions, and prompts that OpenHands should follow.
Check out the specific documentation for each microagent on best practices for more information.>

Hi @neubig, does the above apply to knowledge/ microagents too? The documentation examples on the codebase still use different YAML frontmatter keys of name, type, agent, triggers.

@enyst
Copy link
Collaborator

enyst commented Mar 27, 2025

I'm thinking about what we can do to support task better. One way would be to prod the LLM to use a task.md to keep track of where it's at, e.g. breakdown the task, make itself a checklist with the steps, check the boxes as it fulfills them.

In some cases, this has worked very well to keep Claude Sonnet (and not only) focused, as it went through all of them.

@jasonburt
Copy link
Contributor

@neubig : Created a ticket for the org wide feed back that you mentioned.
#7557

Thinking aloud:

Naming and Framing
Micro Agents, should be OpenHands Agent Customization
The terminology for Repo and Keyword Micro-Agents really should be something like Agent Directives or Agent Context. Ex Repo Agent Directives, Keyword Directives. They allow users to add more general knowledge to the agent and reduce writing longer prompts.

MCP and Task
Tasks (Single Step & Multi Step) : They map to planning and user actions. Can be setup for something simple like steps around a git commit to something more advanced like running end to end tests for an application, building out any missing results and sending an email when complete. Also Tasks are good wrappers for MCP or external tools.

Todo
It would be good to map out a few user scenarios and look for gaps. "Organization micro-agents" add another layer of complexity. For a smaller group a centralized repo that stores Best in Class configurations and agent scripts would work. For a larger org they would be standardizing on authorized workflows and templates.

@caique
Copy link
Contributor

caique commented Mar 27, 2025

+1 to @jasonburt thoughts!

Naming and Framing Micro Agents, should be OpenHands Agent Customization The terminology for Repo and Keyword Micro-Agents really should be something like Agent Directives or Agent Context. Ex Repo Agent Directives, Keyword Directives. They allow users to add more general knowledge to the agent and reduce writing longer prompts.

That is a great opportunity for us to find better names for these concepts.

IMHO "Microagents" is a very misleading name. First time I read it, I thought OpenHands would start new agent instances to delegate tasks too. 😅

Adding to it, the "Repository-specific microagents" are ambiguous to the "repo.md microagent" which has the repo type.

MCP and Task Tasks (Single Step & Multi Step) : They map to planning and user actions. Can be setup for something simple like steps around a git commit to something more advanced like running end to end tests for an application, building out any missing results and sending an email when complete. Also Tasks are good wrappers for MCP or external tools.

I like the idea of step-by-step tasks and workflows for common tasks. Despite the whole LLM UX be based on natural language, it is super annoying to see the agent do 10 attempts to commit and push.

I have not noticed anything in other tools and I know that I would appreciate having the ability to define a checklist that I can trigger in a conversation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

8 participants