-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat (backend): Add support for MCP servers natively via CodeActAgent
#7637
base: main
Are you sure you want to change the base?
Conversation
CodeActAgent
CodeActAgent
93ea7a2
to
82e2500
Compare
Updating to pass the workflow tests |
CodeActAgent
CodeActAgent
CodeActAgent
CodeActAgent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the contribution, I took a look at the PR and left some questions, the approach does make sense to me! The main concern from my side is there are a lot of new types being created, it'd be great if we could find a way to simplify them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome to see! Let me know if you need anything from my side to merge this PR!
Would love to use MCP directly to test sequential thinking: #7643
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello! First, thanks a bunch for the contribution!
Just to clarify, this requires us to have an additional method for customization for MCP servers, right? This contrasts with the approach in #7620, which allows us to re-use our current method for customization, namely micro-agents.
Personally I would prefer keeping the user experience simple and not introduce an additional method of customization. But if there are benefits of this approach over the approach in #7620, I'd be happy to discuss them.
@xingyaoww I'm not entirely familiar with the config flow from the frontend -> backend yet. If possible, users can choose their MCP servers via the frontend, like the LLM API Key, but we still keep the MCP config in the backend as a fallback. If you could help with that It would be great! |
@ducphamle2 sounds good! I think we can focus on getting the backend code that uses "config.toml" to work first, and then we can start adding it into the settings page. I think this PR might be in good shape for merge after some comments from @ryanhoangt is addressed ❤️ |
@neubig Based on my understanding, I think this PR lays down the foundation work for getting the MCP server configured and having those actions executed. I think it is still feasible for us to do something similar to #7620 where we put some MCP config into some microagent, and pass those config along to backend setting? So we are not actually losing anything? WDYT? |
More than happy to contribute! Yes, it is a different approach compared to using micro-agents. I believe the main benefit of this approach is that the agents will decide when to use the MCP tools instead of explicitly stating how and when to trigger them via microagent knowledge. We can also do like @xingyaoww said, by adding specific MCP servers for the microagent to use, which is a win-win imo |
I asked OpenHands to summarize some offline discussion I had with @neubig - hope it provide some context :)Hi @ducphamle2 and everyone, Thanks for the great work on this PR! I wanted to provide some context from our Slack discussion about the MCP integration approaches. We've been discussing two different approaches for MCP integration:
After discussion, we've decided to move forward with this PR's approach as it:
@neubig raised a concern about introducing an additional method of customization. I think we can address this by:
For next steps, I suggest we:
I'm excited to see this merged so we can start using tools like sequential thinking in OpenHands! Let me know if you need any help with the changes. |
2a76ff9
to
eda6369
Compare
Thank you for the acknowledgement and all the comments! We'll resolve the issues asap (in the next couple of days probably) so we can merge this PR, will let you know if I need help from your end! |
f66cf4d
to
a4fe4fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work! It is already SO much cleaner 🙏 Left a few nit / questions
a472380
to
f78727b
Compare
I am so excited for this feature! |
It's an awesome feature and a lot of work going into it, thank you for doing this @ducphamle2 ! Is the PR ready for review again? |
@enyst no problem, glad I could help! Yes it is ready for another round. Im eager to get this PR merged, as me and my team are also actively implementing custom features using MCP on the repo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey great work!
I take a look at this PR this afternoon and tried to:
- revert some unnecessary change
- merge from main
- refactor MCPStdioConfig so we can define new tools in a cleaner way in
config.toml
- break
mcp.py
into multiple files so hopefully it make things cleaner
This new change would allow us to define tools like this in config.toml
which looks slightly cleaner:
[mcp-sse]
mcp_servers = ["http://localhost:8000/sse"]
[mcp-stdio.tools]
filesystem = { command = "npx", args = ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/dir"] }
brave-search = {command = "docker", args = ["run", "-i", "--rm", "-e", "BRAVE_API_KEY", "mcp/brave-search"], env = {"BRAVE_API_KEY" = "xxx"}}
I put those changes into this PR: oraichain#8, which will merge into your branch (this PR). Feel free to take a look and LMK if you have any concerns!
One primary concern I have with existing implementation is that for stdio
MCP tools, the backend assume we have the environment setup for it, but often time we don't (e.g., see log below, the docker/npx command failed - so the tool initialization also failed).
17:57:28 - openhands:ERROR: client.py:119 - Error connecting to npx: [Errno 2] No such file or directory: 'npx'
Traceback (most recent call last):
File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/client.py", line 107, in connect_stdio
await asyncio.wait_for(connection_task, timeout=timeout)
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/tasks.py", line 520, in wait_for
return await fut
^^^^^^^^^
File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/client.py", line 129, in _connect_stdio_internal
stdio_transport = await self.exit_stack.enter_async_context(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/contextlib.py", line 659, in enter_async_context
result = await _enter(cm)
^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/contextlib.py", line 210, in __aenter__
return await anext(self.gen)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/xingyaow/Library/Caches/pypoetry/virtualenvs/openhands-ai-wqN7FI_O-py3.12/lib/python3.12/site-packages/mcp/client/stdio.py", line 100, in stdio_client
process = await anyio.open_process(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xingyaow/Library/Caches/pypoetry/virtualenvs/openhands-ai-wqN7FI_O-py3.12/lib/python3.12/site-packages/anyio/_core/_subprocesses.py", line 184, in open_process
return await get_async_backend().open_process(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xingyaow/Library/Caches/pypoetry/virtualenvs/openhands-ai-wqN7FI_O-py3.12/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2552, in open_process
process = await asyncio.create_subprocess_exec(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/subprocess.py", line 224, in create_subprocess_exec
transport, protocol = await loop.subprocess_exec(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 1743, in subprocess_exec
transport = await self._make_subprocess_transport(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/unix_events.py", line 211, in _make_subprocess_transport
transp = _UnixSubprocessTransport(self, protocol, args, shell,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_subprocess.py", line 36, in __init__
self._start(args=args, shell=shell, stdin=stdin, stdout=stdout,
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/unix_events.py", line 820, in _start
self._proc = subprocess.Popen(
^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/subprocess.py", line 1026, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/subprocess.py", line 1955, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'npx'
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/core/cli.py", line 208, in <module>
loop.run_until_complete(main(loop))
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 674, in run_until_complete
self.run_forever()
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 641, in run_forever
self._run_once()
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 1986, in _run_once
handle._run()
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/core/cli.py", line 113, in main
mcp_tools = await fetch_mcp_tools_from_config(config.mcp)
File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/utils.py", line 109, in fetch_mcp_tools_from_config
mcp_clients = await create_mcp_clients(
File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/utils.py", line 73, in create_mcp_clients
await client.connect_stdio(
File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/client.py", line 119, in connect_stdio
logger.error(f'Error connecting to {command}: {str(e)}')
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/logging/__init__.py", line 1568, in error
self._log(ERROR, msg, args, **kwargs)
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/logging/__init__.py", line 1684, in _log
self.handle(record)
17:57:28 - openhands:INFO: utils.py:79 - Connected to MCP server via stdio with command npx
17:57:28 - openhands:INFO: utils.py:68 - Initializing MCP tool [brave-search] for [docker] with stdio connection...
17:57:28 - openhands:ERROR: client.py:119 - Error connecting to docker: [Errno 2] No such file or directory: 'docker'
Traceback (most recent call last):
File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/client.py", line 107, in connect_stdio
await asyncio.wait_for(connection_task, timeout=timeout)
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/tasks.py", line 520, in wait_for
return await fut
^^^^^^^^^
File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/client.py", line 129, in _connect_stdio_internal
stdio_transport = await self.exit_stack.enter_async_context(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/contextlib.py", line 659, in enter_async_context
result = await _enter(cm)
^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/contextlib.py", line 210, in __aenter__
return await anext(self.gen)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/xingyaow/Library/Caches/pypoetry/virtualenvs/openhands-ai-wqN7FI_O-py3.12/lib/python3.12/site-packages/mcp/client/stdio.py", line 100, in stdio_client
process = await anyio.open_process(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xingyaow/Library/Caches/pypoetry/virtualenvs/openhands-ai-wqN7FI_O-py3.12/lib/python3.12/site-packages/anyio/_core/_subprocesses.py", line 184, in open_process
return await get_async_backend().open_process(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xingyaow/Library/Caches/pypoetry/virtualenvs/openhands-ai-wqN7FI_O-py3.12/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2552, in open_process
process = await asyncio.create_subprocess_exec(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/subprocess.py", line 224, in create_subprocess_exec
transport, protocol = await loop.subprocess_exec(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 1743, in subprocess_exec
transport = await self._make_subprocess_transport(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/unix_events.py", line 211, in _make_subprocess_transport
transp = _UnixSubprocessTransport(self, protocol, args, shell,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_subprocess.py", line 36, in __init__
self._start(args=args, shell=shell, stdin=stdin, stdout=stdout,
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/unix_events.py", line 820, in _start
self._proc = subprocess.Popen(
^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/subprocess.py", line 1026, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/subprocess.py", line 1955, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'docker'
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/core/cli.py", line 208, in <module>
loop.run_until_complete(main(loop))
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 674, in run_until_complete
self.run_forever()
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 641, in run_forever
self._run_once()
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 1986, in _run_once
handle._run()
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/core/cli.py", line 113, in main
mcp_tools = await fetch_mcp_tools_from_config(config.mcp)
File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/utils.py", line 109, in fetch_mcp_tools_from_config
mcp_clients = await create_mcp_clients(
File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/utils.py", line 73, in create_mcp_clients
await client.connect_stdio(
File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/client.py", line 119, in connect_stdio
logger.error(f'Error connecting to {command}: {str(e)}')
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/logging/__init__.py", line 1568, in error
self._log(ERROR, msg, args, **kwargs)
File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/logging/__init__.py", line 1684, in _log
self.handle(record)
I'm start to think if we can simplify the implementation a lot if we:
- stop support stdio-based server
- and use tools like https://github.com/supercorp-ai/supergateway to convert any studio to SSE server
But even this, I imagine we'll need to wait until the server is fully initialized before we can get the tool list, right?
I think the main pain point is coming from: We need to know what the existing tools are BEFORE we initialize the runtime. And if we want to support "mcp-stdio", it means we need to initialize the client first, which probably means we need to have stuff like "docker"/"npx" installed locally for the backend to use. But lots of users are using OpenHands via I just tried https://github.com/supercorp-ai/supergateway locally and it seems to me that it is not super hard to use. Maybe we can:
We are just making an assumption that: the MCP SSE server should be ONLINE BEFORE the agent start. Which I don't think it is too bad? |
@ducphamle2 LMK what you think! I can also help with these changes by sending in PRs to your branch if that helps 🙏 |
@xingyaoww I agree with removing the stdio server!. Tbh, I have never been a fan of stdio-based MCPs. I always use SSE MCP cuz it is very like Plug and Play, and the MCP client doesn't have to care about the MCP's runtime environment. I included stdio because I thought you would prefer it like Claude Desktop / Cursor supporting both standards. The MCP team is cooking a new upgrade for the SSE, which is Streamable HTTP -> people will mostly use it instead of stdio IMO. |
chore: revert original prompts codeact chore: re-add system prompt chore: remove playwright mcp custom chore: remove unused
930388b
to
ca7bd14
Compare
@ducphamle2 awesome! Do you have the bandwidth to help remove the Stdio configs? No worries if not! I can also probably help |
@xingyaoww im working on it now, expecting a new commit in the next hour or sooner |
@xingyaoww pls take a look once you have time, thanks! Also, I probably need some help to pass the tests if you have time. For some reason, the tests passed locally but didn't on CI |
End-user friendly description of the problem this fixes or functionality that this introduces.
Add support for MCP servers via
CodeActAgent
TODOs:
Give a summary of what the PR does, explaining any non-trivial design decisions.
This PR enables arbitrary MCP server usage (mainly focused and tested with SSE MCP Servers) by:
Providing MCP servers in
config.toml
like below:/execute_action
endpoint.action.arguments
MCPObservation
instance, and fed into the LLM messages again for the next step.This design will allow arbitrary MCP servers to be integrated into OpenHands seamlessly without breaking the existing architecture. It will co-exist with the microagents, as two features are different by design.
We can customize how we handle MCP Observations, especially for browser-related MCPs like
playwright-mcp
.Some potential limitations (should be addressed in different PRs):
Potential overlap with built-in tools. However, MCP servers can also have overlapping tool names, so this problem depends on the MCP Server devs, not particularly ours.
Need more tests.
No MCP Authentication yet. This can be made via a different PR.
Hasn't utilized the multi-agentic flow yet (different PR).
Link of any specific issues this addresses.
#5781
#7547
This is a different approach compared to this PR: #7620. I am more than happy to collaborate with @ryanhoangt to merge and re-use each other's code if possible.