Nova Act can:
- Interact with web interfaces
- Extract information from web pages
- Perform automated UI tasks
Go to nova.amazon.com/act and sign in with your amazon.com account. You can generate your API key and get started building workflows. Currently, Nova Act is only available for US-based users.
No, Nova Act is not available as an AWS product, but AWS users are more than welcome to try it out. During this experimental phase, you need to sign up using your amazon.com account. Refer to Question 1 to learn how to access Nova Act.
Nova Act is a research preview which is free to use. Customers get a daily quota of requests.
Nova Act is currently available in the US. If you are interested in using Nova Act in a different region, please let us know. We are tracking feature requests in our GitHub repo, so please +1 and add a comment where you'd like to see us expand to next.
As of now, Nova Act is only available for US-based users. We have not published timelines for availability outside the US.
We highly encourage users to share their workflows with others in the community. Please make a Pull Request (PR) with your script in the Nova Act GitHub samples folder. Our team will analyze your workflow and, if approved, it will be merged into the repository.
Resources available include:
- Nova Act Web Page: https://nova.amazon.com/act
- Nova Act Blog Post: https://labs.amazon.science/blog/nova-act
- GitHub repository: https://github.com/aws/nova-act
- Code samples: https://github.com/aws/nova-act/tree/main/src/nova_act/samples
For security reasons, Nova Act has guardrails that prevent it from handling password inputs or sensitive authentication data. We recommend to use PlayWright APIs for these cases. Check the section about how to enter sensitive information in our documentation: Entering sensitive information.
Currently, Nova Act is limited to browser automation only. We do not support direct computer use yet. However, we have been able to do simple things by launching a browser window pointed to a remote desktop OS VM and then actuating the window.
Nova Act is an independent agentic system in itself and it works end-to-end from the act()
call to the model. It is not designed to be integrated with LangChain.
We are seeing users experiment with Nova Act in many different areas. Some common themes are in automating tasks within QA testing, market research, customer support, and simulating customer journeys.
Yes, you can set the parameter headless
to True
to run Nova Act in headless mode. The default is False
.
Technical Question 6: Can it copy text from a browser window and then paste it into an installed application, for example Excel?
Currently, Nova Act is limited to browser automation only. However, you can use Python functions to return text, JSON or even create a CSV file.
The SDK only works with the Nova Act model.
Yes, the SDK is currently only available for Python.
Technical Question 9: When running a workflow, will Nova Act ask the user for clarification if needed to confirm certain tasks?
Nova Act does not have a function to ask users for clarification. Nova Act was designed to be fully automated and we do not expect users to keep monitoring what it is doing. When you create your workflow, you can leverage Python functions to create conditions where the Agent will take specific tasks.
Technical Question 10: Did the Nova Act team publish any performance metrics using the standard public benchmarks?
Yes, you can refer to the benchmark metrics we published in our blog post. We've focused on scoring >90% on internal evals of capabilities that trip up other models, such as date picking, drop downs, and pop-ups, and achieving best-in-class performance on benchmarks like ScreenSpot and GroundUI Web which most directly measure the ability for our model to actuate the web.
No, Nova Act SDK is not currently supported within those environments.
Nova SDK is officially supported on MacOS and Ubuntu. Users have been successful running the Nova Act SDK on an Ubuntu instance running on WSL2. You can learn more about this setup here.
Breaking down your prompt into more discrete steps can help. Higher level reasoning often takes longer to execute, so taking a step-wise approach may help.
Technical Question 14: Is there a way to have Nova Act remember what it did so it could re-use what it learned about the UI?
You can use the Chrome user data directory to save the session state and restart mid-point.