English | 简体中文
Joyful UI Automation
Midscene.js is an AI-powered automation SDK with the abilities to control the page, perform assertions and extract data in JSON format using natural language.
Instruction | Video |
---|---|
Post a Twitter | twitter-video-1080p.mp4 |
Use JS code to drive task orchestration, collect information about Jay Chou's concert, and write it into Google Docs | google-doc-1080p.mp4 |
From version v0.10.0, we support a new open-source model named UI-TARS
. Read more about it in Choose a model.
- Natural Language Interaction 👆: Describe the steps, and let Midscene plan and control the user interface for you
- Understand UI, Answer in JSON 🔍: Provide prompts regarding the desired data format, and then receive the expected response in JSON format.
- Intuitive Assertion 🤔: Make assertions in natural language; it’s all based on AI understanding.
- Experience by Chrome Extension 🖥️: Start immediately with the Chrome Extension. No code is needed while exploring.
- Visualized Report for Debugging 🎞️: With our visualized report file, you can easily understand and debug the whole process.
- Totally Open Source! 🔥: Experience a whole new world of automation development. Enjoy!
- You can use general-purpose LLMs like
gpt-4o
, it works well for most cases. And also,gemini-1.5-pro
,qwen-vl-max-latest
are supported. - You can also use
UI-TARS
model, which is an open-source model dedicated for UI automation. You can deploy it on your own server, and it will dramatically improve the performance and data privacy. - Read more about Choose a model
There are so many UI automation tools out there, and each one seems to be all-powerful. What's special about Midscene.js?
-
Debugging Experience: You will soon find that debugging and maintaining automation scripts is the real challenge point. No matter how magic the demo is, you still need to debug the process to make it stable over time. Midscene.js offers a visualized report file, a built-in playground, and a Chrome Extension to debug the entire process. This is what most developers really need. And we're continuing to work on improving the debugging experience.
-
Open Source, Free, Deploy as you want: Midscene.js is an open-source project. It's decoupled from any cloud service and model provider, you can choose either public or private deployment. There is always a suitable plan for your business.
-
Integrate with Javascript: You can always bet on Javascript 😎
- Home Page: https://midscenejs.com
- Quick Experience By Chrome Extension, this is where you should get started
- Integration
- Automate with Scripts in YAML, use this if you prefer to write YAML file instead of code
- Bridge Mode by Chrome Extension, use this to control the desktop Chrome by scripts
- Integrate with Puppeteer
- Integrate with Playwright
- API Reference
- Choose a model
- Config Model and Provider
Midscene.js is MIT licensed.