Releases · ServiceNow/BrowserGym

New features
- Makefile 🎉 (thanks @imenelydiaker )
```
make install  # local pip install of all packages
make install-demo
make demo
```
- option hide_bid_if_invisible: boolean = False for flatten_axtree_to_str() and flatten_dom_to_str()
bugfixes
- missing timeout=500 for high-level click action (was hanging for very long if unsuccessful)

Contributors

imenelydiaker

Assets 22

23 May 14:43

github-actions

v0.3.2

0d92b7f

v0.3.2: `report_infeasible()` action

browsergym-core

New features
- report_infeasible(reason) action
  Adds a message {"role": "infeasible", "message": reason} in the chat, and terminates the episode. Validations functions which expect the agent to report a task as infeasible can look for a message with the "infeasible" role, and check for its content if a specific reason is required.
  - new high-level action report_infeasible(reason) and action set "infeas"
  - new python primitive report_infeasible_instructions(reason)
  - new message role "infeasible" in the chat
  - new BrowserEnv argument terminate_on_infeasible: boolean = True
Breaking changes
- methods flatten_axtree_to_str() and flatten_dom_to_str() return more compact representations
  - properties som, clickable, visible instead of som="1", clickable="1", visible="1"
  - visible="0" is no more printed
Bugfixes
- minor fixes in browsergym.experiment.loop

Assets 22

22 May 18:52

github-actions

v0.3.1

7f90c98

v0.3.1: experiment loop bugfix

browsergym-experiment

Bugfixes
- save the last observation in browsergym.experiments.loop

Assets 22

22 May 18:34

github-actions

v0.3.0

2037c36

v0.3.0: Agent API refactor

browsergym-experiments

Breaking changes

Agent API refactor

# former API
Agent.action_mapping(action: str) -> str
Agent.observation_mapping(obs: dict) -> Any

# new API
Agent.action_set: AbstractActionSet
Agent.obs_preprocessor(obs: dict) -> Any

Assets 22

17 May 20:34

github-actions

v0.2.6

daf6aa2

v0.2.6: Experiments

browsergym-core

New features
- a bunch of tools to run and records experiments in browsergym.experiments
- chat messages now have a "timestamp" info
- a new simple, lean demo_agent (previous demo agent will be moved somewhere else at some point, expect it to disappear soon)
Bugfixes
- duplicate bid fix (not perfect but should solve 99% of the cases)

Assets 22

14 May 19:59

github-actions

v0.2.2

f80ff73

v0.2.2: Keyword arguments in high-level action space

browsergym-core

minor fix: high-level action parser now properly handles keyword arguments in Python function calls (were converted to non-keyword arguments before)

browsergym-webarena

minor fix: synced with latest webarena version (libwebarena=0.0.3), mostly typo fixes in task intents

Assets 18

10 May 17:10

github-actions

v0.2.1

c636b3f

v0.2.1: Set-of-Marks, visibility, bbox and more!

browsergym-core

New features
- 🎉 Set-of-Marks 🎉 a new method is available to easily overlay element boxes and bid attributes on top of the screenshot, following ideas from WebVoyager and OSWorld
```
from browsergym.utils.obs import overlay_som

...
obs, info = env.reset()
screenshot_with_som = overlay_som(obs["screenshot"], obs["extra_element_properties"], fontsize = 12, linewidth = 2, tag_margin = 2)
```
- new high-level actions upload_file and mouse_upload_file
- new field "extra_element_properties" in each observation. Contains a dict with bid keys, which gives the extra properties computed by browsergym for every element with a bid on the current page. Example:
```
{
  "23": {
    "visibility": 0.6,  # float between 0 and 1
    "bbox": [56, 345, 12, 20],  # [x, y, width, height]
    "clickable": True,  # boolean
    "set_of_marks": False,  # boolean
}
```
- new set_of_marks property (computed with JS tag browsergym_set_of_marks), following WebVoyager implementation (boolean 0 or 1, whether element should be part of the set-of-marks overlay)
- new clickable property, extracted from Chrome's DOMSnapshot's isClickable
- new info fields "action_exec_start", "action_exec_timeout" and "action_exec_stop" after each env.step() call, useful for video editing
- new resizeable_window parameter in BrowserEnv to switch between setting the viewport size via Chrome (previous behavior, resizeable window and viewport) or via Playwright (new default behaviour, viewport is not resizeable)
Breaking changes
- changed visibility tag in JS from browsergym_is_in_viewport (boolean 0 or 1) to browsergym_visibility_ratio (value between 0.0 and 1.0), extracted as the visibility extra property (see new features)
- BrowserEnv parameters viewport (viewport size), slow_mo (pause between playwright calls) and timeout (default playwright timeout) are now provided by the task. They can still be set in the environment's constructor to override the value provided by the task, which will display a warning.
- each task inheriting AbstractBrowserTask must now take a seed at instantiation (in constructor), instead of via the task.setup() method. This is also where each task should decide its desired browser setting by setting its attributes task.viewport, task.slow_mo and task.timeout (see point above)
Refactors
- bid-based high-level actions fail faster (500 ms)
- shorter nested bids with alphabetical bids for iframes (21-53 -> a53)
- fix mouse display position in demo mode (absolute -> fixed)
- modern chat theme
- refactored coordinate computation using Chrome's DOMSnapshot instead of JS, should be more robust to edge cases
- refactored visibility computation using the IntersectionObserver API, should be more robust to edge cases
- more robust frame marking, supports edge cases such as sandboxed iframes, and pdf viewers in <embed> tags