Releases: ServiceNow/BrowserGym
v0.3.4: minor fixes
browsergym-core
-
Refactors
- minor error message refactors
- module-based logging traces
-
Bugfixes
overlay_som()
bugfix divide by 0
browsergym-experiments
-
New features
- new field
StepInfo.task_info
- new field
-
Bugfixes
loop
should be robust to crashes and interrupts
other
- missing requirements for
demo_agent
- Makefile fixes
v0.3.3: Observation feature hide_bid_if_invisible (#43)
browsergym-core
-
New features
- Makefile 🎉 (thanks @imenelydiaker )
make install # local pip install of all packages make install-demo make demo
- option
hide_bid_if_invisible: boolean = False
forflatten_axtree_to_str()
andflatten_dom_to_str()
- Makefile 🎉 (thanks @imenelydiaker )
-
bugfixes
- missing
timeout=500
for high-level click action (was hanging for very long if unsuccessful)
- missing
v0.3.2: `report_infeasible()` action
browsergym-core
-
New features
report_infeasible(reason)
action
Adds a message{"role": "infeasible", "message": reason}
in the chat, and terminates the episode. Validations functions which expect the agent to report a task as infeasible can look for a message with the "infeasible" role, and check for its content if a specific reason is required.- new high-level action
report_infeasible(reason)
and action set "infeas" - new python primitive
report_infeasible_instructions(reason)
- new message role "infeasible" in the chat
- new BrowserEnv argument
terminate_on_infeasible: boolean = True
- new high-level action
-
Breaking changes
- methods
flatten_axtree_to_str()
andflatten_dom_to_str()
return more compact representations- properties
som, clickable, visible
instead ofsom="1", clickable="1", visible="1"
visible="0"
is no more printed
- properties
- methods
-
Bugfixes
- minor fixes in
browsergym.experiment.loop
- minor fixes in
v0.3.1: experiment loop bugfix
browsergym-experiment
-
Bugfixes
- save the last observation in
browsergym.experiments.loop
- save the last observation in
v0.3.0: Agent API refactor
browsergym-experiments
-
Breaking changes
Agent
API refactor# former API Agent.action_mapping(action: str) -> str Agent.observation_mapping(obs: dict) -> Any # new API Agent.action_set: AbstractActionSet Agent.obs_preprocessor(obs: dict) -> Any
v0.2.6: Experiments
browsergym-core
-
New features
- a bunch of tools to run and records experiments in
browsergym.experiments
- chat messages now have a "timestamp" info
- a new simple, lean demo_agent (previous demo agent will be moved somewhere else at some point, expect it to disappear soon)
- a bunch of tools to run and records experiments in
-
Bugfixes
- duplicate bid fix (not perfect but should solve 99% of the cases)
v0.2.2: Keyword arguments in high-level action space
browsergym-core
- minor fix: high-level action parser now properly handles keyword arguments in Python function calls (were converted to non-keyword arguments before)
browsergym-webarena
- minor fix: synced with latest webarena version (libwebarena=0.0.3), mostly typo fixes in task intents
v0.2.1: Set-of-Marks, visibility, bbox and more!
browsergym-core
-
New features
-
🎉 Set-of-Marks 🎉 a new method is available to easily overlay element boxes and
bid
attributes on top of the screenshot, following ideas from WebVoyager and OSWorldfrom browsergym.utils.obs import overlay_som ... obs, info = env.reset() screenshot_with_som = overlay_som(obs["screenshot"], obs["extra_element_properties"], fontsize = 12, linewidth = 2, tag_margin = 2)
-
new high-level actions
upload_file
andmouse_upload_file
-
new field
"extra_element_properties"
in each observation. Contains a dict withbid
keys, which gives the extra properties computed by browsergym for every element with a bid on the current page. Example:{ "23": { "visibility": 0.6, # float between 0 and 1 "bbox": [56, 345, 12, 20], # [x, y, width, height] "clickable": True, # boolean "set_of_marks": False, # boolean }
-
new
set_of_marks
property (computed with JS tagbrowsergym_set_of_marks
), following WebVoyager implementation (boolean 0 or 1, whether element should be part of the set-of-marks overlay) -
new
clickable
property, extracted from Chrome's DOMSnapshot'sisClickable
-
new info fields
"action_exec_start"
,"action_exec_timeout"
and"action_exec_stop"
after eachenv.step()
call, useful for video editing -
new
resizeable_window
parameter inBrowserEnv
to switch between setting the viewport size via Chrome (previous behavior, resizeable window and viewport) or via Playwright (new default behaviour, viewport is not resizeable)
-
-
Breaking changes
- changed visibility tag in JS from
browsergym_is_in_viewport
(boolean 0 or 1) tobrowsergym_visibility_ratio
(value between 0.0 and 1.0), extracted as thevisibility
extra property (see new features) BrowserEnv
parametersviewport
(viewport size),slow_mo
(pause between playwright calls) andtimeout
(default playwright timeout) are now provided by the task. They can still be set in the environment's constructor to override the value provided by the task, which will display a warning.- each task inheriting
AbstractBrowserTask
must now take a seed at instantiation (in constructor), instead of via thetask.setup()
method. This is also where each task should decide its desired browser setting by setting its attributestask.viewport
,task.slow_mo
andtask.timeout
(see point above)
- changed visibility tag in JS from
-
Refactors
- bid-based high-level actions fail faster (500 ms)
- shorter nested bids with alphabetical bids for iframes (
21-53
->a53
) - fix mouse display position in demo mode (
absolute
->fixed
) - modern chat theme
- refactored coordinate computation using Chrome's DOMSnapshot instead of JS, should be more robust to edge cases
- refactored visibility computation using the
IntersectionObserver
API, should be more robust to edge cases - more robust frame marking, supports edge cases such as sandboxed iframes, and pdf viewers in
<embed>
tags
browsergym-miniwob
- fixed goal conversion to text in task
browsergym/miniwob/click-menu-2
v0.1.0rc7
version bump
v0.1.0rc6
rc6