Releases · ServiceNow/BrowserGym

23 Oct 19:06

github-actions

v0.10.1

e67ade2

v0.10.1: Benchmark updates

Minor changes

train / test splits for WorkArena L2 and L3 tasks #203
More fine-grained per-benchmark action sets #202

Full Changelog: v0.10.0...v0.10.1

Assets 30

23 Oct 14:50

github-actions

v0.10.0

e430bb8

v0.10.0: AssistantBench! 🎉

New features

New BrowserGym benchmark AssistantBench, packaged as browsergym-assistantbench. Thanks @oriyor ! #186

import browsergym.assistantbench

env = gym.make("browsergym/assistantbench.validation.12")
env = gym.make("browsergym/assistantbench.test.42")

Default train/test splits for all benchmarks

miniwob = DEFAULT_BENCHMARKS["miniwob"]  # 125 tasks x 5 seeds
miniwob_train = miniwob.subset_from_split("train")  # 62 tasks x 5 seeds
miniwob_test = miniwob.subset_from_split("test")  # 63 tasks x 5 seeds

Breaking Changes

Various updates and refactors to the new Benchmark class #197 #198 #199

Fixes

Improved experiment logging #182

Full Changelog: v0.9.0...v0.10.0

Contributors

oriyor

Assets 30

19 Oct 01:16

github-actions

v0.9.0

eac373c

v0.9.0: Benchmarks! 🎉

New features

Benchmarks with default config (tasks x seeds) and metadata #173 #191

from browsergym.experiments import BENCHMARKS, Benchmark

# make a custom benchmark
benchmark = Benchmark(
  name="miniwob_click_test",
  high_level_action_set_args=HighLevelActionSetArgs(
    subsets=["bid"],
    multiaction=False,
    strict=False,
    retry_with_force=False,
    demo_mode="off",
  ),
  env_args_list=[
    EnvArgs(
      task_name="miniwob.click-test",
      task_seed=42,
      max_steps=5,
   )
  ],
)

# use a pre-existing benchmark
miniwob = BENCHMARKS["miniwob_all"]()

# use only a task subset
miniwob_original = miniwob.subset_from_glob(
 column="miniwob_category", glob="original"
)

New playwright key modifier "ControlOrMeta" #187

Global demo_mode flag #177

import browsergym.core.action

browsergym.core.action.set_global_demo_mode(True)  # boolean

Fixes

Multi-tab actions fix #188

Full Changelog: v0.8.1...v0.9.0

Assets 26

15 Oct 20:04

gasse

v0.8.1

2ce05b2

v0.8.1 - SoM bugfix

Fixes

browsergym-core

fixed a bug with set-of-marks line drawing #184 #185

Assets 2

08 Oct 21:40

github-actions

v0.8.0

8e8e616

v0.8.0: goal_object

browsergym-core

Breaking changes
- goal refactor #110
  obs["goal_object"] now replaces the old obs["goal_image_urls"]
  obs["goal"] is now deprecated
  the new goal_object now contains a list of openai-style messages, which can include an arbitrary mix of text and / or images.

browsergym-visualwebarena

Breaking changes
- goal refactor #110, the goal is now a list of openai-style messages with goal images as base64 image_url messages.
Fixes
- goal images are now self-hosted as part of the homepage #171 #165

browsergym-experiments

Improvements
- leaner trace files #169

other

the legacy demo agent has been removed
the basic demo agent has been leaned out and upgraded to support the new goal_object format #110
other minor changes #166 #164

Assets 26

27 Sep 13:39

github-actions

v0.7.1

e32e56d

v0.7.1: bugfixes

browsergym-core

Depencency bump
- playwright dependency bumped from playwright>=1.32,<1.40 to playwright>=1.39,==1.* (fixes problems with 1.32.1) #159

browsergym-experiments

Bugfixes
- #155 introduced a bug when agents returns action==None. This was fixed by #163

Assets 26

20 Sep 19:58

github-actions

v0.7.0

ed6d699

v0.7.0: changes in experiments

browsergym-experiments

Breaking change
- save package versions to a separate file package_versions.txt instead of the logs #152
Bugfixes
- minor fixes #154 #155

Assets 26

19 Sep 13:50

github-actions

v0.6.4

a834154

v0.6.4: bugfix

browsergym-core

Bugfix
- fixed bug in Frame has been detached catching #153

Assets 26

17 Sep 18:50

github-actions

v0.6.3

2a171c4

v0.6.3: more robust IFrame handling

browsergym-core

Improvements
- ignoring Frame has been detached errors during unmarking #148 #147
- more robust handling of missing frames in extract_merged_axtree #148 #146

Assets 26

17 Sep 14:52

github-actions

v0.6.2

79e00a6

v0.6.2: minor update

browsergym-experiments

New features
- new field html_page: str in AgentInfo to enable agents to send an HTML page for visualization #13

Assets 26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor changes

New features

Breaking Changes

Fixes

Contributors

New features

Fixes

Fixes

Releases: ServiceNow/BrowserGym

v0.10.1: Benchmark updates

Minor changes

v0.10.0: AssistantBench! 🎉

New features

Breaking Changes

Fixes

Contributors

v0.9.0: Benchmarks! 🎉

New features

Fixes

v0.8.1 - SoM bugfix

Fixes

v0.8.0: goal_object

v0.7.1: bugfixes

v0.7.0: changes in experiments

v0.6.4: bugfix

v0.6.3: more robust IFrame handling

v0.6.2: minor update