Releases: ServiceNow/BrowserGym
Releases · ServiceNow/BrowserGym
v0.10.1: Benchmark updates
Minor changes
- train / test splits for WorkArena L2 and L3 tasks #203
- More fine-grained per-benchmark action sets #202
Full Changelog: v0.10.0...v0.10.1
v0.10.0: AssistantBench! 🎉
New features
- New BrowserGym benchmark AssistantBench, packaged as
browsergym-assistantbench
. Thanks @oriyor ! #186import browsergym.assistantbench env = gym.make("browsergym/assistantbench.validation.12") env = gym.make("browsergym/assistantbench.test.42")
- Default train/test splits for all benchmarks
miniwob = DEFAULT_BENCHMARKS["miniwob"] # 125 tasks x 5 seeds miniwob_train = miniwob.subset_from_split("train") # 62 tasks x 5 seeds miniwob_test = miniwob.subset_from_split("test") # 63 tasks x 5 seeds
Breaking Changes
Fixes
- Improved experiment logging #182
Full Changelog: v0.9.0...v0.10.0
v0.9.0: Benchmarks! 🎉
New features
- Benchmarks with default config (tasks x seeds) and metadata #173 #191
from browsergym.experiments import BENCHMARKS, Benchmark # make a custom benchmark benchmark = Benchmark( name="miniwob_click_test", high_level_action_set_args=HighLevelActionSetArgs( subsets=["bid"], multiaction=False, strict=False, retry_with_force=False, demo_mode="off", ), env_args_list=[ EnvArgs( task_name="miniwob.click-test", task_seed=42, max_steps=5, ) ], ) # use a pre-existing benchmark miniwob = BENCHMARKS["miniwob_all"]() # use only a task subset miniwob_original = miniwob.subset_from_glob( column="miniwob_category", glob="original" )
- New playwright key modifier "ControlOrMeta" #187
- Global demo_mode flag #177
import browsergym.core.action browsergym.core.action.set_global_demo_mode(True) # boolean
Fixes
- Multi-tab actions fix #188
Full Changelog: v0.8.1...v0.9.0
v0.8.1 - SoM bugfix
v0.8.0: goal_object
browsergym-core
- Breaking changes
- goal refactor #110
obs["goal_object"]
now replaces the oldobs["goal_image_urls"]
obs["goal"]
is now deprecated
the newgoal_object
now contains a list of openai-style messages, which can include an arbitrary mix of text and / or images.
- goal refactor #110
browsergym-visualwebarena
-
Breaking changes
- goal refactor #110, the goal is now a list of openai-style messages with goal images as base64
image_url
messages.
- goal refactor #110, the goal is now a list of openai-style messages with goal images as base64
-
Fixes
browsergym-experiments
- Improvements
- leaner trace files #169
other
v0.7.1: bugfixes
v0.7.0: changes in experiments
v0.6.4: bugfix
browsergym-core
- Bugfix
- fixed bug in
Frame has been detached
catching #153
- fixed bug in
v0.6.3: more robust IFrame handling
v0.6.2: minor update
browsergym-experiments
- New features
- new field
html_page: str
inAgentInfo
to enable agents to send an HTML page for visualization #13
- new field