Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark.prepare_backend() #204

Merged
merged 6 commits into from
Oct 24, 2024
Merged

Benchmark.prepare_backend() #204

merged 6 commits into from
Oct 24, 2024

Conversation

gasse
Copy link
Collaborator

@gasse gasse commented Oct 23, 2024

No description provided.

@gasse gasse requested a review from recursix October 23, 2024 21:33
@gasse
Copy link
Collaborator Author

gasse commented Oct 23, 2024

@recursix that's a bit hacky but it should do for now

@@ -55,6 +55,7 @@ class Benchmark(DataClassJsonMixin):
high_level_action_set_args: HighLevelActionSetArgs
is_multi_tab: bool
env_args_list: list[EnvArgs]
full_reset_script: Optional[str]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm this is a symptom that we should have went with a class hieararchy instead of a single class for all benchmarks

@@ -50,6 +53,29 @@ def __init__(

self.credentials = ACCOUNTS

def full_reset(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

helper function instead of code duplicate?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean code duplicate? This code is new

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's 90% the same code on the side of visualwebarena

recursix
recursix previously approved these changes Oct 24, 2024
@gasse gasse changed the title New Benchmark field full_reset_script Benchmark.prepare_backend() Oct 24, 2024
recursix
recursix previously approved these changes Oct 24, 2024
@gasse gasse merged commit 444599b into main Oct 24, 2024
13 checks passed
@gasse gasse deleted the webarena_reset branch October 24, 2024 14:52
@gasse gasse mentioned this pull request Nov 7, 2024
qipeng pushed a commit to orby-ai-engineering/BrowserGym that referenced this pull request Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants