-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
我该怎么解决这个问题,跑mind2web,不太清楚该如何操作这个任务,能给出一些具体的指导吗,谢谢 #119
Comments
Hi, @Ethan-2004 Heartbeat failed的原因是Task worker无法连接到Task controller,从第一张图来看你的assigner是可以连接到controller的,我建议检查一下worker和controller之间的通讯是否顺畅 |
Hi, @zhc7 ,希望能得到您的帮助。 |
I am encountering a similar issue with the following tasks: ltp-std It appears that the server cannot be established using the command |
Hi, @Joe-2002 你是在Windows上运行的吗?Windows上os环境确实有几率出现这个问题,我们建议在linux环境里运行。这似乎是另外一个问题,可以单开一个issue讨论。 |
Hi, @Taishi-N324. Can #120 solve your problem? We can't reproduce your problem currently. |
这个代码是可以解决我的问题的
***@***.***>
时间: 2024年2月27日 (周二) 下午5:49
主题: Re: [THUDM/AgentBench] 我该怎么解决这个问题,跑mind2web,不太清楚该如何操作这个任务,能给出一些具体的指导吗,谢谢 (Issue #119)
***@***.***>
***@***.***>, ***@***.***>
Hi, @Taishi-N324<https://github.com/Taishi-N324>. Can #120<#120> solve your problem? We can't reproduce your problem currently.
—
Reply to this email directly, view it on GitHub<#119 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANOS6X4VR6PCVQUHBPO7CKLYVWT2TAVCNFSM6AAAAABDQU53XOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRWGE3DONBYHA>.
You are receiving this because you were mentioned.[image: https://github.com/notifications/beacon/ANOS6XYANOWFFNZRWPESPRDYVWT2TA5CNFSM6AAAAABDQU53XOWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTTVGFK4A.gif]Message ID: ***@***.***>
[image]
|
Hi, @zhc7 this does not solve my problem Configure start_task.yaml with the following content:
Configure default.yaml in the configs/assignments directory as follows:
I'm going with this setup.
|
Hi, @Taishi-N324 Are you running on Windows or Linux? Is there any error log after executing |
Hi, @zhc7 Thank you for your response. I am running this on an AWS Linux instance. Previously, I was using Docker in rootless mode, but after switching to running it as root, tasks like m2w-std, alfworld-std and webshop-std started working. However, tasks like cg-std, and ltp-std are still not functioning. As reported in #63, it seems that the workers are not responding or are not being assigned to inference. https://github.com/THUDM/AgentBench/blob/main/src/assigner.py#L161-L236 |
Hi, @Taishi-N324 . Please note that different from other tasks ltp does not use prebuilt docker evironment. It is possible that the reason behind this is that some dependencies fail to meet requirements. As for other tasks you mentioned, please make sure yo have downloaded corresponding docker images. Anyway, it would be very helpful if there were any error messages. In addition, there's a difference between Also, please notice that some environments may take a while to start. As for switching to root, I have no idea. This project does not require root. This might not be directly related to this problem. |
hi, @zhc7 thank you for your assistance For the LTP task, the issue was related to the rounds not being properly loaded from https://github.com/THUDM/AgentBench/blob/main/configs/tasks/ltp.yaml#L6, where it's specified as 25 rounds, but instead, 50 rounds were being used as per https://github.com/THUDM/AgentBench/blob/main/src/server/tasks/ltp/task.py#L372. This discrepancy led to errors due to longer sequence lengths. By adjusting it to 25 rounds, the evaluation was able to proceed correctly. I assume the paper probably conducts evaluations using 25 rounds, right? Regarding the cg-std task, the error encountered is as follows: {"index": 19, "error": "START_FAILED", "info": "{"detail":"Error: Worker not responding\n"}", "output": null, "time": {"timestamp": 1709288724712, "str": "2024-03-01 19:25:24"}} |
@zhc7 |
Thank you for pointing that out. I'm glad you solved the problem. I will update the repo.
Detailed evaluation procedure and settings can be found in the paper. We used
Thank you for your detailed investigation. It is known to us that this task might occationally get stuck. But we can't reproduce it or figure out when or why it happens. If you managed to solve it, we'd be happy to merge your pr! |
Hi @zhc7, I will look into the cg-std task when I have some time. Once the issue is resolved, I will send a pull request. Regarding m2w-std, it seems that there are only 100 tasks available for evaluation, but according to the paper, there should be 177 tasks. Why is it that there are only 100? |
Hi @zhc7, Regarding the debugging of the card_game task, it appears that the hanging issue was due to the lack of execution permissions for |
Our apologies, there was a update to mind2web task and due to various reasons, statistics in the paper is slightly behind. The actual number should be 100.
Great! Thanks. I'm glad you solved the problem. Would you like to make a pull request? If I understood correctly, |
这是我的报错
这是我的配置文件
The text was updated successfully, but these errors were encountered: