Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker内运行lunarlander_dqn_deploy失败 #793

Closed
Tracked by #548
Eric-Zhao1 opened this issue Apr 29, 2024 · 6 comments
Closed
Tracked by #548

docker内运行lunarlander_dqn_deploy失败 #793

Eric-Zhao1 opened this issue Apr 29, 2024 · 6 comments
Labels
bug Something isn't working docker Add or update dockerfile

Comments

@Eric-Zhao1
Copy link

docker运行的是ding:nightly版本,docker pull opendilab/ding:nightly。
运行使用 DQN 算法训练的智能体模型: final.pth.tar
报错如下:
3

有遇到过类似问题吗?

@PaParaZz1 PaParaZz1 added bug Something isn't working docker Add or update dockerfile labels Apr 30, 2024
@xyzQ2
Copy link

xyzQ2 commented May 1, 2024

Error:

AttributeError Traceback (most recent call last)
Cell In[9], line 32
29 print(f'Deploy is finished, final epsiode return is: {returns}')
31 if name == "main":
---> 32 main(main_config=main_config, create_config=create_config, ckpt_path='/Users/qboy/Downloads/rl/final.pth.tar')

Cell In[9], line 14
12 main_config.exp_name = 'default' # Set the name of the experiment to be run in this deployment, which is the name of the project folder to be created
13 cfg = compile_config(main_config, create_cfg=create_config, auto=True) # Compile and generate all configurations
---> 14 env = DingEnvWrapper(gym.make(cfg.env.env_id), EasyDict(env_wrapper='default')) # Add the DI-engine environment decorator upon the gym's environment instance
15 #env.enable_save_replay(replay_path='./lunarlander_dqn_deploy/video') # Enable the video recording of the environment and set the video saving folder
16 model = DQN(**cfg.policy.model) # Import model configuration, instantiate DQN model

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/gym/envs/registration.py:581, in make(id, max_episode_steps, autoreset, apply_api_compatibility, disable_env_checker, **kwargs)
578 env_creator = spec_.entry_point
579 else:
580 # Assume it's a string
--> 581 env_creator = load(spec_.entry_point)
583 mode = _kwargs.get("render_mode")
584 apply_human_rendering = False

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/gym/envs/registration.py:61, in load(name)
52 """Loads an environment with name and returns an environment creation function
53
54 Args:
(...)
58 Calls the environment constructor
59 """
60 mod_name, attr_name = name.split(":")
---> 61 mod = importlib.import_module(mod_name)
62 fn = getattr(mod, attr_name)
63 return fn

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/importlib/init.py:126, in import_module(name, package)
124 break
125 level += 1
--> 126 return _bootstrap._gcd_import(name[level:], package, level)

File :1050, in _gcd_import(name, package, level)

File :1027, in find_and_load(name, import)

File :992, in find_and_load_unlocked(name, import)
...
--> 435 _Box2D.RAND_LIMIT_swigconstant(_Box2D)
436 RAND_LIMIT = _Box2D.RAND_LIMIT
438 def b2Random(*args):

AttributeError: module '_Box2D' has no attribute 'RAND_LIMIT_swigconstant'

Code (Exactly from tutorial example)
import gym # Load the gym library, which is used to standardize the reinforcement learning environment
import torch # Load the PyTorch library for loading the Tensor model and defining the computing network
from easydict import EasyDict # Load EasyDict for instantiating configuration files
from ding.config import compile_config # Load configuration related components in DI-engine config module
from ding.envs import DingEnvWrapper # Load environment related components in DI-engine env module
from ding.policy import DQNPolicy, single_env_forward_wrapper # Load policy-related components in DI-engine policy module
from ding.model import DQN # Load model related components in DI-engine model module
from dizoo.box2d.lunarlander.config.lunarlander_dqn_config import main_config, create_config # Load DI-zoo lunarlander environment and DQN algorithm related configurations

def main(main_config: EasyDict, create_config: EasyDict, ckpt_path: str):
main_config.exp_name = 'default' # Set the name of the experiment to be run in this deployment, which is the name of the project folder to be created
cfg = compile_config(main_config, create_cfg=create_config, auto=True) # Compile and generate all configurations
env = DingEnvWrapper(gym.make(cfg.env.env_id), EasyDict(env_wrapper='default')) # Add the DI-engine environment decorator upon the gym's environment instance
#env.enable_save_replay(replay_path='./lunarlander_dqn_deploy/video') # Enable the video recording of the environment and set the video saving folder
model = DQN(**cfg.policy.model) # Import model configuration, instantiate DQN model
state_dict = torch.load(ckpt_path, map_location='cpu') # Load model parameters from file
model.load_state_dict(state_dict['model']) # Load model parameters into the model
policy = DQNPolicy(cfg.policy, model=model).eval_mode # Import policy configuration, import model, instantiate DQN policy, and turn to evaluation mode
forward_fn = single_env_forward_wrapper(policy.forward) # Use the strategy decorator of the simple environment to decorate the decision method of the DQN strategy
obs = env.reset() # Reset the initialization environment to get the initial observations
returns = 1000. # Initialize total reward
while True: # Let the agent's strategy and environment interact cyclically until the end
action = forward_fn(obs) # According to the observed state, make a decision and generate action
obs, rew, done, info = env.step(action) # Execute actions, interact with the environment, get the next observation state, the reward of this interaction, the signal of whether to end, and other information
returns += rew # Cumulative reward return
if done:
break
print(f'Deploy is finished, final epsiode return is: {returns}')

if name == "main":
main(main_config=main_config, create_config=create_config, ckpt_path='/Users/qboy/Downloads/rl/final.pth.tar')

Attempted Fix
Uninstalled Box2D and reinstalled with same issue

Overall, finding issues with most of the code examples. If you are not supporting the library anymore, no problem. Please state. Thank you.

@PaParaZz1
Copy link
Member

docker运行的是ding:nightly版本,docker pull opendilab/ding:nightly。 运行使用 DQN 算法训练的智能体模型: final.pth.tar 报错如下: 3

有遇到过类似问题吗?

这应该是训练完成后存储 replay 视频时的问题(缺少 libx264 库),但是我们在最新的 docker pull opendilab/ding:nightly 镜像(IMAGE ID 01c195e0ee17)中运行类似的存储 replay 任务并未出现该问题。你可以检查一下镜像是否对齐,gym 版本是不是 0.25.1,以及你具体跑存视频的代码是什么?

@Eric-Zhao1
Copy link
Author

docker运行的是ding:nightly版本,docker pull opendilab/ding:nightly。 运行使用 DQN 算法训练的智能体模型: final.pth.tar 报错如下: 3
有遇到过类似问题吗?

这应该是训练完成后存储 replay 视频时的问题(缺少 libx264 库),但是我们在最新的 docker pull opendilab/ding:nightly 镜像(IMAGE ID 01c195e0ee17)中运行类似的存储 replay 任务并未出现该问题。你可以检查一下镜像是否对齐,gym 版本是不是 0.25.1,以及你具体跑存视频的代码是什么?

最新版本opendilab/ding:nightly 镜像,gym版本是0.25.1
运行代码是https://di-engine-docs.readthedocs.io/zh-cn/latest/01_quickstart/hello_world_for_DI_zh.html
中“先定一个小目标:让你的智能体动起来”部分,代码如下:

import gym # 载入 gym 库,用于标准化强化学习环境
import torch # 载入 PyTorch 库,用于加载 Tensor 模型,定义计算网络
from easydict import EasyDict # 载入 EasyDict,用于实例化配置文件
from ding.config import compile_config # 载入DI-engine config 中配置相关组件
from ding.envs import DingEnvWrapper # 载入DI-engine env 中环境相关组件
from ding.policy import DQNPolicy, single_env_forward_wrapper # 载入DI-engine policy 中策略相关组件
from ding.model import DQN # 载入DI-engine model 中模型相关组件
from dizoo.box2d.lunarlander.config.lunarlander_dqn_config import main_config, create_config # 载入DI-zoo lunarlander 环境与 DQN 算法相关配置


def main(main_config: EasyDict, create_config: EasyDict, ckpt_path: str):
    main_config.exp_name = 'lunarlander_dqn_deploy' # 设置本次部署运行的实验名,即为将要创建的工程文件夹名
    cfg = compile_config(main_config, create_cfg=create_config, auto=True) # 编译生成所有的配置
    env = DingEnvWrapper(gym.make(cfg.env.env_id), EasyDict(env_wrapper='default')) # 在gym的环境实例的基础上添加DI-engine的环境装饰器
    env.enable_save_replay(replay_path='./lunarlander_dqn_deploy/video') # 开启环境的视频录制,设置视频存放位置
    model = DQN(**cfg.policy.model) # 导入模型配置,实例化DQN模型
    state_dict = torch.load(ckpt_path, map_location='cpu') # 从模型文件加载模型参数
    model.load_state_dict(state_dict['model']) # 将模型参数载入模型
    policy = DQNPolicy(cfg.policy, model=model).eval_mode # 导入策略配置,导入模型,实例化DQN策略,并选择评价模式
    forward_fn = single_env_forward_wrapper(policy.forward) # 使用简单环境的策略装饰器,装饰DQN策略的决策方法
    obs = env.reset() # 重置初始化环境,获得初始观测
    returns = 0. # 初始化总奖励
    while True: # 让智能体的策略与环境,循环交互直到结束
        action = forward_fn(obs) # 根据观测状态,决定决策动作
        obs, rew, done, info = env.step(action) # 执行决策动作,与环境交互,获得下一步的观测状态,此次交互的奖励,是否结束的信号,以及其它环境信息
        returns += rew # 累计奖励回报
        if done:
            break
    print(f'Deploy is finished, final epsiode return is: {returns}')

if __name__ == "__main__":
    main(main_config=main_config, create_config=create_config, ckpt_path='./final.pth.tar')

@Eric-Zhao1
Copy link
Author

可以确定是保存视频导致的问题,代码对应
env.enable_save_replay(replay_path='./lunarlander_dqn_deploy/video')
但是之前尝试安装 libx264 库,并未解决,不知是否有解决方案?

@PaParaZz1
Copy link
Member

可以确定是保存视频导致的问题,代码对应 env.enable_save_replay(replay_path='./lunarlander_dqn_deploy/video') 但是之前尝试安装 libx264 库,并未解决,不知是否有解决方案?

这个问题解决了。原因是 pytorch 官方镜像中安装的 ffmpeg 版本是 4.3.0,会和镜像中默认的 libx264 库有冲突问题。使用命令conda install -c conda-forge ffmpeg==4.2.2安装低版本的 ffmpeg,然后就可以正常生成视频了。

@Eric-Zhao1
Copy link
Author

可以确定是保存视频导致的问题,代码对应 env.enable_save_replay(replay_path='./lunarlander_dqn_deploy/video') 但是之前尝试安装 libx264 库,并未解决,不知是否有解决方案?

这个问题解决了。原因是 pytorch 官方镜像中安装的 ffmpeg 版本是 4.3.0,会和镜像中默认的 libx264 库有冲突问题。使用命令conda install -c conda-forge ffmpeg==4.2.2安装低版本的 ffmpeg,然后就可以正常生成视频了。

测试确实正常了,靠谱👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working docker Add or update dockerfile
Projects
None yet
Development

No branches or pull requests

3 participants