Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HELP: failed to run gym_hybrid_pdqn_config..py #664

Closed
5 of 11 tasks
Tracked by #548
xujinming01 opened this issue May 15, 2023 · 9 comments
Closed
5 of 11 tasks
Tracked by #548

HELP: failed to run gym_hybrid_pdqn_config..py #664

xujinming01 opened this issue May 15, 2023 · 9 comments
Assignees
Labels
bug Something isn't working

Comments

@xujinming01
Copy link

xujinming01 commented May 15, 2023

  • I have marked all applicable categories:
    • exception-raising bug
    • RL algorithm bug
    • system worker bug
    • system utils bug
    • code design/refactor
    • documentation request
    • new feature request
  • I have visited the readme and doc
  • I have searched through the issue tracker and pr tracker
  • I have mentioned version numbers, operating system and environment, where applicable:
import ding, torch, sys
print(ding.__version__, torch.__version__, sys.version, sys.platform)

v0.4.7 1.12.1+cu102 3.8.16 (default, Mar  2 2023, 03:21:46) 
[GCC 11.2.0] linux

Install DI-engine by:

git clone https://github.com/opendilab/DI-engine.git
cd DI-engine
pip install .

then I run the config:

python dizoo/gym_hybrid/config/gym_hybrid_pdqn_config.py
Traceback (most recent call last):
  File "dizoo/gym_hybrid/config/gym_hybrid_pdqn_config.py", line 76, in <module>
    serial_pipeline([main_config, create_config], seed=0, max_env_step=int(1e7))
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/entry/serial_entry.py", line 97, in serial_pipeline
    stop, eval_info = evaluator.eval(learner.save_checkpoint, learner.train_iter, collector.envstep)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/worker/collector/interaction_serial_evaluator.py", line 232, in eval
    policy_output = self._policy.forward(obs)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/policy/pdqn.py", line 429, in _forward_eval
    action_args = self._eval_model.forward(data, mode='compute_continuous')['action_args']
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/model/wrapper/model_wrappers.py", line 422, in forward
    output = self._model.forward(*args, **kwargs)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/model/template/pdqn.py", line 137, in forward
    return getattr(self, mode)(inputs)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/model/template/pdqn.py", line 151, in compute_continuous
    cont_x = self.encoder[1](inputs)  # size (B, encoded_state_shape)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/model/common/encoder.py", line 168, in forward
    x = self.act(self.init(x))
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

So I set cuda=False in the gym_hybrid_pdqn_config.py
Then I get:

Traceback (most recent call last):
  File "dizoo/gym_hybrid/config/gym_hybrid_pdqn_config.py", line 76, in <module>
    serial_pipeline([main_config, create_config], seed=0, max_env_step=int(1e7))
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/entry/serial_entry.py", line 101, in serial_pipeline
    new_data = collector.collect(train_iter=learner.train_iter, policy_kwargs=collect_kwargs)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/worker/collector/sample_serial_collector.py", line 317, in collect
    self._output_log(train_iter)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/worker/collector/sample_serial_collector.py", line 354, in _output_log
    'reward_mean': np.mean(episode_return),
  File "<__array_function__ internals>", line 200, in mean
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 3464, in mean
    return _methods._mean(a, axis=axis, dtype=dtype,
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/numpy/core/_methods.py", line 165, in _mean
    arr = asanyarray(a)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (121,) + inhomogeneous part.
Exception ignored in: <function SampleSerialCollector.__del__ at 0x7fea545e0040>
Traceback (most recent call last):
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/worker/collector/sample_serial_collector.py", line 196, in __del__
    self.close()
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/worker/collector/sample_serial_collector.py", line 186, in close
    self._env.close()
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/envs/env_manager/subprocess_env_manager.py", line 635, in close
    p.send(['close', None, None])
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
    self._send(header + buf)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe

I want to check why there are inhomogeneous shape in episode_return, but the error is not arised consistently within a fixed iteration, sometimes 1000, sometimes 4000, or else. I don't know how to debug this.

@PaParaZz1 PaParaZz1 added the bug Something isn't working label May 16, 2023
@PaParaZz1 PaParaZz1 self-assigned this May 16, 2023
@PaParaZz1
Copy link
Member

How did you install gym_hybrid package, maybe you should install it with our modified version.

P.S. installation command: cd gym-hybrid && pip install .

@xujinming01
Copy link
Author

Yes, I installed the dizoo version in dizoo/gym_hybrid/envs/gym-hybrid.
Will you get the same error?

@PaParaZz1
Copy link
Member

I have reproduced this bug, and I will fix this problem in two days, you can keep track of this issue.

@PaParaZz1
Copy link
Member

@xujinming01 You can use the above commit to test whether this bug has been fixed.

@xujinming01
Copy link
Author

Before going that far, another error arose.
I created a new environment by conda, then cloned and installed the up-to-date repo and gym-hybrid.

python dizoo/gym_hybrid/config/gym_hybrid_pdqn_config.py
Traceback (most recent call last):
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/envs/env_manager/base_env_manager.py", line 111, in __init__
    self._observation_space = self._env_ref.observation_space
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/dizoo/gym_hybrid/envs/gym_hybrid_env.py", line 133, in observation_space
    return self._observation_space
AttributeError: 'GymHybridEnv' object has no attribute '_observation_space'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "dizoo/gym_hybrid/config/gym_hybrid_pdqn_config.py", line 76, in <module>
    serial_pipeline([main_config, create_config], seed=0, max_env_step=int(1e7))
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/entry/serial_entry.py", line 56, in serial_pipeline
    collector_env = create_env_manager(cfg.env.manager, [partial(env_fn, cfg=c) for c in collector_env_cfg])
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/envs/env_manager/base_env_manager.py", line 531, in create_env_manager
    return ENV_MANAGER_REGISTRY.build(manager_type, env_fn=env_fn, cfg=manager_cfg)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/utils/registry.py", line 96, in build
    raise e
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/utils/registry.py", line 82, in build
    return build_fn(*obj_args, **obj_kwargs)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/envs/env_manager/subprocess_env_manager.py", line 79, in __init__
    super().__init__(env_fn, cfg)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/envs/env_manager/base_env_manager.py", line 120, in __init__
    self._env_ref.reset()
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/dizoo/gym_hybrid/envs/gym_hybrid_env.py", line 48, in reset
    self._env = gym.make(self._env_id)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/gym/envs/registration.py", line 662, in make
    env = env_creator(**_kwargs)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/gym_hybrid/environments.py", line 248, in __init__
    super(MovingEnv, self).__init__(
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/gym_hybrid/environments.py", line 106, in __init__
    self.bg = cv2.cvtColor(self.bg, cv2.COLOR_BGR2RGB)
cv2.error: OpenCV(4.7.0) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'

How about you? @super1603

@PaParaZz1
Copy link
Member

Before going that far, another error arose. I created a new environment by conda, then cloned and installed the up-to-date repo and gym-hybrid.

python dizoo/gym_hybrid/config/gym_hybrid_pdqn_config.py
Traceback (most recent call last):
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/envs/env_manager/base_env_manager.py", line 111, in __init__
    self._observation_space = self._env_ref.observation_space
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/dizoo/gym_hybrid/envs/gym_hybrid_env.py", line 133, in observation_space
    return self._observation_space
AttributeError: 'GymHybridEnv' object has no attribute '_observation_space'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "dizoo/gym_hybrid/config/gym_hybrid_pdqn_config.py", line 76, in <module>
    serial_pipeline([main_config, create_config], seed=0, max_env_step=int(1e7))
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/entry/serial_entry.py", line 56, in serial_pipeline
    collector_env = create_env_manager(cfg.env.manager, [partial(env_fn, cfg=c) for c in collector_env_cfg])
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/envs/env_manager/base_env_manager.py", line 531, in create_env_manager
    return ENV_MANAGER_REGISTRY.build(manager_type, env_fn=env_fn, cfg=manager_cfg)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/utils/registry.py", line 96, in build
    raise e
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/utils/registry.py", line 82, in build
    return build_fn(*obj_args, **obj_kwargs)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/envs/env_manager/subprocess_env_manager.py", line 79, in __init__
    super().__init__(env_fn, cfg)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/envs/env_manager/base_env_manager.py", line 120, in __init__
    self._env_ref.reset()
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/dizoo/gym_hybrid/envs/gym_hybrid_env.py", line 48, in reset
    self._env = gym.make(self._env_id)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/gym/envs/registration.py", line 662, in make
    env = env_creator(**_kwargs)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/gym_hybrid/environments.py", line 248, in __init__
    super(MovingEnv, self).__init__(
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/gym_hybrid/environments.py", line 106, in __init__
    self.bg = cv2.cvtColor(self.bg, cv2.COLOR_BGR2RGB)
cv2.error: OpenCV(4.7.0) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'

How about you? @super1603

Maybe it is a opencv version bug, you can try pip install opencv-python==4.5.5.64

@xujinming01
Copy link
Author

Not work, the error message has not changed, except for the last line:

cv2.error: OpenCV(4.5.5) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'

@PaParaZz1
Copy link
Member

Not work, the error message has not changed, except for the last line:

cv2.error: OpenCV(4.5.5) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'

You need to check why these images (bg.jpg, target.png) are empty in your repo, maybe you delete them then it leads to the empty src array in opencv.

@super1603
Copy link

Before going that far, another error arose. I created a new environment by conda, then cloned and installed the up-to-date repo and gym-hybrid.

python dizoo/gym_hybrid/config/gym_hybrid_pdqn_config.py
Traceback (most recent call last):
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/envs/env_manager/base_env_manager.py", line 111, in __init__
    self._observation_space = self._env_ref.observation_space
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/dizoo/gym_hybrid/envs/gym_hybrid_env.py", line 133, in observation_space
    return self._observation_space
AttributeError: 'GymHybridEnv' object has no attribute '_observation_space'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "dizoo/gym_hybrid/config/gym_hybrid_pdqn_config.py", line 76, in <module>
    serial_pipeline([main_config, create_config], seed=0, max_env_step=int(1e7))
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/entry/serial_entry.py", line 56, in serial_pipeline
    collector_env = create_env_manager(cfg.env.manager, [partial(env_fn, cfg=c) for c in collector_env_cfg])
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/envs/env_manager/base_env_manager.py", line 531, in create_env_manager
    return ENV_MANAGER_REGISTRY.build(manager_type, env_fn=env_fn, cfg=manager_cfg)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/utils/registry.py", line 96, in build
    raise e
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/utils/registry.py", line 82, in build
    return build_fn(*obj_args, **obj_kwargs)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/envs/env_manager/subprocess_env_manager.py", line 79, in __init__
    super().__init__(env_fn, cfg)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/ding/envs/env_manager/base_env_manager.py", line 120, in __init__
    self._env_ref.reset()
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/dizoo/gym_hybrid/envs/gym_hybrid_env.py", line 48, in reset
    self._env = gym.make(self._env_id)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/gym/envs/registration.py", line 662, in make
    env = env_creator(**_kwargs)
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/gym_hybrid/environments.py", line 248, in __init__
    super(MovingEnv, self).__init__(
  File "/home/jinming/anaconda3/envs/ding-dev/lib/python3.8/site-packages/gym_hybrid/environments.py", line 106, in __init__
    self.bg = cv2.cvtColor(self.bg, cv2.COLOR_BGR2RGB)
cv2.error: OpenCV(4.7.0) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'

How about you? @super1603

I don't get this error, it seems fixed my bug. And my opencv version is 4.7.0.72

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants