TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead. #1198

Killpit · 2023-05-26T17:26:47Z

I tried following the RL tutorial from here. Even though it mentioned cuda, I use MPS and when I tried to apply the same code with MPS logic, I couldn't make it work, but it works perfectly without has_mps and using cuda.


device = "cpu" if not torch.has_mps else "mps:0"
num_cells = 256  #number of cells in each layer
lr = 3e-4
max_grad_norm = 1.0

frame_skip = 1
frames_per_batch = 1000 // frame_skip
#For a complete training, bring the number of frames up to 1M
total_frames = 50_000 // frame_skip

sub_batch_size = 64  #cardinality of the sub-samples gathered from the current
#data in the inner loop
num_epochs = 10  #optimization steps per batch of data collected
clip_epsilon = (0.2)  #clip value for PPO loss
gamma = 0.99
lmbda = 0.95
entropy_eps = 1e-4

base_env = GymEnv("InvertedDoublePendulum-v4", device=device, frame_skip=frame_skip)

env = TransformedEnv(
    base_env,
    Compose(
        #normalize observations
        ObservationNorm(in_keys=["observation"]),
        DoubleToFloat(in_keys=["observation"]),
        StepCounter(),
    ),
)

env.transform[0].init_stats(num_iter=1000, reduce_dim=0, cat_dim=0)

print("normalization constant shape:", env.transform[0].loc.shape)

The text was updated successfully, but these errors were encountered:

vmoens · 2023-05-27T19:46:36Z

We currently have limited coverage over MPS but it's a good point.
Let me see how we can make sure that we support that too!
Here it seems that the error occurs when we convert a float64 tensor from cpu to MPS when reading it from gym. Without the full error stack it's hard to really say what's going on but perhaps changing the corresponding env spec to have a type float32 instead of float64 could solve it (my guess is that this occurs during a call to env.observation_spec.encode?)

Killpit · 2023-05-28T14:37:58Z

This is the full error

Traceback (most recent call last):
File "/Users/atatekeli/PycharmProjects/PyTorchProjects/torch_rl.py", line 38, in
base_env = GymEnv("InvertedDoublePendulum-v4", device=device, frame_skip=frame_skip)
File "/Users/atatekeli/PycharmProjects/PyTorchProjects/venv/lib/python3.10/site-packages/torchrl/envs/libs/gym.py", line 589, in init
super().init(**kwargs)
File "/Users/atatekeli/PycharmProjects/PyTorchProjects/venv/lib/python3.10/site-packages/torchrl/envs/libs/gym.py", line 373, in init
super().init(**kwargs)
File "/Users/atatekeli/PycharmProjects/PyTorchProjects/venv/lib/python3.10/site-packages/torchrl/envs/common.py", line 967, in init
self._make_specs(self._env) # writes the self._env attribute
File "/Users/atatekeli/PycharmProjects/PyTorchProjects/venv/lib/python3.10/site-packages/torchrl/envs/libs/gym.py", line 522, in _make_specs
observation_spec = _gym_to_torchrl_spec_transform(
File "/Users/atatekeli/PycharmProjects/PyTorchProjects/venv/lib/python3.10/site-packages/torchrl/envs/libs/gym.py", line 226, in _gym_to_torchrl_spec_transform
low = torch.tensor(spec.low, device=device, dtype=dtype)
TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.

vmoens · 2023-05-30T06:15:34Z

I see
I believe the proper way to go around this would be to provide you with 2 transforms:
One that maps double to float (we already have that, DoubleToFloatTransform) and another that casts the content on MPS.
The second one should not be too difficult to come up with.
I'll get this done soon, stay tuned

jonahclarsen · 2023-08-13T06:36:24Z

@vmoens Any update?

vmoens · 2023-08-30T17:35:29Z

This should be fixed:
Add a DoubleToFloat transform after creating the env on cpu and then map the data with DeviceCastTransform to the MPS device.
Feel free to reopen if needed

EkaterinaAbramova · 2023-09-29T17:16:51Z

This should be fixed:
Add a DoubleToFloat transform after creating the env on cpu and then map the data with DeviceCastTransform to the MPS device.
Feel free to reopen if needed

Could you please kindly provide a working example? I have the MPS float64 error when running base_env = GymEnv("InvertedDoublePendulum-v4", device=device, frame_skip=frame_skip)? Thank you so much!!

EkaterinaAbramova · 2024-01-17T20:14:08Z

Hi Vmoens, I am back to trying to resolve this issue please. I'd appreciate if you could please make a tutorial such that it contains python code that will work on MPS please. It is not clear at all what is the workaround as I am still getting errors unfortunately. I have run this code:

import torch
print(torch.__version__) # gpu acceleration available in this version
print(torch.backends.mps.is_available()) #the MacOS is higher than 12.3+
print(torch.backends.mps.is_built()) #MPS is activated

import torchrl
print(torchrl.__version__)

import tensordict
print(tensordict.__version__)
'''
TensorDict is like a Python dictionary with some extra tensor features. 
Many modules need to be told what key to read (in_keys) and what key to write (out_keys) in the tensordict they will receive. 
Usually, if out_keys is omitted, it is assumed that the in_keys entries will be updated in-place. 
'''

from collections import defaultdict

import matplotlib.pyplot as plt

from tqdm import tqdm

from tensordict.nn import TensorDictModule
from tensordict.nn.distributions import NormalParamExtractor

from torch import nn

from torchrl.collectors import SyncDataCollector

from torchrl.data.replay_buffers import ReplayBuffer
from torchrl.data.replay_buffers.samplers import SamplerWithoutReplacement
from torchrl.data.replay_buffers.storages import LazyTensorStorage

from torchrl.envs import (Compose, DoubleToFloat, ObservationNorm, StepCounter, TransformedEnv)
from torchrl.envs.libs.gym import GymEnv
from torchrl.envs.libs.gym import set_gym_backend
from torchrl.envs.utils import check_env_specs, set_exploration_mode
from torchrl.envs.transforms import DoubleToFloat, DeviceCastTransform

from torchrl.modules import ProbabilisticActor, TanhNormal, ValueOperator

from torchrl.objectives import ClipPPOLoss
from torchrl.objectives.value import GAE
 
device = "mps" 
frame_skip = 1 # action to be executed in current time-step only

with set_gym_backend("gym"):
    base_env = GymEnv("InvertedDoublePendulum-v4", device="cpu", frame_skip=frame_skip)

env = TransformedEnv(
    base_env,
    Compose(
        ObservationNorm(in_keys=["observation"]), # normalise observations (make it about Standard Normal)
        DoubleToFloat(),   
        StepCounter(),                            # count the number of steps before the environment is terminated
        DeviceCastTransform(device=device, orig_device="cpu"),
    ),
)

env.transform[0].init_stats(num_iter=1000, reduce_dim=0, cat_dim=0) 

check_env_specs(env)

And I get an error during the check:

check_env_specs(env)
Traceback (most recent call last):

  Cell In[10], line 1
    check_env_specs(env)

  File ~/anaconda3/envs/gpu-torchrl-latest/lib/python3.10/site-packages/torchrl/envs/utils.py:435 in check_env_specs
    real_tensordict = env.rollout(3, return_contiguous=return_contiguous)

  File ~/anaconda3/envs/gpu-torchrl-latest/lib/python3.10/site-packages/torchrl/envs/common.py:1797 in rollout
    tensordict = self.reset()

  File ~/anaconda3/envs/gpu-torchrl-latest/lib/python3.10/site-packages/torchrl/envs/common.py:1480 in reset
    tensordict_reset = self._reset(tensordict, **kwargs)

  File ~/anaconda3/envs/gpu-torchrl-latest/lib/python3.10/site-packages/torchrl/envs/transforms/transforms.py:760 in _reset
    tensordict_reset = self.transform._reset(tensordict, tensordict_reset)

  File ~/anaconda3/envs/gpu-torchrl-latest/lib/python3.10/site-packages/torchrl/envs/transforms/transforms.py:1020 in _reset
    tensordict_reset = t._reset(tensordict, tensordict_reset)

  File ~/anaconda3/envs/gpu-torchrl-latest/lib/python3.10/site-packages/torchrl/envs/transforms/transforms.py:5077 in _reset
    step_count = torch.where(~expand_as_right(reset, step_count), step_count, 0)

RuntimeError: Expected all tensors to be on the same device, but found at least two devices.

Please could you resolve the issue?

vmoens · 2024-01-22T17:25:36Z

#1827 should fix this issue!
Do not hesitate to open a separate issue next time :)

vmoens mentioned this issue Aug 30, 2023

[Feature] Device transform #1472

Merged

vmoens linked a pull request Aug 30, 2023 that will close this issue

[Feature] Device transform #1472

Merged

vmoens closed this as completed in #1472 Aug 30, 2023

vmoens mentioned this issue Jan 22, 2024

[BugFix] Fix device of container generated values in transforms #1827

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead. #1198

TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead. #1198

Killpit commented May 26, 2023 •

edited

Loading

vmoens commented May 27, 2023

Killpit commented May 28, 2023

vmoens commented May 30, 2023

jonahclarsen commented Aug 13, 2023

vmoens commented Aug 30, 2023

EkaterinaAbramova commented Sep 29, 2023

EkaterinaAbramova commented Jan 17, 2024

vmoens commented Jan 22, 2024

TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead. #1198

TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead. #1198

Comments

Killpit commented May 26, 2023 • edited Loading

vmoens commented May 27, 2023

Killpit commented May 28, 2023

vmoens commented May 30, 2023

jonahclarsen commented Aug 13, 2023

vmoens commented Aug 30, 2023

EkaterinaAbramova commented Sep 29, 2023

EkaterinaAbramova commented Jan 17, 2024

vmoens commented Jan 22, 2024

Killpit commented May 26, 2023 •

edited

Loading