add GPU number and lazy load img to GPU #437

sword4869 · 2023-11-07T14:59:41Z

hello,

I add a gpu number argument in ModelParams class.
In order to save GPU memory, I cancelled the loading of img to GPU when creating the camera, and delayed loading img to GPU during training. This greatly saves GPU memory. I tested it in RTX2080ti and less than 2GB during initial training, even if trained to 30k, less than 5GB.

pablospe · 2023-11-23T15:41:36Z

This is a great improvement in memory usage, this should be merged into main branch. @Snosixtyboo

pablospe · 2023-11-23T15:40:41Z

scene/cameras.py

+        if lazy_load:
+            self.data_device = torch.device("cpu")


Maybe equivalent?

if lazy_load: data_device = "cpu" try: self.data_device = torch.device(data_device) ....

If this is correct, perhaps lazy_load can be replaced directly with data_device (and it needs to be exposed in train.py). Perhaps adding a comment in the README, there is a question about the memory and the different approaches when there is no 24GB.

After your reminder, I realized that other tensors were directly migrated through . cuda, and data_device is only responsible for images, which can indeed function as a lazy load.

There is no need to expose data_device in train.py as it is already specified in the arguments/__init__.py.

All in all, nothing needs to be modified. If possible, I suggest modifying the "24 GB VRAM" in README, as it can easily make people mistakenly think that this is the minimum configuration.

I am not sure to understand your reply. What do you mean by . cuda? I think the lazy_load was a good option to add, so you don't need to modify every time the __init__.py. Perhaps, could you expand how you would modify the README to clarify that data_device can be used for lazy loading?

"other tensors were directly migrated through . cuda" precisely means that they are fixed to the cuda device, e.g. create gaussian.

There is no need to need to modify data_device every time the __init__.py. The train.py uses argparse.ArgumentParser to analyse the command arguments. And data_device is can be called by python train.py --data_device cpu -s <path to COLMAP or NeRF Synthetic dataset>.

# scene/cameras.py # if here data_device is specified as `cpu`, self.original_image is on cpu try: self.data_device = torch.device(data_device) except Exception as e: print(e) print(f"[Warning] Custom device {data_device} failed, fallback to default cuda device" ) self.data_device = torch.device("cuda") self.original_image = image.clamp(0.0, 1.0).to(self.data_device) self.image_width = self.original_image.shape[2] self.image_height = self.original_image.shape[1] if gt_alpha_mask is not None: self.original_image *= gt_alpha_mask.to(self.data_device) else: self.original_image *= torch.ones((1, self.image_height, self.image_width), device=self.data_device) # train.py # original_image is sent to gpu while iterating gt_image = viewpoint_cam.original_image.cuda()

data_device in README

I modify REAMDE to clarify lazy load and recommit it.

thanks for the clarifications!

NiklasVoigt · 2023-12-11T09:12:06Z

typo, should be device with a c --data_device cpu

sword4869 · 2023-12-13T02:57:42Z

typo, should be device with a c --data_device cpu

tks for ur careful check

Ky1eYang · 2024-07-23T05:49:27Z

maybe we can reduce more usage of memory by lazy call function loadCam.

def lazy_call(f, *args, **kwargs):
    return lambda: f(*args, **kwargs)

def cameraList_from_camInfos(cam_infos, resolution_scale, args, lazy_load):
    camera_list = []
    # pdb.set_trace()
    for id, c in enumerate(cam_infos):
        if lazy_load:
            camera_list.append(lazy_call(loadCam, args, id, c, resolution_scale))
        else:
            camera_list.append(loadCam(args, id, c, resolution_scale))
    return camera_list`

In function 'cameraList_from_camInfos', we can delay executing loadCam and execute it in 'training'.

# Pick a random Camera
if not viewpoint_stack:
    viewpoint_stack = scene.getTrainCameras().copy()
viewpoint_cam = viewpoint_stack.pop(randint(0, len(viewpoint_stack)-1))
if scene.lazy_load:
    viewpoint_cam = viewpoint_cam()

sword4869 force-pushed the main branch from 3148ab1 to dfd866c Compare November 8, 2023 09:47

pablospe approved these changes Nov 23, 2023

View reviewed changes

sword4869 closed this Nov 24, 2023

sword4869 reopened this Nov 25, 2023

clarify lazy load

7053220

sword4869 force-pushed the main branch from dfd866c to 7053220 Compare November 25, 2023 02:36

pablospe approved these changes Nov 25, 2023

View reviewed changes

Update README.md

b245506

antithing mentioned this pull request Dec 29, 2023

GPU memory use oppo-us-research/SpacetimeGaussians#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add GPU number and lazy load img to GPU #437

add GPU number and lazy load img to GPU #437

sword4869 commented Nov 7, 2023

pablospe commented Nov 23, 2023 •

edited

Loading

pablospe Nov 23, 2023 •

edited

Loading

sword4869 Nov 24, 2023

pablospe Nov 24, 2023

sword4869 Nov 25, 2023

sword4869 Nov 25, 2023

pablospe Nov 25, 2023

NiklasVoigt commented Dec 11, 2023

sword4869 commented Dec 13, 2023

Ky1eYang commented Jul 23, 2024

add GPU number and lazy load img to GPU #437

Are you sure you want to change the base?

add GPU number and lazy load img to GPU #437

Conversation

sword4869 commented Nov 7, 2023

pablospe commented Nov 23, 2023 • edited Loading

pablospe Nov 23, 2023 • edited Loading

Choose a reason for hiding this comment

sword4869 Nov 24, 2023

Choose a reason for hiding this comment

pablospe Nov 24, 2023

Choose a reason for hiding this comment

sword4869 Nov 25, 2023

Choose a reason for hiding this comment

sword4869 Nov 25, 2023

Choose a reason for hiding this comment

pablospe Nov 25, 2023

Choose a reason for hiding this comment

NiklasVoigt commented Dec 11, 2023

sword4869 commented Dec 13, 2023

Ky1eYang commented Jul 23, 2024

pablospe commented Nov 23, 2023 •

edited

Loading

pablospe Nov 23, 2023 •

edited

Loading