Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't pickle weakref objects #29

Open
Fayzan-Bhatti opened this issue Oct 29, 2021 · 7 comments
Open

Can't pickle weakref objects #29

Fayzan-Bhatti opened this issue Oct 29, 2021 · 7 comments

Comments

@Fayzan-Bhatti
Copy link

I am going to build my project and data is fetched from my database with specific Project_id. and then train my model using LSTM. Epochs are clearly running but after that, It shows an Internal Server Error


admin.py

 def build(self, request, queryset):
        count = 0

        for p in queryset:
            if build_id(p.project_management.id):
                count += 1
            else:
                messages.warning(request, f"Could not build model for {p}")

        messages.success(
            request, f"Successfully built models for {count} projects")

    build.short_description = "Build models for selected Projects"

bild.py
here the model is built via a specific Project_id. Model store only model.pkl data but not completed. And other files scalar_in and scalar_out do not save in a specific folder.

def build_id(project_id):
    # get directory path to store models in
    path = fetch_model_path(project_id, True)

    # train model
    model, scaler_in, scaler_out = train_project_models(project_id)

    # ensure model was trained
    if model is None:
        return False

    # store models
    store_model(f'{path}/model.pkl', model)
    store_model(f'{path}/scaler_in.pkl', scaler_in)
    store_model(f'{path}/scaler_out.pkl', scaler_out)

    # clear current loaded model from memory
    keras_clear()

    return True

utils.py

    with open(path, 'wb') as f:
        model_file = File(f)
        pickle.dump(model, model_file)

when I Comment on the pickle.dump(model,model_file) then model.pkl, scalar_in.pkl, and scalar_out.pkl save files with 0 kb data. If pkl files exist already with data then it removes and builds the project successfully. I debug this code and the Django debuger_tool shows that the page is temporarily moved.


output

Epoch 1/4
11/11 [==============================] - 9s 302ms/step - loss: 0.4594 - val_loss: 0.2777
Epoch 2/4
11/11 [==============================] - 2s 177ms/step - loss: 0.1039 - val_loss: 0.0395
Epoch 3/4
11/11 [==============================] - 2s 170ms/step - loss: 0.0545 - val_loss: 0.0361
Epoch 4/4
11/11 [==============================] - 2s 169ms/step - loss: 0.0414 - val_loss: 0.0551
Internal Server Error: /turboai/turboAI/jaaiparameters/
Traceback (most recent call last):
  File "E:\.Space\project\venv\lib\site-packages\django\core\handlers\exception.py", line 47, in inner
    response = get_response(request)
  File "E:\.Space\project\venv\lib\site-packages\django\core\handlers\base.py", line 181, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "E:\.Space\project\venv\lib\site-packages\django\contrib\admin\options.py", line 616, in wrapper
    return self.admin_site.admin_view(view)(*args, **kwargs)
  File "E:\.Space\project\venv\lib\site-packages\django\utils\decorators.py", line 130, in _wrapped_view
    response = view_func(request, *args, **kwargs)
  File "E:\.Space\project\venv\lib\site-packages\django\views\decorators\cache.py", line 44, in _wrapped_view_func
    response = view_func(request, *args, **kwargs)
  File "E:\.Space\project\venv\lib\site-packages\django\contrib\admin\sites.py", line 232, in inner
    return view(request, *args, **kwargs)
  File "E:\.Space\project\venv\lib\site-packages\django\utils\decorators.py", line 43, in _wrapper
    return bound_method(*args, **kwargs)
  File "E:\.Space\project\venv\lib\site-packages\django\utils\decorators.py", line 130, in _wrapped_view
    response = view_func(request, *args, **kwargs)
  File "E:\.Space\project\venv\lib\site-packages\django\contrib\admin\options.py", line 1723, in changelist_view
    response = self.response_action(request, queryset=cl.get_queryset(request))
  File "E:\.Space\project\venv\lib\site-packages\django\contrib\admin\options.py", line 1408, in response_action
    response = func(self, request, queryset)
  File "E:\.Space\project\TurboAnchor\turboAI\admin.py", line 125, in build
    if build_id(p.project_management.id):
  File "E:\.Space\project\TurboAnchor\turboAI\build.py", line 48, in build_id
    store_model(f'{path}/model.pkl', model)
  File "E:\.Space\project\TurboAnchor\turboAI\utils.py", line 154, in store_model
    pickle.dump(model, model_file)
TypeError: can't pickle weakref objects
[29/Oct/2021 17:50:31] "POST /turboai/turboAI/jaaiparameters/ HTTP/1.1" 500 126722
@Mengzibin
Copy link

to solve this problem, focus on line 181-182:

torch.save(ema, os.path.join(opt.output_dir, now + 'ema.pth')) torch.save(ema2, os.path.join(opt.output_dir, now + 'ema2.pth'))

modify the code to the following:

torch.save(ema.state_dict(), os.path.join(opt.output_dir, now + 'ema.pth')) torch.save(ema2.state_dict(), os.path.join(opt.output_dir, now + 'ema2.pth'))

and focus on line 348-349

torch.save(ema, os.path.join(opt.output_dir, 'ema.pth')) torch.save(ema2, os.path.join(opt.output_dir, 'ema2.pth'))

modify the code to the following:

torch.save(ema.state_dict(), os.path.join(opt.output_dir, 'ema.pth')) torch.save(ema2.state_dict(), os.path.join(opt.output_dir, 'ema2.pth'))

@somnus1840
Copy link

@Mengzibin 这样改虽然代码是跑起来了 但是后面在推理阶段做的extract_shapes,render_video等一个都干不成,因为推理阶段代码有一段ema=torch.load(os.path.output,generator_ddp.parameters()),由于训练的时候保存的ema.pth和generator.pth的类型不一样,所以不能将生成器模型copy_to到ema

@Scxw010516
Copy link

@somnus1840 请问你后来是怎么解决这个问题的呢,能否参考一下

@somnus1840
Copy link

@somnus1840 请问你后来是怎么解决这个问题的呢,能否参考一下

可以的哈 我后面是在推断阶段的加载ema的部分改成了
ema = ExponentialMovingAverage(generator.parameters(), decay=0.999)
ema.load_state_dict(torch.load(os.path.join(opt.load_dir, "ema.pth"), map_location=device))就可以了 我记得这个解决方案借鉴了issue-closed-type error:can't pickle weakref objects中的方法 不知道你有没有关注到

@Scxw010516
Copy link

@somnus1840 请问你后来是怎么解决这个问题的呢,能否参考一下

可以的哈 我后面是在推断阶段的加载ema的部分改成了 ema = ExponentialMovingAverage(generator.parameters(), decay=0.999) ema.load_state_dict(torch.load(os.path.join(opt.load_dir, "ema.pth"), map_location=device))就可以了 我记得这个解决方案借鉴了issue-closed-type error:can't pickle weakref objects中的方法 不知道你有没有关注到

谢谢您的回复,上述问题在前些日子解决了。但最近我遇到了新的问题,可否请教一下(如果您也遇到的话):最近我想使用作者提供的预训练模型进行训练,但是在load其中的ema.pth时遇到了如下的问题

Traceback (most recent call last):
  File "D:\Projects\pi-GAN_Reappearance\venv\lib\site-packages\torch\multiprocessing\spawn.py", line 69, in _wrap
    fn(i, *args)
  File "D:\Projects\pi-GAN_Reappearance\train.py", line 89, in train
    ema.load_state_dict(torch.load(os.path.join(opt.load_dir, 'ema.pth'), map_location=device))
  File "D:\Projects\pi-GAN_Reappearance\venv\lib\site-packages\torch_ema\ema.py", line 257, in load_state_dict
    self.decay = state_dict["decay"]
TypeError: 'ExponentialMovingAverage' object is not subscriptable

貌似意思是不能对EMA对象进行数据操作,我不太清楚这是哪里的原因。

@wangfei173
Copy link

@somnus1840 请问你后来是怎么解决这个问题的呢,能否参考一下

可以的哈 我后面是在推断阶段的加载ema的部分改成了 ema = ExponentialMovingAverage(generator.parameters(), decay=0.999) ema.load_state_dict(torch.load(os.path.join(opt.load_dir, "ema.pth"), map_location=device))就可以了 我记得这个解决方案借鉴了issue-closed-type error:can't pickle weakref objects中的方法 不知道你有没有关注到

谢谢您的回复,上述问题在前些日子解决了。但最近我遇到了新的问题,可否请教一下(如果您也遇到的话):最近我想使用作者提供的预训练模型进行训练,但是在load其中的ema.pth时遇到了如下的问题

Traceback (most recent call last):
  File "D:\Projects\pi-GAN_Reappearance\venv\lib\site-packages\torch\multiprocessing\spawn.py", line 69, in _wrap
    fn(i, *args)
  File "D:\Projects\pi-GAN_Reappearance\train.py", line 89, in train
    ema.load_state_dict(torch.load(os.path.join(opt.load_dir, 'ema.pth'), map_location=device))
  File "D:\Projects\pi-GAN_Reappearance\venv\lib\site-packages\torch_ema\ema.py", line 257, in load_state_dict
    self.decay = state_dict["decay"]
TypeError: 'ExponentialMovingAverage' object is not subscriptable

貌似意思是不能对EMA对象进行数据操作,我不太清楚这是哪里的原因。

这个问题你解决了吗,我想问一下你是如何解决的

@wangfei173
Copy link

@Scxw010516
TypeError: 'ExponentialMovingAverage' object is not subscriptable
这个问题你解决了吗,我想问一下你是如何解决的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants