Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing key(s) in state_dict: "blocks.1.synapse.weight_g", "blocks.1.synapse.weight_v", "blocks.3.synapse.weight_g", "blocks.3.synapse.weight_v", "blocks.4.synapse.weight_g", "blocks.4.synapse.weight_v" #371

Open
HAMBUK opened this issue Dec 6, 2024 · 1 comment
Labels
0-needs-review 1-bug Something isn't working

Comments

@HAMBUK
Copy link

HAMBUK commented Dec 6, 2024

Describe the bug
After updating lava-dl to 0.6.0, the naming way of weight has been changed.
So when loading the pt file(which is trained in lava-dl 0.6.0 environment) for inference, I get this error

Missing key(s) in state_dict: "blocks.1.synapse.weight_g", "blocks.1.synapse.weight_v", "blocks.3.synapse.weight_g", "blocks.3.synapse.weight_v", "blocks.4.synapse.weight_g", "blocks.4.synapse.weight_v"

Problem cause:
Maybe the naming way of weights when making .pt file has been changed since the update, but loading .pt file stays same. So that is why we got the mismatch issue.

Solution:
So I downgrade my lava-dl to 0.5.0 and trained it again and get the new .pt file and check it works fine now.

Screenshots
If applicable, add screenshots to help explain your problem. Remove section otherwise.

Environment (please complete the following information):
Ubuntu 22.04 LTS
Lava-dl 0.6.0
Pytorch 2.1.0
Pyhton 3.10.0

  • Device: [e.g. Laptop, Intel cloud]
  • OS: [e.g. Linux]
  • Lava version [e.g. 0.6.1]

Additional context
Add any other context about the problem here. Remove section otherwise.

@HAMBUK HAMBUK added the 1-bug Something isn't working label Dec 6, 2024
@HAMBUK
Copy link
Author

HAMBUK commented Dec 9, 2024

I have noticed that this is not the lava-dl version issue.
During training, after saving the best.pt file, I also load pt file and export it with .net file like below

    if stats.testing.best_accuracy:
        epochs_without_improvement = 0
        print("Current Process Saved.")
        torch.save(network.state_dict(), trained_folder + "/network.pt")
        network.load_state_dict(torch.load(trained_folder + '/network.pt'))
        network.export_hdf5(trained_folder + '/network.net')

This was the probelm, when exporting pt to net, the weight architectures are changed after this epoch.

My export_hdf5 function is below:

def export_hdf5(self, filename):
    # network export to hdf5 format
    h = h5py.File(filename, "w")
    layer = h.create_group("layer")
    for i, b in enumerate(self.blocks):
        b.export_hdf5(layer.create_group(f"{i}"))

So, it works fine after I just save only pt files and do the conversion later or separately

@HAMBUK HAMBUK closed this as completed Dec 9, 2024
@HAMBUK HAMBUK reopened this Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0-needs-review 1-bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant