You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When training a multi-head module with DER there seems to be a bug as the compute_dataset_logits() function is expected to return a tensor, but seemingly creates a dictionary.
To replicate, simply run the multihead.py with the DER strategy.
I tried fixing the problem by adding a check on the aforementioned function to see if a dictionary is being generated as the output of the model and to convert the dictionary values into the desired data type (lines 48 to 52 on der.py):
if(isinstance(out,dict)):
out = out.values()
out = list(out)[0]
but sometimes this conversion yields tensors with seemingly random sizes ([128,6] or [128,9] instead of the expected [128,10])
The text was updated successfully, but these errors were encountered:
When training a multi-head module with DER there seems to be a bug as the
compute_dataset_logits()
function is expected to return a tensor, but seemingly creates a dictionary.To replicate, simply run the multihead.py with the DER strategy.
I tried fixing the problem by adding a check on the aforementioned function to see if a dictionary is being generated as the output of the model and to convert the dictionary values into the desired data type (lines 48 to 52 on der.py):
but sometimes this conversion yields tensors with seemingly random sizes ([128,6] or [128,9] instead of the expected [128,10])
The text was updated successfully, but these errors were encountered: