Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why confidence and the distance for an original video is coming Low and High respectively? #66

Open
Himanshu21135 opened this issue Apr 8, 2024 · 0 comments

Comments

@Himanshu21135
Copy link

@joonson I have some doubt in the code of SyncNetInstance.py.

Screenshot 2024-04-08 132416

In the function calc_pdist the reason to consider the window it to take the consideration of the offset right?
The way you are computing this distance it would return you the shape of (lastframe, window_size) when you perform torch.stack(dists,1) and then later you find mdist and I am unable to understand the logic behind this computation in the code you have done mdist = torch.mean(torch.stack(dists,1),1) i.e., you have taken the average across the column which gives you the mdist of the shape(1,31) i.e., simply list of 31 values.
Would you please give your views on why have you taken the mean across column because from my understanding the mean should be taken across rows then it would be of shape (lastframe, 1) i.e., mean for each frame while considering a window.

Also I have performed an Experiment in which I have computed the distance and confidence for an original file which was not dubbed and for that the distance I am getting is pretty high and confidence is very low but it supposed to be the distance would be coming low and the confidence should be high and then I have created a dubbed video of an speaker saying the same statement said in the original file using wave2lip model and then computed the distance and confidence and this distance is comparable lower with respect to the distance computed for original video.
What would be the reason for this?

Please give your views on why taking the mean across column not across rows?

@Himanshu21135 Himanshu21135 changed the title Why confidence and the distance for an original video is coming high? Why confidence and the distance for an original video is coming Low and High respectively? Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant