Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PR] Saving Inferred 3D Hand Keypoints #89

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

Kaszanas
Copy link
Contributor

@Kaszanas Kaszanas commented Nov 21, 2024

As the name suggests.

There were some unused imports in the demo.pl.

As described in the issues below this should calculate the 3D keypoints and save them with np.save.

Related:

If you can verify and see if this is the solution for the issues above @geopavlakos.

@dhingratul
Copy link

The concatenation doesn't work fyi.

@Kaszanas
Copy link
Contributor Author

The concatenation doesn't work fyi.

I am just now testing the solution. Do you know how to make it work?

@Kaszanas
Copy link
Contributor Author

Kaszanas commented Nov 23, 2024

I think this works now @dhingratul I am not aware of the precise mechanism of how these values should be mutated mathematically. but given the description

You can get the 3D coordinates of the hand keypoints in the camera frame by adding pred_cam_t_full (which is calculated here) to out['pred_keypoints_3d'] (which is calculated here). Then you can export the sum in a pkl file.

I think this is how it is supposed to work.

specifically this piece of code handles this addition for each of the keypoints:

                    # Iterating over all of the samples in the batch:
                    for i in range(len(pred_keypoints_3d)):
                        batch_element_keypoints = pred_keypoints_3d[i]
                        batch_element_cam_t = pred_cam_t_full[i]

                        for keypoint_idx in range(len(batch_element_keypoints)):
                            # Adding the camera translation to the keypoints:
                            batch_element_keypoints[keypoint_idx] += batch_element_cam_t

Some input from @geopavlakos would surely be appreciated. I am also pretty sure that this can be done with some simpler numpy calls, but then the understanding of the dimensionality is lost for me.

Apparently this could be:

pred_keypoints_3d += pred_cam_t_full[:, np.newaxis, :]

But it would be great to get a review from someone.

@dhingratul
Copy link

I am still getting around this, but did you visualize and check ?

@Kaszanas
Copy link
Contributor Author

Kaszanas commented Nov 30, 2024

I am still getting around this, but did you visualize and check ?

Didn't have time (or the knowledge), I've stopped at running through debugger and seeing the results as numbers.

Shapes matched. I think (2,3) and (2,21,3). So adding an empty axis and elementwise adding the last dimension makes sense?

@Kaszanas
Copy link
Contributor Author

I am still getting around this, but did you visualize and check ?

Based on the discussion in #88 (comment), I think we need some input from @geopavlakos @dhingratul

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants