Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Biwi Kinect Head Pose dataset. #3903

Merged
merged 23 commits into from
May 31, 2022

Conversation

dnaveenr
Copy link
Contributor

This PR adds the Biwi Kinect Head Pose dataset.

Dataset Request : Add Biwi Kinect Head Pose Database #3822

The Biwi Kinect Head Pose Database is acquired with the Microsoft Kinect sensor, a structured IR light device.It contains 15K images of 20 people with 6 females and 14 males where 4 people were recorded twice.

For each frame, there is :

  • a depth image, (.bin file)
  • a corresponding rgb image (both 640x480 pixels),
  • annotation ( present inside a .txt file)
    The ground truth is the 3D location of the head and its rotation.

The dataset structure is as follows :

- 01.obj
- 01
     - frame_00003_depth.bin
     - frame_00003_pose.txt
     - frame_00003_rgb.png
     .
     .
     .
- 02.obj
 - 02
     - frame_00003_depth.bin
     - frame_00003_pose.txt
     - frame_00003_rgb.png
     .
     .
     .

Preview of frame_00003_pose.txt :

0.988397 0.0731349 0.133128 
-0.0441539 0.976945 -0.208876 
-0.145334 0.200575 0.968838 

126.665 40.4515 876.198 

I have used the following dataset features :

features=datasets.Features(
    {
        "person_id": datasets.Value("string"),
        "frame_number": datasets.Value("string"),
        "depth_image": datasets.Value("string"),
        "rgb_image": datasets.Image(),
        "3D_head_center": datasets.Array2D(shape=(3, 3), dtype="float"),
        "3D_head_rotation": datasets.Value("float"),
    }

I am giving the path to the depth_image here.

I need some inputs for the following :

  1. For each person, the dataset has the following additional information :
For each sequence, the corresponding .obj file represents a head template deformed to match the neutral face of that specific person. [*.obj file]
In each folder, two .cal files contain calibration information for the depth and the color camera, e.g., the intrinsic camera matrix of the depth camera and the global rotation and translation to the rgb camera.

Wanted to know how we can represent these features ?

  1. For _generate_examples , do I parse the directories and fetch the required information ? This would mean reading the .txt file to obtain the "3D_head_center" and "3D_head_rotation" details. We could precompute the features information and have a metadata file and use the metadata file to yield information in _generate_examples ? Wanted your thoughts for the best approach for this ?

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Mar 13, 2022

The documentation is not available anymore as the PR was closed or merged.

@mariosasko
Copy link
Collaborator

Thanks for the detailed explanation of the structure!

  1. IMO it makes the most sense to yield one example for each person (so the total of 24 examples), so the features dict should be similar to this:
  features = Features({
      "rgb": Sequence(Image()),  # for the png frames
      "rgb_cal": {"intrisic_mat": Array2D(shape=(3, 3), dtype="float32"), "extrinsic_mat": {"rotation": Array2D(shape=(3, 3), dtype="float32"), "translation": Sequence(Value("float32", length=3)}},
      "depth": Sequence(Value("string")), # for the depth frames
      "depth_cal": the same as "rgb_cal",
      "head_pose_gt": Sequence({"center": Sequence(Value("float32", length=3), "rotation": Array2D(shape=(3, 3), dtype="float32")}),
      "head_template": Value("string"),  # for the person's obj file

  })

We can add a "Data Processing" section to the card to explain how to parse the files.

  1. Yes, it's ok to parse the files as long as it doesn't take too much time/memory (e.g., it's ok to parse the *_pose.txt or *.cal files, but it's better to leave the *_depth.bin or *.obj files unprocessed and yield the paths to them)

@dnaveenr dnaveenr changed the title Add for Biwi Kinect Head Pose dataset. Add Biwi Kinect Head Pose dataset. Mar 18, 2022
@dnaveenr
Copy link
Contributor Author

Thanks for the suggestions @mariosasko, yielding one example for each person would make things much easier.
Okay. I'll look at parsing the files and then displaying the information.

@dnaveenr
Copy link
Contributor Author

dnaveenr commented Mar 20, 2022

Added the following :

  • Features, I have included sequence_number and subject_id along with the features you had suggested.
  • Tested loading of the dataset along with dummy_data and full_data tests.
  • Created the dataset_infos.json file.

To-Do :

  • Update Dataset Cards with more details.
  • "Data Processing" section

Any inputs on what to include in the "Data Processing" section ?

@dnaveenr
Copy link
Contributor Author

@mariosasko Please could you review this when you get time. Thank you.

@dnaveenr
Copy link
Contributor Author

dnaveenr commented Apr 2, 2022

In the Data Processing section, I've added example code for a compressed binary depth image file. Updated the Readme as well.

@dnaveenr
Copy link
Contributor Author

dnaveenr commented Apr 8, 2022

@mariosasko / @lhoestq , Please could you review this when you get time. Thank you.

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool thanks ! I left a few comments :)

It looks like the CI on pyarrow 5 is failing because of an issue with Array2D that is unrelated to this dataset:

AttributeError: 'Array2DExtensionType' object has no attribute 'wrap_array'

Before merging this dataset we'll have to fix this issue. Let me create an issue on github

datasets/biwi_kinect_head_pose/biwi_kinect_head_pose.py Outdated Show resolved Hide resolved
datasets/biwi_kinect_head_pose/biwi_kinect_head_pose.py Outdated Show resolved Hide resolved
datasets/biwi_kinect_head_pose/biwi_kinect_head_pose.py Outdated Show resolved Hide resolved
datasets/biwi_kinect_head_pose/biwi_kinect_head_pose.py Outdated Show resolved Hide resolved
datasets/biwi_kinect_head_pose/biwi_kinect_head_pose.py Outdated Show resolved Hide resolved
datasets/biwi_kinect_head_pose/README.md Show resolved Hide resolved
@lhoestq
Copy link
Member

lhoestq commented Apr 12, 2022

Created an issue here: #4152

@dnaveenr
Copy link
Contributor Author

Got it. Thanks for the comments. I've collapsed the C++ code in the readme and added the suggestions.

@lhoestq
Copy link
Member

lhoestq commented May 6, 2022

Hi ! The AttributeError bug has been fixed, feel free to merge master into your branch ;)

@dnaveenr dnaveenr requested a review from lhoestq May 17, 2022 14:54
@dnaveenr
Copy link
Contributor Author

dnaveenr commented May 23, 2022

I haven't been able to figure out why CI is failing, the error shown is :

E           ValueError: The following issues have been found in the dataset cards:
E           README Parsing:
E           list index out of range
E           The following issues have been found in the dataset cards:
E           README Validation:
E           list index out of range

Any inputs would be helpful.

@lhoestq
Copy link
Member

lhoestq commented May 25, 2022

I think it's because there are tabulations in the c++ code, can you replace them with regular spaces please ?

(then in another PR we can maybe fix the Readme parser to support text indented with tabulations)

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ! I played with the dataset a bit and I find it not very convenient to have the images grouped together. Don't you think it would be more practical to have one example = one image in this dataset ?

datasets/biwi_kinect_head_pose/README.md Outdated Show resolved Hide resolved
Add contributions info.

Co-authored-by: Quentin Lhoest <[email protected]>
@dnaveenr
Copy link
Contributor Author

@lhoestq , initially the idea was to have one example = one image with an additional field mentioning the frame_number. But each subject, we had a head template, calibration information for the depth and the color camera which was common to all the examples for that subject. Also, the images were continuous frames.
@mariosasko suggested this structure and it made sense to group the images together for a particular subject.

@dnaveenr
Copy link
Contributor Author

Don't you think it would be more practical to have one example = one image in this dataset ?

Having one example = one image would be good but since we have a head template, calibration information for the depth and the color camera which is common to all the images for that subject and the images being continuous frames, I think it makes sense to group the images together for each subject. This will make the feature representation easier.

@lhoestq
Copy link
Member

lhoestq commented May 31, 2022

Ok I see, sounds good then. Users can still separate the images if they want to

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's all good then, thanks for adding this dataset ! :)

@lhoestq
Copy link
Member

lhoestq commented May 31, 2022

The CI fails are unrelated to this PR and fixed on master, merging !

@lhoestq lhoestq merged commit 55e6b30 into huggingface:master May 31, 2022
@dnaveenr
Copy link
Contributor Author

Great. Thanks @lhoestq , I think we can close this issue now. ( #3822 )

@dnaveenr dnaveenr deleted the add_biwi_kinect_head_pose_dataset branch May 31, 2022 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants