Initial commit

google-deepmind · May 13, 2019 · f0103ea · f0103ea
1 parent 7364b4f
commit f0103ea
Show file tree

Hide file tree

Showing 11 changed files with 852 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -0,0 +1,54 @@
+Exploiting temporal context for 3D human pose estimation in the wild
+==
+ <!--- # Bundle-Adjusted Poses for Kinetics-400 -->
+
+[Exploiting temporal context for 3D human pose estimation in the wild](http://arxiv.org/abs/1905.04266) uses temporal information from videos to correct errors in single-image 3D pose estimation.  In this repository, we provide results from applying this algorithm on the [Kinetics-400](https://deepmind.com/research/open-source/open-source-datasets/kinetics/) dataset.  Note that this is not an exhaustive labeling: at most one person is labeled per frame, and frames which the algorithm has identified as outliers are not labeled.
+
+The archive contains a single `.pkl` file for each video where bundle adjustment succeeded.  Let `N` be the number of frames that the algorithm considers inliers.  Then the `.pkl` file contains a map with the following keys:
+
+* `time`: Array of size `N`, where each element is the time in seconds since the start of the 10-second kinetics clip (not the start of the whole video)
+* `smpl_shape`: Array of size `Nx10`, where each row is the SMPL shape for one example.
+* `smpl_pose`: Array of size `Nx72`, where each row is the SMPL pose for one example.
+* `3d_keypoints`: Array of size `Nx24x3` where each slice is the 19 cocoplus joints obtained from the SMPL model using the custom keypoint regressor described below.
+* `2d_keypoints`: Array of size `Nx19x2`, where each slice is the 19 cocoplus joints reprojected from the SMPL model, using the custom keypoint regressor described below, in `(x,y)` coordinates.  These coordinates are normalized to the image frame: therefore, (0, 0) and (1,1) are the top-left and bottom-right corners respectively.
+* `cameras`: Array of size `Nx3`, containing the translation and scale that maps the SMPL 3D joint locations to `2d_keypoints`.  `cameras[:,0]` is scale and `cameras[:,1:3]` is translation.  Thus, if `x` is a `19x3` array of 3D keypoints in the format `(x,y,z)` produced byt the SMPL model, then `2d_keypoints` can be computed as `cameras[:,0:1]*(x[:,0:2]+cameras[:,1:3])`.
+* `vertices`: Array of size `Nx6890x3`. These are the vertices of the SMPL mesh computed from `smpl_shape` and `smpl_pose` computing with the neutral body model from [HMR](https://github.com/akanazawa/hmr).
+
+The dataset can be downloaded [here](https://storage.cloud.google.com/temporal-3d-pose-kinetics/temporal_3d_pose_kinetics.tar.gz) (325 GB), as well as an significantly smaller archive which does not contain `vertices`, but is otherwise identical, [here](https://storage.cloud.google.com/temporal-3d-pose-kinetics/temporal_3d_pose_kinetics_noverts.tar.gz) (2.7 GB).
+
+## Joint regressor
+
+We also have a custom [joint regressor](https://storage.cloud.google.com/temporal-3d-pose-kinetics/custom_joint_regressor.pkl) that is specific to our pose estimator (since there are slight differences between the 2D joints we used for bundle adjustment and those used for SMPL).  This is a `6890x19` array that can be used as a drop-in replacement for the `cocoplus_regressor` that is distributed in the public [HMR repository](https://github.com/akanazawa/hmr), and is required to extract the `3d_keypoints` above from the estimated poses.  It was learned using ground-truth from the [Human3.6m dataset](http://vision.imar.ro/human3.6m/).
+
+## Pretrained Model
+This [Tensorflow checkpoint](https://storage.cloud.google.com/temporal-3d-pose-kinetics/model-894621.tar.gz) was trained using the procedure outlined in our paper.  That is, it uses the above dataset as well as standard HMR 3D data.  The checkpoint is compatible with [HMR](https://github.com/akanazawa/hmr).
+
+## Visualising data
+
+- You need to install [`youtube-dl`](https://github.com/ytdl-org/youtube-dl) and [`ffmpeg`](http://ffmpeg.org) to download the Kinetics videos to visualise.
+- Download the faces of the SMPL mesh for visualisation: `wget https://github.com/akanazawa/hmr/raw/master/src/tf_smpl/smpl_faces.npy`
+- The python packages needed are in `requirements.txt`. We recommend create a new virtual environment, and running `pip install -r requirements.txt`.
+
+To run the demo:
+
+`python run_visualise --filename <path_to_downloaded_pickle_file>`
+
+## Credits
+- The Kinetics download scripts are from [ActivityNet](https://github.com/activitynet/ActivityNet/tree/master/Crawler/Kinetics)
+- The renderer to visualise the SMPL model is from [HMR]( https://github.com/akanazawa/hmr)
+
+## Reference
+
+If you use this data, please cite
+
+```tex
+@InProceedings{Arnab_CVPR_2019,
+    author = {Arnab, Anurag* and 
+              Doersch, Carl* and 
+              Zisserman, Andrew},
+    title = {Exploiting temporal context for 3D human pose estimation in the wild},
+    booktitle = {Computer Vision and Pattern Recognition (CVPR)},
+    month = {June},
+    year = {2019}
+}
+```
diff --git a/plot_utils.py b/plot_utils.py
@@ -0,0 +1,180 @@
+# Copyright 2018 DeepMind Technologies Limited.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+
+"""Create plots with Matplotlib to visualise the result."""
+
+import StringIO
+import matplotlib.pyplot as plt
+import numpy as np
+
+HMR_JOINT_NAMES = [
+    'right_ankle',
+    'right_knee',
+    'right_hip',
+    'left_hip',
+    'left_knee',
+    'left_ankle',
+    'right_wrist',
+    'right_elbow',
+    'right_shoulder',
+    'left_shoulder',
+    'left_elbow',
+    'left_wrist',
+    'neck',
+    'head_top',
+    'nose',
+    'left_eye',
+    'right_eye',
+    'left_ear',
+    'right_ear',
+]
+
+MSCOCO_JOINT_NAMES = [
+    'nose', 'left_eye', 'right_eye', 'left_ear', 'right_ear', 'left_shoulder',
+    'right_shoulder', 'left_elbow', 'right_elbow', 'left_wrist', 'right_wrist',
+    'left_hip', 'right_hip', 'left_knee', 'right_knee', 'left_ankle',
+    'right_ankle'
+]
+
+coco_to_hmr = []
+for name in MSCOCO_JOINT_NAMES:
+  index = HMR_JOINT_NAMES.index(name)
+  coco_to_hmr.append(index)
+
+PARENTS_COCO_PLUS = [
+    1, 2, 8, 9, 3, 4, 7, 8, 12, 12, 9, 10, 14, -1, 13, -1, -1, 15, 16
+]
+COLOURS = []
+for name in HMR_JOINT_NAMES:
+  if name.startswith('left'):
+    c = 'r'
+  elif name.startswith('right'):
+    c = 'g'
+  else:
+    c = 'm'
+  COLOURS.append(c)
+
+
+def plot_keypoints_2d(image,
+                      joints_2d,
+                      ax=None,
+                      show_plot=False,
+                      title='',
+                      is_coco_format=False):
+  """Plot 2d keypoints overlaid on image."""
+
+  if ax is None:
+    fig = plt.figure()
+    ax = fig.add_subplot(111)
+
+  if hasattr(ax, 'set_axis_off'):
+    ax.set_axis_off()
+
+  if is_coco_format:
+    kp = np.zeros((len(HMR_JOINT_NAMES), 2))
+    kp[coco_to_hmr, :] = joints_2d
+    joints_2d = kp
+
+  if image is not None:
+    ax.imshow(image)
+
+  joint_colour = 'c' if not is_coco_format else 'b'
+  s = 30 * np.ones(joints_2d.shape[0])
+  for i in range(joints_2d.shape[0]):
+    x, y = joints_2d[i, :]
+    if x == 0 and y == 0:
+      s[i] = 0
+
+  ax.scatter(
+      joints_2d[:, 0].squeeze(),
+      joints_2d[:, 1].squeeze(),
+      s=30,
+      c=joint_colour)
+
+  for idx_i, idx_j in enumerate(PARENTS_COCO_PLUS):
+    if idx_j >= 0:
+      pair = [idx_i, idx_j]
+      x, y = joints_2d[pair, 0], joints_2d[pair, 1]
+      if x[0] > 0 and y[0] > 0 and x[1] > 0 and y[1] > 0:
+        ax.plot(x.squeeze(), y.squeeze(), c=COLOURS[idx_i], linewidth=1.5)
+
+  ax.set_xlim([0, image.shape[1]])
+  ax.set_ylim([image.shape[0], 0])
+
+  if title:
+    ax.set_title(title)
+
+  if show_plot:
+    plt.show()
+
+  return ax
+
+
+def plot_summary_figure(img,
+                        joints_2d,
+                        rend_img_overlay,
+                        rend_img,
+                        rend_img_vp1,
+                        rend_img_vp2,
+                        save_name=None):
+  """Create plot to visulise results."""
+
+  fig = plt.figure(1, figsize=(20, 12))
+  plt.clf()
+
+  plt.subplot(231)
+  plt.imshow(img)
+  plt.title('Input')
+  plt.axis('off')
+
+  ax_skel = plt.subplot(232)
+  ax_skel = plot_keypoints_2d(img, joints_2d, ax_skel)
+  plt.title('Joint Projection')
+  plt.axis('off')
+
+  plt.subplot(233)
+  plt.imshow(rend_img_overlay)
+  plt.title('3D Mesh overlay')
+  plt.axis('off')
+
+  plt.subplot(234)
+  plt.imshow(rend_img)
+  plt.title('3D mesh')
+  plt.axis('off')
+
+  plt.subplot(235)
+  plt.imshow(rend_img_vp1)
+  plt.title('Other viewpoint (+60 degrees)')
+
+  plt.axis('off')
+  plt.subplot(236)
+  plt.imshow(rend_img_vp2)
+  plt.title('Other viewpoint (-60 degrees)')
+  plt.axis('off')
+
+  plt.draw()
+
+  if save_name is not None:
+    buf = StringIO.StringIO()
+    plt.savefig(buf, format='jpg')
+    buf.seek(0)
+
+    with open(save_name, 'w') as fp:
+      fp.write(buf.read(-1))
+  else:
+    plt.show()
+
+  return fig
+
diff --git a/requirements.txt b/requirements.txt
@@ -0,0 +1,7 @@
+matplotlib==2.2.3
+numpy==1.11.3
+absl-py
+scipy==1.2.1
+scikit-video==1.1.11
+opencv-python==4.0.0.21
+opendr==0.78