Sapiens-Pytorch-Inference

Minimal code and examples for inferencing Sapiens foundation human models in Pytorch

Why

Make it easy to run the models by creating a SapiensPredictor class that allows to run multiple tasks simultaneously
Add several examples to run the models on images, videos, and with a webcam in real-time.
Download models automatically from HuggigFace if not available locally.
Add a script for ONNX export. However, ONNX inference is not recommended due to the slow speed.
Added Object Detection to allow the model to be run for each detected person. However, this mode is disabled as it produces the worst results.

Caution

Use 1B models, since the accuracy of lower models is not good (especially for segmentation)
Exported ONNX models are too slow.
Input sizes other than 768x1024 don't produce good results.
Running Sapiens models on a cropped person produces worse results, even if you crop a wider rectangle around the person.

Installation

pip install sapiens-inferece

Or, clone this repository:

git clone https://github.com/ibaiGorordo/Sapiens-Pytorch-Inference.git
cd Sapiens-Pytorch-Inference
pip install -r requirements.txt

Usage

import cv2
from imread_from_url import imread_from_url
from sapiens_inference import SapiensPredictor, SapiensConfig, SapiensDepthType, SapiensNormalType

# Load the model
config = SapiensConfig()
config.depth_type = SapiensDepthType.DEPTH_03B  # Disabled by default
config.normal_type = SapiensNormalType.NORMAL_1B  # Disabled by default
predictor = SapiensPredictor(config)

# Load the image
img = imread_from_url("https://github.com/ibaiGorordo/Sapiens-Pytorch-Inference/blob/assets/test2.png?raw=true")

# Estimate the maps
result = predictor(img)

cv2.namedWindow("Combined", cv2.WINDOW_NORMAL)
cv2.imshow("Combined", result)
cv2.waitKey(0)

SapiensPredictor

The SapiensPredictor class allows to run multiple tasks simultaneously. It has the following methods:

SapiensPredictor(config: SapiensConfig) - Load the model with the specified configuration.
__call__(img: np.ndarray) -> np.ndarray - Estimate the maps for the input image.

SapiensConfig

The SapiensConfig class allows to configure the model. It has the following attributes:

dtype: torch.dtype - Data type to use. Default: torch.float32.
device: torch.device - Device to use. Default: cuda if available, otherwise cpu.
depth_type: SapiensDepthType - Depth model to use. Options: OFF, DEPTH_03B, DEPTH_06B, DEPTH_1B, DEPTH_2B. Default: OFF.
normal_type: SapiensNormalType - Normal model to use. Options: OFF, NORMAL_03B, NORMAL_06B, NORMAL_1B, NORMAL_2B. Default: OFF.
segmentation_type: SapiensSegmentationType - Segmentation model to use (Always enabled for the mask). Options: SEGMENTATION_03B, SEGMENTATION_06B, SEGMENTATION_1B. Default: SEGMENTATION_1B.
detector_config: DetectorConfig - Configuration for the object detector. Default: {model_path: str = "models/yolov8m.pt", person_id: int = 0, confidence: float = 0.25}. Disabled as it produces worst results.
minimum_person_height: float - Minimum height ratio of the person to detect. Default: 0.5f (50%). Not used if the object detector is disabled.

Examples

Image Sapiens Predictor (Normal, Depth, Segmentation):

python image_predictor.py

Video Sapiens Predictor (Normal, Depth, Segmentation): (https://youtu.be/hOyrnkQz1NE?si=jC76W7AY3zJnZhH4)

python video_predictor.py

Webcam Sapiens Predictor (Normal, Depth, Segmentation):

python webcam_predictor.py

Image Normal Estimation:

python image_normal_estimation.py

Image Human Part Segmentation:

python image_segmentation.py

Image Pose Estimation

python image_pose_estimation.py

Video Normal Estimation:

python video_normal_estimation.py

Video Human Part Segmentation:

python video_segmentation.py

Webcam Normal Estimation:

python webcam_normal_estimation.py

Webcam Human Part Segmentation:

python webcam_segmentation.py

Export to ONNX

To export the model to ONNX, run the following script:

python export_onnx.py seg03b

The available models are seg03b, seg06b, seg1b, depth03b, depth06b, depth1b, depth2b, normal03b, normal06b, normal1b, normal2b.

Original Models

The original models are available at HuggingFace: https://huggingface.co/facebook/sapiens/tree/main/sapiens_lite_host

License: Creative Commons Attribution-NonCommercial 4.0 International (https://github.com/facebookresearch/sapiens/blob/main/LICENSE)

References

Sapiens: https://github.com/facebookresearch/sapiens
Sapiens Lite: https://github.com/facebookresearch/sapiens/tree/main/lite
HuggingFace Model: https://huggingface.co/facebook/sapiens

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sapiens-Pytorch-Inference

Why

Installation

Usage

SapiensPredictor

SapiensConfig

Examples

Export to ONNX

Original Models

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
models		models
notebooks		notebooks
sapiens_inference		sapiens_inference
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
image_normal_estimation.py		image_normal_estimation.py
image_pose_estimation.py		image_pose_estimation.py
image_predictor.py		image_predictor.py
image_segmentation.py		image_segmentation.py
onnx_export.py		onnx_export.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
video_normal_estimation.py		video_normal_estimation.py
video_predictor.py		video_predictor.py
video_segmentation.py		video_segmentation.py
webcam_normal_estimation.py		webcam_normal_estimation.py
webcam_predictor.py		webcam_predictor.py
webcam_segmentation.py		webcam_segmentation.py

License

Jaykumaran/Sapiens-Pytorch-Inference

Folders and files

Latest commit

History

Repository files navigation

Sapiens-Pytorch-Inference

Why

Installation

Usage

SapiensPredictor

SapiensConfig

Examples

Export to ONNX

Original Models

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages