Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider other options for textured depth streaming #4

Closed
bmegli opened this issue Mar 3, 2020 · 7 comments
Closed

consider other options for textured depth streaming #4

bmegli opened this issue Mar 3, 2020 · 7 comments
Labels
planning high level plans

Comments

@bmegli
Copy link
Owner

bmegli commented Mar 3, 2020

The current textured depth streaming works.

However I expected that bitrate requirements would be lower.

This may be due to hacky unooptimal encoding of infrared in chroma.

Totally subjectively good results require around 8 Mb @ 848x480, 30 fps, no B frames.

Again subjectively I would expect around 5 Mb to be enough from the facts:

  • 4 Mb seemed enough for reasonable quality pointclouds with HEVC Main10 encoding
  • 1 Mb seemed enough for reasonable quality infrared with H.264 encoding

Alternative approach would encode separately depth and infrared.

How it will affect quality and latency is comlex.

From Intel Programmers Reference Manual for KabyLake

VDBOX Media VDBOX:

The encoding process is partitioned across host software, the GPE engine, and the MFX engine. The generation of transport layer, sequence layer, picture layer, and slice header layer must be done in the host software. GP hardware is responsible for compressing from Slice Data Layer down to all macro-block and block layers. Specifically, GPE w/ VME acceleration is for motion vector estimation, motion estimation, and code decision.

HCP HEVC Coding Pipeline:

Supports Video Command Streamer (VCS):

  • Shared with MFX HW pipeline, and at any one time, only one pipeline (MFX or HCP) and one
    operation (decoding or encoding) can be active

It seems some of the operations could run concurrently, other not.
With that in mind the simplest way to check is through experiment/benchmark.

@bmegli bmegli added the planning high level plans label Mar 3, 2020
@bmegli
Copy link
Owner Author

bmegli commented Mar 6, 2020

Related to #2

@bmegli
Copy link
Owner Author

bmegli commented Mar 9, 2020

Before proceeding it is important to know how much time it takes to encode to HEVC Main10 currently.

This will be architecture dependent, model dependent and possibly even current CPU/GPU clock in GHz dependent but some idea of the timings is necessary.

@bmegli
Copy link
Owner Author

bmegli commented Mar 9, 2020

From depth encoding time benchmark

scenario i7-7820hk LPA m3-7y30
848x480 depth HEVC Main10 7-9 ms 8-10 ms
640x360 depth HEVC Main10 5-6 ms 6-7 ms
848x480 ir HEVC Main 5-8 ms 6-8 ms
848x480 ir H264 3-5 ms 3-5 ms
640x360 ir H264 3-4 ms 3-4 ms

It seems that at 848x480@30 fps it should be possible to encode depth and ir separately (HEVC + HEVC or HEVC + H264).

In fact there enough time left for some postprocessing (if needed)

@bmegli
Copy link
Owner Author

bmegli commented Jun 11, 2020

There is Intel whitepaper:

Depth image compression by colorization for Intel® RealSense™ Depth Cameras that describes depth encoding in RGB using hue color space which gives 10 and half bits depth encoding.

Method is interesting but will not work with most of hardware encoders correctly. The reason is chroma subsampling (Intel encoding requires 4:4:4, most hardware encoders in the wild don't support it).

@bmegli
Copy link
Owner Author

bmegli commented Jun 21, 2020

I am satisfied with how HVS point cloud streaming works, closing for now.

@bmegli bmegli closed this as completed Jun 21, 2020
@bmegli
Copy link
Owner Author

bmegli commented Jun 21, 2020

Another interesting whitepaper by Intel:

Enabling High Quality Volumetric VOD Streaming Over Broadband and 5G

Some early comments after reading:

  • whitepaper discusses ~100 Mbps bandwidth (~10x data reduction)
    • but for long-range wireless transmission even 10 Mbps is a lot (~100x data reduction)
  • motion to photon latency
    • this is concerned mostly by having the data on device
    • performing rendering locally on device (not remotely)
    • and reacting to user motion immediately (headset)
    • this should not be confused with glass-to-glass of 3d sensor streaming
      • glass-to-glass may and will be higher than a few ms
      • and represents update rate of the 3D world
  • the experiments use 65k vertices
    • but even D435 at 848x480 may produce up to 848*480 = ~ 400k vertices per frame (!)
    • at the same time the texture resolution considered is 2k
    • this may mean low quality of 3D vs texture
  • the paper is only concerned with decoding time
    • but the hard part is real time encoding
    • decoding is typically order of magnitude faster
    • so it looks like the paper is concerned with offline processing of the data (encoding) and only later serving it to real-time VR framework

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
planning high level plans
Projects
None yet
Development

No branches or pull requests

1 participant