Flesh out support for VK_FORMAT_R16G16_S10_5_NV. #864

MarkCallow · 2024-03-01T03:09:13Z

Code is believed complete. Tests have yet to be added. Hence draft status.

This includes a new round-trip test of vk2dfd and dfd2vk for all Vulkan formats except multi-plane and modifications to both functions, or rather their generators, to fix support for some formats, including VK_FORMAT_R16G16_S10_5_NV and the YUV 422 single plane formats. Those listed required changes tointerpretDFD.

Fixes #859.

MarkCallow · 2024-03-01T03:15:20Z

@fluppeteer, please review. I was unable to add your GitHub id to the reviewers list.

lib/dfdutils/createdfd.c

lib/dfdutils/dfd.h

lib/dfdutils/dfd2vk.inl

fluppeteer · 2024-03-16T19:29:01Z

lib/dfdutils/interpretdfd.c

-                        /* Uh-oh, we've seen this channel before. */
-                        return i_UNSUPPORTED_NONTRIVIAL_ENDIANNESS;
+                        if (model == KHR_DF_MODEL_YUVSDA && sampleChannel == KHR_DF_CHANNEL_YUVSDA_Y) {
+                            if (sampleChannelPtr == R) {


Note that strictly speaking this should be checking whether the coordinates for the sample are different, since a big-endian 16-bit YUYV channel would be using two samples for each of Y0, U, Y1 and V. I presume that's rare enough that it's not worth the code to handle it, at least for now (which is where the NONTRIVIAL_ENDIANNESS comes in).

IIRC correctly @fluppeteer I sent you an email asking some questions about samplePosition. All of the YUVSDA format DFDs we are dealing with have different sample positions for each sample not just Y1 and Y2. I am unaware of any Vulkan YUVSDA formats that would have samplePositions that are all the same or all 0 (as in RGBSDA). Hence I did not test samplePosition. The keyword here is "unaware". I think you have a far better understanding of YUV than I do. If you think it should be tested I will add that.

FYI, for the single-plane 4:2:2 YUV formats (YUYV) we set the sample positions to be cosited even when creating KTX files (Y: 64, 128, 0, 0; U: 64, 128, 0, 0; Y: 192, 128, 0, 0; V: 64, 128, 0, 0), considering that there are currently no explicit controls to configure cositing explicitly.

You did - and it sounded like something was broken. I'm hoping to get back and look at that shortly, but I'd failed to register that you needed to know for this (which, I guess, should have been obvious). Sorry I've only just found time.

For a 422 format, Y should be 0, and one of the X values should be 0 - the indexing should be happening relative to the minimum coordinate of any sample (which I didn't explicitly say, I suspect). If you're doing a texture lookup that places the sample location at a half-pixel offset relative to the coordinate space (which is common), that's a separate transformation, although it would be helpful to have made that explicit in the introduction somewhere. I'll try to get that in with the next update.

The Y(/G) samples should be at X = 0 and 1. The Cb and CR (/B and R) samples will likely be either at X = 0 or X = 0.5, depending on the representation (for 4:2:0 formats, typical JPEG and MPEG disagree, for example); Vulkan doesn't embed that in the format, it's in the samplerYCbCrConversion, so you may have to pick a representation that may be inaccurate if the format is the only thing you have access to.

I'll look as soon as I've had lunch to see whether that 0.5 offset you mentioned is still there and what's causing it.

What I'm referring to here is running on a big-endian architecture, at which point you have multiple samples representing each channel (in order to handle the non-contiguous bits when seen in little-endian canonical order). If you had a big-endian 16-bit-per-channel YUV422 format and looked at it on a little-endian machine (or used the nominally canonical little-endian representation) then you'd have 8 samples, with two each for each channel, so a total of four covering the two Y samples. Is anyone actually supporting this? Questionable. I suspect there are more urgent things to fix.

Ah - thanks @aqnuep - sorry, crossed over. :-) I'll look at the code shortly.

Just for me to understand though, why is 0 the midpoint of the pixel for sample position if the sample positions are unsigned integers? How could then one express e.g. sample positions left of the midpoint?

Interesting. 0 isn't the midpoint so much as it's the origin of the repeating pixel pattern, which is assumed to map to the coordinates that your texture sampling code/hardware uses; in Vulkan terms they're integer i,j coordinates. That's a per-axis thing - I could imagine a 2x2 coordinate format (the result of checkerboard rendering?) with samples at 1,0 and 0,1. A 4:4:4 format which had chroma offset by half a pixel relative to luma (because... reasons?) would probably place one sample at 0 and the next at 0.5 (128). If the implementation (as with Vulkan's texturing) maps fractional coordinates such that the integer sample grid isn't a floor operation, that's its own choice.

But you do have a point; I could imagine a format where, in the interests of orthogonality, you might want to express a sample at a negative offset relative to what all the other formats you're dealing with would call "0", and that relocating a different sample to "0" would cause problems. I've kind of assumed that fractional coordinates happen at the higher coordinate end - you can have a sample at 0 and 0.5, but not -0.5 and 0. Arguably the latter is what you get if you flip a 4:2:2 image with an odd width. Cosited-even samples already have to flip to cosited-odd under similar circumstances, and transpositions already have to flip the direction of downsampling, so it's not entirely unreasonable.

I think in the specific case of a we-don't-know-what-it-is 4:2:2 format the origin should still align to the luma sample locations. You have a problem with any offset that it may not be possible to express the chroma sample location within the texel block coordinate range, if it falls on a boundary. The problem feels common enough that it's a bit unfortunate to have to rely on an extension block to disambiguate it, though. You could work around it by having the texel block size in coordinates different from the expressible subsample range, but that complicates the decoder, and once you're handling format specially the whole DFD goes out of the window.

I wonder whether this is worth adopting four of the "flags" bits to indicate inverting the coordinate system for each coordinate (unless I've forgotten using them for something; I'd thought about using a flag to indicate a bitfield channel representation, which would reduce the number of samples in some cases, but it's a relatively minor win for a doubling in the amount of decode logic). That might actually simplify the "flipping a 4:2:2" case. I need to think more, but if you have any thoughts on the subject, do let me know.

I'll admit to being lost in this conversation. As the current sample positions pre-date this PR we'll leave them in place, not delay this PR and continue the discussion. It sounds like sample positions are generally implementation dependent. Since the DFD here is describing the data in the KTX file, not for any particular implementation, maybe the 0 and 1 that was suggested somewhere in here is the best option. Answering the question of how image processing operations would use the sample positions may help to make the solution clear. It is not a question I can answer.

We need to capture this conversation to continue it after this PR is closed. Should I create an issue?

Thanks Mark - please do. It relates to 4:2:2 and 4:2:0 formats in general, but certainly not specifically this one, and we should handle it separately.

lib/dfdutils/makedfd2vk.pl

fluppeteer · 2024-03-16T19:56:57Z

I think this mostly looks good, except that the multi-channel packing convention wires got crossed. I've added notes where I think the numbers should be different. Please poke me if you change them and want a re-review, or want to push back; I'll try to respond promptly this time.

Includes restoring SCALED formats to vkformat_list.inl and exporting dfd2vk via internal exports.

to dfd2vk and support for fixed point to interpretDFD.

Remove excess space. Co-authored-by: Andrew Garrard <[email protected]>

Use defines for RGBA to YUV mapping for clarity.

fluppeteer · 2024-03-19T21:22:32Z

The #define approach looks much easier to read (not a phrase I'll often say) - thanks for clarifying, and sorry again to have caused the churn!

Assuming all the test are happy, I think this is good to go, and we can talk about coordinate offsets separately.

MarkCallow marked this pull request as draft March 1, 2024 03:09

MarkCallow requested a review from aqnuep March 1, 2024 03:10

aqnuep reviewed Mar 11, 2024

View reviewed changes

lib/dfdutils/createdfd.c Show resolved Hide resolved

MarkCallow force-pushed the issue859 branch from 83eefaa to ab9363b Compare March 14, 2024 08:14

MarkCallow marked this pull request as ready for review March 15, 2024 09:04