Replies: 30 comments 31 replies
-
According to the libvips color documentation, I suppose you could use Nx to do the transformation yourself, perhaps using these transforms but that would require some development by you. Alternatively, you could leverage eVision by @cocoa-xu which is based upon OpenCV. You could either use I've you attach a sample image in YUV colourspace, I'll willing to do some experimentation on an image/evision version of what you're looking for. |
Beta Was this translation helpful? Give feedback.
-
Thanks @mat-hek. I'm a little confused on the image example. |
Beta Was this translation helpful? Give feedback.
-
You can tell from my comment that this is not an area of expertise for me at all! I've been doing some further reading and so far my findings are:
Therefore thinking about your problem, the best I can come up with so far is:
I wonder if using |
Beta Was this translation helpful? Give feedback.
-
There is an ffmpeg command line interface library at https://hexdocs.pm/ffmpex/readme.html that might help. PS: Not giving up on this, just not sure of the path forward. |
Beta Was this translation helpful? Give feedback.
-
OK, one more thought. I do think eVision will be the better tool for this (better than
I have no doubt that I have oversimplified the details above but I think its the right general direction. For converting back to YUV there appears to be a number of different color spaces that meet your requirements: iex> Evision.Constant.cv_COLOR_RGB<tab>
cv_COLOR_RGB2BGR/0 cv_COLOR_RGB2BGR555/0 cv_COLOR_RGB2BGR565/0 cv_COLOR_RGB2BGRA/0
cv_COLOR_RGB2GRAY/0 cv_COLOR_RGB2HLS/0 cv_COLOR_RGB2HLS_FULL/0 cv_COLOR_RGB2HSV/0
cv_COLOR_RGB2HSV_FULL/0 cv_COLOR_RGB2Lab/0 cv_COLOR_RGB2Luv/0 cv_COLOR_RGB2RGBA/0
cv_COLOR_RGB2XYZ/0 cv_COLOR_RGB2YCrCb/0 cv_COLOR_RGB2YUV/0 cv_COLOR_RGB2YUV_I420/0
cv_COLOR_RGB2YUV_IYUV/0 cv_COLOR_RGB2YUV_YV12/0 cv_COLOR_RGBA2BGR/0 cv_COLOR_RGBA2BGR555/0
cv_COLOR_RGBA2BGR565/0 cv_COLOR_RGBA2BGRA/0 cv_COLOR_RGBA2GRAY/0 cv_COLOR_RGBA2mRGBA/0
cv_COLOR_RGBA2RGB/0 cv_COLOR_RGBA2YUV_I420/0 cv_COLOR_RGBA2YUV_IYUV/0 cv_COLOR_RGBA2YUV_YV12/0 If thats not enough, then using |
Beta Was this translation helpful? Give feedback.
-
Thanks for this very comprehensive answer!
Indeed it is, I should have made this clear. There is a simple file format called Y4M that holds plain YUV, but it's for video, so it can carry multiple YUV frames, and I don't think it's popular.
Indeed, FFmpeg can do this, but I think it's too heavy for the job, especially if I already have a stream of decoded YUV frames
I'm afraid Membrane doesn't support this, because I'm trying to add it there now 😄
Here comes the first problem, how do I convert YUV to the OpenCV matrix. I cannot take it from the camera, and for
It seems doable with OpenCV, but a bit tricky, especially when the overlay image has some opacity in it. Would it be feasible to use Image to do that once I have RGB?
Makes sense
I'm good once I have the output YUV. Then I'm going to use Membrane to stream further. Thanks again, OFC I'll post here if I have something ;) |
Beta Was this translation helpful? Give feedback.
-
I did some more googling and perhaps this Python lib encapsulates most of what you're after? It looks well written and should convert to a Elixir/Nx reasonably well. I'm ok to tackle an Elixir implementation if you think it meets your needs. I'll make it a new library |
Beta Was this translation helpful? Give feedback.
-
Yes, this is easy in |
Beta Was this translation helpful? Give feedback.
-
Ok, so it turns out that my friends from Video compositor had the same problem and they wrote the RGB <-> YUV 420 by hand, so I rewrote it in Elixir. I suppose it's not very performant and I'm not sure I did everything right, but it seems to work! Here's the POC: Mix.install([:vix, :image])
defmodule YUV420pToRGB do
def convert(image, width) do
y_size = trunc(byte_size(image) * 2 / 3)
uv_size = trunc(byte_size(image) * 1 / 6)
<<y_plane::binary-size(y_size), u_plane::binary-size(uv_size), v_plane::binary-size(uv_size)>> =
image
IO.iodata_to_binary(convert_rows(y_plane, u_plane, v_plane, width))
end
defp convert_rows(<<>>, <<>>, <<>>, _width) do
[]
end
defp convert_rows(y_plane, u_plane, v_plane, width) do
uv_width = div(width, 2)
<<y_row::binary-size(width), y_next_row::binary-size(width), y_plane::binary>> = y_plane
<<u_row::binary-size(uv_width), u_plane::binary>> = u_plane
<<v_row::binary-size(uv_width), v_plane::binary>> = v_plane
[
convert_pixels(y_row, u_row, v_row),
convert_pixels(y_next_row, u_row, v_row) | convert_rows(y_plane, u_plane, v_plane, width)
]
end
defp convert_pixels(<<>>, <<>>, <<>>) do
[]
end
defp convert_pixels(<<y1, y2, y_row::binary>>, <<u, u_row::binary>>, <<v, v_row::binary>>) do
[convert_pixel(y1, u, v), convert_pixel(y2, u, v) | convert_pixels(y_row, u_row, v_row)]
end
defp convert_pixel(y, u, v) do
r = clamp(y + 1.40200 * (v - 128.0))
g = clamp(y - 0.34414 * (u - 128.0) - 0.71414 * (v - 128.0))
b = clamp(y + 1.77200 * (u - 128.0))
<<r, g, b>>
end
defp clamp(number) when number > 255, do: 255
defp clamp(number) when number < 0, do: 0
defp clamp(number), do: round(number)
end
defmodule RGBToYUV444p do
def convert(image) do
do_convert(image, {[], [], []})
end
defp do_convert(<<>>, {y_plane, u_plane, v_plane}) do
IO.iodata_to_binary([Enum.reverse(y_plane), Enum.reverse(u_plane), Enum.reverse(v_plane)])
end
defp do_convert(<<r, g, b, image::binary>>, {y_plane, u_plane, v_plane}) do
y = clamp(0.299 * r + 0.587 * g + 0.114 * b)
u = clamp(-0.168736 * r - 0.331264 * g + 0.5 * b + 128.0)
v = clamp(0.5 * r + -0.418688 * g + -0.081312 * b + 128.0)
do_convert(image, {[y | y_plane], [u | u_plane], [v | v_plane]})
end
defp clamp(number) when number > 255, do: 255
defp clamp(number) when number < 0, do: 0
defp clamp(number), do: round(number)
end
defmodule YUV444pToYUV420p do
def convert(image, width) do
plane_size = byte_size(image) |> div(3)
<<y_plane::binary-size(plane_size), u_plane::binary-size(plane_size),
v_plane::binary-size(plane_size)>> = image
IO.iodata_to_binary([y_plane, convert_rows(u_plane, width), convert_rows(v_plane, width)])
end
defp convert_rows(<<>>, _width) do
[]
end
defp convert_rows(plane, width) do
<<row::binary-size(width), next_row::binary-size(width), plane::binary>> = plane
[convert_pixels(row, next_row) | convert_rows(plane, width)]
end
defp convert_pixels(<<>>, <<>>) do
[]
end
defp convert_pixels(<<a, b, row::binary>>, <<c, d, next_row::binary>>) do
result = round((a + b + c + d) / 4)
[<<result>> | convert_pixels(row, next_row)]
end
end
yuv = File.read!("image.yuv")
rgb = YUV420pToRGB.convert(yuv, 1920)
{:ok, vix} = Vix.Vips.Image.new_from_binary(rgb, 1920, 1080, 3, :VIPS_FORMAT_UCHAR)
overlay = Image.open!("membrane.png")
composed = Image.compose!(vix, overlay)
{:ok, composed_srgb} = Vix.Vips.Image.write_to_binary(composed)
# It seems that Vips cannot convert to plain RGB
composed_rgb = for <<r, g, b, _a <- composed_srgb>>, do: <<r, g, b>>, into: <<>>
yuv444 = RGBToYUV444p.convert(composed_rgb)
output_yuv = YUV444pToYUV420p.convert(yuv444, 1920)
# Convert back to RGB to look up
rgb = YUV420pToRGB.convert(output_yuv, 1920)
{:ok, vix} = Vix.Vips.Image.new_from_binary(rgb, 1920, 1080, 3, :VIPS_FORMAT_UCHAR)
Image.write!(vix, "output.jpeg") Output: I may need the support for yuv422 and yuv444 and maybe others - I'm not sure if there's something else actually used out there - so if you're willing to help, we can collaborate and make it image_yuv ;) The python library you mentioned seems right, but it seems it only supports yuv444. |
Beta Was this translation helpful? Give feedback.
-
Thats very cool, thank you.
It supports
It also supports: |
Beta Was this translation helpful? Give feedback.
-
Net net I'm going to try and re-implement the Python package and use your code (and hopefully your engagement) to verify, test and improve. |
Beta Was this translation helpful? Give feedback.
-
I'm creating a repo called Integrate with
|
Beta Was this translation helpful? Give feedback.
-
And ... small change in plan. I've been overcomplicating things. I'm just going to add an I also have the matrices for converting between BT.601 (SD) BT.709 (HD) and BT.2020 (UHDTV) and RGB. I expect to have a version you can test out on your Tuesday. I've learned a lot already from this - thanks for the opportunity. |
Beta Was this translation helpful? Give feedback.
-
One other limitation (at least in the first iteration) will be that the data is 8 bits per channel. It shouldn't be too hard to adapt to 10 or 12 bit data. Is that something you're likely to need? I'm not confident the rest of |
Beta Was this translation helpful? Give feedback.
-
I've got a basic implementation up a running on my Mac M1 Max, 64Gb. The results are not correct, but the computational effort isn't likely to change. This is using your 1920x1080 test image at the top of this thread. The results are quite promising:
Less than a millisecond and less than 12KB of memory is pretty good I think. The only thing is that the results are incorrect :-). Well it's late at night so I'll tackle it again in the morning. |
Beta Was this translation helpful? Give feedback.
-
The code is located https://github.com/elixir-image/image/blob/yuv/lib/image/yuv.ex but it certainly isn't producing the expected result yet. I'm following the principles of libvips/libvips#2561 where the key function is You can try it for yourself if you're curious by: {:ok, f} = File.open("/path/to/image.yuv")
data = IO.binread(f, :all)
decoded = Image.YUV.decode(data, 1920, 1080, :C420)
{:ok, rgb} = Image.YUV.to_rgb(decoded, 1920, 1080, :C420, :bt601) Time for sleep. |
Beta Was this translation helpful? Give feedback.
-
The sleep helped. We are up and running for YUV image decoding. In the The performance looks pretty good (all due to
I'll work next on image encoding to YUV which should be pretty straightforward now. Feel free to try out the |
Beta Was this translation helpful? Give feedback.
-
I've added The performance of converting RGB to YUV is much slower than YUV to RGB and I'm not sure why. I'll look into it later today - it won't change the API. For now I think this is API complete other than Here is the benchmark data:
|
Beta Was this translation helpful? Give feedback.
-
The YUV->RGB->YUV conversion works well, thanks! I have some trouble trying to convert RGB->YUV after doing the composition: overlay = Image.open!("membrane.png")
yuv = File.read!("image.yuv")
{:ok, rgb} = Image.YUV.new_from_binary(yuv, 1920, 1080, :C420)
composed = Image.compose!(rgb, overlay)
{:ok, output_yuv} = Image.YUV.write_to_binary(composed, :C420) I'm getting |
Beta Was this translation helpful? Give feedback.
-
That will be because your composed image has an alpha band and will need to be flattened first. The error messages aren't great I agree but they come straight from Would you try a I'm hesitant to flatten by default but perhaps I should since the code doesn't support YUV with alpha anyway. Thoughts? |
Beta Was this translation helpful? Give feedback.
-
OK, I've pushed a commit that flattens the image in |
Beta Was this translation helpful? Give feedback.
-
Handling libvips errors is a broader problem - fix it in one place and 100 more crop up. I'll still trying to find a better way to wrap those errors but I haven't go there yet. |
Beta Was this translation helpful? Give feedback.
-
I'll work on some tests and improve the docs but otherwise I think Is there anything else you need from |
Beta Was this translation helpful? Give feedback.
-
Thats possibly related to the fact that I'll also run your script on my system (default concurrency of 10) and see what result I get. |
Beta Was this translation helpful? Give feedback.
-
I'm seeing only a 20% speed up too. I've tried a lot of different combinations with no change in concurrency. I am seeing CPU utilisation of over 90% (of the available 10 cores) though so I assume the issue is CPU exhaustion, not some kind of lock contention. I looked at CPU utilisation just running one conversion (no concurrency) and it used up to 6 cores so libvips is definitely threading (as expected). The key performance constraint is the chroma subsampling (image resizing in libvips terms). I'll try a different sampling kernel and see if that makes a performance difference. Of course even it it does, it make have an affect on image quality (chroma at least). |
Beta Was this translation helpful? Give feedback.
-
I've managed to get the to_yuv function down from 58ms to 43ms on my machine. Still not fast enough for your needs I know. The improvement comes from using Further testing shows the biggest performance hit is I did some testing using Nx which did not improve performance. Since the bottleneck is Still not giving up, I have some more things to try. Basic conversionConvert to YUV: 3.8ms Full encode: 43ms
|
Beta Was this translation helpful? Give feedback.
-
So, for the record, here's what I ended up with so far. This script puts an overlay image over a video: Mix.install([
:membrane_h264_plugin,
:membrane_h264_ffmpeg_plugin,
:membrane_file_plugin,
:membrane_hackney_plugin,
:req,
:image
])
defmodule Membrane.OverlayFilter do
use Membrane.Filter
alias Membrane.RawVideo
alias Vix.Vips.Image, as: Vimage
alias Vix.Vips.Operation
def_input_pad :input, accepted_format: RawVideo
def_output_pad :output, accepted_format: RawVideo
def_options overlay_image: [spec: String.t()]
@impl true
def handle_init(_ctx, options) do
overlay_planes = open_overlay(options.overlay_image)
{[], %{overlay_planes: overlay_planes}}
end
@impl true
def handle_buffer(:input, buffer, ctx, state) do
%RawVideo{width: width, height: height} = ctx.pads.input.stream_format
{overlay_y, overlay_u, overlay_v} = state.overlay_planes
{image_y, image_u, image_v} = open_planes(buffer.payload, width, height)
composed_y = compose_to_binary(image_y, overlay_y)
composed_u = compose_to_binary(image_u, overlay_u)
composed_v = compose_to_binary(image_v, overlay_v)
composed = composed_y <> composed_u <> composed_v
{[buffer: {:output, %{buffer | payload: composed}}], state}
end
defp open_overlay(path) do
overlay = Image.open!(path)
{:ok, overlay_yuv} = Image.YUV.write_to_binary(overlay, :C420)
planes = open_planes(overlay_yuv, Image.width(overlay), Image.height(overlay))
add_alpha(planes, overlay)
end
defp open_planes(yuv, width, height) do
half_width = div(width, 2)
half_height = div(height, 2)
y_size = width * height
uv_size = half_width * half_height
<<y::binary-size(y_size), u::binary-size(uv_size), v::binary-size(uv_size)>> = yuv
{:ok, y} = Vimage.new_from_binary(y, width, height, 1, :VIPS_FORMAT_UCHAR)
{:ok, u} = Vimage.new_from_binary(u, half_width, half_height, 1, :VIPS_FORMAT_UCHAR)
{:ok, v} = Vimage.new_from_binary(v, half_width, half_height, 1, :VIPS_FORMAT_UCHAR)
{y, u, v}
end
defp add_alpha(planes, image) do
alpha = image[3]
downsized_alpha = Operation.subsample!(alpha, 2, 2)
{y, u, v} = planes
y = Operation.bandjoin!([y, alpha])
u = Operation.bandjoin!([u, downsized_alpha])
v = Operation.bandjoin!([v, downsized_alpha])
{y, u, v}
end
defp compose_to_binary(image, overlay) do
composed = Image.compose!(image, overlay, x: -1, y: 0)
{:ok, binary} = Vimage.write_to_binary(composed[0])
binary
end
end
defmodule Example do
def run() do
import Membrane.ChildrenSpec
pipeline = Membrane.RCPipeline.start_link!()
File.write!(
"membrane.png",
Req.get!("https://avatars.githubusercontent.com/u/25247695?s=200&v=4").body
)
Membrane.RCPipeline.exec_actions(pipeline,
spec:
child(%Membrane.Hackney.Source{
location:
"https://raw.githubusercontent.com/membraneframework/static/gh-pages/samples/big-buck-bunny/bun33s_720x480.h264",
hackney_opts: [follow_redirects: true]
})
|> child(Membrane.H264.Parser)
|> child(Membrane.H264.FFmpeg.Decoder)
|> child(%Membrane.OverlayFilter{overlay_image: "membrane.png"})
|> child(Membrane.H264.FFmpeg.Encoder)
|> child(%Membrane.File.Sink{location: "output.h264"})
)
end
end
Example.run()
Process.sleep(:infinity) Thanks again @kipcole9 and @akash-akya for your help! |
Beta Was this translation helpful? Give feedback.
-
Well done @mat-hek. I'm going to publish an update to Given you aren't using anything |
Beta Was this translation helpful? Give feedback.
-
One other small thing would be to test on an image that has an odd number of pixels on one or both dimenions. The integer division when calculating the size of the U and V planes might not work as expected. I think I've seen some other implementations that do the equivalent of |
Beta Was this translation helpful? Give feedback.
-
I've published Image version 0.41.0 with the following changelog entry: Enhancements
|
Beta Was this translation helpful? Give feedback.
-
Hi there, thanks for a great library! I'd like to overlay a PNG image on top of a YUV image and get a YUV image. Does Image support YUV?
Beta Was this translation helpful? Give feedback.
All reactions