Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in g_utf8_validate #19

Closed
szotsaki opened this issue Sep 15, 2018 · 7 comments
Closed

Crash in g_utf8_validate #19

szotsaki opened this issue Sep 15, 2018 · 7 comments

Comments

@szotsaki
Copy link

[This is a copy of mpv-player/mpv/issues/6118].

mpv version and platform

v0.29.0
built on Jul 22 2018
ffmpeg library versions:
libavutil 56.14.100
libavcodec 58.18.100
libavformat 58.12.100
libswscale 5.1.100
libavfilter 7.16.100
libswresample 3.1.100
ffmpeg version: 4.0.2

Reproduction steps

Start a radio stream which reports "icy-title" (what's currently played) and have an invalid UTF-8 character in there. In my case it was, perhaps, in another encoding.

Expected behaviour

mpv skips the invalid character.

Actual behaviour

mpv crashes with the following message:

A: 01:09:21 / 01:09:37 (99%) Cache: 16s+498KB
File tags:
 icy-title: NEK - LAURA NON C�

(process:15290): GLib-CRITICAL **: 15:09:02.059: g_variant_new_string: assertion 'g_utf8_validate (string, -1, NULL)' failed
dbus[15290]: arguments to dbus_message_iter_append_basic() were incorrect, assertion "_dbus_check_is_valid_utf8 (*string_p)" failed in file dbus-message.c line 2754.
This is normally a bug in some application using the D-Bus library.

  D-Bus not built with -rdynamic so unable to print a backtrace

They closed it with:

Looks like a problem in the mpris plugin, as mpv by itself does not use dbus.

At that time this bug happened I was on commit d741f7a.

@hoyon
Copy link
Owner

hoyon commented Sep 16, 2018

Does your system have GLib >= 2.52? Because the code to ensure that non utf8 strings are properly handled depends on a function added in GLib 2.52, and is skipped in systems with an older GLib version.

@lgbaldoni
Copy link
Contributor

Chiming in: I have GLib 2.54.3 and I'm experiencing the same problem.

@szotsaki
Copy link
Author

Yes, I have 2.56.2.

@hoyon
Copy link
Owner

hoyon commented Sep 18, 2018

Do you have a link to a radio stream or file I can use to test this issue with?

@traycold
Copy link

I also have same issue, here follows an example of the error and url of a radio stream causing it. Of course, it not always happens, just when the title of the song or author contains some "special" characters (in this radio is quite frequent, anyway, given the type of music -classical- and authors).

>mpv http://stream.srg-ssr.ch/m/rsc_it/aacp_96  
Playing: http://stream.srg-ssr.ch/m/rsc_it/aacp_96
 (+) Audio --aid=1 (aac 2ch 44100Hz)
AO: [pulse] 44100Hz stereo 2ch float
A: 00:00:07 / 00:00:15 (50%) Cache:  7s+152KB
File tags:
 icy-title: Leopold Antonin Kozeluch - Sinfonia in fa maggiore
A: 00:15:51 / 00:17:37 (89%) Cache: 106s+2MB
File tags:
 icy-title: Fr�d�ric Chopin - Valzer brillante in la bemolle maggiore op. 34 n. 1 

(process:3442): GLib-CRITICAL **: 14:53:32.327: g_variant_new_string: assertion 'g_utf8_validate (string, -1, NULL)' failed

@hoyon
Copy link
Owner

hoyon commented Sep 28, 2018

I've had a look into the issue and it seems that, according to http://icecast.org/docs/icecast-2.4.1/config-file.html, many radio streams use latin1 encoding instead of utf-8 when they are not an ogg stream. Radio streams can specify their character encoding, but I'm not sure if that is actually done, and how I would get that information from mpv.

It may be possible to detect the encoding of the text and convert it accordingly, but I'm not sure how to reliably do that without introducing more bugs relating to the text conversion.

I am already calling g_utf8_make_valid but that seems to let the strings through without catching the fact that they are not valid utf-8.

@hoyon hoyon closed this as completed in 298fec3 Jan 17, 2020
@hoyon
Copy link
Owner

hoyon commented Jan 17, 2020

Sorry this took so long. I've decided the best thing to do would be to call g_utf8_validate myself before passing it to glib and give a placeholder string if it isn't valid even after trying to fix it. I guess it's better than crashing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants