Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to mark audio as reviewed #2517

Open
AndiPersti opened this issue Aug 20, 2020 · 7 comments
Open

Allow users to mark audio as reviewed #2517

AndiPersti opened this issue Aug 20, 2020 · 7 comments
Labels
enhancement Issue that describes a problem that requires a change in the current functionalities of Tatoeba.

Comments

@AndiPersti
Copy link
Contributor

This was suggested by mramosch in a PM.

Similar to reviewing sentences, it should be possible to review audio recordings. This review should be displayed on the sentence page.
In addition it may be useful to add a search filter in order to only find sentences with reviewed audio.

@AndiPersti AndiPersti added the enhancement Issue that describes a problem that requires a change in the current functionalities of Tatoeba. label Aug 20, 2020
@soliloquist-tatoeba
Copy link

I'd like to see this feature being implemented, but is there a specific reason CMs are not allowed to check and import audio? Is it because it's too unintuitive and easy to get messed up? A native speaker can determine better if an audio recording is poorly pronounced, heavily accented or doesn't match the sentence, before it's imported. I remember recordings of some users being criticized for not sounding like a native speaker. I think it would be better to check them before importing than reviewing later.

@jiru
Copy link
Member

jiru commented Aug 22, 2020

@soliloquist-tatoeba There is no specific reason. The whole process of importing audio is rather archaic and have lots of room for improvement. The current process is more or less like this:

  1. Upload files using SFTP to a temporary folder.
  2. Import all the files at once using a special admin page.

This model assumes a single user in charge of import, which is CK at the moment. As far as I know, CK also deals with new audio contributors reaching out to [email protected] by checking for sound quality on a few samples before sending prepared lists tied for the Shtooka recorder. Note that it’s now possible for users to create their own lists as Tatoeba lists and then download them in a specific format Shtooka is able to read.

If we want to allow CMs to upload, check and import audio recordings, then we need to:

  1. replace the above step 1 with HTTP upload
  2. allow partial import of uploaded files based on language
  3. allow listening from the import page
  4. allow CMs to access the HTTP upload page and the import page
  5. update documentation about audio contributions

Not trivial, but definitely doable. That was for the technical part. Now, I also want to ask if putting CMs in charge of audio review and import is a good idea, and if CMs are okay with that. It’s definitely better than having CK do all the work like currently, but at least I’d like to hear other CMs’ opinion.

@jiru
Copy link
Member

jiru commented Aug 22, 2020

Related: #2271

@mramosch
Copy link

A native speaker can determine better if an audio recording is poorly pronounced, heavily accented or doesn't match the sentence, before it's imported. I remember recordings of some users being criticized for not sounding like a native speaker. I think it would be better to check them before importing than reviewing later.

Proof listening Audio is not only about pronunciation and intonation but also about technical issues like e.g. chopped off tails and heads. So I could easily check the technical aspects of sentences in other languages than my mother tongue.

However, after having listened to almost 6000 sentences of other contributors and having done a second pass on my own almost 7000 sentences there are still almost 20.000 left for me to check in German. So this is of course the preferred way to go when you listen for interpretation, intonation etc. and technical quality at the same time ;-)

However, if there’s no way to indicate to other proof-listeners that bulks of recordings have already been checked, someone else might do the same work again, just to find out, that there are no flaws at all (optimal case).

There is also no dedicated GERMAN language homepage (and every other language for that matter) where German ‚staff‘ could announce these issues in a centralized manner (without having to hope for the right search results on the wall and having to read through tons of threads). Even not being language biased, Tatoeba offering such a facility would attract more active people to check regularly for news on their language and field of work.

When you‘re doing things big scale, every click counts, so there is no time for hitting 2-4 buttons just to confirm a checked audio recording. After finishing an audio list (like I did with gretelen‘s audio list of 5000+ sentences, because I found a lot of mistakes when doing a quick cross check on some of them) I wanna hit one single button that batch-marks all Audio recordings in one felt swoop, of course after all 298 chopped off recordings and 100 more that had to be moved to new sentences (created manually by me as alternative) are properly handled according to their problems.

So this is a different UI (functionality/workflow) than for the average Joe who wants to mark a recording as being Ok every once in a blue moon. This information is only really useful if it is provided big scale (for search criteria etc.) and I guess ‚read only‘ is enough for the average user. But for us big scale worker ‚write‘ needs to be much simpler because we are talking about tens of thousands of clicks more for a proof-listen session of 5000 sentences.

Moving a recording does not only mean renaming the file, but also correcting the text in the title field in the MP3 tag etc. and for hundreds of files this is not a feasible workflow. But also getting thousands of new recordings to proof-listen before they are imported, just as audio files without the context of the sentence page, is useless because sometimes you just need to read the source sentence for a translation and even listen to the source recording to make an informed decision about a recorded sentence, and that is only possible in an environment like the sentence page itself, and not with loose audio files.

My way around this missing functionality right now is to listen to all 30.000 German sentences and hoping for someone at Tatoeba to provide me with a list of all new German additions (in the same format than the User/audio list format) - so I can simply say, if being asked, ALL German recordings are proof-listened!

Where the rest of the community can find this information (without personally asking me or someone who happens to know that this work was already done by someone else) is still an unsolved issue...

———

@jiru
Copy link
Member

jiru commented Aug 24, 2020

@mramosch Thanks a million for thoroughly clarifying your actual need. This is very valuable information that you are giving us. About having a dedicated space for a community of a certain language to share and organize itself in its own language, this is something I have been thinking about for a while too. I can’t find anything on the Wall and on Github so I’ll just create a new issue.

@mramosch
Copy link

Would it be possible to have an option for e-mail notifications that owners of audio files could activate in order to be notified of new comments on their recorded sentences?

At the moment it is only the sentence owner and those who have already added a comment to the corresponding sentence. And of course explicitly those targeted with @user.

Since people who find faulty audio files usually write a comment but do not tag (@Audio: ISSUE) etc., the creator of the soundtrack cannot react or provide a correction or a replacement.

I think the audio environment should (even if it is still treated as a third-class member) receive a few small extensions to raise the awareness of the community a little bit that something is happening in Audioland ;-)

@jiru
Copy link
Member

jiru commented Aug 25, 2020

@mramosch Yes, of course. Please create new issues on Github for each of the improvements you’d like to see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Issue that describes a problem that requires a change in the current functionalities of Tatoeba.
Projects
None yet
Development

No branches or pull requests

4 participants