Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

accuraterip-checksum: convert to python C extension #274

Merged
merged 2 commits into from
Oct 26, 2019

Conversation

mtdcr
Copy link
Contributor

@mtdcr mtdcr commented May 10, 2018

  • calculate v1 and v2 checksums at once
  • let libsndfile handle both WAV and FLAC

I tested this using the following script:

#!/usr/bin/env python

import accuraterip

v1, v2 = accuraterip.compute('Nirvana - Nevermind (Remastered)/01. Nirvana - Smells Like Teen Spirit.flac', 1, 12)
assert(v1 == 0xA0D81775)
assert(v2 == 0x280694C8)

v1, v2 = accuraterip.compute('Nirvana - Nevermind (Remastered)/02. Nirvana - In Bloom.flac', 2, 12)
assert(v1 == 0x92EC1D7D)
assert(v2 == 0x1AE5FD7A)

v1, v2 = accuraterip.compute('Nirvana - Nevermind (Remastered)/12. Nirvana - Something in the Way - Endless, Nameless.flac', 12, 12)
assert(v1 == 0x3EAA3B67)
assert(v2 == 0x14C921A5)

print("OK")

This may need some more testing. When called with invalid input files, compute() returns None, None. Note that 0 may be a valid checksum, so the statement if not v1_sum in accurip.py may have produced false negatives in the past.

@mtdcr mtdcr force-pushed the accuraterip-module branch 5 times, most recently from c35368c to aca9dbc Compare May 12, 2018 12:53
@mtdcr
Copy link
Contributor Author

mtdcr commented May 12, 2018

I updated my branch to fix the encoding of the filename passed to accuraterip.compute().

@RecursiveForest RecursiveForest self-requested a review May 23, 2018 02:12
@RecursiveForest
Copy link
Contributor

I like this. I will give this a more thorough review & testing shortly. A simple to answer question is whether the existing unit tests are testing this properly or not.

@mtdcr mtdcr changed the title RFC: accuraterip-checksum: convert to python C extension accuraterip-checksum: convert to python C extension Jun 30, 2018
@MerlijnWajer
Copy link
Collaborator

What is the advantage over a binary - unix style? Seems more complex to me.

@mtdcr
Copy link
Contributor Author

mtdcr commented Jul 1, 2018

My commit removes more lines than it adds or modifies. Therefore I'd argue that complexity actually gets reduced.

Benefits are:

  • Both checksums (v1 and v2) are calculated at once, reducing execution time and use of resources.
  • No need to run make to install the executable, because the module gets installed by setup.py just like the rest of whipper.
  • Simple and stable interface between native Python code and the module.
  • Supports FLAC directly through libsndfile, removing the conversion step from flac to wav before calling the executable.

Please have a look at how simple arc.py became!

It would be possible to simplify accurip.py a little more, because the module returns either both checksums or none, whereas the executable could fail to calculate one checksum but succeed with the other (if the file becomes unreadable between both invocations), but I preferred to keep the commit small.

@MerlijnWajer
Copy link
Collaborator

MerlijnWajer commented Jul 1, 2018

I would say that (1) can be done in the binary quite easily too - but I never did it because the (initial, switching to native code instead of gst stuff) speedup was so significant I didn't need to make it more complex. (2) is fair, but binaries can also be installed by setup.py AFAIK (3) I would say that simple program output + exit code is also a stable interface (4) I purposefully re-used the flac binary, so that was a feature. ;)

I personally find python modules cumbersome to debug (more so than simply binaries for sure), but I don' feel that strongly about this in general and the interface looks slightly more simple (but a simple interface can hide a lot of complexity), so if people want this merged, it's fine by me.

@MerlijnWajer
Copy link
Collaborator

Oh, and thanks for taking the time to write this.

@mtdcr
Copy link
Contributor Author

mtdcr commented Jul 31, 2018

@RecursiveForest, are you going to review this PR?

@mtdcr
Copy link
Contributor Author

mtdcr commented Jul 31, 2018

Regarding your question about the unit tests: I guess they do cover it. See test_common_accurip.py.

Copy link
Member

@Freso Freso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of this, for the reason that whipper is currently a Python project. People who contribute to it expect its code to be Python. This PR introduced a glob of C code unique to this project, which means any maintenance would have to happen in the scope of whipper as well. This means that whipper now at any time needs contributors who can both do C and Python. (Not the most unreasonable criteria, but definitely a steeper one that Python-only.)

If accuraterip-checksum.c is going to get forked into a Python C extension (which I am not inherently against), I think it'd be better to make the fork outside of whipper and can then be pulled back in.

Copyright : GPL
Description : A C99 commandline program to compute the AccurateRip checksum of singletrack WAV files.
Implemented according to http://www.hydrogenaudio.org/forums/index.php?showtopic=97603
Git : https://github.com/leo-bogert/accuraterip-checksum
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer accurate: the code in this accuraterip-checksum.c is not the same in the linked repository.

@mtdcr
Copy link
Contributor Author

mtdcr commented Mar 1, 2019

This PR introduced a glob of C code unique to this project, which means any maintenance would have to happen in the scope of whipper as well.

I don't get your point. The "original" author didn't seem to understand C very well either. The current checksum routines were verbatim copies from a message board. Therefore there's no external maintainer and thus maintenance already happens in the scope of whipper now. Well, actually maintenance didn't happen anywhere at all, until this PR.

Second, IIRC, there was an implementation in Python, either in whipper or its predecessor, which was replaced with C code due to its bad performance. See MerlijnWajer's second comment in this PR.

Of course I'm not arguing to set this module in stone and keep it forever. Keeping it local to whipper's repository gives you the freedom to replace it with any superior solution you're going to invent in the future. And I don't even think it would be hard to do, but it may certainly be done in a second iteration.

This PR just improves the status quo a little bit, because I actually took the time to try and understand how the checksum works (and how the old checksum differs from the newer one), which the original author apparently didn't. You can even use this new code as a blueprint for efficient Python code later.

Btw., I rebased the patch locally the last time I ripped a CD. It's been a while, but I can push an updated version, if anybody is interested.

And, frankly, I don't understand why moving it out of the whipper repository would improve anything. It would just live under https://github.com/whipper-team/ and have the same maintainers as now.

@spvkgn
Copy link

spvkgn commented Mar 2, 2019

Btw., I rebased the patch locally the last time I ripped a CD. It's been a while, but I can push an updated version, if anybody is interested.

I'm use this PR patch for whipper deb package on Launchpad PPA. It would be useful if you update your branch with 0.7.3 sources. Thanks!

@MerlijnWajer
Copy link
Collaborator

Would you be up for creating an executable and also a python extension, sharing the same core?

I definitely like code and performance improvements. And yeah, the current code (in whipper) is not stellar, I didn't bother cleaning up the code, I just fixed it to work for my use case.

I like the idea of cleaned up code, I just prefer to have the feature available as a binary (as well, at least).

@JoeLametta JoeLametta changed the base branch from master to develop March 16, 2019 18:04
@mtdcr mtdcr force-pushed the accuraterip-module branch 3 times, most recently from 9cd821c to ea3b817 Compare May 9, 2019 12:20
@mtdcr
Copy link
Contributor Author

mtdcr commented May 9, 2019

@spvkgn: Sorry, I was busy with other things and forgot about this PR. Few minutes ago, I rebased it on today's develop branch. This version didn't receive any testing, but there were no significant changes since the previous attempt.

@Freso: I removed the reference to the former Git repository as requested.

@MerlijnWajer: I don't think an additional executable would improve anything, but confuse new users. Couldn't you just use https://github.com/leo-bogert/accuraterip-checksum directly, if you needed it outside of whipper?

@MerlijnWajer
Copy link
Collaborator

How would it confuse new users? Using the repo you linked directly at this point differs significantly from the code in this PR.

* calculate v1 and v2 checksums at once
* let libsndfile handle both WAV and FLAC

Signed-off-by: Andreas Oberritter <[email protected]>
@mtdcr
Copy link
Contributor Author

mtdcr commented Sep 21, 2019

@MerlijnWajer How about this script as a replacement for your binary? Is there anything else blocking this PR?

#!/usr/bin/env python
# SPDX-License-Identifier: GPL-3.0-only

import accuraterip
import sys


if len(sys.argv) == 2 and sys.argv[1] == '--version':
    print('accuraterip-checksum version 2.0')
    exit(0)

use_v1 = None
if len(sys.argv) == 4:
    offset = 0
    use_v1 = False
elif len(sys.argv) == 5:
    offset = 1
    if sys.argv[1] == '--accuraterip-v1':
        use_v1 = True
    elif sys.argv[1] == '--accuraterip-v2':
        use_v1 = False

if use_v1 is None:
    print('Syntax: accuraterip-checksum [--version / --accuraterip-v1 / --accuraterip-v2 (default)] filename track_number total_tracks')
    exit(1)

filename = sys.argv[offset + 1]
track_number = int(sys.argv[offset + 2])
total_tracks = int(sys.argv[offset + 3])

v1, v2 = accuraterip.compute(filename, track_number, total_tracks)
if use_v1:
    print('%08X' % v1)
else:
    print('%08X' % v2)

@MerlijnWajer
Copy link
Collaborator

That looks clean. Can you add that to setup.py so it gets installed as executable?

@mtdcr
Copy link
Contributor Author

mtdcr commented Oct 20, 2019

@MerlijnWajer: Done.

@RecursiveForest: Could you please review this PR or remove yourself from the list of pending reviewers?

@JoeLametta
Copy link
Collaborator

/remind me to remove RecursiveForest from the list of reviewers if we get no reply in 1 week

@MerlijnWajer
Copy link
Collaborator

@JoeLametta - I'm good with getting this merged whenever.

@MerlijnWajer
Copy link
Collaborator

Thanks for the work and putting up with me, @mtdcr ! :)

@JoeLametta JoeLametta merged commit d7f6d68 into whipper-team:develop Oct 26, 2019
@JoeLametta
Copy link
Collaborator

Merged, thanks!

@ArchangeGabriel
Copy link

So just to be sure, this means we should not use an external accuraterip-checksum anymore, right?

@JoeLametta
Copy link
Collaborator

@mtdcr Hi, starting with the next release whipper will be Python 3 only (porting progress is in #411). It seems that the current code of the accuraterip-checksum extension is not compatible with Python 3:

import accuraterip

ImportError: /home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/whipper-0.7.4.dev109+gd626c9c-py3.5-linux-x86_64.egg/accuraterip.cpython-35m-x86_64-linux-gnu.so: undefined symbol: Py_InitModule

Could you help us porting it to Python 3?

Thanks,
Joe

@JoeLametta
Copy link
Collaborator

So just to be sure, this means we should not use an external accuraterip-checksum anymore, right?

Correct.

@mtdcr
Copy link
Contributor Author

mtdcr commented Oct 30, 2019

undefined symbol: Py_InitModule

@JoeLametta can you try this patch? It passes python3 setup.py build now, but I didn't try to execute it.
mtdcr@5183f93

@JoeLametta
Copy link
Collaborator

JoeLametta commented Oct 30, 2019

Thanks, I've merged the patch in the py3 branch (commit 0fc330c): according to Travis CI the tests related to accuraterip-checksum aren't failing anymore.

@mtdcr mtdcr deleted the accuraterip-module branch October 30, 2019 20:20
@ArchangeGabriel
Copy link

@JoeLametta Thanks. Also I think that this part of the instructions should be removed since the setup.py already handles this (and the Makefile is gone for instance).

@JoeLametta
Copy link
Collaborator

I think that this part of the instructions should be removed since the setup.py already handles this (and the Makefile is gone for instance).

Done! 👌

@mtdcr
Copy link
Contributor Author

mtdcr commented Nov 6, 2019

I think that this part of the instructions should be removed since the setup.py already handles this (and the Makefile is gone for instance).

Done!

Did you forget to push it to develop?

@ArchangeGabriel
Copy link

No, he pushed as part of the Python3 porting PR #411.

@JoeLametta
Copy link
Collaborator

No, he pushed as part of the Python3 porting PR #411.

That's right (9edc98d).

archlinux-github pushed a commit to archlinux/aur that referenced this pull request Jul 30, 2022
accuraterip-checksum converted to C extension:
whipper-team/whipper#274

Switch to YAML library (ruamel) for formatting log files:
whipper-team/whipper#415

Port to Python 3:
whipper-team/whipper#411
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants