Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3 Support #8

Closed
lxp opened this issue Mar 11, 2019 · 41 comments
Closed

Python 3 Support #8

lxp opened this issue Mar 11, 2019 · 41 comments
Milestone

Comments

@lxp
Copy link
Member

lxp commented Mar 11, 2019

Python 2 will be end-of-life in 2020 (https://www.python.org/dev/peps/pep-0373/#update), so we have to add support for Python 3.

@lxp lxp modified the milestones: 2.0, 3.0 Mar 11, 2019
@stevenxxiu
Copy link
Contributor

Is Python 2 still needed? If not I can try port it.

@lxp
Copy link
Member Author

lxp commented Jan 3, 2020

I would be happy to have your help for porting to Python 3. The main goal is to have it running on Python 3. As a bonus it would be great, if it still runs on Python 2.7 with compatibility layers like python-future [1].

I already gave it a try last year using futurize [2] and manual fixes. I have now pushed my work in the python3 branch. You can use this as a starting point, if you want. However, the test cases do not pass yet.

To run the unit and integration tests, you can use the following commands:

~/cfv/test$ python3 test.py --unit --exit-early
*** testing all_unittests_suite
test_lchoplen_compose (test_strutil.chopTestCase) ... ok
test_lchoplen_simple (test_strutil.chopTestCase) ... ok
test_lchoplen_wide (test_strutil.chopTestCase) ... ok
test_rchoplen_compose (test_strutil.chopTestCase) ... ok
test_rchoplen_simple (test_strutil.chopTestCase) ... ok
test_rchoplen_wide (test_strutil.chopTestCase) ... ok
test_compose_noprecombined (test_strutil.uwidthTestCase) ... ok
test_halfwidth (test_strutil.uwidthTestCase) ... ok
test_nonspacing (test_strutil.uwidthTestCase) ... ok
test_simple (test_strutil.uwidthTestCase) ... ok
test_wide (test_strutil.uwidthTestCase) ... ok
test_get_path_key (test_caching.AbsPathKeyTest) ... ok
test_rename (test_caching.AbsPathKeyTest) ... ok
test_nocase_findfile (test_caching.RelPathKeyTest) ... ok
test_nocase_findfile_parent (test_caching.RelPathKeyTest) ... ok
path_split (cfv.osutil)
Doctest: cfv.osutil.path_split ... ok
strippath (cfv.osutil)
Doctest: cfv.osutil.strippath ... ok
lchoplen (cfv.strutil)
Doctest: cfv.strutil.lchoplen ... ok
rchoplen (cfv.strutil)
Doctest: cfv.strutil.rchoplen ... ok
human_int (benchmark)
Doctest: benchmark.human_int ... ok

----------------------------------------------------------------------
Ran 20 tests in 0.072s

OK
.OK (False)

~/cfv/test$ python3 test.py --exit-early
>>> testing...
...................
>>> failed test: ~/cfv/test/cfv -ZNVRMUI --unquote=no --fixpaths="" --strippaths=0 --showpaths=auto-relative --progress=no --announceurl=url --noprivate_torrent -r -p /tmp/tmpgudzgbaz --renameformat="%(name)s-%(count)i%(ext)s" -C -t sha512 
/tmp/tmpgudzgbaz/test.ext.end : undecodable filename
/tmp/tmpgudzgbaz/test2.foo : undecodable filename
/tmp/tmpgudzgbaz/test3 : undecodable filename
/tmp/tmpgudzgbaz/d2/test4.foo : undecodable filename
4 files, 0 OK, 4 file errors.  0.002 seconds, 0.0K/s
FAILED (16)
  File "test.py", line 1890, in <module>
    failed += all_tests()

  File "test.py", line 1636, in all_tests
    ren_test(fmt)

  File "test.py", line 914, in ren_test
    test_generic('%s -C -t %s' % (cmd, f), cfv_test)

  File "test.py", line 287, in test_generic
    test_log_results(cfvtest.cfvenv + cfvtest.cfvfn + ' ' + cmd, s, o, r, kw)

  File "test.py", line 272, in test_log_results
    test_log_finish(cmd, s, r, o, kw)

  File "test.py", line 255, in test_log_finish
    traceback_str = '\n'.join(traceback.format_stack())

[1] https://python-future.org/
[2] https://python-future.org/futurize.html

@stevenxxiu
Copy link
Contributor

It's been quite tedious so far, and I reckon having it working with Python 3 is more than enough. Got most of the tests working now, just 32 left!

@stevenxxiu
Copy link
Contributor

I'm not sure if I'll continue porting this however, the code doesn't look all that well written from reading it. Maybe I'm better off using other utilities.

But would be happy if you can continue where I left off.

@lxp
Copy link
Member Author

lxp commented Jan 6, 2020

Thanks for your effort! I am sad to hear that you stopped your porting work for now. However, I think you already got a lot further with your port, than my last try. I know the code base is old and not easy to port. Did you find any major show stopper that requires a lot of re-work?

I do not know about any similar tool with that many features, that's why I am trying to keep cfv alive. If you find some other tool with a similar feature-set, I am happy to hear about it myself.

I will give your branch a try, once time allows it.
If you decide to continue your work, you are welcome of course :)

@stevenxxiu
Copy link
Contributor

Well the tool has good intentions but it just isn't that well written. This makes me rather nervous in using it. The main problem I saw was that it treated strings and bytes the same, and lots of functions were called assuming they were the same thing. Files were opened in both 'w' and 'wb' and passed to functions as if they were the same thing.

The torrent files were parsed in a rather non-documented way, with the torrent file specifying its own encoding. I somehow got those tests to pass.

I agree there aren't any similar tools, but there are individual tools for their own file formats, like cksfv for cfv and md5sum, sha1sum for the hash files, aria2c for torrent files, etc. I'm gonna just use the individual tools.

But yes I can continue a bit more today see if I can make more progress. I don't intend on using the tool though.

@stevenxxiu
Copy link
Contributor

stevenxxiu commented Jan 7, 2020

Alright got the tests down to 26 failing ones. The remaining ones all appear to be due to a similar reason. I'm not quite sure what the test does however:

>>> failed test: stdin/out of -ZNVRMUI --unquote=no --fixpaths="" --strippaths=0 --showpaths=auto-relative --progress=no --announceurl=url --noprivate_torrent -tbsdmd5 -C -f- with file data4 
MD5 (data4) = 96d879de0782e286d4031a8de9e351f2
-: 1 files, 1 OK.  0.000 seconds, 19038674702.3K/s
- : undecodable filename
1 files, 0 OK, 1 file errors.  0.000 seconds, 0.0K/s
FAILED (3) ((0, 16))

Perhaps you could debug it with both Python 2 & 3 and see how it's failing?

I might stop working on it here on, since doubt I'm gonna use this in the future anyway.

@lxp
Copy link
Member Author

lxp commented Jan 8, 2020

Thanks for your hard work! I will look into it, when I have some free time.

@alfs
Copy link

alfs commented Apr 11, 2020

Just a heads up - cfv is marked for autoremoval from testing on 2020-04-17
because of the python2 removal (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=936288).

I guess we can re-add the package later on unless the current state of the py3 port is complete enough?

@lxp
Copy link
Member Author

lxp commented Apr 15, 2020

Thanks for this heads up!
Sadly, it is unrealistic that there will be a working py3 version until 2020-04-17.
However, my plan is still to release a Python 3 version.
Is there a way to extend the deadline and keep the py2 version a bit longer in Debian?

@mmuehlenhoff
Copy link

cfv isn't going to be removed entirely from Debian by 2020-04-17, but only from testing, see https://en.wikipedia.org/wiki/Debian#Branches

As such. it won't be part of the next stable release unless it gets ported to Python 3. If a port is complete and upload to Debian by end of the year, that will still allow it to re-enter for the next release.

@alfs
Copy link

alfs commented Apr 17, 2020

Sorry, don't think so. It would block removal of python2. Anyway, lets just ITP it once py3 version is functional enough. By "functional", I think we can disable certain functions if they are blocking the porting progress (i.e. perhaps fixing some uncommon checksum files can be postponed to later if there are py3 problems with those).

@bentolor
Copy link

As cfv is no longer present in Ubuntu 20.04 LTS (and I also removed pyhton2 for good, finally): Any recommendations for similar working alternatives?

@stevenxxiu
Copy link
Contributor

@bentolor Some alternatives I use are md5sum, sha1sum, cksfv. For torrents I use aria2c to verify checksums.

@Clover2k
Copy link

Clover2k commented Apr 30, 2020

@bentolor Some alternatives I use are md5sum, sha1sum, cksfv. For torrents I use aria2c to verify checksums.

I tried cksfv a few days ago but in checking my (very) old sfv files on CD and some DVD, it doesn't find the content of the subdirectories and it gives error even if the file exists and is intact.
md5sum and sha1sum in my case are not useful since I used a different hash algoritm.

@amazingdash
Copy link

@bentolor @stevenxxiu @Clover2k rhash with -rc does more or less the same thing as cfv, performance is good and it supports most hash formats including bittorrent. It lacks cfv's -m though.

@matt-mueller
Copy link

Hi, just thought some background might be helpful, or at least interesting. cfv started back in the Python 1.5.2 days before unicode string was a thing. I was working on converting it to use the Python2 unicode support but it might have been in a weird state when I stopped. If I recall, the reason for passing around byte/unicode strings equivalently was to try to handle invalid filename encodings. At least in Python 2, I recall that if filenames didn't decode properly with the current locale, they would be returned as a byte string instead of a unicode string. I don't know how Python 3 handles this. That might also be the reasoning behind the non-standard torrent handling that was mentioned, I don't recall that.
Now, whether these attempts to handle invalid encodings was wise or not is another question...

I also apologize for the inscrutable tests.

@lxp
Copy link
Member Author

lxp commented Jul 25, 2020

@matt-mueller Thanks for sharing these insights. I would be happy if you could join the project here at Github. I think Github encourages people more to contribute to the project than Sourceforge does.

I still plan to have a working cfv version for Python 3 at some point. It is just not progressing that fast because at least for myself the pain of still having Python 2 around was not that big.

@Terry-Kennedy
Copy link

I tried the cfv-python3.zip file with code from 05-24-2020 (latest) with Python 3.8 on FreeBSD 12.2, and it appears that there are some new incompatibilities. Trying to use cfv as-is results in:
TypeError: a bytes-like object is required, not 'str'
I'm not really familiar with Python programming, but a wholesale replacement of open mode 'rb' with mode 'r' results in:
TypeError: can't concat str to bytes
Not that it would be a workable solution, since that would probably cause problems with newlines on at least some platforms.

Can someone suggest a path forward, or at least things for me to try? I don't think trying to futurize the code again would work as there were additional changes made after futurizing. I've looked at the alternatives and none of them seem appealing:
cksfv - doesn't create, only verifies
bsdsfv - limited to 1024 files for some reason
pure-sfv - doesn't seem to display progress information, even with -v

@lxp
Copy link
Member Author

lxp commented Jul 7, 2021

Sadly, Python 3 support is still in an incomplete state. I started the Python 3 port some time ago in the python3 branch. In the meantime @stevenxxiu continued the work in pull request #11. Some tests are still failing, so it is not ready for release yet. However, you could give it a try.
If you invest some time testing, I would appreciate it if you could report back your findings in pull request #11.
Also further contributions are welcome :)

@Terry-Kennedy
Copy link

I was able to build and install the code from there. A quick test of functionality seems to be working. What is the method to run the tests without installing external test scaffolding? A simple:
python3.8 setup.py install test
ends with:
running test
WARNING: Testing via this command is deprecated and will be removed in a future
version. Users looking for a generic test entry point independent of test runner
are encouraged to use tox.

I am very encouraged by the progress made by others so far and would like to help. Also, should the various changes made to the cfv-project/cfv master branch since the python3 fork be looked at and possibly cherry-picked for inclusion in the stevenxxiu/cfv python3 branch?

@lxp
Copy link
Member Author

lxp commented Jul 8, 2021

You can run the tests via python3 test/test.py. You can specify a few options (see --help) but most important right now are the tests that are run without options. You can also take a look at the Travis CI pipeline for a few hints: https://github.com/cfv-project/cfv/blob/master/.travis.yml

I took some time to work through some changes made by @stevenxxiu. I ended up cherry-picking some fixes and also implementing some fixes differently (see my recent push to python3 branch). I only worked through to commit b69e34a, so @stevenxxiu's branch still has a lot more fixes. However, I want to carefully review them, as a few fixes are suboptimal, because they do not fix the root-cause, which most of the time was introduced by my previous porting effort.

The changes from the master branch are currently not that important for the python3 branch, as they are only changing the CI pipeline from Travis CI to Github Actions. Of course, in the end we want to have those changes in the python3 version, but there is still a long way to go. In the python3 branch we are still struggling to have the test suite not crash... I am not even talking about having all tests pass.

@stevenxxiu
Copy link
Contributor

@Terry-Kennedy

cksfv - doesn't create, only verifies

cksfv both creates and verifies. See https://man.archlinux.org/man/cksfv.1#EXAMPLES.

@lxp
Copy link
Member Author

lxp commented Jul 11, 2021

I made a lot of progress in the last days. All normal testcases pass now, at least on my Linux system. However, I haven't tried to use it in real-world scenarios yet. Also there seem to be some issues with the long tests.

@Terry-Kennedy I think it would be a good opportunity, if you could try the python3 branch on your system. Please also run the test suite with something like python3 test/test.py and report back any issues, preferably in new Github issues.

@Terry-Kennedy
Copy link

I was finally able to get back to this today (a combination of work and vacation was interfering). It looks pretty good in summary:

tests finished: ok: 3167 failed: 1
which is a vast improvement over my previous testing (where the test scaffolding itself fails after reporting a bunch of individual failed tests).

The one failure in both "testing" and "testing without mmap" is:

failed test: /sysprog/terry/cfv-python3/test/cfv -ZNVRMUI --unquote=no --fixpaths="" --strippaths=0 --showpaths=autorelative --progress=no --announceurl=url --noprivate_torrent --strippaths=none -T -f teststrip-none.csv4
/data1 : file size does not match (13!=33)
teststrip-none.csv4: 1 files, 0 OK, 1 badsize. 0.000 seconds, 0.0K/s
FAILED (4)
File "test/test.py", line 1876, in
File "test/test.py", line 1719, in all_tests
File "test/test.py", line 269, in test_generic
File "test/test.py", line 254, in test_log_results
File "test/test.py", line 237, in test_log_finish

I also get a bunch of:
max open files is big (3771108) clipping to 4096. Use --long to try the real value
which I assume is just because I'm running this on an account with no limits at all.

This is with Python 3.8.11 on FreeBSD 12.2-STABLE r370177 if that matters.

Thanks to everyone for your hard work on this! With a little further testing we should be able to get this back into the FreeBSD ports tree.

@Terry-Kennedy
Copy link

I decided to create a FreeBSD port in case anyone wants to try this. This is not an official port although I plan on submitting it once we're all happy with with the state of the python3 branch.

FreeBSD-cfv-port.zip

@lxp
Copy link
Member Author

lxp commented Jul 30, 2021

@Terry-Kennedy Thank you for testing.

I think the failure is just a hiccup with the test suite on your system. Basically, the test suite assumes that /data1 does not exist. However, it seems to exist on your system, which leads to this unexpected error. Can you verify this?

The clipping of max open files is a feature of the test suite, it keeps the test duration and resource usage within bounds. I would suggest you to reduce the max open files limit to <=4096 before running the tests (on Linux you can do this using ulimit -n 4096, not sure how to do it on FreeBSD). If the max open files limit is clipped, the manyfiles_test cannot really detect issues with leaking file descriptors. Alternatively, you could add the --long option to also enable longer and more I/O intensive tests, which also runs the manyfiles_test with your real limit. With your current limit the test will create more than 3771108 files.

Thanks for creating the FreeBSD port. I would be glad, if you could submit it, when the time has come :)

@Terry-Kennedy
Copy link

Yes, I do have a partition named /data1 (I'm the guy with 512TB in his dining room 8-):
Filesystem Size Used Avail Capacity Mounted on
rz1:/storage/data 83T 22T 61T 27% /data1

If I temporarily unmount it (and remove the mountpoint) the test passes. May I suggest changing the "/data1" reference (and any other similar references) in the tests to something like "/ZxCvB" that people are less likely to run into? OTOH, maybe things are just more 'fun' here at the Edge Case Saloon. (cf. "On Testing" by Bill Sempf)

Doing "ulimit -n 4096" lets those other tests pass without warning, so we're at 100% on FreeBSD 12.2. I see that a subsequent commit appears to address this.

I'll be glad to submit the port. It would be best to wait until we have something where it makes sense to tag it as cfv-3_0_0 or similar - the port framework gets upset if the checksum of the tarball it downloads doesn't match what is coded in the port. In fact, if I re-run the port build now it fails because the python3 branch changed since Thursday.

I believe that a bunch of the other *BSDs base their ports on FreeBSD's, so once the port lands back in FreeBSD it should be picked up by the others.

@lxp
Copy link
Member Author

lxp commented Jul 31, 2021

For the /data1 issue, please see #26.

The commit regarding the max open file limit does not change the code behavior. It just fixes the issue in the CI pipeline by calling ulimit -n 4096. So you still have to do it on your local system too. You will most likely see now additional warnings that I added regarding external testing tools. They are not mandatory, but it makes sense to test cfv created checksum files against external tools like sha256sums or cksfv. Previously, the tests were silently skipped, if the tools were not available.

I still want to do some more practical testing before releasing 3.0.0. However, I could release 3.0.0-beta1, if that helps.
What I am also unsure about, if I want to have Windows support in the 3.0.0 release, just delay it to some later release or drop it at all. Are there any Windows users using cfv?

@lxp
Copy link
Member Author

lxp commented Aug 1, 2021

@Terry-Kennedy I just tagged and released version 3.0.0.dev0. I hope this helps with the FreeBSD port.

@Terry-Kennedy
Copy link

Request to reinstate the FreeBSD port submitted as PR 257546:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257546

@Terry-Kennedy
Copy link

Committed to FreeBSD ports:
https://cgit.freebsd.org/ports/commit/?id=0377086a1f58526c3026ae470cd8191838b4c1f9

If anyone following this issue has contacts at any of the other *BSDs, feel free to ask them to pick this up from FreeBSD.

@lxp
Copy link
Member Author

lxp commented Aug 3, 2021

@Terry-Kennedy Thank you!

@hydrargyrum
Copy link

What's missing for releasing 3.0.0?

@lxp
Copy link
Member Author

lxp commented Apr 25, 2022

What's missing for releasing 3.0.0?

The only things missing for 3.0.0 is more testing, some minor code formatting changes (see #19) and fixing the PyPI project description formatting (https://pypi.org/project/cfv/3.0.0.dev0/).
I personally did not use the 3.0.0.dev0 pre-release in any real-world scenarios yet. Therefore, I am not confident enough to release 3.0.0.

If you want to give it a try, you can simply install it via pip. On Python 3, the 3.0.0.dev0 pre-release version is automatically selected, because no stable release is available yet. So you only need to run:

pip install cfv

Please report back any problems and also if it is just working for you :)

@lxp lxp mentioned this issue Oct 29, 2022
@lxp
Copy link
Member Author

lxp commented Oct 30, 2022

Version 3.0.0 is out now 🎉 🎉 🎉 : https://github.com/cfv-project/cfv/releases/tag/v3.0.0

Thanks to everyone supporting the Python 3 port, especially @stevenxxiu!

@Terry-Kennedy This would be a good time to update the FreeBSD port :)
@alfs If you can help resurrecting the Debian package, please leave a comment here: #4

Closing here :)

@lxp lxp closed this as completed Oct 30, 2022
@Terry-Kennedy
Copy link

Terry-Kennedy commented Oct 31, 2022 via email

@lxp
Copy link
Member Author

lxp commented Nov 1, 2022

Submitted: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=267449

-- Terri

Thank you!
If you plan to keep the FreeBSD port up-to-date, you can also provide us some install instructions for inclusion in the README.md.
I thought about something like this:

### From OS-specific Repositories

#### FreeBSD

To install from FreeBSD port (maintained by @Terry-Kennedy):

```sh
...install cfv...
```

@Terry-Kennedy
Copy link

Terry-Kennedy commented Nov 1, 2022 via email

@Terry-Kennedy
Copy link

Terry-Kennedy commented Nov 4, 2022 via email

@lxp
Copy link
Member Author

lxp commented Nov 7, 2022

Thank you @Terry-Kennedy! I have updated the README.md with your instructions. If you find any mistake or want to improve the wording, just leave a comment or directly open a pull request :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants