-
Notifications
You must be signed in to change notification settings - Fork 607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
csvgrep fails on (giant) csvclean'ed CSV on 2.7 only #617
Comments
Does the same error occur when using the latest code from GitHub? You can install it with:
|
Yes, it does still occur. |
Is the backtrace different? The backtrace above doesn't match the current code. |
Ah, yeah, sorry:
|
It seems to be that at some point, a row doesn't have a column with index 4. Can you try with the 617 branch?
|
Yes, I agree that's what's wrong. Is that a thing that csvclean is supposed to fix? the command seems to fail right away with 617:
|
Ah, indeed. I fixed the commit, so we can try again. |
in the same spot, different traceback
|
Thanks - I've written a test now, so it should finally work. |
It worked! Thanks! |
I found a weird issue where csvkit works on 3.5 but not 2.7. This is not a problem for my workflow (since 3.5 works fine), but wanted to report it in case it's useful. I'm trying to grep thru a giant (3.1gb) CSV that contains some cells with internal line breaks. (So normal grep won't work right).
I keep getting a
list index out of range
error; full traceback is below. The error occurs in an identical manner on both my original file and the one that I ran throughcsvclean
. The command looks like thiscsvgrep -v -c 4 -m "2016-05-12" jeremys_giant_csv_out.csv
.When I run the command like that (without directing the output to a file), everything around the last line printed appears to be well-formed -- and indeed, when I separate out the 200 lines into their own file,
csvgrep
can handle it without problems.Got any idea on how to fix the problem? Happy to share the file in private, under frieNDA, if we can figure out a good way to get you 3.1gb. Is it possible that
csvgrep
has issues with huge files? Should I split it into pieces (manually checking to make sure thathead
/tail
don't cut off a row in the middle, at an internal line break)? If so, what's a dependably-small-enough size?Additional possibly-relevant information is that the file was generated by Ruby's CSV library. I'm running csvkit 0.9.1, installed with
pip
, with 2.7. The same file, same command works fine in 3.5.The text was updated successfully, but these errors were encountered: