Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV parser fails when trying to import XLSX #108

Closed
funkybob opened this issue Aug 19, 2013 · 8 comments
Closed

CSV parser fails when trying to import XLSX #108

funkybob opened this issue Aug 19, 2013 · 8 comments
Labels

Comments

@funkybob
Copy link

Using 0.9.11 I get the following trying to import a XLSX file...

Traceback (most recent call last):
File "...", line 81, in parse
self.source_data = tablib.import_set(source)
File ".../site-packages/tablib/core.py", line 1006, in import_set
format.import_set(data, stream)
File ".../site-packages/tablib/formats/_csv.py", line 41, in import_set
for i, row in enumerate(rows):
File ".../site-packages/tablib/packages/unicodecsv/__init__.py", line 54, in next
row = self.reader.next()
Error: line contains NULL byte
@funkybob
Copy link
Author

I just tried master, and it's now the yaml parser failing with:

File ".../site-packages/tablib/packages/yaml/reader.py", line 200, in update
exc.encoding, exc.reason)
ReaderError: 'utf8' codec can't decode byte #x8e: invalid start byte in "<string>", position 10

@funkybob
Copy link
Author

So basically, it seems the parsers aren't failing cleanly when encoding is the reason they fail...

@djrobstep
Copy link
Contributor

YAML parser also breaks when trying to import_set with a tsv.

yaml.scanner.ScannerError: while scanning for the next token
found character '\t' that cannot start any token

@funkybob
Copy link
Author

Sadly, this and the pickling bug mean I've had to abandon my use of this library.

@kennethreitz
Copy link
Contributor

This project is in a bit of a crisis state — it's really useful, and I use regularly. However, I wrote it several years ago and haven't touched it since. In order to get the project into a stable state I'm closing all issues and pull requests to get a "fresh slate"

Don't take this as aggressive — it's just necessary for the project to make any progress any time soon (it's pretty clear the project is effectively unmaintained at the moment). Great things to come! Please watch the GitHub logs and feel free to re-open this discussion soon. I just need to really it into a good state first.

✨ ❤️ ✨

@iurisilvio
Copy link
Collaborator

Reopening this issue because it is a real bug and should be fixed.

Should the import_set just ignore exceptions if any formatter accept the input?

@iurisilvio iurisilvio reopened this May 1, 2014
@iurisilvio iurisilvio removed the crisis label May 1, 2014
@iurisilvio iurisilvio added the bug label Sep 6, 2014
@iurisilvio
Copy link
Collaborator

We have to ignore exceptions when we don't know the format.

claudep added a commit to claudep/tablib that referenced this issue Oct 4, 2019
Autodetection was added for the odf format.
claudep added a commit to claudep/tablib that referenced this issue Oct 4, 2019
Autodetection was added for the odf format.
claudep added a commit that referenced this issue Oct 4, 2019
Autodetection was added for the odf format.
@claudep
Copy link
Contributor

claudep commented Oct 4, 2019

I cannot say I solve all issues, but tablib should be a little more robust now wrt autodetection. Feel free to open new tickets if you can reproduce crashes on master.

@claudep claudep closed this as completed Oct 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants