Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

csvsql can complete without any output #428

Closed
stucka opened this issue Jul 17, 2015 · 9 comments
Closed

csvsql can complete without any output #428

stucka opened this issue Jul 17, 2015 · 9 comments
Labels

Comments

@stucka
Copy link

stucka commented Jul 17, 2015

I've never seen this before. csvsql on one file is doing something weird: No error messages, but also no output whatsoever.

Data is AFAIK clean -- and actually was already processed once by different csvkit util. I'm getting no answer, no nothing, whether I run without a sniffing limit, run with a modest sniffing limit like 20mb, or disable sniffing.

The data set is free, but large:
http://download.cms.gov/openpayments/PGYR14_P063015.ZIP

I rename the 5.5gb (!) file, then do this:
csvgrep -c 12 -m FL gnrl.csv >gnrlfl.csv

That gets me a 350mb file.
First try:
csvsql -i mysql gnrlfl.csv >gnrlfl.sql
0 bytes, no message. Dropped redirect. Dropped MySQL specific. Changed sniffing limit. Eliminated sniffing. Ran on original file. Still get no error message, no output except for a couple blank lines.

@7below
Copy link

7below commented Dec 4, 2015

I am also getting the same issue on a large (1.6Gb, 14 million rows) csv. Works with a smaller subset which was created from the top 100 rows.
I'm trying to use csvsql to execute a query and output another csv - i.e. in memory SQL. Does anyone know of any size limitations to this, or is it just related to how much system memory is available?

@jpmckinney
Copy link
Member

@onyxfish Not sure now 4ab4d2f closed this?

@onyxfish
Copy link
Collaborator

Oops, I put that commit message on the repo. I didn't realized I'd inadvertently closed a ticket.

@onyxfish onyxfish reopened this Jan 23, 2016
@jpmckinney
Copy link
Member

@stucka I can't download http://download.cms.gov/openpayments/PGYR14_P063015.ZIP Any similar file that produces this behavior?

@jpmckinney
Copy link
Member

Closing: one month without response.

@Prosserc
Copy link

Prosserc commented May 29, 2018

I'm getting the same issue on a csv file in the Nasa Cassini dataset (http://archive.redfour.io/cassini/cassini_data.zip). After downloading/unzipping the file is under ./curious_data/data/INMS/inms.csv, the file is large (~ 5GB).

When I run the command: csvsql inms.csv -i postgresql --tables "import.inms" --no-constraints from the INMS dir I get no output, no error or anything. I expected to get the sql to generate the table to STDOUT.


As a workaround for anyone else with this problem I did manage to generate the sql by running the command against a subset of the file, piping the output of head inms.csv through to csvsql rather than providing a file:

head inms.csv | csvsql -i postgresql --tables "import.inms" --no-constraints > import.sql

@jpmckinney
Copy link
Member

Can you run csvsql with -v to show the error, as described here?

@Prosserc
Copy link

Prosserc commented May 29, 2018

@jpmckinney running csvsql inms.csv -i postgresql --tables "import.inms" --no-constraints --verbose produces the error:

$ csvsql inms.csv -i postgresql --tables "import.inms" --no-constraints --verbose
Traceback (most recent call last):
File "/usr/bin/csvsql", line 9, in
load_entry_point('csvkit==0.9.1', 'console_scripts', 'csvsql')()
File "/usr/lib/python3/dist-packages/csvkit/utilities/csvsql.py", line 160, in launch_new_instance
utility.main()
File "/usr/lib/python3/dist-packages/csvkit/utilities/csvsql.py", line 106, in main
**self.reader_kwargs
File "/usr/lib/python3/dist-packages/csvkit/table.py", line 201, in from_csv
contents = f.read()
MemoryError

I guess this makes sense if the program attempts to read the whole file into memory as this is on a low memory laptop (4GB) and the file is > 5GB.

@jpmckinney
Copy link
Member

Ah, yes  – and makes sense that taking a subset using head works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants