-
-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Idea: import CSV to memory, run SQL, export in a single command #272
Comments
Maybe also support |
How about And what happens if you provide a filename too? I'm tempted to say that the |
|
|
Problem: |
|
So, I do things like this a lot, too. I like the idea of piping in from stdin. Something like this would be nice to do in a makefile: cat file.csv | sqlite-utils --csv --table data - 'SELECT * FROM data WHERE col="whatever"' > filtered.csv If you assumed that you're always piping out the same format you're piping in, the option names don't have to change. Depends how much you want to change formats. |
This is going to need to be a separate command, for relatively non-obvious reasons.
Is equivalent to this, because
But... this means that making the filename optional doesn't actually work - because then this is ambiguous:
So instead, I'm going to add a new sub-command. I'm currently thinking
I still think I need to use Another option: allow multiple arguments which are filenames, and use the extension (or sniff the content) to decide what to do with them:
This would require the last positional argument to always be a SQL query, and would treat all other positional arguments as files that should be imported into memory. |
Another option: allow an optional
One catch: how to treat
That's fine for CSV, but what about TSV or JSON or nl-JSON? Maybe this:
Bit weird though. The alternative would be to support this:
But that's verbose compared to the version without the long |
Solution: |
The documentation already covers this
https://sqlite-utils.datasette.io/en/latest/cli.html#running-queries-and-returning-json
|
Mainly for debugging purposes it would be useful to be able to save the created in-memory database back to a file again later. This could be done with:
Can use Maybe instead (or as-well-as) offer |
Got a prototype working!
|
Moving this to a PR. |
Here's a radical idea: what if I combined The trick here would be to detect if the arguments passed on the command-line refer to SQLite databases or if they refer to CSV/JSON data that should be imported into temporary tables. Detecting a SQLite database file is actually really easy - they all start with the same binary string: >>> open("my.db", "rb").read(100)
b'SQLite format 3\x00... (Need to carefully check that a CSV file with So then what would the semantics of
The complexity here is definitely in the handling of a combination of SQLite database files and CSV filenames. Also, I'm not 100% sold on this as being better than having a separate |
But... |
Plus, could I make this change to |
I wonder if there's a better name for this than
I think |
Also And it helps emphasize that the file you are querying will be loaded into memory, so probably don't try this against a 1GB CSV file. |
Columns from data imported from CSV in this way is currently treated as |
* Turn SQL errors into click errors * Initial CSV-only prototype of sqlite-utils memory, refs #272 * Implement --save plus tests for --save and --dump, refs #272 * Re-arranged CLI query documentation, refs #272 * Re-organized CLI query docs, refs #272 * Docs for --save and --dump plus made SQL optional for those, refs #273 * Replaced one last :memory: example * Documented --attach option for memory command, refs #272 * Improved arrangement of CLI query documentation
I'll split the remaining work out into separate issues. |
Wrote this up on my blog here: https://simonwillison.net/2021/Jun/19/sqlite-utils-memory/ - with a video demo here: https://www.youtube.com/watch?v=OUjd0rkc678 |
I quite often load a CSV file into a SQLite DB, then do stuff with it (like export results back out again as a new CSV) without any intention of keeping the CSV file around afterwards.
What if
sqlite-utils
could do this for me? Something like this:The text was updated successfully, but these errors were encountered: