Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: LOAD INTO VALUES #9955

Closed
1 task done
fengttt opened this issue Jun 8, 2023 · 7 comments
Closed
1 task done

[Feature Request]: LOAD INTO VALUES #9955

fengttt opened this issue Jun 8, 2023 · 7 comments
Assignees
Labels
attention/dev-design-required attention/doc-influence need to complete design document attention/needs-doc-discussion the user behavior is not clear, documentation job cannot proceed kind/feature priority/p-1 source/dev issues from devs
Milestone

Comments

@fengttt
Copy link
Contributor

fengttt commented Jun 8, 2023

Is there an existing issue for the same feature request?

  • I have checked the existing issues.

Is your feature request related to a problem?

SQL has the following syntax, insert into t with many rows.
insert into t values (1, 2), (3, 4), ...

Potentially we can extend sql with our own binary 
insert into t binary values [format] ['protocolbuffer', 'matrixone'] 0x'XXXXXXXX';

format matrixone, the 0x'XXXX' is simply a serialized matrixone batch.
format protocol buffer can use a simple protocol buffer format of batch.

It can be prepared, insert into t binary values format 'protocolbuffer' ?
then people serialize a protocol buffer and execute it.

Describe the feature you'd like

We can either publish protocolbuffer definition and let user serialize it, or, we can publish our own client software and
turn insert into ... values ... into insert into ... binary values ...

Our own client driver is definitely much less intrusive to customer but that means we need to fork/maintain jdbc, python client, etc. And that is intrusive in a different way.

I think protocol buffer has enough language bindings that we can simply use that. Or parquet/arrow. Note that we may need to support parquet/arrow anyway as part of external table work.

But our internal use (logging) can simply use our own batch format

Describe implementation you've considered

No response

Documentation, Adoption, Use Case, Migration Strategy

No response

Additional information

No response

@fengttt
Copy link
Contributor Author

fengttt commented Jun 13, 2023

So far the "format" that comes to mind are,

  1. our own batch format
  2. CSV
  3. Json array or jsonline
  4. Other binary formats like a well defined protocol buffer, parquet, arrow, etc.

I think 1, 2, 3 will be great. Doing 4 is even better but optional.

@domingozhang domingozhang added the attention/doc-influence need to complete design document label Jul 5, 2023
@dengn
Copy link
Contributor

dengn commented Jul 7, 2023

Doc related questions:

  1. what's even the priority for this feature?
  2. it seems to be such a big feature, needs specification about the real implementation.

@dengn dengn added the attention/needs-doc-discussion the user behavior is not clear, documentation job cannot proceed label Jul 7, 2023
@domingozhang domingozhang added the attention/being-discussed features being discussed label Jul 11, 2023
@fengttt fengttt added the priority/p1 Medium priority feature that should be implemented in this version label Jul 28, 2023
@sukki37 sukki37 modified the milestones: 1.0.0, 1.1.0 Aug 24, 2023
@sukki37 sukki37 added priority/p0 Critical feature that should be implemented in this version attention/dev-design-required source/dev issues from devs and removed priority/p1 Medium priority feature that should be implemented in this version attention/being-discussed features being discussed attention/needs-doc-discussion the user behavior is not clear, documentation job cannot proceed labels Aug 24, 2023
@sukki37 sukki37 assigned nnsgmsone and unassigned ouyuanning Aug 28, 2023
@domingozhang domingozhang changed the title [Feature Request]: INSERT INTO BIANRY VALUES [Feature Request]: LOAD INTO VALUES Aug 31, 2023
@domingozhang
Copy link
Contributor

USE new syntax
LOAD INTO insert into t values [format]

@fengttt
Copy link
Contributor Author

fengttt commented Aug 31, 2023

Current load syntax,

load data infile 'path_to_file.csv' into table t;

We pretty much want an inline file -- that is,

load data ...
inputdata = 'string values'

Or we can even overload current syntax

inputfile {path="inline", data="....", format="csv/json"}

@nnsgmsone nnsgmsone mentioned this issue Sep 5, 2023
7 tasks
@florashi181
Copy link
Contributor

to clarify the syntax in this week. @domingozhang

@florashi181 florashi181 added priority/p-1 and removed priority/p0 Critical feature that should be implemented in this version labels Sep 5, 2023
@lacrimosaprinz lacrimosaprinz added the attention/needs-doc-discussion the user behavior is not clear, documentation job cannot proceed label Sep 6, 2023
@lacrimosaprinz
Copy link
Contributor

lacrimosaprinz commented Sep 13, 2023

@nnsgmsone
Copy link
Contributor

Completed by 11633

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
attention/dev-design-required attention/doc-influence need to complete design document attention/needs-doc-discussion the user behavior is not clear, documentation job cannot proceed kind/feature priority/p-1 source/dev issues from devs
Projects
None yet
Development

No branches or pull requests

8 participants