Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove table names from column names for df() call #1256

Merged
merged 9 commits into from
Oct 7, 2023
Merged

Conversation

pchunduri6
Copy link
Contributor

Removing table names from the dataframe during df() call. The users can then easily load CSV files generated using EvaDB with the to_csv() call at a later time (for long-running or expensive queries).

Example:

select_query = cursor.query(
    f"SELECT * FROM {repo_name}_StargazerList;"
).df()

select_query.to_csv("stargazers_list.csv", index=False)

# Later
cursor.query(
        f"""
   CREATE TABLE IF NOT EXISTS {repo_name}_StargazerList(
   github_username TEXT(1000));
"""
    ).df()

cursor.query("LOAD CSV 'stargazers_list.csv' INTO {repo_name}_StargazerList;""").df()

Do we need the table names for any use cases? For example, for duplicate column names from two different functions - object_detector_1.labels and object_detector_2.labels?

@pchunduri6 pchunduri6 linked an issue Oct 5, 2023 that may be closed by this pull request
@xzdandy xzdandy added this to the v0.3.8 milestone Oct 6, 2023
@xzdandy
Copy link
Collaborator

xzdandy commented Oct 6, 2023

No merge now. Making minor changes to PR.

@xzdandy xzdandy merged commit d6cb3a5 into staging Oct 7, 2023
7 checks passed
@xzdandy xzdandy deleted the csv-column-names branch October 7, 2023 06:14
a0x8o pushed a commit to alexxx-db/eva that referenced this pull request Oct 30, 2023
…#1256)

Removing table names from the `dataframe` during `df()` call. The users
can then easily load CSV files generated using `EvaDB` with the
`to_csv()` call at a later time (for long-running or expensive queries).

Example:

```
select_query = cursor.query(
    f"SELECT * FROM {repo_name}_StargazerList;"
).df()

select_query.to_csv("stargazers_list.csv", index=False)

# Later
cursor.query(
        f"""
   CREATE TABLE IF NOT EXISTS {repo_name}_StargazerList(
   github_username TEXT(1000));
"""
    ).df()

cursor.query("LOAD CSV 'stargazers_list.csv' INTO {repo_name}_StargazerList;""").df()

```

Do we need the table names for any use cases? For example, for duplicate
column names from two different functions - `object_detector_1.labels`
and `object_detector_2.labels`?

---------

Co-authored-by: Andy Xu <[email protected]>
Co-authored-by: Andy Xu <[email protected]>
a0x8o pushed a commit to alexxx-db/eva that referenced this pull request Nov 22, 2023
…#1256)

Removing table names from the `dataframe` during `df()` call. The users
can then easily load CSV files generated using `EvaDB` with the
`to_csv()` call at a later time (for long-running or expensive queries).

Example:

```
select_query = cursor.query(
    f"SELECT * FROM {repo_name}_StargazerList;"
).df()

select_query.to_csv("stargazers_list.csv", index=False)

cursor.query(
        f"""
   CREATE TABLE IF NOT EXISTS {repo_name}_StargazerList(
   github_username TEXT(1000));
"""
    ).df()

cursor.query("LOAD CSV 'stargazers_list.csv' INTO {repo_name}_StargazerList;""").df()

```

Do we need the table names for any use cases? For example, for duplicate
column names from two different functions - `object_detector_1.labels`
and `object_detector_2.labels`?

---------

Co-authored-by: Andy Xu <[email protected]>
Co-authored-by: Andy Xu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

select_query.to_csv() adds table name to the csv file
2 participants