Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: The csv file cannot be read if there are square brackets in the csv file path or full path. #6828

Open
2 of 3 tasks
JacobKwon opened this issue Dec 15, 2023 · 4 comments
Assignees
Labels
bug 🦗 Something isn't working External Pull requests and issues from people who do not regularly contribute to modin P3 Very minor bugs, or features we can hopefully add some day.

Comments

@JacobKwon
Copy link

Modin version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest released version of Modin.

  • I have confirmed this bug exists on the main branch of Modin. (In order to do this you can follow this guide.)

Reproducible Example

df = pd.read_csv("/home/user/[DE]/[PROJECT]/total_data_231127.csv",
    names=['seq', 'A', 'B'],
    usecols=['A','B'],
    sep="\t",
    dtype=str,
    na_values=['\\N'],
 )

Issue Description

I thought it was a Korean (UTF-8) problem at first, so I tested it several times to find the error.
(Because, I am using the Korean version of Ubuntu 22.04 and Korean is included in the file path.)

After removing Korean, I tried to read_csv using the file full path, but the same error occurred.
Below is the test screen using square brackets.
(Using "MODIN_ENGINE" = "dask")

modin1

modin2

I tried these tests just in case, but it was no use.

modin3

modin4

Currently, I'm using it very well after removing square brackets.
However I think it would be better to leave such a report, so I'm writing in a bug report.
Thank you.

Expected Behavior

FileNotFoundError: [Errno 2] No such file or directory: '/home/user/[DE]/[PROJECT]/total_data_231127.csv'

Error Logs

Replace this line with the error backtrace (if applicable).

Installed Versions

  • Ubuntu 22.04.2 LTS (Korean)

  • conda 22.9.0

  • jupyter notebook 6.5.3

  • pip list print
    ...
    dask 2023.5.0
    modin 0.23.1.post0
    modin-spreadsheet 0.1.2
    ...

@JacobKwon JacobKwon added bug 🦗 Something isn't working Triage 🩹 Issues that need triage labels Dec 15, 2023
@YarShev
Copy link
Collaborator

YarShev commented Dec 15, 2023

cc @anmyachev

@anmyachev
Copy link
Collaborator

Hello @JacobKwon! Thanks for your contribution and sorry for the long response.

First of all, I would like to clarify if square brackets work if you are using pandas and not modin? (you may have already tried)

@JacobKwon
Copy link
Author

Hello, @anmyachev
It's okay to have a late response time. I know the time difference is considerable. 😉

First, As you thought, I tested the pandas read_csv.
However, I tested it to attach the image once again after seeing your answer, and I will respond by attaching the image. 👍

  • Using Pandas
    1

  • Using Modin
    2

Actually, it's my first bug report on Github, so I'm worried that I might have made a mistake. 🙄
Thank you.

@anmyachev anmyachev added External Pull requests and issues from people who do not regularly contribute to modin and removed Triage 🩹 Issues that need triage labels Dec 20, 2023
@anmyachev
Copy link
Collaborator

@JacobKwon I can reproduce the problem and seem to have found the cause. Modin uses fsspec library in cases when pandas doesn't: fsspec/filesystem_spec#1476

@anmyachev anmyachev added the P3 Very minor bugs, or features we can hopefully add some day. label Jan 26, 2024
@anmyachev anmyachev self-assigned this Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🦗 Something isn't working External Pull requests and issues from people who do not regularly contribute to modin P3 Very minor bugs, or features we can hopefully add some day.
Projects
None yet
Development

No branches or pull requests

3 participants