Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datafusion-cli not installed #9294

Open
l1t1 opened this issue Feb 20, 2024 · 14 comments
Open

datafusion-cli not installed #9294

l1t1 opened this issue Feb 20, 2024 · 14 comments
Labels
bug Something isn't working

Comments

@l1t1
Copy link

l1t1 commented Feb 20, 2024

Describe the bug

https://arrow.apache.org/datafusion/user-guide/cli.html
the CLI cannot run

datafusion-cli
-bash: datafusion-cli: command not found

To Reproduce

pip install datafusion

datafusion-cli

Expected behavior

the CLI runs

Additional context

No response

@l1t1 l1t1 added the bug Something isn't working label Feb 20, 2024
@l1t1
Copy link
Author

l1t1 commented Feb 20, 2024

python module works

python3
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import datafusion                                                                          
>>> datafusion.__version__
'35.0.0'

@viirya
Copy link
Member

viirya commented Feb 20, 2024

Hmm, I think this issue may move to https://github.com/apache/arrow-datafusion-python?

@l1t1
Copy link
Author

l1t1 commented Feb 20, 2024

@l1t1 l1t1 closed this as completed Feb 20, 2024
@l1t1 l1t1 reopened this Feb 21, 2024
@viirya
Copy link
Member

viirya commented Feb 21, 2024

datafusion-cli is in this repo. I meant its pypi packaging and release should be done at https://github.com/apache/arrow-datafusion-python.

I think the issue is not datafusion-cli itself or its functionality but the pypi package issue.

@viirya
Copy link
Member

viirya commented Feb 21, 2024

Anyway, it is okay to keep it open to get visibility. :)

@alamb
Copy link
Contributor

alamb commented Feb 21, 2024

I think there is a brew package if you are on Mac:

brew install datafusion

@Jefffrey
Copy link
Contributor

Actually, does the datafusion pypi package even include datafusion-cli in the first place? 🤔

I did a quick search through https://github.com/apache/arrow-datafusion-python and found no actual mention of datafusion-cli (though in fairness I'm not familiar with that repo or the process of packaging a Python package)

Maybe the documentation on that CLI user guide page is mistaken on that account? Relevant PR: #8389

cc @Weijun-H

@Weijun-H
Copy link
Member

Actually, does the datafusion pypi package even include datafusion-cli in the first place? 🤔

I did a quick search through apache/arrow-datafusion-python and found no actual mention of datafusion-cli (though in fairness I'm not familiar with that repo or the process of packaging a Python package)

Maybe the documentation on that CLI user guide page is mistaken on that account? Relevant PR: #8389

cc @Weijun-H

After checking the documentation in apache/arrow-datafusion-python, I discovered that the current PyPi installation for CLI is incorrect @Jefffrey . Perhaps it's time to implement pip install datafusion-cli 🤔 ?

@SteveLauC
Copy link
Contributor

SteveLauC commented Feb 22, 2024

Perhaps it's time to implement pip install datafusion-cli

I am not sure how DataFusion release procedure works, but if you want to automate it in CI, maturin can help.

I have done this to Topgrade, take a look at this PR if you want to see a real-world example on how it looks like.

@l1t1
Copy link
Author

l1t1 commented Mar 8, 2024

pip install datafusion-cli works now, thanks.

@l1t1 l1t1 closed this as completed Mar 8, 2024
@MohamedAbdeen21
Copy link
Contributor

Hey @l1t1 the current datafusion-cli on PyPI is meant to be a test, it's not automated for future releases as the PR is not yet merged. I'd appreciate if you can re-open the issue to be closed after the PR is merged.

@l1t1 l1t1 reopened this Mar 10, 2024
@andygrove
Copy link
Member

I left a comment on #9452 (comment)

DataFusion is a Rust project and datafusion-cli is already available via cargo, which is the default packaging manager for Rust. If we want to use Python packaging for datafusion-cli, it seems logical to do that in the DataFusion Python repository.

@l1t1
Copy link
Author

l1t1 commented May 15, 2024

version 37.1.0 still has the issue

D:\>pip install datafusion -U
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Requirement already satisfied: datafusion in d:\python38\lib\site-packages (36.0.0)
Collecting datafusion
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/90/7e/09877d816952ff90f2bdcd49c45b199e20b226708068fa6a5bfb7d8ed51a/datafusion-37.1.0-cp38-abi3-win_amd64.whl (16.8 MB)
     ---------------------------------------- 16.8/16.8 MB 40.9 MB/s eta 0:00:00
Requirement already satisfied: pyarrow>=11.0.0 in d:\python38\lib\site-packages (from datafusion) (15.0.0)
Requirement already satisfied: numpy<2,>=1.16.6 in d:\python38\lib\site-packages (from pyarrow>=11.0.0->datafusion) (1.21.0)
Installing collected packages: datafusion
  Attempting uninstall: datafusion
    Found existing installation: datafusion 36.0.0
    Uninstalling datafusion-36.0.0:
      Successfully uninstalled datafusion-36.0.0
Successfully installed datafusion-37.1.0

D:\mathhigh>datafusion-cli
DataFusion CLI v36.0.0
❯
\q

@MohamedAbdeen21
Copy link
Contributor

Hey @l1t1, as per Andy's comments on #9452, datafusion-cli releases should be handled in the python repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants