Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Install trino and pandas python lib in docker image #63

Open
shaofengshi opened this issue Aug 19, 2024 · 2 comments
Open

Comments

@shaofengshi
Copy link
Contributor

Today in the jupyter notebook for trino, the first step is to instll trino and pandas library, see https://github.com/apache/gravitino-playground/blob/main/init/jupyter/gravitino-trino-example.ipynb. This step needs to access internect, while some users, they may get network problem here, and then block their evaluation. Besides, after execute this step, Jupyter reminds that you need to restart the kernel ("Note: you may need to restart the kernel to use updated packages."), this will bring confusing.

To impove the user experience, we can install these dependencies during build the docker image.

@shaofengshi
Copy link
Contributor Author

shaofengshi commented Aug 19, 2024

Another dependency in "gravitino-fileset-example.ipynb" is "hdfs" python lib:

"pip install hdfs"

and also gravitino:
"pip install gravitino"

@shaofengshi
Copy link
Contributor Author

I see that the Jupyter notebook is using the "jupyter/minimal-notebook " image, which is not built by our-own, so we are not able to pre-install our dependencies, unless we build a new image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant