New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

caching option should use view when possible #81

Open

cccs-jc opened this issue Mar 7, 2023 · 0 comments

Collaborator

cccs-jc commented Mar 7, 2023

currently caching of the dataframe created by sparksql magic uses
df.cache()

This api does not let us name the cached data. Using the SQL version of this api it is possible to name the cache data (in the spark UI).

When the users creates a cached view
%%sparksql --output skip --view user_view_name --cache --eager

we could cache it using the SQL api as follows
df.createOrReplaceTempView("tmp_df_view"

spark.sql("CACHE TABLE user_view_name as select * from tmp_df_view")

spark.sql("DROP VIEW tmp_df_view")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment