Replies: 1 comment
-
Main take aways from the call:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
at my organization, we have created a refined integration of dagster pipes with Databricks. It adds the automatic upload of the script and any resources i.e. libraries are also bootstrapped.
Furthermore, we build our own integration with EMR which offers similar properties
And also neatly integrate with local pyspark.
This allows to put the PaaS compute provider like DBR, and EMR as an implementation detail - similar as outlined here https://georgheiler.com/2023/12/11/dagster-dbt-duckdb-as-new-local-mds and allows to use i.e. Photon of DBR where necessary but one can directly use EMR (i.e. no DBU service fee). This allows for easy cost optimization as the business logic can be executed in all 3 modes (local for developer productivity and the 2 different cloud PaaS engines).
We would be open to sharing it with the community but I do not know yet what would be the best format. Certain assumptions of our code are only valid internally.
Would someone on the dagster team be up for a video call so I can discuss how an MR would need to be structured?
Beta Was this translation helpful? Give feedback.
All reactions