Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dagster_databricks package for Databricks integration #2468

Merged
merged 25 commits into from
Jun 9, 2020
Merged
Changes from 1 commit
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
1971950
Add dagster-databricks package
sd2k May 18, 2020
eb5fc56
Reference Databricks docs in dagster-databricks configs module
sd2k May 19, 2020
ec0ffee
Move build_pyspark_zip into dagster_pyspark utils module
sd2k May 19, 2020
ec90193
Fix style/minor issues in dagster-databricks
sd2k May 19, 2020
554d210
Add references to Databricks storage docs in 'main' script
sd2k May 19, 2020
e3f0d8d
Add comment explaining global vars in databricks_step_main.py
sd2k May 20, 2020
8ba8a25
Fix Python 2 issues in dagster-databricks
sd2k May 21, 2020
69a4109
Check invariants when setting up storage in Databricks job
sd2k May 21, 2020
a758afc
Fix dependencies in dagster-databricks/tox.ini
sd2k May 21, 2020
45328be
Move 'secret_scope' field into inner credentials object to simplify D…
sd2k May 21, 2020
10a4ab8
isort dagster-databricks
sd2k Jun 4, 2020
2298e53
Add pylint to tox.ini for dagster_databricks
sd2k Jun 4, 2020
c23534f
Install dagster-databricks in 'make install_dev_python_modules'
sd2k Jun 4, 2020
d507426
Reference GitHub issue for better storage setup in databricks_step_ma…
sd2k Jun 4, 2020
c2ebb50
Uncomment dagster-azure related config
sd2k Jun 4, 2020
7ff55fe
Replace assert_called_once with call_count for Python3.5 compat
sd2k Jun 4, 2020
4f53e82
Fix lint errors in databricks.py
sd2k Jun 4, 2020
8a23cf0
Improve handling of libraries by including required libs by default
sd2k Jun 4, 2020
e2ab165
Fix version to match other dagster libraries
sd2k Jun 4, 2020
e0a7461
Specify supported_pythons to exclude Python 3.8 from dagster-databric…
sd2k Jun 5, 2020
d9adf92
Add README for dagster-databricks
sd2k Jun 5, 2020
f374fbc
Install dagster-databricks in dagster-examples tox.ini
sd2k Jun 5, 2020
4d704ce
Update snapshot test for dagster example using databricks
sd2k Jun 6, 2020
6c1bfc6
Add API docs for dagster_databricks
sd2k Jun 6, 2020
52eddd7
Add coveragerc for dagster-databricks
Jun 9, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Reference GitHub issue for better storage setup in databricks_step_ma…
…in.py
  • Loading branch information
sd2k committed Jun 8, 2020

Verified

This commit was signed with the committer’s verified signature.
sd2k Ben Sully
commit d5074265b3626e52f1b39a9dcdd8b05cfc76ea7c
Original file line number Diff line number Diff line change
@@ -76,6 +76,9 @@ def setup_storage(step_run_ref):
At least one of S3 or ADLS2 storage should be provided in config, so that the run can
save intermediate files to a location accessible by the original process which launched
the job.

This requires modifying the 'sc' global which isn't great.
https://github.com/dagster-io/dagster/issues/2492 tracks a better solution
'''
root_storage = step_run_ref.environment_dict['storage']
check.invariant(