Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drops external outputs and updates external data guides #4574

Merged
merged 4 commits into from
Jun 8, 2023
Merged

Conversation

dberenbaum
Copy link
Contributor

@dberenbaum dberenbaum commented May 29, 2023

Part of #4513.

This drops external outputs but also makes other changes to the guides about external data:

  • Reorganizes into a page in data management about importing external data, and another one in pipelines about external deps and outs.
  • For importing external data, separates it from external dependencies into its own page and highlights how to avoid unnecessary downloads/uploads.
  • For external deps and outs, drops external cache, focuses on no-cache external outputs, and mentions plans to address lack of caching with cloud versioning.

Edit: closes #520

@dberenbaum dberenbaum requested review from a team May 29, 2023 13:50
locations outside your local <abbr>project</abbr>, or stream your data outputs
directly to some external location, like cloud storage or HDFS.

## How external dependencies work
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire external dependencies section is moved from https://dvc.org/doc/user-guide/data-management/importing-external-data#how-external-dependencies-work and otherwise is untouched except for taking out import examples since they are addressed in other guides and aren't directly relevant to pipeline dependencies.

@shcheklein shcheklein temporarily deployed to dvc-org-external-data-jczlbz7q May 29, 2023 13:54 Inactive
@github-actions
Copy link
Contributor

github-actions bot commented May 29, 2023

Link Check Report

There were no links to check!

@dberenbaum dberenbaum added the blocked Waiting/Blocked over some dependency label May 30, 2023
@dberenbaum
Copy link
Contributor Author

@efiop I'm hesitant to merge this because it relies on keeping --outs-no-cache, so maybe we should take a look and make sure this makes sense to implement first.

@dberenbaum
Copy link
Contributor Author

@efiop Still need to update the ref and other areas to drop --external everywhere.

@shcheklein shcheklein temporarily deployed to dvc-org-external-data-jczlbz7q June 7, 2023 19:34 Inactive
@dberenbaum
Copy link
Contributor Author

@efiop Still need to update the ref and other areas to drop --external everywhere.

Done. PTAL 🙏.

@dberenbaum dberenbaum mentioned this pull request Jun 7, 2023
@efiop efiop merged commit e0625ed into main Jun 8, 2023
@efiop efiop deleted the external-data branch June 8, 2023 23:17
efiop added a commit to efiop/dvc that referenced this pull request Jun 9, 2023
efiop added a commit to efiop/dvc that referenced this pull request Jun 9, 2023
efiop added a commit to iterative/dvc that referenced this pull request Jun 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked Waiting/Blocked over some dependency
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

guide: consolidate external data mgmt guides
3 participants