Skip to content

Commit

Permalink
Merge branch 'master' into 2020-04-08
Browse files Browse the repository at this point in the history
  • Loading branch information
jorgeorpinel committed Apr 16, 2020
2 parents 7279e17 + 8d9e2a3 commit fc8589f
Show file tree
Hide file tree
Showing 22 changed files with 454 additions and 178 deletions.
2 changes: 2 additions & 0 deletions .eslintignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
public
.cache
4 changes: 3 additions & 1 deletion .eslintrc.json
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,11 @@
"src/gatsby/**/*.js",
"src/components/PageWrapper/index.js",
"scripts/**/*.js",
"config/**/*.js",
"src/server/**/*.js",
"plugins/**/*.js",
"gatsby-*.js"
"gatsby-*.js",
"postcss.config.js"
],
"rules": {
"@typescript-eslint/no-var-requires": "off",
Expand Down
130 changes: 30 additions & 100 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,119 +3,49 @@
[![Maintainability](https://api.codeclimate.com/v1/badges/5872e0a572ec8b74bd8d/maintainability)](https://codeclimate.com/github/iterative/dvc.org/maintainability)
[![CircleCI](https://circleci.com/gh/iterative/dvc.org.svg?style=svg)](https://circleci.com/gh/iterative/dvc.org)

[DVC](https://github.com/iterative/dvc) project documentation and website source
code.
[DVC](https://github.com/iterative/dvc) project website's source code.
[Documentation](https://dvc.org/doc) and [blog](https://dvc.org/blog) content.
Contributions to are welcome!

## Installation
# Contributing Docs

Make sure you have the latest LTS version of [Node.js](https://nodejs.org) and
[Yarn](https://yarnpkg.com) installed.

Run `yarn`.

## Commands

- `yarn develop` - run dev server with hot reload.
- `yarn build` - build static assets to `public` folder.
- `yarn serve` - run static server over the `public` folder content to check
build results.
- `yarn lint-ts` - lint `.ts` and `.tsx` for compilance with code style and
check its for type errors.
- `yarn lint-css` - lint `.css` files for compilance with code style.

## ENV variables

- `GA_ID` – id of the Google Analytics counter.
- `ANALYZE` - boolean prop to run webpack-analyzer
- `SENTRY_DSN` - sentry dsn url for tracking errors

## Technologies

This docs engine was built using [Gatsby.js](https://www.gatsbyjs.org/).
[Algolia](https://www.algolia.com/products/search/) uses to provide full text
search.
Please see our
[Contributing guide](https://dvc.org/doc/user-guide/contributing/docs) for more
details.

Please feel free to use it for your own sites and
[reach out to us](https://dvc.org/support) if you have any questions.
[![](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/images/0)](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/links/0)
[![](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/images/1)](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/links/1)
[![](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/images/2)](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/links/2)
[![](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/images/3)](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/links/3)
[![](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/images/4)](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/links/4)
[![](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/images/5)](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/links/5)
[![](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/images/6)](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/links/6)
[![](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/images/7)](https://sourcerer.io/fame/shcheklein/iterative/dvc.org/links/7)

# Contributing
# Writing a blog post

We welcome contributions to the DVC docs by the community!
Please, [contact us](mailto:[email protected]) if you'd like to write something
cool together or publish your content on our [blog](https://dvc.org/blog).

You can refer to our
[Contributing guide](https://dvc.org/doc/user-guide/contributing/docs) for more
details. Thank you!
Please see our
[Writing a Blog Post guide](https://dvc.org/doc/user-guide/contributing/blog)
for more details on how to write and submit a new blog post.

**If you need help:**
# Getting help

If you have any questions, please join the [community](https://dvc.org/chat) and
use the `#dev-docs` channel to discuss any issues in our website or docs. We are
very responsive and happy to help.

# Writing blog posts

Create `.md` file in the `content/blog` folder. File name will be used as
`slug`.

Write frontmatter in the following format:

```yml
---
title: Hello World
date: '2015-05-01T22:12:03.284Z'
description: 'Hello World'
descriptionLong: |
Some long
multiline
text
picture: /uploads/image.jpeg
pictureComment: Some Comment
author: ../authors/author_name.md
tags:
- Open Source
- Machine Learning
- Data Science
- Version Control
- AI
---

```

- `title` - **Required.** Title of the post.
- `date` - **Required.** Post date. Will be used to sort posts and in RSS.
- `description` - **Required.** Short description to show in the feed.
- `descriptionLong` - Optional long description to show before image on the post
page. If not set, `description` will be used instead.
- `picture` - Optional cover image.
- `pictureComment` - Optional cover image comment.
- `author` - **Required** Relative path to `.md` file with information about the
author.
- `tags` - Optional list of tags.

## Adding authors

Create `.md` file in the `content/authors` folder.

Write frontmatter in the following format:

```yml
path: ../authors/relative_path_to_file.md
name: Author's Name
avatar: /uploads/avatar.jpeg
```
- `path` - **Required** String that the CMS will insert to the author field in
the blog post. Should be equal to the path from the blog post to the author's
`.md` file.
- `name` – **Required** Author's name.
- `avatar` - **Required** relative path to the author's avatar.

# Copyright

This project is distributed under the Apache license version 2.0 (see the
LICENSE file in the project root).
Source code of this project is distributed under the Apache license version 2.0
(see the LICENSE file in the project root).

By submitting a pull request for this project, you agree to license your
contribution under the Apache license version 2.0 to this project.
Except where otherwise noted, documentation, blog content, images are licensed
under a [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) license.

If you use images, please make a reference to the original site.
By submitting a pull request for this project, you agree to license your
contribution under the Apache license 2.0 (source code) or CC BY 4.0
(documentation). Exceptions could be made to content and any other materials
that are published on our [blog](https://dvc.org/blog).
6 changes: 3 additions & 3 deletions content/blog/2019-06-26-june-19-dvc-heartbeat.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,10 +104,10 @@ image="/uploads/images/2019-06-26/first-impressions-of-data-science-version-cont
> this mix offering a cleaner solution, specifically targeting Data Science
> challenges.
- **[Versioning and Reproducibility with MLV-tools and DVC](https://github.com/peopledoc/mlv-tools-tutorial):
[Talk](https://peopledoc.github.io/mlv-tools-tutorial/talks/pyData/presentation.html#/)
- **[Versioning and Reproducibility with MLV-tools and DVC](https://github.com/peopledoc/mlvtools-tutorial):
[Talk](https://peopledoc.github.io/mlvtools-tutorial/talks/pyData/presentation.html#/)
and
[Tutorial](https://peopledoc.github.io/mlv-tools-tutorial/talks/workshop/presentation.html#/)
[Tutorial](https://peopledoc.github.io/mlvtools-tutorial/talks/workshop/presentation.html#/)
by [Stéphanie Bracaloni](https://github.com/sbracaloni) and
[Sarah Diot-Girard](https://github.com/SdgJlbl).**

Expand Down
1 change: 1 addition & 0 deletions content/blog/2020-03-31-reimagining-devops-video.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ tags:
- CI/CD
- DevOps
- MLOps
- DivOps
---

Last week, DVC was part of [DivOps](https://divops.org/), a fully remote
Expand Down
172 changes: 172 additions & 0 deletions content/blog/2020-04-06-april-20-dvc-heartbeat.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
---
title: April '20 DVC❤️Heartbeat
date: 2020-04-06
description: |
Catch up on new DVC releases, talks, and projects in our community.
This month, learn what we're up to in MLOps, CI/CD, and the
intersection of data science and software engineering.
descriptionLong: |
Every month we share news, findings, interesting reads, community takeaways,
and everything else along the way.
Look here for updates about [DVC](https://dvc.org), our journey as a startup,
projects by our users and big ideas about best practices in ML and data
science.
picture: ../../static/uploads/images/2020-04-06/april_header.png
pictureComment:
A view from [Barrancas del
Cobre](https://en.wikipedia.org/wiki/Copper_Canyon), shot by Jorge Orpinel
Pérez. Jorge has mastered the art of working on DVC remotely.
author: ../authors/elle_obrien.md
commentsUrl: https://discuss.dvc.org/t/april-20-heartbeat/347
tags:
- Heartbeat
- Google Drive
- MLOps
- CI/CD
- Podcast
- DivOps
---

Welcome to the April Heartbeat, our
[monthly roundup of cool happenings](https://dvc-landing-april-heart-6d0onb.herokuapp.com/tags/heartbeat),
good reads and other bright spots in our community.

## News

**Adapting to the pandemic.** Although the world seems different than when we
posted last month, the DVC community is steady and strong. As a predominantly
distributed company, we've been developing our infrastructure for remote work
from the get-go. It isn't always _easy_ to schedule an all-hands meeting across
9 time zones but we make it work. This experience has prepared us well for the
COVID-19 pandemic: although there are new challenges (like caring for families
while working from home) we've been able to weather the transition to fully
remote work relatively well.

![](/static/uploads/images/2020-04-06/laptop_on_boat.jpeg)_Before social
distancing started, DVC technical writer Jorge Orpinel Pérez has worked from a
canoe. Check out more photos from his workations
[on Instagram](https://www.instagram.com/workationer/)._

**DVC sponsors DivOps.** In a time when many conferences are going remote out of
necessity, we were fortunate to be part of an _intentionally_ remote conference
this month! We sponsored [DivOps](https://divops.org/), a fully-online meeting
led by women in DevOps. The DivOps lineup included speakers from GitHub,
DropBox, Gremlin and more. DVC data scientist Elle (that's me!) gave a
ten-minute talk about MLOps and CI/CD, so
[please check out the video](https://dvc.org/blog/reimagining-devops-video).
Another very relevant talk was from Anna Petrovicheva, CEO of
[Xperience AI](http://xperience.ai/); Anna
[spoke about her team's development workflow for deep learning projects](https://youtu.be/8nwpCQufeE0)
and gave a clear overivew of how they use DVC.

**DVC on the airwaves.** In early March, Elle was interviewed on an episode of
[The Data Stream podcast](https://open.spotify.com/show/5w4sAKB0fT6lGCELZAMIBh)
about a DVC data science project,
[building a public dataset of posts](https://dvc.org/blog/a-public-reddit-dataset)
from the "Am I the Asshole?" subreddit.

<external-link
href="https://open.spotify.com/episode/5JzIZLqnTF5aDh2B6UTemo?si=6LTQxq4xSDe0vhTpSLs1Jw"
title="The Data Stream #3 - Who is the A-hole? With Elle"
description="Ever wonder if it's possible to train a model to discover whether your friends are assholes or not? Today Elle comes on the show to talk about her project building a classifier to predict the results from reddit's hottest advice community: Am I the Asshole (or AITA for short)."
link="spotify.com"
image="/uploads/images/2020-04-06/data_stream.png"/>

## New releases

This month, DVC has
[released some new features](https://github.com/iterative/dvc/releases) and
updates:

- Did you know you can use Google Drive for remote storage with DVC? We've been
hard at work delivering the best performance with Google Drive and are
thrilled to invite users to try it out. Brand new
[docs](https://dvc.org/doc/user-guide/setup-google-drive-remote#setup-a-google-drive-dvc-remote)
explain how to get started.
- We're introducing the `metrics diff` functionality, which lets you compare
metrics from different commits side-by-side
([check out the docs](https://dvc.org/doc/command-reference/metrics/diff) to
learn more)
- Windows users, we are here for you. Contributor
[Charles Baynham](https://github.com/charlesbaynham) helped us get better
performance on copy operations in Windows.

## From the community

**DVC and R working together** One of our favorite blogs this month came from
Marcel Ribeiro-Dantas, a developer and PhD student at the
[Institut Curie](https://institut-curie.org/). Marcel wrote about using DVC to
manage projects in R, particularly defining and versioning pipelines of data
processing and analysis that can be reproduced easily. While DVC is language
agnostic, much of our user content has been Python-centric, so it's exciting to
see a detailed post for the R-using data scientist (for more about R with DVC,
see
[Marija Ilić's post](https://dvc.org/blog/r-code-and-reproducible-model-development-with-dvc))!

<external-link
href="https://mribeirodantas.xyz/blog/index.php/2020/03/05/r-dvc-and-rmarkdown/"
title="Manage your Data Science Project in R"
description="A simple project tutorial with R/RMarkdown, Packrat, Git, and DVC."
link="mribeirodantas.xyz"
image="/uploads/images/2020-04-06/marcel.jpeg"/>

Also, Marcel recently gave an interview on
[The Data Hackers Podcast](https://medium.com/data-hackers/health-data-e-o-coronav%C3%ADrus-data-hackers-podcast-22-2b059d460cb1),
a Portuguese-language show. Listen for a shout-out about DVC!

**DVC is in another book!** Last month we reported that DVC is part of a Packt
book,
["Learn Python by Building Data Science Applications"](https://www.packtpub.com/programming/learn-python-by-building-data-science-applications).
This month, DVC got a mention in a just-released O'Reilly book,
["Building Machine Learning Pipelines"](https://www.oreilly.com/library/view/building-machine-learning/9781492053187/)
by Hannes Hapke and Catherine Nelson.

<external-link
href="https://www.oreilly.com/library/view/building-machine-learning/9781492053187/"
title="Building Machine Learning Pipelines"
description="Automating Model Life Cycles with TensorFlow"
link="oreilly.com"
image="/uploads/images/2020-04-06/oreilly.jpeg"/>

**Some more links we like.** Here are a few other discussions that have caught
our attention.

- **MLOps can be fun.** Jeroen France's blog, "MLOps: Not as boring as it
sounds!", reads like a "coming of age" story about embracing engineering as a
data scientist. It's part-motivational, part tutorial- definitely worth a
read. Here's a sample:

> No-one wants to baby-sit, maintain, and troubleshoot their own models once
> they are in production. Every data scientist secretly hopes they can pawn
> that job off to an engineering team, or maybe an intern, right? Well, in
> fact MLOps is going to make your data science life a lot better.
- **Leveling up your Jupyter notebooks.** In a series called
["How to Use Jupyter Notebooks in 2020"](https://ljvmiranda921.github.io/notebook/2020/03/16/jupyter-notebooks-in-2020-part-2/),
Lj Miranda discusses how to use Jupyter Notebooks in a mature software
development workflow. He makes several recommendations for tools, including
DVC.

- **Reddit discussion about CI/CD** When we shared around our DivOps conference
presentation on Reddit, some
[great discussion happened](https://www.reddit.com/r/MachineLearning/comments/fshh9p/p_a_talk_about_adapting_cicd_systems_for_ml_full/).
We chatted about how CI/CD might work for data scientists, who often begin a
project with a phase of rapid exploration, and what version control for ML
could look like without Git.

- **Smashing the data monolith.** Engineer Juan López López wrote a blog called
["A complete guide about how to break the data monolith"](https://medium.com/packlinkeng/a-complete-guide-about-how-to-break-the-data-monolith-caa2ab2d01f6),
which is a neat manifesto about treating infrastructure _and_ data as code.
It's got nice coverage of DVC, code examples, and some deeply enjoyable
artwork.

![](/static/uploads/images/2020-04-06/monolith.jpeg)_From Juan Juan López
López's
[blog](https://medium.com/packlinkeng/a-complete-guide-about-how-to-break-the-data-monolith-caa2ab2d01f6)._

Thanks for reading. As always, let us know what you're making with DVC and what
links are catching your interest in the blog comments, on
[Twitter](https://twitter.com/DVCorg), and our
[Discord channel](https://dvc.org/chat). Be safe and be in touch!
6 changes: 1 addition & 5 deletions content/docs/command-reference/run.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ command and execute the command.
usage: dvc run [-h] [-q | -v] [-d <path>] [-o <path>] [-O <path>]
[-p <params>] [-m <path>] [-M <path>] [-f <filename>]
[-w <path>] [--no-exec] [-y] [--overwrite-dvcfile]
[--ignore-build-cache] [--remove-outs] [--no-commit]
[--ignore-build-cache] [--no-commit]
[--outs-persist <path>] [--outs-persist-no-cache <path>]
[--always-changed]
command
Expand Down Expand Up @@ -167,10 +167,6 @@ data pipeline (e.g. random numbers, time functions, hardware dependency, etc.)
command's code is non-deterministic (meaning it produces different outputs
from the same list of inputs).

- `--remove-outs` (_deprecated_) - remove stage outputs before executing the
`command`. If `--no-exec` specified outputs are removed anyway. See
`dvc remove` as well for more details. **This is the default behavior.**

- `--no-commit` - do not save outputs to cache. A DVC-file is created and an
entry is added to `.dvc/state`, while nothing is added to the cache.
(`dvc status` will report that the file is `not in cache`.) Use `dvc commit`
Expand Down
4 changes: 4 additions & 0 deletions content/docs/sidebar.json
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,10 @@
{
"label": "Docs and Website",
"slug": "docs"
},
{
"label": "Writing Blog Posts",
"slug": "blog"
}
]
},
Expand Down
2 changes: 1 addition & 1 deletion content/docs/understanding-dvc/resources.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,4 +41,4 @@

## Slides

- [From ML experiments to production: Versioning and Reproducibility with MLV-too](https://peopledoc.github.io/mlv-tools-tutorial/talks/pyData/presentation.html#/)
- [From ML experiments to production: Versioning and Reproducibility with MLV-too](https://peopledoc.github.io/mlvtools-tutorial/talks/pyData/presentation.html#/)
Loading

0 comments on commit fc8589f

Please sign in to comment.