Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add "machine learning pipeline" references, expand context in cmd ref, user guide [SEO] #1915

Merged
merged 4 commits into from
Nov 10, 2020

Conversation

jeremydesroches
Copy link
Contributor

@jeremydesroches jeremydesroches commented Nov 9, 2020

Follow up on suggested changes from #1857 :

It's a good idea to address this particular term "machine learning pipelines" and add context across dag repro run and files and directories.

  • dag terms added
  • repro terms added
  • run terms added
  • Files and Directories terms added'

Refer to #550 for follow-up on core document creation for pipeline concept.

@jeremydesroches jeremydesroches added A: docs Area: user documentation (gatsby-theme-iterative) A: website Area: website labels Nov 9, 2020
@shcheklein shcheklein temporarily deployed to dvc-landing-pipelines-s-fbrmda November 9, 2020 05:31 Inactive
@jeremydesroches
Copy link
Contributor Author

@jorgeorpinel It seems to me that run could be the best place to expand and maybe add/modify an example to cater towards "machine learning pipeline”... what do you think?

We don't have any clicks or traffic from "machine learning pipeline" searches to guide us (since that's what we're trying to get!) but I'm trying to address that situation and flow... where would it make the most sense to land, for someone who searched for those terms?

Comment on lines 178 to 179
For simplicity, let's build a pipeline defined below. (If you want get your
hands-on something more real, see this short
[pipeline tutorial](/doc/start/data-pipelines)). It takes this `text.txt` file:
To get hands-on experience with data science and machine learning pipelines, see
[Get Started: Data Pipelines](/doc/start/data-pipelines).
Copy link
Contributor

@jorgeorpinel jorgeorpinel Nov 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole change is out of scope but OK. Should we move this link up somewhere in the description, actually? Not in examples

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. It can stay here too if it makes less sense to have it in the desc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, much better idea. Moved it up into the description and that should be better for SEO purposes too. PTAL

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. It feels to me like it's not in a natural place right now though. It interrupts the description flow. Maybe before Parallel stage execution? But again, it can stay in the top of Examples if there's no better place. It should also probably be a note (md block quote).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But again, it can stay in the top of Examples if there's no better place. It should also probably be a note (md block quote).

OK @jorgeorpinel, submitted #1937 to fix this.

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the commends above, the repro cmd ref has 25 instances of "pipeline" but nowhere was the target term added? Same for run (9 existing instances).

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Nov 9, 2020

which doc of above four should be core/hub

What about https://dvc.org/doc/start/data-pipelines though? It probably gets more traffic than all the other docs combined. It could even be the main hub... although I incline more for trying to use https://dvc.org/doc/command-reference/dag for now.

We need to extract important concepts like "pipelines" out of the cmd ref eventually anyway, into a new "basic concepts" section, see #550 (I even tried and failed to start it in #1655) — if you consider it very SEO-relevant you could go for it after this, @jeremydesroches

@shcheklein shcheklein temporarily deployed to dvc-landing-pipelines-s-fbrmda November 10, 2020 05:33 Inactive
@shcheklein shcheklein temporarily deployed to dvc-landing-pipelines-s-fbrmda November 10, 2020 06:02 Inactive
@jeremydesroches
Copy link
Contributor Author

Other than the commends above, the repro cmd ref has 25 instances of "pipeline" but nowhere was the target term added? Same for run (9 existing instances).

I added the target terms in one location for both of those, which is enough for these supporting docs. Check line 33 in repro and line 29 in run.

The core pipelines piece should have more mentions in appropriate context, but there isn't a SEO benefit to extraneous mentions of the target terms.

@jeremydesroches
Copy link
Contributor Author

jeremydesroches commented Nov 10, 2020

What about https://dvc.org/doc/start/data-pipelines though? I incline more for trying to use https://dvc.org/doc/command-reference/dag for now.

We need to extract important concepts like "pipelines" out of the cmd ref eventually anyway, see #550

Get Started: Data Pipelines does get more traffic, but I should have put it more simply: a conceptual doc for pipelines is needed. Totally in line with your next thought... nice idea @jorgeorpinel !

It is very SEO-relevant... there would need to be a separate doc for each concept to maximize the SEO impact. Concept docs would help with the search results for pipelines (and other topics).

Right now, we're trying to land people who made a general search for a tool/concept on either a get started or a command reference, and that's not quite right. Having an intermediate step would make the description and term insertion more natural, but more importantly it would increase the likelihood they learn more about that concept as related to DVC (and don't bounce).

I'll think on the other topics that could work and propose my ideas for a concept section.

@jorgeorpinel
Copy link
Contributor

Concept docs would help with the search results for pipelines (and other topics).
we're trying to land people who made a general search for a tool/concept... increase the likelihood they learn more about that concept as related to DVC (and don't bounce).

OK we just need to figure out a good way to add these landing pages. Should it be one guide with a section per concept? A subsection under Use Cases (which are kind of landing pages right now)? Please follow up in #550

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorgeorpinel jorgeorpinel merged commit 618aae7 into master Nov 10, 2020
@jorgeorpinel
Copy link
Contributor

Oh p.s. can you set an alert to follow up on traffic/searches for this term and specific docs? That would be great.

@jeremydesroches
Copy link
Contributor Author

Oh p.s. can you set an alert to follow up on traffic/searches for this term and specific docs? That would be great.

Yep, added these to my list of changed docs for SEO. I'll check back in one week and again after two to see where it's at.

@jorgeorpinel
Copy link
Contributor

I'll check back in one week

Cool, thanks Jeremy, please let us know when/if you see a change in traffic/SRs

@jeremydesroches
Copy link
Contributor Author

jeremydesroches commented Dec 6, 2020

Final check on this as requested for GSoD, @jorgeorpinel:

/command-reference/dag - Link - Improved across all metrics including 35% more clicks. Notable new impressions from general "dag", "pipeline", and "machine learning" -related terms. 👍

image

image

/command-reference/repro - Link - Improved across all metrics including 8% more clicks. Quite a few new terms appeared in impressions, some related to "pipelines" as intended and some look like the search engine "auditioning" the page for other related terms. 👍

image

image

/command-reference/run - Link - 17% more clicks, 22% improved clickthrough rate, and average position improved from 10.5 to 8.9. 5% lower impressions. This page is indexed for more than 3x the number of "long tail" terms that run and repro command references are. So there is likely more potential here outside of the terms in this PR. 👍

image

image

/user-guide/dvc-files-and-directories - Link - 13% more clicks, 15% improved clickthrough rate. Impressions down by 3% and average position dropped from 20.3 to 21.4. Very strong uptrend in clicks and clickthrough rate after Nov. 22, and impressions are following. This page also indexes for nearly 400 "long tail" terms, which may help with deciding on how to move/organize content for the files/metafiles/cache basic concepts section in #1944. 👍

image

image

@jorgeorpinel
Copy link
Contributor

Pretty good results for command references, keeping in mind we don't really expect these to be landing pages. Also, the term "dvc pipeline" seems like an interesting one to explore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: docs Area: user documentation (gatsby-theme-iterative) A: website Area: website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants