Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: update generate fabric doc #2214

Merged
merged 12 commits into from
May 21, 2024

Conversation

JessicaXYWang
Copy link
Contributor

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

Briefly describe the changes included in this Pull Request.

How is this patch tested?

  • I have written tests (not required for typo or doc fix) and confirmed the proposed feature/bug-fix/change works.

Does this PR change any dependencies?

  • No. You can skip this section.
  • Yes. Make sure the dependencies are resolved correctly, and list changes here.

Does this PR add a new feature? If so, have you added samples on website?

  • No. You can skip this section.
  • Yes. Make sure you have added samples following below steps.
  1. Find the corresponding markdown file for your new feature in website/docs/documentation folder.
    Make sure you choose the correct class estimators/transformers and namespace.
  2. Follow the pattern in markdown file and add another section for your new API, including pyspark, scala (and .NET potentially) samples.
  3. Make sure the DocTable points to correct API link.
  4. Navigate to website folder, and run yarn run start to make sure the website renders correctly.
  5. Don't forget to add <!--pytest-codeblocks:cont--> before each python code blocks to enable auto-tests for python samples.
  6. Make sure the WebsiteSamplesTests job pass in the pipeline.

@JessicaXYWang
Copy link
Contributor Author

@JessicaXYWang JessicaXYWang changed the title update generate fabric doc chore: update generate fabric doc May 2, 2024
@JessicaXYWang
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines failed to run 1 pipeline(s).

self.input_dir = input_dir
self.output_dir = output_dir
self.notebooks = notebooks
self.output_structure = kwargs.get("output_structure", "hierarchy")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it need to be kwargs or can we add an output structure argument explicitly here?

import shutil

class LearnDocPreprocessor(Preprocessor):
def __init__(self, remove_tags=None, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: perhaps call this tags_to_remove so it doesent get confused as a boolean variable

Comment on lines 61 to 62
self.input_dir = self.attributes.get("input_dir", None) # access local images
self.notebook_path = self.attributes.get("notebook_path", None) # access local images
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: are these comments still relevant here?

@@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Classification - before and after SynapseML"
"# Classification tasks- SparkML vs SynapseML"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"# Classification tasks- SparkML vs SynapseML"
"# Classification - SparkML vs SynapseML"

True,
)
content = self._read_rst(full_input_file)
# TODO: Not tested yet
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this still applicable?

@@ -44,22 +49,22 @@ channels:
- path: Explore Algorithms/Classification/Quickstart - SparkML vs SynapseML.ipynb
filename: classification-before-and-after-synapseml
metadata:
title: Classification - before and after SynapseML
title: Classification tasks using SynapseML
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: Classification tasks using SynapseML
title: Classification using SynapseML

]
},
"source": [
"see our [embedding guide](./Quickstart%20-%20OpenAI%20Embedding)."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"see our [embedding guide](./Quickstart%20-%20OpenAI%20Embedding)."
"For more information on using `OpenAIEmbedding` see our [embedding guide](./Quickstart%20-%20OpenAI%20Embedding)."

@@ -322,7 +322,18 @@
"source": [
"### Generating Text Embeddings\n",
"\n",
"In addition to completing text, we can also embed text for use in downstream algorithms or vector retrieval architectures. Creating embeddings allows you to search and retrieve documents from large collections and can be used when prompt engineering isn't sufficient for the task. For more information on using `OpenAIEmbedding`, see our [embedding guide](./Quickstart%20-%20OpenAI%20Embedding)."
"In addition to completing text, we can also embed text for use in downstream algorithms or vector retrieval architectures. Creating embeddings allows you to search and retrieve documents from large collections and can be used when prompt engineering isn't sufficient for the task. For more information on using `OpenAIEmbedding`."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"In addition to completing text, we can also embed text for use in downstream algorithms or vector retrieval architectures. Creating embeddings allows you to search and retrieve documents from large collections and can be used when prompt engineering isn't sufficient for the task. For more information on using `OpenAIEmbedding`."
"In addition to completing text, we can also embed text for use in downstream algorithms or vector retrieval architectures. Creating embeddings allows you to search and retrieve documents from large collections and can be used when prompt engineering isn't sufficient for the task."

@JessicaXYWang
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@codecov-commenter
Copy link

codecov-commenter commented May 13, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.22%. Comparing base (5c37342) to head (09117c6).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2214      +/-   ##
==========================================
+ Coverage   86.19%   86.22%   +0.03%     
==========================================
  Files         327      327              
  Lines       16715    16715              
  Branches     1495     1495              
==========================================
+ Hits        14407    14413       +6     
+ Misses       2308     2302       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@JessicaXYWang
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@JessicaXYWang JessicaXYWang marked this pull request as ready for review May 13, 2024 14:37
@JessicaXYWang
Copy link
Contributor Author

@mhamilton723
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mhamilton723
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mhamilton723 mhamilton723 merged commit a8820fe into microsoft:master May 21, 2024
65 of 68 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants