Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python Markdown extension for VS Code #1655

Closed
stefanuddenberg opened this issue Dec 8, 2019 · 11 comments
Closed

Python Markdown extension for VS Code #1655

stefanuddenberg opened this issue Dec 8, 2019 · 11 comments

Comments

@stefanuddenberg
Copy link

Feature Request: "Python Markdown" in Markdown cells

Description

It would be nice to have the equivalent of the "Python Markdown" extension for Jupyter in VS Code. So within a markdown cell, you could put arbitrary Python code between two curly braces which would be executed and rendered in-line. This is very useful when writing quantitative scientific reports.

Microsoft Data Science for VS Code Engineering Team: @rchiodo, @IanMatthewHuff, @DavidKutu, @DonJayamanne, @greazer

@DavidKutu
Copy link

Thanks for the suggestion! We'll look into it.

@JimCallahanOrlando
Copy link

JimCallahanOrlando commented May 30, 2020

Markdown for Data Science
RStudio users use RMarkdown which just appears to be Pandoc markdown
with the "Extension: fenced_code_blocks"; specifically using three backtick marks " ``` "
(shift-tilda) followed by a language name such as "Haskell", "R" or "Python".
https://rmarkdown.rstudio.com/authoring_pandoc_markdown.html%23raw-tex#Verbatim_(code)_blocks

Within RStudio these "fenced_code_blocks" have flags for behavior such as whether to run and replace with output (eval=True) and whether to display source code ( echo=False).

When turning in homework for Coursera Data Science we were admonished like math class to "show all our work" which meant displaying all of the source code with "echo=True"). More generally, depending on the audience, the audience may or may not care how the sausage was made.

The code block superset of Pandoc markdown behavior seems to be run by the R package knitR.
"You write your document in markdown and embed executable R code chunks with the knitR syntax."
https://rmarkdown.rstudio.com/articles_intro.html

So, for data science, it would be helpful in VS Code to have Pandoc compatible markdown with "fenced_code_blocks" reprocessed by a language pre-processor/interpreter which in the case of R would be the knitR package/R interpreter. I don't know if their is an equivalent of knitR for Python.

An "R-centric" way of handling Python would be to using the R knitR and reticulate packages.
"The reticulate package includes a Python engine for R Markdown that enables easy interoperability between Python and R chunks."

It is not clear what the "Python engine for R Markdown" is but there are some clues as to how it must be installed in the R Reticulate package.
https://cran.r-project.org/web/packages/reticulate/vignettes/python_dependencies.html

Here is the GitHub for Reticulate
https://github.com/rstudio/reticulate

Reticulate is just the R/Python interface, the pluggable engines seems to be provided by the knitR package; specifically knitR's "knit_engines" object:

"knit_engines: Engines of other languages
...
This object controls how to execute the code from languages other than R (when the chunk option engine is not 'R'). Each component in this object is a function that takes a list of current chunk options (including the source code) and returns a character string to be written into the output."
...
See str(knitr::knit_engines$get()) for a list of built-in language engines."

# R session
> library(knitr)
> str(knitr::knit_engines$get())
List of 41
 $ awk      :function (options)  
 $ bash     :function (options)  
 $ coffee   :function (options)  
 $ gawk     :function (options)  
 $ groovy   :function (options)  
 $ haskell  :function (options)  
 $ lein     :function (options)  
 $ mysql    :function (options)  
 $ node     :function (options)  
 $ octave   :function (options)  
 $ perl     :function (options)  
 $ psql     :function (options)  
 $ Rscript  :function (options)  
 $ ruby     :function (options)  
 $ sas      :function (options)  
 $ scala    :function (options)  
 $ sed      :function (options)  
 $ sh       :function (options)  
 $ stata    :function (options)  
 $ zsh      :function (options)  
 $ highlight:function (options)  
 $ Rcpp     :function (options)  
 $ tikz     :function (options)  
 $ dot      :function (options)  
 $ c        :function (options)  
 $ fortran  :function (options)  
 $ fortran95:function (options)  
 $ asy      :function (options)  
 $ cat      :function (options)  
 $ asis     :function (options)  
 $ stan     :function (options)  
 $ block    :function (options)  
 $ block2   :function (options)  
 $ js       :function (options)  
 $ css      :function (options)  
 $ sql      :function (options)  
 $ go       :function (options)  
 $ python   :function (options)  
 $ julia    :function (options)  
 $ sass     :function (options)  
 $ scss     :function (options

https://rstudio.github.io/reticulate/articles/r_markdown.html

NOTE: "RScript" is R so it is not clear how R is being handled.

So, from a data science perspective the VS Code handling of markdown needs to:

  1. Be compatible with Pandoc markdown
  2. Have a way of outputting to various formats by calling Pandoc in the background
  3. implement the codeblock extension in Pandoc markdown
  4. Have languages in the codeblock interpreted using a switching mechanism similar to knitR
    (could be implemented in a language other than R if that would be helpful to VS Code as a whole).

Implementing the feature this way would greatly reduce "RStudio envy".

@JimCallahanOrlando
Copy link

JimCallahanOrlando commented May 30, 2020

NOTES ON PANDOC
The underlying Pandoc engine is quite remarkable; it ranks up there with Tex and SQLite.
Pandoc was written by the Harvard educated University of California Berkeley Philosopher John G MacFarlane.

"Pandoc is a Haskell library for converting from one markup format to another, and a
command-line tool that uses this library.
...
Pandoc has a modular design: it consists of a set of readers, which parse text in a given
format and produce a native representation of the document (an abstract syntax tree or AST),
and a set of writers, which convert this native representation into a target format. Thus,
adding an input or output format requires only adding a reader or writer. Users can also
run custom pandoc filters to modify the intermediate AST.
Because pandoc’s intermediate representation of a document is less expressive than many
of the formats it converts between, one should not expect perfect conversions between every format and every other. Pandoc attempts to preserve the structural elements of a document, but not formatting details such as margin size."
https://pandoc.org/MANUAL.html

Although I learned about Pandoc as a tool for final conversion of R Markdown to a variety of formats; Pandoc is much more powerful; for example Pandoc can read and write both Microsoft Word and Jupyter Notebooks!!! And Pandoc can output to Microsoft PowerPoint.

@JimCallahanOrlando
Copy link

JimCallahanOrlando commented May 30, 2020

Reproducible Research (motivation for R Markdown)
To better understand the reproducible research workflow, see this document:

British Ecological Society
Guide-to-reproducible-code.pdf

The British Ecological Society Guide is compatible with the Coursera / Johns Hopkins University "Reproducible Research" course but more concise.

@JimCallahanOrlando
Copy link

pandoc-plot is a custom filter for Pandoc to include graphics (graphical library oriented) rather than language oriented (as in knitR).
https://github.com/LaurentRDC/pandoc-plot#readme

@JimCallahanOrlando
Copy link

nbconvert
"nbconvert uses pandoc to convert between various markup languages, so pandoc is a dependency when converting to latex or reStructuredText."
https://nbconvert.readthedocs.io/en/latest/usage.html

@JimCallahanOrlando
Copy link

JimCallahanOrlando commented May 31, 2020

John G. MacFarlane (jgm) in addition to being the author of Pandoc is a co-author of the markdown specification, "CommonMark Spec" and the JavaScript reference implementation of CommonMark specification, commonmark.js (and there is also a Python port of commonmark.js called commonmark.py).

CommonMark Spec (fenced-code-block)
"4.5 Fenced code blocks
A code fence is a sequence of at least three consecutive backtick characters (`) or tildes (~). (Tildes and backticks cannot be mixed.) A fenced code block begins with a code fence, indented no more than three spaces.

The line with the opening code fence may optionally contain some text following the code fence; this is trimmed of leading and trailing whitespace and called the info string.
... The content of the code block consists of all subsequent lines, until a closing code fence of the same type as the code block began with (backticks or tildes), and with at least as many backticks or tildes as the opening code fence."
https://spec.commonmark.org/0.29/#fenced-code-block

CommonMark Spec (info-string)
"The line with the opening code fence may optionally contain some text following the code fence; this is trimmed of leading and trailing whitespace and called the info string.
... The first word of the info string is typically used to specify the language of the code sample, and rendered in the class attribute of the code tag. However, this spec does not mandate any particular treatment of the info string."
https://spec.commonmark.org/0.29/#info-string

Pandoc User's Guide (Pandoc's Markdown)
Extension: fenced_code_blocks
In addition to standard indented code blocks, pandoc supports fenced code blocks. These begin with a row of three or more tildes (~) and end with a row of tildes that must be at least as long as the starting row. Everything between these lines is treated as code. No indentation is necessary:"
...
"Extension: backtick_code_blocks
Same as fenced_code_blocks, but uses backticks (`) instead of tildes (~).

Extension: fenced_code_attributes
Optionally, you may attach attributes to fenced or backtick code block using this syntax:

[reduced number of tildas below three so GitHub won't render away markdown -- jbc]

~~ {#mycode .haskell .numberLines startFrom="100"}
qsort [] = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++
qsort (filter (>= x) xs)
~~
Here mycode is an identifier, haskell and numberLines are classes, and startFrom is an attribute with value 100. Some output formats can use this information to do syntax highlighting. Currently, the only output formats that uses this information are HTML, LaTeX, Docx, Ms, and PowerPoint."
https://pandoc.org/MANUAL.html#pandocs-markdown

RStudio uses an ".rmd" extension for "RMarkdown" and pre-processes with R KnitR package
to output a Pandoc compatible markdown document with an ".md" extension that is fed to the Pandoc program for final rendering. This is R-centric; I would like VS Code to implement a language neutral or pan-language solution (or at least one that doesn't require every user to install R regardless of whether they ever use R).

commonmark.js (reference implementation of CommonMark Specification)
https://github.com/commonmark/commonmark.js/

commonmark.py (port of commonmark.js to Python)
https://github.com/readthedocs/commonmark.py

@JimCallahanOrlando
Copy link

JimCallahanOrlando commented Jun 1, 2020

Just to clarify for readers of this thread; VS Code already supports Markdown:

"VS Code supports Markdown files out of the box. You just start writing Markdown text, save the file with the .md extension and then you can toggle the visualization of the editor between the code and the preview of the Markdown file; obviously, you can also open an existing Markdown file and start working with it."
https://code.visualstudio.com/docs/languages/markdown

The VS Code Markdown functionality is hidden until you actually create or open a file with the ".md" extension and then magic happens! In the VS code "command pallet" there are markdown commands; including "Markdown: Open Preview to the Side". This is awesome, while your markdown tab stays open; a new tab previewing the formatted output is opened to the right. It is not that the screen setup is that unusual, but that it is so easy to get to! I went from not being able to find the markdown commands in VS Code to having this beautiful side by side view in seconds.

If you haven't tried Markdown in VS Code, this is really easy!

  1. I created a file "test.md" and typed "# Test markdown" on the first line and pressed enter so my cursor was on the second line and saved the file.

  2. Then from the VS Code "Command Pallet" I selected the option "Markdown: Open Preview to the Side" and "Test Markdown" came booming back to me as a heading on the right hand panel of the screen, while the markdown was on the left (side by side).

Try it.

BTW, in the VS Code Help, "Markdown" is listed as a language (among C and Python) and is where the above quote came from.

What we are asking for is implementing executable code blocks within Markdown like there are for Python in Jupyter Notebooks or for R in RStudio.

@JimCallahanOrlando
Copy link

JimCallahanOrlando commented Jun 1, 2020

"The Markdown Preview Mermaid Support extension demonstrates using scripts to add mermaid diagrams and flowchart support to the markdown preview. You can review the Mermaid extension's source code on GitHub."
https://code.visualstudio.com/api/extension-guides/markdown-extension

The VS Code "Mermaid" extension uses a three back tick (```) code block; so it may serve as a model for programming language code blocks {reduced to two backticks to prevent rendering):

```mermaid

graph LR
fa:fa-check-->fa:fa-coffee

```

https://marketplace.visualstudio.com/items?itemName=bierner.markdown-mermaid

Mermaid extension GitHub
https://github.com/mjbvz/vscode-markdown-mermaid

Mermaid seems to link to markdown-it

Interesting comment in
Markdown Preview Extensions Exploration
#22916
davidanthoff commented on Apr 4, 2017
This looks great!
I'd be interested in a slight variation of this: in the julia extension we support julia markdown files. They have the file ending .jmd. Ideally I would like to reuse the markdown preview for jmd files, but add in some custom extensions into that markdown preview that are only used for jmd files, not any other markdown files.
microsoft/vscode#22916

The handling of the Julia .jmd extension sounds nearly identical to the way one would handle the R .rmd extension.

@matthew-brett
Copy link

Having a vscode handler for .Rmd files would make a huge difference to me too, as a user and a teacher. I would much prefer to use my code editor for .Rmd files instead of RStudio; and I edit all my Jupyter notebooks in .Rmd format also - using Jupytext.

The big advantages of editing notebooks in .Rmd format, over editing .ipynb files, are:

  • Transparent version control of text / code
  • Transparent editing of important metadata in YaML header and fenced code block annotations

I'm very happy to expand if that's of interest - but only to say that adding this feature would be a big step in usability and teachability for notebooks.

@rchiodo
Copy link
Contributor

rchiodo commented Feb 24, 2022

Marking this as dupe of #4946 which has more upvotes

@rchiodo rchiodo closed this as completed Feb 24, 2022
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants