Composing Jupytext with Jupyter Nbconvert should preserve Markdown files #321

mwouts · 2019-09-01T06:36:48Z

This problem was first revealed by jupyter-book/jupyter-book#283

There, we start from a .md file, convert it to a Jupyter Notebook, and then convert it back to a .md file using jupyter nbconvert.

Following this operation, the code cells in the Markdown document that don't have an explicit language are converted to raw cells by Jupytext, and therefore the formatting of those code extracts is lost when jupyter nbconvert converts the document back to Markdown.

The text was updated successfully, but these errors were encountered:

Jupytext Markdown format in version 1.2 - #321 Raw cells are encoded using HTML comments (```` and ````) in Markdown files. Code blocks from Markdown files, when they don't have an explicit language, are displayed as Markdown cells in Jupyter

#321

Closes #321

mwouts · 2019-09-21T18:09:41Z

@choldgraf , an updated Jupytext Markdown format is available on branch 1.3.0. Now, when you open a Markdown file with Jupytext, and export it to Markdown again using nbconvert, you get something close(r) to the original.

To see exactly what I mean by 'close', you can have a look at this test:

jupytext/tests/test_jupytext_nbconvert_round_trip.py

Lines 9 to 32 in f2167f1

    
           def test_markdown_jupytext_nbconvert_is_identity(md_file): 
        
               """Test that a Markdown file, converted to a notebook, then 
        
               exported back to Markdown with nbconvert, yields the original file""" 
        
               with open(md_file) as fp: 
        
                   md_org = fp.read() 
        
               nb = jupytext.reads(md_org, 'md') 
        
               import nbconvert 
        
               md_nbconvert, _ = nbconvert.export(nbconvert.MarkdownExporter, nb) 
        
               # Our expectations 
        
               md_expected = md_org.splitlines() 
        
               # #region and #endregion comments are removed 
        
               md_expected = [line for line in md_expected if line not in ['<!-- #region -->', '<!-- #endregion -->']] 
        
               # language is not inserted by nbconvert 
        
               md_expected = ['```' if line.startswith('```') else line for line in md_expected] 
        
               # nbconvert inserts no empty line after the YAML header (which is in a Raw cell) 
        
               md_expected = '\n'.join(md_expected).replace('---\n\n', '---\n') + '\n' 
        
               # an extra blank line is inserted before code cells 
        
               md_nbconvert = md_nbconvert.replace('\n\n```', '\n```') 
        
               jupytext.compare.compare(md_nbconvert, md_expected)

You can also see the Markdown file being tested here: f2167f1 .

Please let me know if you'd like to play with that version. In that makes it simpler for you, I could publish a dev version on pypi.

choldgraf · 2019-09-22T21:07:44Z

Nice! It looks like in this case, code fences make the markdown file break into different cells, but those cells are still markdown cells so the formatting shouldn't change much, yeah?

This seems reasonable to me - so in this case, how would one mark in the md file that a code fence is executable?

mwouts · 2019-09-23T06:54:09Z

Thanks @choldgraf .

That is correct: with the implementation on 1.3.0-dev, code fences without a language, or with a language that is not understood by Jupytext, are mapped to isolated markdown cells (where the fences are preserved).

Code cells that do have a language are still mapped to executable code cells in Jupyter. Now, if you open every markdown file with 1.3.0-dev as a Notebook in Jupyter Book, you still have to decide when a particular Markdown file should be executed, or not. Maybe if it has a 'jupyter' entry in the front-matter YAML? (i.e. jupytext.notebook_metadata_filter != '-all')

choldgraf · 2019-09-23T16:43:58Z

Ah gotcha - thanks for the clarification.

I'm still a little iffy on whether we can assume that just because a code fence has a language attached to it, that it can be executed, however I don't think this'll be an issue in Jupyter Book. Currently I'm doing what you suggest - read in a markdown file, check if there was jupytext configuration in there. If not, then dump the notebook back to a markdown string and create a new notebook w/ one cell in it.

Jupytext Markdown format in version 1.2 - #321 Raw cells are encoded using HTML comments (```` and ````) in Markdown files. Code blocks from Markdown files, when they don't have an explicit language, are displayed as Markdown cells in Jupyter

#321

Closes #321

mwouts · 2019-10-13T01:25:59Z

@choldgraf , a release candidate that includes the fix for the raw cells in the Markdown format is available on pypi:

pip install jupytext==1.3.0rc0

I'm not sure you will want to open every Markdown file as a notebook in Jupyter Book again, but in theory this becomes possible with that version. Let me know if you can give it a try!

choldgraf · 2019-10-13T18:12:50Z

Nice! I just opened it and it looks quite good. I still can't open markdown files in JB w/o jupytext metadata, as (at least in the jupyter book docs) it runs all of the jupytext pages, but there are lots of code cells that have syntax highlighting (e.g., python cells) but that aren't runnable code :-/

maybe there is some way to mark a code cell as not runnable?

mwouts · 2019-10-13T20:02:22Z

Thanks @choldgraf for having a look. Good news!

maybe there is some way to mark a code cell as not runnable?

Well, I'm not sure to see yet why you would want this - do you want to execute only a fraction of the cells, is that it? Anyway, here is what I am thinking of

encapsulate the inactive cell in a Markdown region, or
add an active="md" cell metadata, or an active-md cell tag to the cell, or a {"runtools": {"frozen": true}} cell metadata, to mean that it should not be active in the ipynb representation. Unfortunately, a quick test seems to suggest that this is may not be working properly on Markdown files... I will confirm.

Is that something like 2. that you are looking for? Maybe in the context of a Markdown file a tag like inactive-ipynb would sound more natural?

mwouts · 2019-10-13T20:32:43Z

I confirm that the tests on active/inactive cells do not cover the .md format, see https://github.com/mwouts/jupytext/blob/bf6049b9107fef171647345f84364cdb4c5aaf5e/tests/test_active_cells.py .

Also, how do you expect that Jupytext could make the cell not runnable? Should it turn it into a Markdown cell - probably the most reasonable way to make sure the cell is not runnable in Jupyter, without breaking the nbconvert compatibility? Or should we just add a tag to the cell, and let the execution preprocessor know that it should not execute those cells?

mwouts · 2019-10-16T06:20:06Z

The missing tests were added in 55ad5a9.

With that commit, the active metadata is not lost any more in the Markdown format. But it still has no effect on the Markdown file: this is because, unlike in scripts or R Markdown, cells are never expected to be active in Markdown! We'll see how to answer that specific question at #347 .

mwouts mentioned this issue Sep 1, 2019

Code cells in plain Markdown files are rendered as raw text jupyter-book/jupyter-book#283

Closed

mwouts added this to the 1.3.0 milestone Sep 7, 2019

mwouts mentioned this issue Sep 21, 2019

Capture YAML header key/values that aren't in jupyter:? #336

Closed

mwouts added a commit that referenced this issue Sep 21, 2019

Update test_read_simple_markdown.py

5b90f75

#321

mwouts added a commit that referenced this issue Sep 21, 2019

Test jupytext/nbconvert composition

f2167f1

#321

mwouts added a commit that referenced this issue Sep 21, 2019

Update the mirror files

9b79106

Closes #321

mwouts closed this as completed Sep 22, 2019

mwouts added a commit that referenced this issue Oct 12, 2019

Update test_read_simple_markdown.py

00938f4

#321

mwouts added a commit that referenced this issue Oct 12, 2019

Test jupytext/nbconvert composition

a017df1

#321

mwouts added a commit that referenced this issue Oct 12, 2019

Update the mirror files

4a76ffe

Closes #321

mwouts reopened this Oct 13, 2019

mwouts mentioned this issue Oct 14, 2019

How to specify that a code cell should be inactive in Jupyter in the Markdown format? #347

Closed

mwouts closed this as completed Oct 16, 2019

mwouts mentioned this issue Oct 27, 2019

Maintain cell metadata for RISE #66

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Composing Jupytext with Jupyter Nbconvert should preserve Markdown files #321

Composing Jupytext with Jupyter Nbconvert should preserve Markdown files #321

mwouts commented Sep 1, 2019

mwouts commented Sep 21, 2019

choldgraf commented Sep 22, 2019

mwouts commented Sep 23, 2019

choldgraf commented Sep 23, 2019

mwouts commented Oct 13, 2019

choldgraf commented Oct 13, 2019

mwouts commented Oct 13, 2019

mwouts commented Oct 13, 2019

mwouts commented Oct 16, 2019

Composing Jupytext with Jupyter Nbconvert should preserve Markdown files #321

Composing Jupytext with Jupyter Nbconvert should preserve Markdown files #321

Comments

mwouts commented Sep 1, 2019

mwouts commented Sep 21, 2019

choldgraf commented Sep 22, 2019

mwouts commented Sep 23, 2019

choldgraf commented Sep 23, 2019

mwouts commented Oct 13, 2019

choldgraf commented Oct 13, 2019

mwouts commented Oct 13, 2019

mwouts commented Oct 13, 2019

mwouts commented Oct 16, 2019