Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A myst-based 'ipynb' document structure #12

Closed
choldgraf opened this issue Feb 18, 2020 · 9 comments · Fixed by #116
Closed

A myst-based 'ipynb' document structure #12

choldgraf opened this issue Feb 18, 2020 · 9 comments · Fixed by #116

Comments

@choldgraf
Copy link
Member

choldgraf commented Feb 18, 2020

Per a recent conversation with @mmcky and @chrisjsewell , we came up with the proposal for a MyST-based IPYNB structure.

Here is an example notebook with the latest syntactic ideas

---
kernel_info:
    name: python3
language_info:
    name: Python
title: "My notebook title"
comment: "If any of the above aren't specified then use jupyter defaults"
---

# Markdown syntax

## Cell breaks

We can manually break markdown cells quickly with this syntax

+++

### A subsection in another markdown cell

another proposal is to use

+++

### And here would be the other markdown cell...

## Markdown metadata

We can also explicitly separate a markdown cell and configure it like so:

```{markdown} tag1, tag2
---
key: val
---
## Here is some *configured* markdown!
```

And now this would be a third markdown cell

## Executable code

Code is always executed with 'execute' blocks, like so:

```{execute}
print('this would be run by the front-matter-specified, or default, kernel')
```

You can also add metadata to these

```{execute} kernelname
:key: val
:key2: val2
# Or perhaps we want a `metadata`: field for cell metadata, and other keys for options like jupyter-sphinx does
print('some python with cell metadata')
```
and that's it!

influences

  • Use MyST syntax to define code cells, markdown cells, and their breaks
  • Don't force extra syntax when it's not needed. E.g., if there's pure markdown between two code cells, treat that as a markdown cell.
  • Try to use markdown design influence in decisions, the syntax should suggest what it is doing.

constraints

Round-trip conversion

Content within cells

All content withing cells, as well as the breaks between cells, should be 100% round-trippable in a lossless fashion. The markup language used in markdown cells and anything inside code cells should not be modified.

Cell-level metadata

All metadata specified in markdown will be converted into the ipynb file. Conversely, a subset of cell-level ipynb metadata will be converted into markdown. TODO: figure out what subset we want...only tags? Other publishing-specific stuff?

Notebook-level metadata

The same rule applies to notebook-level metdata (and we need to figure out the subset of metadata to keep)

Proposed syntax

Notebook-level metadata

A YAML header block at the top of the document will denote notebook-level metadata

example:

---
key: value
---

# My first header

Code cells

Code cells are defined with the "execute" directive, followed by the language that should be used to execute the code. If no language is specified, then a notebook-level metadata should define the default kernel to use. YAML or : configuration at the front of the code cell will convert into cell-level metadata.

example:

```{execute} python
---
key: val
key2: val2
---
print('hi')

OR

```{execute} python
:key: val
:key2: val2
print('hi')

Markdown cells

Markdown that's in-between code cells will be treated as a single markdown cell that separates those code cells.

If a user wants to attach cell-level metadata to some markdown, then they must use the "markdown" directive. This accepts a list of tags as a short-hand input, and also accepts YAML configuration like code cells.

example:

```{markdown} tag1, tag2
:key1: val1

# This is my markdown
```

Simple markdown cell splits

To define a split between two markdown cells, but without attaching extra metadata to those cells, there is a short-hand one-liner:

  • +++

This simply defines where one block of markdown content should become two markdown cells. If the author wishes to add extra metadata to one of the markdown cells, they should instead use the

```{markdown}
```

pattern

@choldgraf
Copy link
Member Author

choldgraf commented Feb 18, 2020

On that last point (shorthand for markdown cell breaks), why not use "double comment" characters?

%%

?

That would map onto this pattern as used in Matlab and pycharm I believe... In those cases it breaks up code cells but this seems similar. (for example, see "the percent format" here https://jupytext.readthedocs.io/en/latest/introduction.html#jupytext-formats)

@chrisjsewell
Copy link
Member

That would be semi-consistent with the jupytext py:percent format. However, it would clash with the current syntax for comments (starting with %). Perhaps +++ would be more intuitive, in that you are ‘adding’ a new cell?

@choldgraf
Copy link
Member Author

choldgraf commented Feb 19, 2020

Is your concern about comment clashing that it would be an annoying regex, or that users would find it cognitively confusing?

If it is the latter, I think it's worth discussing for sure, but my intuition is that it would be OK given that this is common in other languages. For example, Matlab's uses a single % for comments, and %% is a special syntax for a "cell break". (god I can't believe I just used matlab as an example for inspiration)

But I think your concern is valid, so we should see what others think. I think that +++ seems reasonable too, though I also like the fact that %% wouldn't require that people remember a new character to use

@chrisjsewell
Copy link
Member

chrisjsewell commented Feb 19, 2020

Is your concern about comment clashing that it would be an annoying regex, or that users would find it cognitively confusing?

A little from column A, a little from column B

@choldgraf choldgraf transferred this issue from executablebooks/meta Feb 19, 2020
@choldgraf
Copy link
Member Author

Hey all - I just added a little example to the top-level-comment showing off the currently-proposed syntax for a notebook. I don't think it's too bad, what do you all think?

A few thoughts that came up:

  • If we use the execute directive, should we have a dedicated keyword for "metadata" (same for markdown cells), and then use other keys to control some behavior at run-time (e.g., how jupyter-sphinx controls "hide inputs" etc). Or, should we assume that all the YAML metadata is directly-ported to cell-level metadata?
  • Let's try to close-off that conversation about short-hand for markdown cell breaks. The two main proposal right now are:

@jstac
Copy link
Member

jstac commented Feb 21, 2020

+++ seems natural to me.

@choldgraf
Copy link
Member Author

Hey @mmcky do you know what the state of implementing this with jupytext is? I seem to recall you and aakash were going to look into this?

@mmcky
Copy link
Member

mmcky commented Mar 9, 2020

hey @choldgraf (@aakash) started looking into this -- but we asked him to focus on helping get the full myst/ipynb -> sphinx -> HTML Writer linked up using jupyter-cache with outputs supported to help with building out the full pipeline that then can be worked on independently up and down the chain. We have some great direction from @mwouts and doesn't seem to far off once we get back to looking at it.

@choldgraf
Copy link
Member Author

sounds good...if we can unify what myst-nb and what jupyter-sphinx are doing, then we can get more-or-less markdown support using directives without needing jupytext anyway...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants