Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jupyter notebook to emacs' org-mode format #62

Open
mwouts opened this issue Sep 9, 2018 · 13 comments
Open

Jupyter notebook to emacs' org-mode format #62

mwouts opened this issue Sep 9, 2018 · 13 comments

Comments

@mwouts
Copy link
Owner

mwouts commented Sep 9, 2018

Emacs' org mode format (extension .org) has a well documented syntax for code blocks:

#+NAME: <name>
#+BEGIN_SRC <language> <switches> <header arguments>
  <body>
#+END_SRC

We could implement a notebook to/from org converter.

Further notes:

  • The NAME option seems optional.
  • Examples are available here
  • Org mode has a header with metadata. Can it hold the Jupyter notebook metadata?
  • Also, it's not clear how cell metadata could be represented in org mode
@mwouts
Copy link
Owner Author

mwouts commented Sep 9, 2018

The second and third links provided by Doug at #61 provide examples of code blocks with cell metadata. The first and fourth link point to ORG to Notebook converters, writen in Emacs Lisp.

Interesting facts:

  • ORG is not markdown! If we want to support styling, round trip conversion is going to be tough!
  • ORG can include results. Currently Jupytext supports no format with outputs. Probably we don't want to implement the matching with Jupyter outputs.
  • More generally, ORG has support for many features that are not in Jupyter notebooks (tables, etc...). Is it acceptable for the users to have them, say, as raw cells in Jupyter?

@dsblank
Copy link
Contributor

dsblank commented Sep 9, 2018

Maybe org-mode and jupytext aren't going to be a good match afterall. But my friends are still on the quest for combining emacs + notebooks...

@mwouts
Copy link
Owner Author

mwouts commented Sep 9, 2018

No problem! @dsblank , we'll try to make your friends happy 😄

Could you ask them to write a sample org file with text, code, a header, and a few org specific sections, and tell us how they imagine the corresponding notebook? Or even, could they contribute a test similar to test_read_simple_julia.py, but for org mode?

@srnnkls
Copy link

srnnkls commented May 11, 2020

What‘s the current state with this? Do you already have something to work on? I’d like to help.

@mwouts
Copy link
Owner Author

mwouts commented May 11, 2020

Hello @srnnkls, thanks for reaching out!

Well, I am afraid that we've not made big progresses here... As you saw above, one can use ox-ipynb, by @jkitchin, to convert org-mode documents to ipynb, but I am not aware of a tool doing the opposite conversion.

It would help much if you could write two functions that convert a notebook object to its text representation, and vice versa. Maybe at first we should target a limited conversion, i.e.

  • keep the content of code and markdown cells verbatim in the org-mode representation,
  • and ignore the cell and notebook metadata.

Ideally that first version should be compatible (should use?) ox-ipynb.

These two functions should be called in jupytext.reads and jupytext.writes if the format name matches the name you choose for this format, e.g. "org". You should also add a description of the new format in formats.py.

NB: If you like, you can also provide these two functions in a separate Python package, and add that package as an optional dependency in Jupytext - like we did for the md:pandoc or the md:myst formats.

@jkitchin
Copy link

You can use pandoc to convert ipynb to org right now. It is technically possible to do a round trip conversion, but you will lose things like cell metadata, and probably some formatting.

It would not be hard to use elisp to convert an ipynb to an org file containing markdown blocks and code blocks. Also not hard to do that in Python. It might be tricky either way to deal with the results.

@mwouts
Copy link
Owner Author

mwouts commented May 13, 2020

Oh, that's interesting! What we could do, then, is to use ox-ipynb on one side, and pandoc on the other side, plug this into Jupytext's collection of test notebooks, and see how well this work 😃 (or not). If time permits I will give it a try!

@srnnkls
Copy link

srnnkls commented May 13, 2020

Hey, thank's for taking the time to discuss options. Another option for ipynb to org conversion is nbcorg. I'm using nbcorg and ox-ipynb at the moment. That works well but I'd like to have a emacs independet solution that works from jupyter and the command-line. That is what brings me here. As a nbconvert plugin nbcorg uses jinja templates.

I will have a look at jupytext-reads and jupytext-writes. @jkitchin, you are right. I'm not sure about how to deal with results too. nbcorg includes results as EXAMPLE blocks what I'm not a fan about.

@srnnkls
Copy link

srnnkls commented May 13, 2020

Okay, I just realised that for markdown you use pandoc under the hood as well. I will write a round trip conversion test first, then. I don't know when I find the time to work on this but probably sometime within the next week.

@mwouts
Copy link
Owner Author

mwouts commented May 13, 2020

@srnnkls , if that can help, I have prepared a branch with a tentative implementation of the org format based on pandoc (back and forth), see the last three commits here: https://github.com/mwouts/jupytext/commits/org_pandoc:

  • The first commit 086178b adds the org:pandoc format to Jupytext (and simply calls pandoc)
  • The second commit 93f6478 activates the round trip tests on that format
  • And the third commit b5432c7 adds the files generated by the round trip tests

Note that the round trip test does not work. Is it correct that pandoc's conversion only works in one direction (ipynb to org)? Any way, feel free to experiment with this, and replace either converter with your favorite one.

Two additional comments:

  • Jupytext removes the outputs before calling pandoc (because they are preserved in the .ipynb file), so no headache with outputs...
  • pandoc is used for the md:pandoc format, but that is not Jupytext's default markdown format (see the example files at https://github.com/mwouts/jupytext/tree/master/demo )

@srnnkls
Copy link

srnnkls commented May 14, 2020

Wow, thank you! I‘ll have a look. Removing results before converting to text and just adding them back to the notebook version is a nice approach; good to know about it.

@mwouts
Copy link
Owner Author

mwouts commented May 16, 2020

Just a note for the people who subscribed to this thread... I have opened an issue at pandoc regarding the round trip ipynb-org-ipynb: jgm/pandoc#6367.

The issue also raises the question of how the notebook cells should be represented in org mode. Personally I think that the representation should remain as simple as possible, because the users are going to type it 😃 But obviously, it is simpler for the programmers (and maybe also safer in the long run?) to use explicit cell markers. Anyway... if you have an opinion about this, please follow the pandoc thread as well!

@dlukes
Copy link

dlukes commented Sep 7, 2022

Recent developments on the Pandoc side of things: jgm/pandoc#6367 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants