Import Jupyter Notebook files into Atom #1501

kylebarron · 2018-12-28T05:53:05Z

This PR adds functionality to import Jupyter Notebook files as text files in Atom.

It imports the new file into a new TextEditor. In order to support Markdown blocks, I need to know the correct comment symbol, so I need to find the correct Atom Grammar for the notebook file. I first try to match file extensions between the notebook file's metadata and the file extensions that each Grammar is applied for. But since the file extension is an optional metadata field for the notebook, I also attempt to match on kernelspec name. These two together should work for the vast majority of notebooks.

Given the grammar, I get the source text for code and markdown cells, prepending each line of text in the markdown cell with the commentStartString, right trimmed plus a space. I add a cell marker line before each cell, either # %% or # %% markdown. It uses \r\n as the line separator on Windows; \n otherwise (for newlines created between cells).

To do:

I want to add tests, especially testing that the round-trip import-export and export-import from Hydrogen is idempotent, but that depends on the result of Support exporting markdown cells to notebook file #1498 and Include kernelspec as metadata of notebook export #1499. Regardless, if there are any substantial requested changes to this PR, I want to make the changes before making the tests.

Limitations:

Currently only version 4 of the notebook format is supported.
Only imports source text of code cells and Markdown cells. Doesn't import raw cells or code cell outputs.

Python

# %% markdown
# # Example Notebook
# 
# This is an example Python Notebook!
# %%
print('hello world!')
# %%

Bash:

# %% markdown
# # Example Notebook
# 
# This is an example Bash Notebook.
# %%
echo "hello world!"
# %%

Javascript:

// %% markdown
// # Example Notebook
// 
// This is an example Javascript Notebook.
// %%
console.log('hello world!');
// %%

R:

# %% markdown
# # Example Notebook
# 
# This is an example R Notebook!
# %%
print('hello world!')
# %%

Closes #1457, ref #1404, ref #75.

kylebarron · 2018-12-30T00:46:41Z

Can someone help with this last flow error? My understanding of promises and callbacks is only so-so.

Cannot call readFile with loadNotebook bound to callback because:
 • Promise [1] is incompatible with undefined [2] in the return value.
 • Promise [3] is incompatible with undefined [2] in the return value.

     lib/import-notebook.js
      23│       atom.notifications.addError("Selected file must have extension .ipynb");
      24│       return;
      25│     }
      26│     readFile(filename, loadNotebook);
      27│   });
      28│ }
      29│
 [3]  30│ async function loadNotebook(err, data) {

     /private/tmp/flow/flowlib_f7461a8/core.js
 [1] 612│ declare class Promise<+R> {

     /private/tmp/flow/flowlib_f7461a8/node.js
 [2] 967│     callback: (err: ?ErrnoError, data: Buffer) => void

lgeiger · 2018-12-30T18:57:41Z

Thanks for getting this started 🎉

I guess the flow error is because the callback in fs.readFile shouldn't be a async function. I guess wrapping it into an arrow function might work around this issue.

I can review the PR in detail next week.

BenRussert

This is going to be great @kylebarron! Thanks for leading this effort!

I added some tests and fixed a couple things on my fork. If you add me as a collaborator on your fork I can push my changes to this PR. Or, I can pr against your fork if you prefer.

kylebarron/hydrogen@import-notebook...BenRussert:import-notebook

lib/import-notebook.js

BenRussert · 2019-01-13T01:09:44Z

lib/import-notebook.js

+  const cellType = cell.cell_type;
+  const cellMarkerKeyword = cellType === "markdown" ? "markdown" : null;
+  const cellMarker = getCellMarker(commentStartString, cellMarkerKeyword);
+  var source = cell.source;


For best practice, use let instead of var here.

BenRussert · 2019-01-13T01:13:47Z

lib/main.js

@@ -55,6 +55,7 @@ import {
 } from "./utils";

 import exportNotebook from "./export-notebook";
+import importNotebook from "./import-notebook";


I really like that you implemented this in its own file and imported into main. This improves organization, avoids merge conflicts from unrelated PRs, and makes testing easier to name a few benefits.

kylebarron · 2019-01-14T14:15:05Z

I added some tests and fixed a couple things on my fork. If you add me as a collaborator on your fork I can push my changes to this PR. Or, I can pr against your fork if you prefer.

Added as contributor

BenRussert · 2019-01-14T15:08:36Z

We can rebase last once we are ready to merge. Try this branch out and see if you can find anything that still needs work. I'll take another look during the week as well to see what's left.

lib/import-notebook.js

kylebarron · 2019-01-14T15:20:22Z

lib/import-notebook.js

+  }
+  const nb = parseNotebook(data);
+  if (nb.nbformat < 4) {
+    atom.notifications.addError("Only notebook version 4 currently supported");


Might be good to check out how version 3 differs, because supporting reading version 3 would be a nice plus (probably for a future PR)

kylebarron · 2019-01-14T15:22:29Z

lib/import-notebook.js

+  const cellType = cell.cell_type;
+  const cellMarkerKeyword = cellType === "markdown" ? "markdown" : null;
+  const cellMarker = getCellMarker(commentStartString, cellMarkerKeyword);
+  var source = cell.source;


kylebarron · 2019-01-14T15:25:42Z

spec/import-notebook-spec.js

+      fail(e);
+      done();
+    });
+  };


Can't you import this from test-utils.js?

Definitely meant to 😂

kylebarron · 2019-01-14T15:35:56Z

I think all your edits make sense, though I haven't tested them in Atom yet.

kylebarron · 2019-01-14T17:36:28Z

I'd be in favor of moving importNotebook to an atom-workspace command, so that a text editor doesn't have to be currently active to use it.

kylebarron · 2019-01-14T17:38:13Z

I also think it's important to have tests of [import then export] and [export then import] and to make sure that the .ipynb file and text editor are identical, respectively.

kylebarron · 2019-01-14T17:41:22Z

I'd also be happy to add documentation to https://nteract.gitbooks.io/hydrogen/docs/Usage/NotebookFiles.html. (When #1498 was merged, the documentation was updated despite that code not being released yet).

kylebarron · 2019-01-15T01:11:09Z

I added some more documentation and moved import-notebook to be a workspace command.

kylebarron · 2019-01-22T18:36:03Z

@BenRussert
I think this is probably good to merge if you're happy with it.

I wasn't sure if this should be its own PR, since it might need more discussion, but it would also be useful to add the import notebook functionality as an opener for .ipynb files. It's a simple 10-line change.

JohnCHarrington · 2019-02-05T15:57:23Z

I wasn't sure if this should be its own PR, since it might need more discussion, but it would also be useful to add the import notebook functionality as an opener for .ipynb files. It's a simple 10-line change.

I know nothing about how this works, but the ideal for me would be to have atom open .ipynb files like this, then export them on save. Everyone around me works directly in jupyter, this way I could work with them pretty seamlessly.

On a side note, how about having this import/export to/from the rich document format? Even if it only supported importing one language and markdown cells it would be nice to have the choice of which format to work in.

kylebarron · 2019-02-05T16:05:27Z

I know nothing about how this works, but the ideal for me would be to have atom open .ipynb files like this, then export them on save. Everyone around me works directly in jupyter, this way I could work with them pretty seamlessly.

We currently don't support automatic exporting because of the potential for unintentionally overwriting data.

On a side note, how about having this import/export to/from the rich document format? Even if it only supported importing one language and markdown cells it would be nice to have the choice of which format to work in.

Do you mean markdown documents?

JohnCHarrington · 2019-02-05T17:02:46Z

Fair enough, I can add a hotkey for export anyway.

Yes I mean markdown documents, then the grammar/kernelspec would be used to set the language on the code blocks.

kylebarron · 2019-02-05T17:07:51Z

Fair enough, I can add a hotkey for export anyway.

Well currently when you export a file to a notebook, it brings up the system file selector, so not sure how much time a hotkey would save. Hydrogen has no way to know ahead of time how to name the outputted notebook file.

Yes I mean markdown documents, then the grammar/kernelspec would be used to set the language on the code blocks.

Markdown documents are more prone to losing some metadata, in particular cell boundaries. It's probably not too difficult to write, but not my focus at the moment.

kylebarron · 2019-02-06T02:49:55Z

@JohnCHarrington
You may also want to check out the newest release of Pandoc https://github.com/jgm/pandoc/releases/tag/2.6

Kyle Barron added 3 commits December 27, 2018 22:12

Import Jupyter Notebook files into Hydrogen

14990e6

Require imported file to have extension .ipynb

26b551f

Raise error for notebook version < 4

ffa3291

kylebarron requested a review from lgeiger December 28, 2018 05:53

Kyle Barron added 2 commits December 27, 2018 22:55

Change python fallback grammar notice to warning

d347b7d

Fix most flow errors

83e17b3

kylebarron requested a review from BenRussert December 30, 2018 01:07

Add flow types for Notebook and Cell

19fa2b9

kylebarron mentioned this pull request Jan 5, 2019

Support exporting markdown cells to notebook file #1498

Merged

kylebarron added the work in progress 👷‍♂️ label Jan 8, 2019

Kyle Barron and others added 9 commits January 11, 2019 12:25

Merge branch 'master' into import-notebook

9de86e7

Fix flow error for possibly null string

967d36d

Move loadNotebook into anonymous function

336f4ce

Fix readFile callback

20295a2

Fix getCommentString and add spec

6d0dda8

Add import notebook spec

778b02a

Move waitAsync helper to new test-helpers file

669283c

Handle case if user cancels in file dialog

e1050ce

Add showOpenDialog ipynb file filter

f2fe5fa

BenRussert reviewed Jan 13, 2019

View reviewed changes

kylebarron commented Jan 14, 2019

View reviewed changes

Make import-notebook a workspace command

41c9edd

Add docs for importing a notebook file

1e41d6f

Kyle Barron added 2 commits January 15, 2019 10:13

Use waitAsync from test-utils.js

8603a7e

Use waitAsync from test-utils.js (again)

b655686

kylebarron mentioned this pull request Jan 17, 2019

Converting Stata magic back to Stata code kylebarron/stata_kernel#259

Closed

kylebarron removed the work in progress 👷‍♂️ label Jan 22, 2019

BenRussert force-pushed the import-notebook branch from 04f78e4 to b655686 Compare January 22, 2019 23:00

Prep for merge

0f6f169

BenRussert mentioned this pull request Jan 22, 2019

Import notebook #1514

Closed

BenRussert merged commit e69e274 into nteract:master Jan 22, 2019

kylebarron mentioned this pull request Feb 15, 2019

Support jupyter notebook files #75

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Import Jupyter Notebook files into Atom #1501

Import Jupyter Notebook files into Atom #1501

kylebarron commented Dec 28, 2018 •

edited

Loading

kylebarron commented Dec 30, 2018

lgeiger commented Dec 30, 2018

BenRussert left a comment

BenRussert Jan 13, 2019

kylebarron Jan 14, 2019

BenRussert Jan 13, 2019

kylebarron commented Jan 14, 2019

BenRussert commented Jan 14, 2019

kylebarron Jan 14, 2019

kylebarron Jan 14, 2019

kylebarron Jan 14, 2019

BenRussert Jan 14, 2019

kylebarron commented Jan 14, 2019

kylebarron commented Jan 14, 2019

kylebarron commented Jan 14, 2019

kylebarron commented Jan 14, 2019

kylebarron commented Jan 15, 2019

kylebarron commented Jan 22, 2019

JohnCHarrington commented Feb 5, 2019

kylebarron commented Feb 5, 2019

JohnCHarrington commented Feb 5, 2019

kylebarron commented Feb 5, 2019

kylebarron commented Feb 6, 2019

Import Jupyter Notebook files into Atom #1501

Import Jupyter Notebook files into Atom #1501

Conversation

kylebarron commented Dec 28, 2018 • edited Loading

Python

Bash:

Javascript:

R:

kylebarron commented Dec 30, 2018

lgeiger commented Dec 30, 2018

BenRussert left a comment

Choose a reason for hiding this comment

BenRussert Jan 13, 2019

Choose a reason for hiding this comment

kylebarron Jan 14, 2019

Choose a reason for hiding this comment

BenRussert Jan 13, 2019

Choose a reason for hiding this comment

kylebarron commented Jan 14, 2019

BenRussert commented Jan 14, 2019

kylebarron Jan 14, 2019

Choose a reason for hiding this comment

kylebarron Jan 14, 2019

Choose a reason for hiding this comment

kylebarron Jan 14, 2019

Choose a reason for hiding this comment

BenRussert Jan 14, 2019

Choose a reason for hiding this comment

kylebarron commented Jan 14, 2019

kylebarron commented Jan 14, 2019

kylebarron commented Jan 14, 2019

kylebarron commented Jan 14, 2019

kylebarron commented Jan 15, 2019

kylebarron commented Jan 22, 2019

JohnCHarrington commented Feb 5, 2019

kylebarron commented Feb 5, 2019

JohnCHarrington commented Feb 5, 2019

kylebarron commented Feb 5, 2019

kylebarron commented Feb 6, 2019

kylebarron commented Dec 28, 2018 •

edited

Loading