-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some markup features used by book authors #11
Comments
A few thoughts on a bijective ipynb to #12 style rmd. See https://gist.github.com/jlperla/d4972c5dc1cef2e2936d8a33e7a9ab34 |
I have written lecture notes in LaTeX, Rmd, RST, and Weave.jl. I would say that Rmd is my favoriate system. For me, the single most important feature is Rmd's chunk-caching. The great thing about it is that it caches not just output of chunks, but the entire R environment, and it can automatically recognize dependencies between chunks. With caching turned on, regenerating output after editing the source file will only rerun the chunks that are absolutely necessary. It's not uncommon for me to write documents with code that takes 10 minutes - an hour to run. Without caching, it becomes very tedious to edit these documents. Weave.jl has a cache option, but it only caches output, not all the variables in the Julia session. There is and can be no dependency management without caching more of what's in memory. I would also like to be able to share caches between output formats. I often generate multiple output formats from the same document (e..g static html and jupyter notebooks or slides and slides with extra notes in between). An annoyance with Rmd, RST, and Weave.jl is that some things break when switching output formats. Interactive javascript figures and tables are nice to have and generally work with html output. They can't work completely in pdf, but they don't always fall back to reasonable static alternatives. More annoying is that they break in jupyter notebooks, sometimes depending on whether in jupyterlab or the old interface or some private provider's custom interface. A worse problem for me (I sort of expect javascript stuff to break) is that customization and extensibility tends to be fragile across output formats. For example, you can put tags into Rmd or Weave.jl jmd files and then add custom css to add new formatting to html output. But, of course these will break if you switch to latex->pdf output. They also (at least with weave.jl) tend to break in jupyter notebooks (although I expect I could fix this if I tried).
I think Weave.jl's inline code evaluation with the strategy described here is a good way to fix the fragility of customization and extensibility. E.g. instead of |
Another thought that has come to mind. I think the executable blocks (a.k.a. code chunks), should use a Model–view–controller pattern. For example, in RST syntax: .. exec-block:: id1
:kernel: ipython
print("just a note")
plot([1, 2, 3])
.. note::
.. exec-view:: id1
:format: text
.. exec-view:: id1
:output_index: 1
:format: figure
:label: fig:figure1
This is my caption that can use **any** of the RST syntax,
even roles like :ref:`aref`. As you can see, some important benefits of this approach are that (a) you can format multiple outputs per block, and (b) it means you don't have to hide the caption in a metadata field. You could also use inline views like: .. exec-block:: id1
:kernel: ipython
create_variable_text()
In my text I want to inject computed variables like :exec-view:`id1`.
It may also help to address the problem that @schrimpf noted above, e.g. .. exec-view:: id1
:mimetype: application/javascript
:only: html
.. exec-view:: id1
:mimetype: text/latex
:only: latex |
Chipping in with another example. I found collapsible admonition blocks extremely useful: ??? question "How does $C$ predicted by the Einstein model behave at low $T$?"
When $T → 0$, $T_E/T → \infty$. Therefore neglecting $1$ in the denominator we get $C \propto \left(\frac{T_E}{T}\right)^2e^{-T_E/T}$, and the heat capacity should be exponentially small! |
@akhmerov do you know if rST directives already allow for this? E.g. if note admonitions had a "title" attribute, then this would just be a matter of writing CSS |
I haven't seen anything similar in rST, so it would definitely require an extension. |
It seems like that'd be a pretty valuable / modular extension to add though! |
Pingback about collapsible admonitions: https://sphinx-collapse-admonitions.readthedocs.io/en/latest/# |
I wanted to list out a couple of the markup features that I really appreciate from using Jupinx/Weave/etc. and perhaps tie them to Rmd/bookdown. Most of these are things I currently use, although some of them are things I wish I had.
I think the goal shouldn't just be about getting the functionality working, but rather making sure that the syntax is clean and easy to read/write for the end-users. In this case, the end-user I am thinking about is someone writing a serious book with PDF/HTML/Jupyter as output formats.
I will leave it up to you to see whether these map into a cell-based jupyter approach, but my intuition tells me that many of them do not. But they all match the semantics of the
Rmd/bookdown
language. If I ever say "jupyter" here, I am talking only about jupyter as an output type, not an editing front-end or intermediate format in a build pipeline.Right now, jupinx allows equation numbering but it is a little ugly (on a syntactic level as well as the actual output when generating ipynb with numbering).
For the syntax, jupinx right now can add in a label to a full math environment
But it is hard to write normal latex, especially if you want to have multiple numbered equations. Rmd/bookdown does this in as latex-centric way. Writing almost correct latex (except for having to escape the
#
, which is reasonable for python/julia/R)Being able to write almost proper latex in the documents would be very liberating!
One other note on the equation numbering in Jupyter: I think along the time-horizon of this grant, it should be considered whether requiring MathJax automatic numbering is accceptable (through either extension or updates to Jupyter front-ends themselves). If so, then the generated HTML that jupinx does for jupyter notebook output (which ends up being a layout mess, but where there seemed no other approach) could be replaced.
There are times when you want multiple languages displayed in the same document - although only one of them would be executable in a jupyter output. But that means you still want nice and language specific syntax highlighting/etc. in the PDF and html output. Ideally you would also have beautiful syntax highlighting in jupyter outputs for languages outside of the main kernel, but we could live without it.
Classic places where you want pretty syntax highlighting in HTML/PDF outside different from your core language are for
yaml
,toml
, andbash
blocks. I don't believe this is currently in Jupinx. For example, in https://github.com/QuantEcon/lecture-source-jl/edit/master/source/rst/getting_started_julia/getting_started.rst we usewhere I would prefer a
code-block:: bash
but don't think it is implemented.Rmd definetely has this. For example, look at https://github.com/rstudio/bookdown/blob/master/inst/examples/04-customization.Rmd which includes
latex
andyaml
blocks and I suspect could handle a{bash, eval=false}
as well.In Jupinx this is
In Rmd the chunk is
There are lots of places we would want to use this in designing online courses.
It is nice to be able to see output (usually figures) but where the code may not be displayed.
Currently, I don't think it is possible in jupinx but in Rmd it is done through
For figures, this would have two parts. The first is that the images need to be generated and added, and the second is that the assets need to be managed for the Jupyter deployment process (i.e. jupyter notebooks linking to an embedded online image from the generated source). See the comments below on assets
There are many cases where the output is too ugly to be displayed in PDF/latex/distributed notebooks, but where you want it to run. For example, we don't want the package manager outputs to run.
In Rmd I think this is done with
There are a lot of times when you want to have a literal inclusion of text markup into the document. For example, we need to have a header in each of our notebooks with a version number that we can bump easily. To do that, we have a file like https://github.com/QuantEcon/lecture-source-jl/blob/master/source/_static/includes/deps_generic.jl
and then include it with something like
My suspicion is that Rmd has better ways to deal with this. I think it is child-documents? Something like
If this was easier, I would use probably use it more often.
Unit and regression testing is a little tricky with writing online books. You don't necessarily want to have a complete regression test on the layout, but you frequently want to have the code itself tested. That way, you can rearrange the layout but if someone submits a PR that breaks a calculation, you know about it.
In jupinx, we do this by having a special
test
class for code blocks. This is only displayed in the output on a test build. See https://github.com/QuantEcon/lecture-source-jl/edit/master/source/rst/dynamic_programming/mccall_model.rst for exampleFirst, you might have setup code which conditionally runs
and then embedded in the content you can have the actual code which runs during tests
I don't think that having the
test
conditionally run is really needed... it was just the easiest way to implement the feature for jupinx.With Rmd, I believe you would do this with a block which executes but shows neither the code nor the output. It would always run (which is fine by me) but never display in the output.
The trick there is that Rmd chunks show errors by default, so if there was an assertion failure you would get that output. Then it is easy enough to write a CI tool to check for regressions by looking to see if an error occurs.
This is imperfect (e.g. tough to have regerssion tests for figures) but gets the job done.
I should point out that an alternative approach used in Julia's markdown is to have output that needs to be tested within the markdown itself in
jldoctest
.For example, see https://juliadocs.github.io/Documenter.jl/v0.7/man/doctests.html#
To implement that feature in something like Rmd you could have a new chunk type which
test
which looks for an output and checks it. Then you wouldn't have special hidden code chunks but rather it could check existing chunks. For example, I could imagine something like the following:The behavior of the
test=true
chunk would be as follows: in normal builds, it would just drop anything in code chunks below# output
completely. In a build tagged astest
it would execute the code above the# output
and compare it exactly to the output from the execution. It would throw an error if it failed.An important feature for this feature, which the julia documenter implemented, is to automatically fill in the
# output
blocks from anyjldoctest
chunks. Basically, you can just create a bunch of code chunks asjldoctest
, run some sort ofupdate_tests
utility on a file, and it adds or replaces the# output
and output to match the current execution. This sort of functionality would make me incredibly happy and make testing much easier.There are times when you want to generate a display block which shows a REPL session rather than having everything in a single code block.
I believe that Rmd might do this with
becoming something like
This is a variation on the previous one. I don't believe that Rmd has a distinction between code that should be executable in a notebook and ones that should just be displayed. So the following feature would require a new chunk option.
Lets say you had an option called
cell=split
orcell=single
which would decide whether to run code line-by-line creating cells as it goes or doing the whole thing in the same cell. The default would be single. But if you did the split, then it would act like you had taken the code and executed a whole bunch of different cells for each.e.g.
Would be equivalent to
I would use this feature a lot as there are frequently times where you want to write the code together but would really want people using jupyter output to execute line by line.
Hopefully this is clear. But it would mean you can keep output-specific stuff inside of the files rather than messy post-processing.
The issue here comes down to differences in formating of images/etc. for html/pdf/etc. There are a few options in jupinx, but Rmd figures have much more control of sizing in the output. This becomes especially important if you have the features described above of having blocks of code which run but where the code is not displayed in the output.
For HTML this isn't really an issue since the assets are generated with paths relative to the generated files. No problems. But for Jupyter it is an issue since you need the assets to link somewhere online.
Right now, this is a little bit of a mess in RST since sphinx wasn't designed directly for Jupyter. Everything is related to a static folder and the generated links are fudged after the fact given
conf.py
(e.g. https://github.com/QuantEcon/lecture-source-jl/blob/master/conf.py). The rst block itself would still look locally, pre-fudging.I don't know how this is done in Rmd. It may not be.
I don't believe this is in jupinx. In bookdown it is beautiful and just comes from latex. i.e. put in
Some text that includes \index{Markov Chains} would end up in the index
.It only generates the index in pdf, though (see https://bookdown.org/yihui/bookdown/latex-index.html)
From the sphinx docs, this is what it looks like in jupinx
But the bookdown one is a lot easier. I think it is just
There is a way to reference footnotes as well, but that is rarely needed.
Sadly, colab, jupyterlab, and jupyter notebook might have different outputs to be "perfect". For colab, for example, we wante to generate notebooks with the colab package additions already setup, whereas we might not want to for a nbgitpuller/binderhub setup.
As an example, with our quantecon datascience lectures we want to have the following code generated for jupyter+colab output (but not executed, as should be clear from the
eval=false
For non-colab jupyterhub/binderhub we do not want that line of code or even unexecuted cell visible.
Sadly, does not exist in Jupinx... and I wish it did!
See https://bookdown.org/yihui/bookdown/markdown-extensions-by-bookdown.html#theorems
Again, it is very close to writing latex and in fact generates latex styling.
This seems to be Rmd specific feature, and a very useful one - especially for things like Julia where code may take a long time to execute.
Jupinx doesn't allow us to define our own blocks for a book, per se, but the flexibility of the directives makes it possible to define new ones with the existing syntax.
Bookdown has a very clean way to extend more generally with custom blocks. See https://bookdown.org/yihui/bookdown/custom-blocks.html
The text was updated successfully, but these errors were encountered: