Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jq manual remake #3183

Open
01mf02 opened this issue Sep 26, 2024 · 8 comments
Open

jq manual remake #3183

01mf02 opened this issue Sep 26, 2024 · 8 comments

Comments

@01mf02
Copy link

01mf02 commented Sep 26, 2024

I would like to extend the jq manual with information about diverging behaviour between different jq implementations (jq, gojq, jaq, ...). For this, I will have to go through the whole manual once, which I think would be a nice opportunity to adapt its format.

Currently, the jq manual is written as a YAML file, which mostly contains Markdown blocks and examples.
I thought about transforming it to a pure Markdown file, and then creating HTML from it via Pandoc.
The advantages of this would be:

  • syntax highlighting of JSON / jq code
  • smaller input file
  • better support for editing (e.g. spellchecking)

To show the feasibility of it, I recreated a part of the "Conditionals and Comparisons" section of the jq manual in Markdown. You can use

pandoc conds.md -s -o conds.html --section-divs --lua-filter filter.lua

to render this to the following HTML (sorry, GitHub doesn't let me upload HTML directly, so I exported the HTML to PDF and uploaded that).

This requires the file filter.lua, which I wrote today in about 2 hours (I don't know Lua ^^) and which currently looks like this:

function Header(el)
  if el.content[1].text == "Examples" then
    --print(dump(el))
    el.attr.classes:insert("examples")
  end
  return el
end

function Code(code)
  code.classes[1] = "jq"
  --print(dump(code))
  return code
end

function CodeBlock(block)
  --print(dump(block))
  --if block.classes[1] == "jq-test" then
  rows = {}
  categories = {"Filter", "Input", "Output"}
  local i = 1
  for line in block.text:gmatch("[^\n]+") do
    local code = pandoc.Code(line)
    local lang = "json"
    if i == 1 then lang = "jq" end
    code.classes[1] = lang
    table.insert(rows, {pandoc.Plain(categories[i] or ""), code})
    i = i + 1
  end
  simple_table = pandoc.SimpleTable(
    "", -- caption
    {pandoc.AlignDefault, pandoc.AlignDefault},
    {0, 0}, -- let pandoc determine col widths,
    {}, -- headers
    rows
  )
  return pandoc.utils.from_simple_table(simple_table)
end

I am fairly confident that I would be able to convert the whole manual to this Markdown format.
But I'm only going to do this if it's likely that this is going to be merged into jq.
So the question is: Would you consider merging such a change?

@01mf02
Copy link
Author

01mf02 commented Sep 26, 2024

I've converted a few more sections now and enabled "Run" links for jqplay.org, the result is here.

@itchyny
Copy link
Contributor

itchyny commented Sep 27, 2024

How do you generate man.test, manonig.test from Markdown?

@pkoppstein
Copy link
Contributor

@01mf02 - Rewriting the jq manual in the way you've described sounds like a very
ambitious project!

I was wondering whether you've considered (either
as an alternative or at least as a prelude) revising existing
documentation on the jq wiki, and in particular the
jq-Language-Description page.

I would also like to see a new page under the Tips section, which
already has a "Regarding gojq" page.

It would be really helpful to have an up-to-date "Regarding jaq" page,
and you could perhaps give a preview there of some of the content you
envision for the "official documentation".

@01mf02
Copy link
Author

01mf02 commented Sep 27, 2024

How do you generate man.test, manonig.test from Markdown?

Creating the combination of man.test and manonig.test is quite simple; it involves printing all example code blocks to stdout in the Lua filter (a one-line instruction) and piping the result into man.test.
It is a bit more annoying to separate out those tests that involve regular expressions, but it can be done, in the worst case by post-processing the exported tests.

@01mf02
Copy link
Author

01mf02 commented Sep 27, 2024

@01mf02 - Rewriting the jq manual in the way you've described sounds like a very ambitious project!

It's might actually be a less ambitious project than I thought, for I was able to convert the whole manual to HTML/PDF.
The result is here.
As you can see, jq syntax highlighting already works in the PDF version!
(The examples look a bit wonky, but please recall that I performed zero styling of the output so far.)

I did this as follows:

I converted manual.yml to manual.json with a converter, then used jq to convert the result to Markdown:

jq -r -f convman.jq manual.json > manual.md

The jq script convman.jq is:

"---\ntitle: jq manual\n---\n\n",
.body,
(.sections[] | (
  "# \(.title)",
  .body // "",
  (.entries[]? | (
    "## \(.title)",
    .body,
    if .examples then "::: Examples\n", (.examples[] | (
      "~~~",
      (.program | gsub("\n"; " ")),
      .input,
      (.output | if . == [] then "\n" else .[] end),
      "~~~\n"
    )), ":::\n" else empty end
  ))
))

From this, I use Pandoc to convert Markdown to HTML:

pandoc manual.md -s -o manual.html --section-divs --lua-filter filter.lua > man.test

This creates an HTML file manual.html and prints on stdout the combination of man.test and manonig.test.
This uses the following Lua filter (filter.lua):

-- inline code is always written in jq
function Code(code)
  code.classes[1] = "jq"
  return code
end

-- code blocks are assumed to be in jq if no other language is given
function CodeBlock(block)
  if next(block.classes) == nil then
    block.classes[1] = "jq"
    return block
  end
end

function Div(el)
  if el.classes:includes'Examples' then
    return pandoc.walk_block(el, {CodeBlock = function(block)
      -- print example to stdout
      print(block.text .. "\n")
      return exampleTable(block.text)
    end})
  end
end

function exampleTable(test)
  local _, _, filter, input, output = test:find("([^\n]+)\n([^\n]+)\n(.*)")
  local url = "https://jqplay.org/jq?q=" .. encodeUrl(filter) .. "&j=" .. encodeUrl(input)
  simple_table = pandoc.SimpleTable(
    "", -- caption
    {pandoc.AlignRight, pandoc.AlignLeft},
    {0, 0}, -- let pandoc determine col widths,
    {}, -- headers
    {
      {pandoc.Plain("Filter"), pandoc.Code(filter, {class = "jq"  })},
      {pandoc.Plain( "Input"), pandoc.Code( input, {class = "json"})},
      {pandoc.Plain("Output"), pandoc.Code(output, {class = "json"})},
      {pandoc.Link("Run", url), {}}
    }
  )
  return pandoc.utils.from_simple_table(simple_table)
end

function encodeUrl(str)
  str = string.gsub(str, "\n", "\r\n")
  str = string.gsub(str, "([^%w%.%- ])", function(c) return string.format("%%%02X", string.byte(c)) end)
  str = string.gsub(str, " ", "+")
  return str
end

We can also create a PDF file, by going through Typst:

pandoc manual.md -s -o manual.typ --section-divs --lua-filter filter.lua
typst c manual.typ

I used this to produce the PDF file at the beginning of this post.

I was wondering whether you've considered (either as an alternative or at least as a prelude) revising existing documentation on the jq wiki, and in particular the jq-Language-Description page.

I have not considered this so far. In the context of the present issue, I would like to concentrate on making the format of the manual easier to modify. Revising documentation is a different issue.

I would also like to see a new page under the Tips section, which already has a "Regarding gojq" page.

It would be really helpful to have an up-to-date "Regarding jaq" page, and you could perhaps give a preview there of some of the content you envision for the "official documentation".

I think the best would be if any information that would be available at some "Regarding jaq" page would be directly integrated into the manual. That makes it much easier for users to find information and to keep it up-to-date.
An example of what I imagine is: Currently, the user manual says the following about input:

## `input`

Outputs one new input.

My idea is to enhance it as follows:

## `input`

Outputs one new input.

::: Compatibility
Available since jq 1.5.
When there is no more input,
jq returns an error, whereas
jaq returns no output (`empty`).
:::

The compatibility statement could be rendered in the documentation as a block.
I would like to do this for all the filters documented in the manual.
In the long run, this would also make it possible to have a unified documentation for all jq versions on the same page, by explicitly mentioning since which version a feature is available.

@01mf02
Copy link
Author

01mf02 commented Sep 27, 2024

By the way, Pandoc can also create man pages, so we can use it to generate jq(1) from the Markdown source.

01mf02 added a commit to 01mf02/jq that referenced this issue Sep 27, 2024
This is needed to make the Markdown -> HTML conversion work with Pandoc, see jqlang#3183.
@01mf02
Copy link
Author

01mf02 commented Sep 30, 2024

@itchyny, I've found a way to generate both man.test and manonig.test: When exporting all tests (all.test) from the Markdown manual, I terminate every test with \0. That way, as a second step, we can split all tests one-by-one and write those that contain regex filters into a separate file:

mkdir -p tests
split --separator='\0' -l1 all.test tests/
REGEX="test|match|capture|scan|split|splits|sub|gsub"
grep -L -E $REGEX tests/* | xargs cat | sed 's/\x0/\n/g' > man.test
grep -l -E $REGEX tests/* | xargs cat | sed 's/\x0/\n/g' > manonig.test
rm -r tests

@01mf02
Copy link
Author

01mf02 commented Sep 30, 2024

I've now performed the conversion of the manual to Markdown, see #3186.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@pkoppstein @itchyny @01mf02 and others