Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transclusion - Block level metadata in Markdown - i.e. middlematter #12

Open
jmatsushita opened this issue Feb 18, 2016 · 5 comments
Open

Comments

@jmatsushita
Copy link
Member

Especially in content reuse scenarios, keeping metadata (in particular Provenance metadata) is key to keep track of upstream changes in complex aggregated content pipelines.

In addition to (awesome) approaches such as @elationfoundation which generates JSON-LD from Jekyll Frontmatter documents we will surely need to manage metadata for more granular blocks of content for instance in scenarios where larger documents are made of smaller parts.

There is a scenario where the content editor aggregates smaller files with their metadata each, but it seems to break the model of having Markdown fit the view/perspective of the Author rather than bending to technical requirements.

From a usability standpoint, the ability to add "chapters" or "sections" in this way without creating folders, subfolders and small files is important to consider.

When I think about this it seems to lead to the possibility to add block level metadata in markdown. Given that adding metadata at the top is a well adopted practice across static website generators and called a frontmatter, the idea to add metadata in the middle, instead of the front called therefore be called middlematter.

I'm thinking for instance of this type of syntax (with YAML single line maps) :

1.

# An H1 without metadata
## An H2 level block with metadata --- {creator: 'seamus', source: 'elationfoundation/using-tor-browser-bundle'}

or

2.

# An H1 without metadata
## An H2 level block with metadata 
--- {creator: 'seamus', source: 'elationfoundation/using-tor-browser-bundle'}

or

3.

# An H1 without metadata
## An H2 level block with metadata 

---
creator: 'seamus'
source: 'elationfoundation/using-tor-browser-bundle'

---

I prefer 2. because it looks like a byline.

A content editor (#5) should probably hide inline block level metadata (the way prose hides document level metadata), and allow to surface them when needed.

@jmatsushita jmatsushita changed the title Block level metadata in Markdown Block level metadata in Markdown - i.e. middlematter Feb 18, 2016
@jmatsushita
Copy link
Member Author

Some interesting Markdown flavors:

MEMOFON

- _italic_
- **bold**
- [links](http://google.com)
 - images
    ![](/images/doc/grasshopper.png)
- blockquote
  > The question is, whether you can make
  >> words mean so many different things.
- code
    var test = function test() {
      return this.isTest();
    };

produces this mind map:

And a thread on the hCal microformat in Markdown which includes some thoughts on :

(startdate-optional enddate)[description/title  <at>  location]

for example:

(23rd June 2002)[Big Meeting  <at>  Room 200, Bldg 3]
(10am-2pm)[World Cup game]

@seamustuohy
Copy link

Check out Substance It seems to have an interface for adding custom content types that might make it a good choice for an editing environment.

@jmatsushita jmatsushita changed the title Block level metadata in Markdown - i.e. middlematter Transclusion - Block level metadata in Markdown - i.e. middlematter Mar 15, 2016
@jmatsushita
Copy link
Member Author

There's several discussions here. I'll try to unpack the various topics while maybe keeping this issue to keep track of the bigger picture.

Starting with the bigger picture, I think the actual problem that's interesting here is transclusion. As mentioned here https://www.mediawiki.org/wiki/Transclusion

Ted Nelson coined the term "transclusion," as well as "hypertext" and "hypermedia", in his 1982 book, Literary Machines.

image

There are questions related to a few different topics that interact in various ways and need to be well aligned together, and with the objectives of maintaining readability of source documents.

Infrastructure of transclusion

How content which includes transcluded content is kept up to date? This is fairly straightforward on a single platform (with subtleties) but harder in a distributed heterogeneous environment when you'll start needing ETags and fragment caching and Fragment Identifiers. Doing this with static content generation adds another interesting challenge (including graceful degradation with simple webservers, and taking advantage of more advanced caching strategies).

Addressing or URIs for fragments

Addressing fragments consistently within Markdown is an interesting opportunity (leveraging the AST of pandoc, but also thinking about content addressability and smart things like content defined chunking in dat and using NLP in the process to keep track of content and do smart custom merges with git (Like daff for CSV or this JSON custom Git merge driver. The reason why this matters is because addressing fragments via only document structure might not be enough to deal with moving paragraphs, inserting paragraphs, changing heading titles and so on. Also looking in the direction of Operational Transforms might be a more sustainable (and functional monadic approach by only describing a suite of operations which might be a more composable approach) to diffing and merging.

image

Syntax for transclusion

This was the original topic of this issue and discussed here: #25 (comment).

There are a few needs:

  • Having (invisible) metadata available in source documents (is that really needed?) and published documents.
  • Having syntax for requesting external/internal content to be in/trans-cluded in source documents.

There are a few possible implementation approaches:

  • Extend the YAML/Markdown combo.
  • Extend Markdown.

Trying to minimise the number of extensions of existing standard is obviously best. Also it's worth noting that the YAML/Markdown combo is a bit of an ad-hoc extension which is actually not necessary well supported outside of the SSG world.

In the minimal implementation it seems that maybe only extending markdown to do transclusion might be enough because the metadata could be implicit (i.e. either inferred from the transclusion statement - external source and so on... - or present in the transcluded content - either as published metadata at the fragment level or in the destination source content if the system is using a content as code approach).

Other topics

In a content package approach (with possibly npm publishing) all dependencies in the tree would be inherited up into each layers (each subsection might be treated as an implicit package) and up to the top package metadata.

The transitivity of transclusion dependencies might be an interesting topic.

Dealing with modified transcluded content is another challenge that relates to content/fragment addressing. Again it seems that an OT approach would be a good way to deal with this. In that case, where would the metadata about transformations be stored? Maybe the flow would be:

  • I copy content on an external site. (the fragment metadata is copied i.e. https://example.org/page#section-paragraph).
  • I paste it in an open document (an include statement is created [[https://example.org/page#section-paragraph]], the editor loads the content from the remote source stores it in memory.)
  • I start modifying the content (the in-memory/or in LocalStorage version is modified).
  • I save the document (the include gets updated to point to the locally modified version of the external content, and this locally modified version has metadata pointing to the original fragment, and maybe includes an OT representation of the changes).

So in the end I would have:

  index.md                              # [[file://example.org/page#section-paragraph]]
  example.org.page#section-paragraph    # locally modified version with maybe added metadata.

Using the fragment URI as the file name might help with usability and avoid having to add metadata in the simple case. A companion .ot file could be created with modifications...

@jmatsushita
Copy link
Member Author

For block level metadata MSON looks pretty cool. The capacity to render to JSON-Schema is really interesting.

I don't think the spec suggests a way to mix Markdown and MSON though. It's not clear if the triple dash approach is really solid. But apparently it's supported by pandoc also inside documents (i.e. ignored when parsed). There a very similar discussion to the original post about metadata in documents which also mentions that the triple dash syntax is a bit too verbose for short metadata inclusion](http://talk.commonmark.org/t/jekyll-style-do-not-show-or-parse-sections/918).

Also jqm's mentions that it's not clear what contains what and what is an extension of what.

Interesting to note that there's no real commenting syntax in Markdown (but some really creative ways to comment still). There's a discussion on the CommonMark discussion site.

Also there's a discussion on the CommonMark site about Transclusion syntax and a great list of Markdown extensions https://github.com/jgm/CommonMark/wiki/Proposed-Extensions

I think for now I'm settling on:

  • Using the triple dash for metadata in documents even if its a bit verbose for a start. This might mean several passes (one for markdown, one for frontmatter (for instance with graymatter).
  • Using the hercules syntax for transclusion :[]() which means another pass (a pre-processing one).

@jmatsushita
Copy link
Member Author

There's a discussion about endmatter and sectionmatter on the gray-matter issue tracker! jonschlinkert/gray-matter#20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants