Content Collections #373

bholmesdev · 2022-11-03T20:27:51Z

Start Date: 2022-10-03
Status: Draft

Summary

Content Collections are a way to fetch Markdown and MDX frontmatter in your Astro projects in a consistent, performant, and type-safe way.

This is paired with Render Content, a solution to script and style bleed when importing globs of Markdown and MDX Content.

Both proposals are presented below, and are meant to be reviewed and accepted as a pair.

astro-content-smol.mp4

Links

Content Collections Full Rendered Proposal
Render Content Full Rendered Proposal

natemoo-re · 2022-11-04T14:50:46Z

Overall I am super thrilled by this, great work!

I have one very nitpicky bikeshed, which is that the ~schema.ts file probably doesn't need the ~ prefix? If we do want to use a prefix, perhaps _schema.ts is better?

JLarky · 2022-11-04T17:13:23Z

I like the changes that you described in the video. The part that is unclear for me right now is if zod stuff is optional or not.

The ~schema.ts and .astro remind me autoimporting in Vue, so I actually wonder if you can generate schema from frontmatter automatically? At least in a "field X exists in at least one file, let's call it x?: any or like you can do with cli in rails rails g model NameOfModel column_name:datatype column_name2:datatype2 :) see

bholmesdev · 2022-11-04T19:52:48Z

@JLarky, yes schema files are totally optional! You can still call fetchContent for entries in a collection, but you'll get an array of type any. You'll need a Zod schema to get type checking.

And that's a good suggestion! It was avoided since types like Date or email are hard to infer. That said, we could offer some sensible defaults for strings and numbers. I'd also love to lean into our CLI to generate schema starters / recipes for you, similar to that Ruby example you presented there.

bholmesdev · 2022-11-04T19:55:52Z

@natemoo-re Fair point! the ~ was a loose preference more than anything, since it:

Indicates schema is "magic"
Future-proofs us if we support collections of ts or js files in the future
Sorts schema definitions to the top of the directory. Always bothered me to have a folder of posts and a schema.ts in sorted alphabetically in the middle.

I understand ~ is not an Astro convention though. I may shy away from _ since it tells Astro to ignore files in src/pages. I've heard @schema.ts floated as well, which would be consistent with CMS conventions I've seen.

proposals/0028-render-content.md

louiss0 · 2022-11-30T01:37:30Z

I'm confused about what you did with schema but could you more importantly talk about define collection? What else does it do rather than allow you to define a schema? Does it have other configuration details? Oh and good luck. I hope this RFC doesn't hurt performance at all or slow down SSG

 defineCollection({
  schema: {
    title: z.string(),
    slug: z.string(),
    // mark optional properties with `.optional()`
    image: z.string().optional(),
    tags: z.array(z.string()),
    // transform to another data type with `transform`
    // ex. convert date strings to Date objects
    publishedDate: z.string().transform((str) => new Date(str)),
  },
});
```

bholmesdev · 2022-11-30T16:51:49Z

@louiss0 Ah yes, I should clarify: we may introduce collection config options other than schema in the future. One that's been raised is a custom slug mapper if you want to compute entry slugs yourself:

// not final code
const blog = defineCollection({
  slug: ({ id, data }) => data.customSlug ?? slugify(slug),
  schema: {...},
})

Nesting schema as its own key should keep doors open like this.

And thanks! Happy to share we're seeing perf gains from content schemas over Astro.glob if anything 👍

Co-authored-by: Ben Holmes <[email protected]>

Update renderEntry with implementation details

naiyerasif · 2022-12-11T05:45:30Z

Are there any plans to introduce a concept of relationships between collections? For example, a blog collection may have an array of authors which may be part of an author collection. Usually, maintaining such relationships manually is a huge pain and having some good DX around this might be helpful.

Another thing I'd really love is to have some search primitive akin to SQL. For example,

const allBlogPostsAfter2020 = await search(`
  blog.* from blog
  where publishedDate.year > 2022
  order by publishedDate asc
`);

where publishedDate.year gets resolved by a function defined in the schema (if the function does not exist on the primitive itself).

Furthermore, the search API can flatten the getCollection and getEntry into one API.

// this gives you all the blog posts
const allBlogPosts = await search(`blog.* from blog`)

// this gives you an entry
const firstBlogPost = await search(`blog.* from blog where title = "First Blog Post"`)

// this gives you the latest entry
const latestBlogPost = await search(`blog.* from blog order by publishedDate asc limit 1`)

This might also work nicely with relationships using joins.

const blogPostsByAstro = await search(`
  blog.* from blog, author 
  where blog.authorId = author.id 
  and author.name = "Astro"
`)

Lume does something similar using its search and relations plugins.

bholmesdev · 2022-12-11T17:16:25Z

@naiyerasif Ah, I love these ideas!

Relationships: we've definitely built schemas with relations in mind. Since you can define a type for every field, we could certainly introduce a "reference" type to refer to other collections. This was kept out of the RFC to keep our scope well-defined, but it's a feature we're very excited to explore.
Relational querying: this is an interesting thought, and matches our analogy of collections to database tables. I'd point to Nuxt's Content feature for some prior art here. They decided to use MongoDB's query language to treat content as document-based. I'd be hesitant to 100% mimic SQL querying per your example since it would be difficult to offer intellisense for a generic string vs. helper functions. Still, I'd love to find an answer here that's beginner-friendly, while still powerful enough for advanced users.

louiss0 · 2022-12-11T23:47:22Z

@naiyerasif Ah, I love these ideas!

Relationships: we've definitely built schemas with relations in mind. Since you can define a type for every field, we could certainly introduce a "reference" type to refer to other collections. This was kept out of the RFC to keep our scope well-defined, but it's a feature we're very excited to explore.

Relational querying: this is an interesting thought, and matches our analogy of collections to database tables. I'd point to Nuxt's Content feature for some prior art here. They decided to use MongoDB's query language to treat content as document-based. I'd be hesitant to 100% mimic SQL querying per your example since it would be difficult to offer intellisense for a generic string vs. helper functions. Still, I'd love to find an answer here that's beginner-friendly, while still powerful enough for advanced users.

How far have you gotten in the last few days? Did you fix the windows problem? Is the magic layouts feature going to go away after this feature is standardized?

naiyerasif · 2022-12-12T19:02:29Z

Relationships: we've definitely built schemas with relations in mind. Since you can define a type for every field, we could certainly introduce a "reference" type to refer to other collections. This was kept out of the RFC to keep our scope well-defined, but it's a feature we're very excited to explore.

This can be a separate RFC if you think the current RFC may become too big.

Relational querying: this is an interesting thought, and matches our analogy of collections to database tables. I'd point to Nuxt's Content feature for some prior art here. They decided to use MongoDB's query language to treat content as document-based. I'd be hesitant to 100% mimic SQL querying per your example since it would be difficult to offer intellisense for a generic string vs. helper functions. Still, I'd love to find an answer here that's beginner-friendly, while still powerful enough for advanced users.

Any fluent query DSL (like Nuxt's Content) should be fine. I agree that having helper functions for such an API would be immensely helpful. I think this should be a part of this RFC since you're already planning for something similar with getCollection and getEntry.

pilcrowonpaper · 2022-12-13T00:15:23Z

Is it possible for the collection argument for getCollection() to be a route as well? So if I have content/blog/en, I can use either of these?

const blogs = await getCollection("blog");
const englishBlogs = await getCollection("blog/en");

To not introduce complexity, schemas will still be limited to top-level (blog in this case).

pilcrowonpaper · 2022-12-13T00:17:18Z

Also, why not move renderEntry() inside the entry object? Does entry have to be a POJO?

// now
const { Content } = renderEntry(entry);
// idea
const { Content } = entry.render();

bholmesdev · 2022-12-13T02:21:41Z

@pilcrowonpaper Good questions!

No admittedly. Since collections are considered one level deep, you can only query for the top-level blog collection with the collection argument. However, we do offer a filter function where you can check for /en at the front of each entry slug. More on that in the blue Tip section on the landing page example.
This has been suggested, and was avoided in the initial RFC due to technical limitations. But with @matthewp's recent changes to our content renderer, this may be possible! We're very close to an experimental release so I plan to table this for future refinement. I'm glad to hear there's interest though.

bholmesdev · 2022-12-13T02:31:32Z

Alright everyone, thank you so much for your input and excitement over these past few weeks. We plan to discuss Content Collections during the RFC call on our discord tomorrow (2pm ET), and hope to reach consensus for an experimental release!

I'll highlight 2 final tweaks that were made:

We now support slug configuration from our src/content/config. This is useful for generating slugs based on frontmatter, or mapping your preferred directory structure (ex. /content/blog/2022-05-10/post.md) to URLs on your site (ex. /content/blog/post). You can use the slug argument like so:

import { defineCollection } from 'astro:content';

const blog = defineCollection({
  slug({ id, data }) {
    return data.slug ?? myCustomSlugify(id);
  },
  schema: {...}
});

export const collections = { blog };

@matthewp has heroically made renderEntry more powerful and stable with some new head-hoisting internals! If that jargon has your head spinning, here's the big takeaway: MDX styles and scripts are only injected when the <Content /> component is used. This means you can safely call renderEntry for headings and injectedFrontmatter without worrying about a bloated bundle (cc @andersk). Read the updated Detailed Design for full details.

And that's it! Hope to see y'all on the call tomorrow 👋

matthewp · 2022-12-13T19:47:07Z

Things to figure out before unflagging:

How should users colocate related data that is not the content from the schema? Currently suggesting _ folders are ignored.
What about relative image links? Should those be treated differently from images in side of md files outside of the content/ folder.

louiss0 · 2022-12-17T18:57:35Z

aliases: [content collections criticism,]
tags: []
note type: Main
created:
day: Friday December 16th 2022
time: 20:49:37

Content Collections Criticism

First off Id like to say a good job on this RFC. It is good enough for me to create a project around and good enough for me to maybe scale it. I like the fact that you have decided to put the render function on the entry instead of us having to import it. I like get collectionToPaths() function Please add it! I even like the fact that Zod was chosen for this RFC. You said that maybe in the future other formats could be supported. I hope the next one is JSON. But no tool is perfect. This is only the beginning so I have decided to talk about some big changes to consider.

RSS Feeds

Right now with content collections, I can't seem to get RSS feeds working not that it's that important. But I want to have that power the issue is Can't use RSS with content folder I feel like this issue needs to be fixed immediately. I even copy and pasted the stack trace onto the issue so that the error can be addressed quickly.

Magic Properties

In Astro there is so far two magic frontmatter properties that are available.
The first one is draft: and the second one is layout. I believe that both of them should be removed since people are expected to create their blogs by using Content Collections. Or they should at least not be able to be used in the /content folder. The reason why is that these properties are just not needed anymore. If the layout property was to be used it would lead to bad design and coupling. When you use the layout key the render function activates the mechanism responsible for rendering layouts. This RFC is better of being used to make the developer have to import the layout that the person wants to use inside of the page he or she wants to use it. Having magic layouts in the /content folder can only lead to bad behavior.

I could argue about removing the magic properties completely but then when it comes to pages the user would have to write lots of boilerplate code inside of pages. Since the /content folder exists the only things .mdx is good for is just being used as a templating language and importing components. So I think it's just better to just remove them from the content folder. Now as far as draft: is concerned. I'd argue that if someone was thinking about whether or not a page should be a draft of not it should just be put in the content folder instead. But draft is a minor thing. But if it is going to exist as a first-class citizen I'd suggest letting the developer know about it throughout the creation of each collection through types.

Extending and Defining a default Schema

There is a problem with the schema property of define collection. It only allows me to define a ZodRawShape I can't use z.object() on it at all. This means I can't use other features of zod in order to construct my schemas at all. This is not good. This means that if a developer wants to be able to reuse other schema definitions like title author draft updatedDate and even pubDate they would have to rewrite them all over again. Remember the DRY principle.

import { z, defineCollection } from 'astro:content';

// In this example the title is repeated twice 
const releases = defineCollection({
  schema: {
    title: z.string(), // here
    version: z.number(),
  },
});

const engineeringBlog = defineCollection({
  schema: {
    title: z.string(), //and  here
    tags: z.array(z.string()),
    image: z.string().optional(),
  },
});

export const collections = {
  releases: releases,
  // Don't forget 'quotes' for collection names containing dashes
  'engineering-blog': engineeringBlog,
};

I think there should be a default schema for people to use. I'd put it in the astro config but it could also exist in the /config.ts file for the content folder.

{
 contentCollections: {
	 defaultSchema: {
		 title: z.string().max(90)
		 author: z.string("Authors name").default("Authors name")
	 } 
 }
}

Users would probably then have the power to extend it by importing it from astro:content

import {defaultSchema, z } from "astro:content"

export const collections = {
	blog: defineCollection({
		schema: defaultSchema 
	})
}

I'm asking for the developer to have more access to zod's features. The schema key expects just a plain object. That makes it so that I can't just use the other functions that zod provides. If you don't intend to let developers gain full access to the API of zod through define collection. You could at least give them back some of its capabilities by using the extends: key in define collections and the omit: key so that you can omit some keys from a schema.

Schema for injected frontmatter

At the moment injected frontmatter is just not typeable at all. I think the user should have access to all the types for the frontmatter that is injected into the pages. I would like to have an injectedFrontmatterSchema: available for collections. If that is not possible I the render() function should have a generic argument that allows the user to pass in a type.

Injected Frontmatter Schema

const blog = defineCollection({
injectedFrontmatterSchema:{
 readingTime: z.string().datetime(),
 author: z.string().default("Shelton Louis"),
 
	} 
})

Example with the render function

render<T extends Record<string, unknown> >(): Promise<{  
Content: AstroComponentFactory;  
headings: MarkdownHeading[];  
injectedFrontmatter: T;  
}>

Lastly

I don't have many other concerns from here but are going to find a better way of generating types from collections. Generating an entry map for each individual collection seems good in the short run but bad in the long run. The thing is that the map can become huge and ts may not be able to tell us the answers we need and there could be some scalability issues when it comes to writing and erasing types.

I wish there was a way to specify injected front matter as default and each collection as well. That way people don't have to keep having to read the file where they put remark plugins to find which ones they injected. A key to specify injected front Matter Schemas would be nice.

ispringle · 2022-12-19T12:53:46Z

Would it be possible to provide a function to the collection config so that we can transform/create slugs in our own way? For example, the current slug normalizer that's creating the slug value is not dealing with unicode characters. It's pretty common that websites in languages which have characters beyond the standard ASCII alphanumerals will strip those characters out of URLs. Another example is the current setup doesn't remove whitespaces.

Seems the simplest solution is to continue to provide a simple slugifier and then allow people who want a more advanced one to provide that to the config. Of course you could just map over the collection returned by getCollection, update the slug field, and then use that new object, but this would need to be redone in ever file that uses that collection, but this is less than ideal,.

louiss0 · 2022-12-19T13:50:00Z

Would it be possible to provide a function to the collection config so that we can transform/create slugs in our own way? For example, the current slug normalizer that's creating the slug value is not dealing with unicode characters. It's pretty common that websites in languages which have characters beyond the standard ASCII alphanumerals will strip those characters out of URLs. Another example is the current setup doesn't remove whitespaces.

Seems the simplest solution is to continue to provide a simple slugifier and then allow people who want a more advanced one to provide that to the config. Of course you could just map over the collection returned by getCollection, update the slug field, and then use that new object, but this would need to be redone in ever file that uses that collection, but this is less than ideal,.

The solution is built in already.

import { defineCollection } from 'astro:content';

const blog = defineCollection({
slug({ id, data }) {
return data.slug ?? myCustomSlugify(id);
},
schema: {...}
});

export const collections = { blog };

bholmesdev · 2022-12-19T16:45:00Z

@ispringle fair point! Other RFC reviewers raised slug customization as well, so we decided to ship a slug option as part of your collection config (see added section). This should address the more advanced use cases you raised.

Also curious to hear how our default slugger can be improved! We have an existing issue for handling file name spaces, but open to further ideas as well.

bholmesdev · 2022-12-19T16:47:47Z

Things to figure out before unflagging:

How should users colocate related data that is not the content from the schema? Currently suggesting _ folders are ignored.

What about relative image links? Should those be treated differently from images in side of md files outside of the content/ folder.

Both points have been addressed in the final RFC draft. With these resolved... this RFC is officially accepted and good-to-merge 🥳

Thanks again to everyone for your time and input. You can try on the experimental release with our shiny new Content Collections docs.

We'll also be marking this RFC as closed. So if you have future ideas, we encourage you to start a new discussion. Thanks again 🙌

bholmesdev added 21 commits October 14, 2022 11:18

draft: summary, OOS, example, motivation

9a4c57b

edit: refine summary, add glossary

1de6ccb

draft: detailed usage

c62aa95

draft: detailed design

981180a

edit: refine scope and usage

205394b

draft: refine .astro, add alternatives

a8accb3

draft: adoption, questions, move map to appendix

1f85f60

edit: remove extra .astro details

4c1e191

edit: refine content map sample

b966155

edit: refine adoption strat

7510904

edit: add "goals," move "motivation" to "background"

1039e7c

edit: add .astro to goals and summary

7d5e428

edit: better explain leaving the Vite pipeline

6752e5e

new: detail .astro directory

284cf94

edit: add note on id naming

39d1cdc

wip: nested directories (unclear how to query?)

7417c09

new: pull in notion draft

20803b8

new: render content proposal

11e32a9

fix: broken links

60ba183

edit: remove all notion links for render-content

c8dc106

edit: replace all notion links in content-schemas

c48d4c3

bholmesdev changed the title ~~Content schemas~~ Content Schemas Nov 3, 2022

bholmesdev added 2 commits November 3, 2022 16:29

refactor: add headings

1f40e00

edit: add CSS bleed issue reference

deb69db

bholmesdev mentioned this pull request Nov 4, 2022

🐛 BUG: CSS generated by Astro.glob is included on all pages withastro/astro#3816

Closed

1 task

matthewp reviewed Nov 14, 2022

View reviewed changes

proposals/0028-render-content.md Show resolved Hide resolved

fix: async / await on injected fm ex

6fd4ca7

Update renderEntry with implementation details

807ae0c

bholmesdev mentioned this pull request Dec 6, 2022

Content Collections guide withastro/docs#2141

Merged

matthewp and others added 5 commits December 7, 2022 15:43

Update proposals/0028-render-content.md

fd4895c

Co-authored-by: Ben Holmes <[email protected]>

Update proposals/0028-render-content.md

1153c65

Co-authored-by: Ben Holmes <[email protected]>

Update proposals/0028-render-content.md

e9023e6

Co-authored-by: Ben Holmes <[email protected]>

Merge pull request #407 from withastro/new-render-entry

5f4d1b2

Update renderEntry with implementation details

edit: Content Schemas -> Content Collections

b5b328e

bholmesdev changed the title ~~Content Schemas~~ Content Collections Dec 9, 2022

fix: remove schema index.ts ref, collection -> configure

332fe94

bholmesdev added 3 commits December 19, 2022 11:34

new: custom slug section

480b26b

new: underscore convention

1754eea

edit: add relative images to out-of-scope

9c72b51

bholmesdev merged commit 11cf879 into main Dec 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Content Collections #373

Content Collections #373

bholmesdev commented Nov 3, 2022 •

edited

Loading

natemoo-re commented Nov 4, 2022

JLarky commented Nov 4, 2022

bholmesdev commented Nov 4, 2022 •

edited

Loading

bholmesdev commented Nov 4, 2022 •

edited

Loading

louiss0 commented Nov 30, 2022

bholmesdev commented Nov 30, 2022

naiyerasif commented Dec 11, 2022 •

edited

Loading

bholmesdev commented Dec 11, 2022

louiss0 commented Dec 11, 2022 •

edited

Loading

naiyerasif commented Dec 12, 2022

pilcrowonpaper commented Dec 13, 2022 •

edited

Loading

pilcrowonpaper commented Dec 13, 2022 •

edited

Loading

bholmesdev commented Dec 13, 2022

bholmesdev commented Dec 13, 2022

matthewp commented Dec 13, 2022

louiss0 commented Dec 17, 2022

ispringle commented Dec 19, 2022

louiss0 commented Dec 19, 2022

bholmesdev commented Dec 19, 2022

bholmesdev commented Dec 19, 2022

Content Collections #373

Content Collections #373

Conversation

bholmesdev commented Nov 3, 2022 • edited Loading

Summary

Links

natemoo-re commented Nov 4, 2022

JLarky commented Nov 4, 2022

bholmesdev commented Nov 4, 2022 • edited Loading

bholmesdev commented Nov 4, 2022 • edited Loading

louiss0 commented Nov 30, 2022

bholmesdev commented Nov 30, 2022

naiyerasif commented Dec 11, 2022 • edited Loading

bholmesdev commented Dec 11, 2022

louiss0 commented Dec 11, 2022 • edited Loading

naiyerasif commented Dec 12, 2022

pilcrowonpaper commented Dec 13, 2022 • edited Loading

pilcrowonpaper commented Dec 13, 2022 • edited Loading

bholmesdev commented Dec 13, 2022

bholmesdev commented Dec 13, 2022

matthewp commented Dec 13, 2022

louiss0 commented Dec 17, 2022

aliases: [content collections criticism,] tags: [] note type: Main created: day: Friday December 16th 2022 time: 20:49:37

Content Collections Criticism

RSS Feeds

Magic Properties

Extending and Defining a default Schema

Schema for injected frontmatter

Lastly

ispringle commented Dec 19, 2022

louiss0 commented Dec 19, 2022

bholmesdev commented Dec 19, 2022

bholmesdev commented Dec 19, 2022

bholmesdev commented Nov 3, 2022 •

edited

Loading

bholmesdev commented Nov 4, 2022 •

edited

Loading

bholmesdev commented Nov 4, 2022 •

edited

Loading

naiyerasif commented Dec 11, 2022 •

edited

Loading

louiss0 commented Dec 11, 2022 •

edited

Loading

pilcrowonpaper commented Dec 13, 2022 •

edited

Loading

pilcrowonpaper commented Dec 13, 2022 •

edited

Loading

aliases: [content collections criticism,]
tags: []
note type: Main
created:
day: Friday December 16th 2022
time: 20:49:37