Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sitemap.xml #1142

Open
gavinr opened this issue Apr 20, 2021 · 37 comments
Open

sitemap.xml #1142

gavinr opened this issue Apr 20, 2021 · 37 comments
Labels
feature / enhancement New feature or request
Milestone

Comments

@gavinr
Copy link

gavinr commented Apr 20, 2021

Is your feature request related to a problem? Please describe.
Most websites need to provide a sitemap.xml for SEO purposes.

Describe the solution you'd like
SvelteKit should provide a way (automatically, or recommended way via documentation) on how to create a sitemap.xml.

Describe alternatives you've considered
I see sitemap.xml generation is mentioned in the sapper repo - is this the way to do it?

Also mentioned by @antony here.

How important is this feature to you?
I think this will affect a large number of SvelteKit users who are building websites.

Note

The SvelteKit homepage says:

SvelteKit doesn't compromise on SEO

image

Without supporting sitemap.xml, this claim is inaccurate/disingenuous.

Thank you!

@CaptainCodeman
Copy link
Contributor

You can create an endpoint with a get function to provide this, e.g. sitemap.xml.js or sitemap.xml.ts, but you will have to handle the content yourself.

But if you are meaning that you want SvelteKit to provide the pages it knows about, that may be something that could be provided, similar to how the files can be imported from $service-worker?

@gavinr
Copy link
Author

gavinr commented Apr 20, 2021

you want SvelteKit to provide the pages it knows about

Yes, that seems along the lines that I'm thinking.

@bwbroersma
Copy link

I was playing with SvelteKit in November and wanted to do just that: generate a sitemap based on the pages.
The source was still 'closed' at that time, so I used a direct dist. include:

import {create_manifest_data} from '@sveltejs/kit/dist/utils.js';
const manifest = create_manifest_data('src/routes');

Then you can use the manifest.pages, of course you want to cache the result or just use the adapter-static (export).

But it would be great if this would be exposed in some way. I also used the Svelte compiler to parse the pages to extract a svelte:head title, and some metadata comments, in this way you can auto-populate navigation.

@Zerotask
Copy link
Contributor

I also wanted to ask for that feature. Would be a great addition for SvelteKit to provide even more automations.

@moritzebeling
Copy link

moritzebeling commented May 23, 2021

It sure would be super nice if the sitemap would be generated from the page routes. But that only starts making real sense, if there was a way to define priority and changefreq on a per route basis plus lastmod and optional images per actual page.

@benmccann
Copy link
Member

Here's a video tutorial showing how you can create your own sitemap.xml: https://www.youtube.com/watch?v=u8n5-urtGB0

@kvn-shn
Copy link
Contributor

kvn-shn commented Jun 2, 2021

I also need to create a sitemap.xml and think SvelteKit would benefit from having it built-in (as it's such a fundamental task). At minimum it should be explained in the docs how to do it correctly in "the SvelteKit way".

To be honest I didn't like the video as it left me more confused than before: where is the new (?) api endpoint coming from? How else can I get the current list of pages? What's the deal with these endpoint URLs? How can I pre-generate the sitemap? ...Maybe it's all answered when watching the whole series but I'm not planning to do that.

Edit: Because we developers (at least I ;) ) tend to make things more complicated than they have to be. For now I ended up writing the sitemap.xml manually. (Took less than 10 mins.)

@moritzebeling
Copy link

The video assumes that you have an existing API endpoint (e.g. at your CMS) that knows the correct routing and urls of all pages within your frontend website.
The problem is, that you would usually use SvelteKit to manage routing independently from your backend. So this approach works, but is a workaround.

A svelte-style solution in the future could be, that during build-time, every page generated is also optionally added to the sitemap. Maybe this could be configured with an additional option returned by the load() function of each route.

@antony
Copy link
Member

antony commented Jun 2, 2021

I feel like we should provide a way to expose the routing table, but that's the extent to which we should provide any sort of sitemap support. Exposing the routing table is also useful for automatic documentation generators, for example.

@Zerotask
Copy link
Contributor

Zerotask commented Jun 2, 2021

That's how I do it at the moment and then I build the sitemap with a generator script via a npm prebuild script. For static content that's way better than building it on-the-fly with an endpoint.
SvelteKit could provide an optional setting config.kit.sitemap which is false by default and you can opt-in to do something similar.

@bartholomej
Copy link

bartholomej commented Jun 9, 2021

I agree with @antony SvelteKit should provide a way to expose the routing table.

But meanwhile... I needed sitemap for my static website (used with adapter-static) and I didn't want to write it manually :)
So I made a simple proof-of-concept library for myself – but maybe it will come in handy for someone.

https://github.com/bartholomej/svelte-sitemap

Basically just install it and use it as prebuild script:

npm install svelte-sitemap --save-dev
{
  "name": "my-project",
  "scripts": {
    "prebuild": "svelte-sitemap --domain https://mydomain.com"
  }
}

And yes, I reckon that my library will soon become obsolete when the SvelteKit supports it natively ;) 👍

@benmccann
Copy link
Member

The routing table alone wouldn't be enough because you also need to know all possible values of all parameters to generate a sitemap, the date each page was last updated, etc. These are things that SvelteKit fundamentally cannot provide in most cases. I'm curious how people suggest this would be handled or if there are other frameworks doing a good job at this

@bartholomej
Copy link

bartholomej commented Jun 9, 2021

@benmccann Yes, <lastmod />, this is something I am also trying to solve in my library. But it's just a naive solution...
I only take the time of the last modification of each route file. I don't care about the child components modifications at all :(
However, there is a --reset-time parameter with which I can set all routes to the current time each time I deploy.

Like I said, it's just a quick workaround library and a temporary solution. But for my purpose, that's ok for now :)

I'm also curious how people suggest this would be handled... 👍

@madeleineostoja
Copy link

Nice tool @bartholomej! Would it make more sense to have something run as a post build so it could capture all the statically generated paths (including dynamic routes that are crawled by sveltekit)? Or am I misunderstanding how it works?

@NEO97online
Copy link

Would it make more sense to have something run as a post build so it could capture all the statically generated paths (including dynamic routes that are crawled by sveltekit)?

Definitely, this should run as a postbuild by scanning the build folder. Currently @bartholomej 's package is scanning src/routes, which is fine until you start using generated pages like [slug].svelte. I opened an issue for that: bartholomej/svelte-sitemap#1

The script from this article works great as a postbuild:

import fs from "fs";
import fg from "fast-glob";
import { create } from "xmlbuilder2";
import pkg from "./package.json";

const getUrl = (url) => {
	const trimmed = url.slice(6).replace("index.html", "");
	return `${pkg.url}/${trimmed}`;
};

async function createSitemap() {
	const sitemap = create({ version: "1.0" }).ele("urlset", {
		xmlns: "http://www.sitemaps.org/schemas/sitemap/0.9"
	});

	const pages = await fg(["build/**/*.html"]);

	pages.forEach((page) => {
		const url = sitemap.ele("url");
		url.ele("loc").txt(getUrl(page));
		url.ele("changefreq").txt("weekly");
	});

	const xml = sitemap.end({ prettyPrint: true });

	fs.writeFileSync("build/sitemap.xml", xml);
}

createSitemap();

@bartholomej
Copy link

Thank you @auderer @madeleineostoja
It was just proof-of-concept and it worked well for my case. But you're right this should run as postbuild.

So now (in v1.0.0) it already works as a postbuild (including dynamic routes) 🎉

https://github.com/bartholomej/svelte-sitemap

npm install svelte-sitemap --save-dev

Let me know how it works ;)

@benmccann benmccann added the feature / enhancement New feature or request label Jul 12, 2021
@myisaak
Copy link

myisaak commented Oct 4, 2021

Although svelte-sitemap lacks many features, it's a great lightweight solution! However, I'm currently using sitemap.js due to its flexibility and open-source support and would recommend those who need complex sitemaps to do the same.

@karolis-sh
Copy link

karolis-sh commented Nov 7, 2021

There's also a somewhat similar use case with robots.txt when, based on config, you'd want to build a correct file. E.g. robots.txt.js:

export async function get({ host }) {
  return {
    headers: {
      'Content-Type': 'text/plain',
    },
    body: `User-agent: *
Allow: /
Sitemap: https://${host}/sitemap.xml`,
  };
}

It would be nice, if you could have the ability to prerender GET requests, this could solve both cases.

@eba8

This comment was marked as off-topic.

@Myrmod
Copy link

Myrmod commented Feb 8, 2022

That's how I create a sitemap currently. If this endpoint is not being used in a link somewhere in the application it wont be prerendered. Maybe there was an option for adapter-static to do so regardless?

// sitemap.xml.ts
export async function get() {
  const response = await fetch('example.com/api')

  if (!response.ok) {
    return {
      status: response.status,
      body: response.statusText,
    }
  }

  const staticPages = Object.keys(import.meta.glob('/src/routes/**/!(_)*.svelte'))
    .filter(page => {
      const filters: Array<string> = ['slug]', '_', '/src/routes/index.svelte']

      return !filters.find(filter => page.includes(filter))
    })
    .map(page => {
      return page
        .replace('/src/routes', 'https://example.com')
        .replace('/index.svelte', '')
        .replace('.svelte', '')
    })

  const body = render(staticPages)

  const headers = {
    'Cache-Control': `max-age=0, s-max-age=${600}`,
    'Content-Type': 'application/xml',
  }

  return {
    body,
    headers,
  }
}

const render = (staticPages: Array<string>) => `<?xml version="1.0" encoding="UTF-8" ?>
<urlset
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"
  xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
  xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
  xmlns:video="http://www.google.com/schemas/sitemap-video/1.1"
  xmlns:news="http://www.google.com/schemas/sitemap-news/0.9"
  xmlns:mobile="http://www.google.com/schemas/sitemap-mobile/1.0"
  xmlns:pagemap="http://www.google.com/schemas/sitemap-pagemap/1.0"
  xmlns:xhtml="http://www.w3.org/1999/xhtml"
>
${staticPages.map(
  staticPage => `
  <url>
    <loc>${staticPage}</loc>
    <lastmod>${`${process.env.VITE_BUILD_TIME}`}</lastmod>
    <changefreq>monthly</changefreq>
  </url>
`,
)}
</urlset>
`

@benmccann
Copy link
Member

I think there are a lot of people participating in this thread that don't understand that a SvelteKit-generated sitemap is likely to provide almost no value. Please read this whole comment before responding to it and don't leave comments that are just saying "this feature is important" without detailing a unique use case or imparting some technical information. I'm happy to consider things I may have overlooked, but +1 comments just add noise, slow down development, and will be hidden. You can thumbs up the issue to indicate support

The main thing that sitemaps do is help search engines figure out how to prioritize crawling very large sites. Search engines prioritize crawling based on a number of factors such as how important/popular a site is, how often the site is updated, etc. If you have a very large site where there are meaningful changes randomly distributed deep in the site's structure and it isn't popular enough that the whole thing will be frequently recrawled by Google you can hint to it that it should focus on the new and updated pages with a sitemap so that it can go directly there instead of crawling the first pages it encounters and then giving up to do the rest later. Sitemaps do not affect the ranking of your page relative to other pages other than ensuring search engines have the latest copy of a page.

For a sitemap to have an impact, you have to help the search engine prioritize which pages to crawl. SvelteKit cannot know which pages were recently updated and thus would have no possible way of generating a sitemap of any real value. It can't even know the list of pages in your site in most cases - except if you use adapter-static. This could potentially be a feature of adapter-static, but even then it would be so limited that it'd be practically useless. adapter-static works by crawling your site. If your site is large enough that it's difficult to crawl, then adapter-static will have as much trouble crawling your site as Google will have. But more importantly, it can't know which pages were recently updated, which defeats the entire purpose of a site map.

By definition, sitemaps must be generated in user-land and not by the framework to have any meaningful value. A framework like Wordpress can provide SEO plugins because it's responsible for the content database and has a well defined set of data it's storing. But a sitemap is way outside the realm of what a frontend framework can provide. You could possibly have things like a Strapi+SvelteKit plugin, etc. But for SvelteKit in a standalone context I haven't seen anything that's both possible and valuable for it to do

@gavinr
Copy link
Author

gavinr commented Feb 16, 2022

@benmccann thank you for the thoughtful response and for your work on SvelteKit. I really appreciate what you've done and the time you put into it. Thank you.

I read your post and understand the technical and logical limitations of the ability of SvelteKit to generate a sitemap.xml. A few quick responses:

Paragraph 2:

The main thing that sitemaps do is help search engines figure out how to prioritize crawling very large sites

In paragraph 2 it seems like you're saying that sitemaps are not as essential as some people in this thread are implying. I think you're correct and would share the official Google Developers site about sitemaps. I think your comment echos these two points from the guide:

image

Contrary though,

  1. It says you may need a sitemap if your site is really large (bullet point 1). You have responded with a techincal reason of why SvelteKit might have trouble generating a sitemap.xml on really large sites, but that does not negate the fact that Google says you may need one.
  2. It says you may need a sitemap if your site is new and has few external links to it (bullet point 3). Many web sites built with SvelteKit fall into this category.
  3. It says you may need a sitemap if your site has a lot of rich media content (bullet point 4)

Given the above, you and I agree that there are many cases where sitemaps may not be necessary (small sites, comprehensively linked sites), but you must concede that certainly there are cases (numbers 1,2,3 above, for example) that websites built with SvelteKit do need a sitemap.xml. Do you agree?

Paragraph 4:

By definition, sitemaps must be generated in user-land and not by the framework to have any meaningful value ... a sitemap is way outside the realm of what a frontend framework can provide

I see your technical points here, and trust that you're correct. Given that, and my assertion (above) that sitemap.xml files are needed for some sites,

  1. If we're saying that SvelteKit will not support generating sitemap.xml files, I'm ok with that - I would just say that the phrase "SvelteKit doesn't compromise on SEO" should be removed from the homepage (see the note on my original post that started this issue), given that a site without a sitemap.xml is indeed a "compromise on SEO" for some situations (see paragraph 2 discussion above).
  2. How do other equivalent frameworks like NextJS and Gatsby handle this? I'm not very familiar with those frameworks but it looks like they "support" it via plugin? Maybe we can follow a pattern from there? Or maybe those are so different from SvelteKit that it's not relevant (if so, my apologies).

Finally, from a "developer relations" perspective, note that instead of trying to convince developers that sitemaps are unnecessary, those other frameworks have faced the fact that sitemaps are necessary for certain projects, and I think that's how we at SvelteKit should handle it. Even if it's not truly "built in" to SvelteKit, at least we can have a documentation page that explains the best way to do it (be it with a plugin, manually via a script in user world, etc etc).

Thanks again. I appreciate your time and all the work you do for SvelteKit.

@benmccann
Copy link
Member

Thanks @gavinr for the thoughtful and constructive reply.

It says you may need a sitemap if your site is really large (bullet point 1). You have responded with a technical reason of why SvelteKit might have trouble generating a sitemap.xml on really large sites, but that does not negate the fact that Google says you may need one.

SvelteKit could certainly generate a sitemap.xml, but it would require the user to code it to be of any use. How would SvelteKit know what pages had new content in your database or API? I'm not saying you should never use a sitemap, but rather that it's fundamentally a problem that a frontend framework cannot solve for you because it doesn't have the necessary information to do so.

It says you may need a sitemap if your site is new and has few external links to it (bullet point 3). Many web sites built with SvelteKit fall into this category.

Yes, if there are pages that are not linked to either from your own website or another, then a sitemap will help Google discover those pages. Most sites built with SvelteKit will be fairly new, but typically the vast majority of pages on a site will be linked to from within the site. If a page has no links to it, it's not going to rank well, so there's very little value in having a search engine crawl it because it's highly unlikely to drive any amount of traffic.

It says you may need a sitemap if your site has a lot of rich media content (bullet point 4)

I'm assuming this is referring to Google's video extension of the sitemap standard. I'm less familiar with this than sitemaps in general, but I believe the reason Google requests it is because video embeds are often in iframes, which are difficult to crawlers to deal with. It's hard for me to see what exactly SvelteKit could offer in this situation. If there's some specific request, I'm happy to consider it. I don't have experience with video SEO, so I'm willing to be educated. My first reaction though is that this seems like something that fundamentally has to be dealt with in user land

If we're saying that SvelteKit will not support generating sitemap.xml files, I'm ok with that - I would just say that the phrase "SvelteKit doesn't compromise on SEO" should be removed from the homepage (see the #1142 (comment) on my original post that started this issue), given that a site without a sitemap.xml is indeed a "compromise on SEO" for some situations (see paragraph 2 discussion above).

SvelteKit provides a number of other features that are useful for SEO. E.g. it has been optimized to get really great core vital scores out-of-the-box. It's also does SSR by default and is maybe the only framework I've seen that can do dynamic rendering in just a few lines of code. These types of things are far more impactful, so I do think it's fair that we say we offer SEO benefits. In terms of the specific wording, I think that having a sitemap.xml that isn't aware of last modified times would be a compromise and would contradict the claim and having the user code it is necessary for uncompromising SEO. But anyway, we have at least one draft for a totally new homepage design and will do a homepage refresh in the future that will update this content. That will be a bit down the road though as right now we've chosen to put that on hold to finish the core features.

How do other equivalent frameworks like NextJS and Gatsby handle this?

NextJS appears to support it in exactly the same manner as SvelteKit currently does based on this link.
Gatsby isn't too different and still requires quit a bit of user code based on their example. Personally, I'm not a big fan of their API. The NextJS approach seems a lot more straightforward to me. With Gatsby, I'd have to learn their API, which doesn't really do much for you. With NextJS I can use the sitemap spec and there's one less layer of indirection.

Even if it's not truly "built in" to SvelteKit, at least we can have a documentation page that explains the best way to do it (be it with a plugin, manually via a script in user world, etc etc).

Yes, totally agree with this. I'll take a stab at putting some starter docs up and people can add to them from there

@spences10
Copy link
Member

Hey @gavinr I've created sitemaps for Gatsby, NextJS and SvelteKit projects.

Gatsby does have a plugin for creating sitemaps you can check out the documentation over on the Gatsby repo. I haven't used this in a while now, from my understanding it will crawl the file structure once the site is built and generate a sitemap.xml file.

For NextJS and SvelteKit the approach is similar. Create a route/endpoint for the sitemap, here you can generate the xml needed this is useful if the data for the site is dynamically generated.

I documented how to make a sitemap with SvelteKit with help from @davidwparker's YouTube video.

For reference here's my notes on creating a sitemap in NextJS

Apologies for adding to the noise here @benmccann. The documentation from me can help until the starter docs have been added.

@moritzebeling
Copy link

Regardless wether sitemaps are good or bad, there should be the possibility to have one – which is perfectly possible through the sitemap.xml.js endpoint as proposed above.

And sure, all the page info and metadata (date, priority) is received from the content structure (MD files, API, CMS). So as long as your content knows the URL under which it will be visible, the best strategy would be to from within your sitemap.xml.js retrieve a combined list of pages and metadata from your data source.

When that is not the case, and routing is only implied by SvelteKit, you could create a writable store and from every {endpoint}.json.js add one entry to that list of pages. The sitemap.xml.js endpoint would have to be rendered at the very end of the process, read the store and render the XML.
But this strategy only works for every route that triggers some SvelteKit endpoint during prerendering.

@benmccann
Copy link
Member

benmccann commented Feb 16, 2022

Not noise at all! Thanks for sharing @spences10!

Here's a draft of the SEO docs for folks who are interested: #3946

Also, I saw Astro has a sitemap plugin, so we could look at what they do for inspiration: https://github.com/withastro/astro/tree/main/packages/integrations/sitemap

@Rich-Harris Rich-Harris added this to the post-1.0 milestone Apr 7, 2022
@justingolden21
Copy link

Hey I'm just chiming in to see if there's been any progress on this since Feb. Thanks Ben for all your hard work and great discussions thus far.

@gavinr
Copy link
Author

gavinr commented Apr 19, 2022

Docs on how to generate a sitemap.xml were added in #3946: https://kit.svelte.dev/docs/seo#manual-setup-sitemaps (thank you @benmccann!)

In that PR, @Rich-Harris suggested we leave this issue open:

we could probably do something to e.g. make prerendered pages available to a sitemap generator. It wouldn't be trivial, and it wouldn't cover all cases, but there's room for Kit to provide value

@spences10
Copy link
Member

I have a tangentially related question regarding what would be safe to exclude from robots...

Specifically around if _app/chunks and _app/pages files can be ignored? Do .js files need to be indexed?

If anyone thinks this should be a separate issue I can log it.

I did ask on the Svelte Discord with no answers

@CaptainCodeman
Copy link
Contributor

A sitemap is really for end-user navigable routes, ie. the URLs that you want indexed, the content pages. You wouldn't list .js or .css files in it

@justingolden21
Copy link

This is a good tutorial on sitemaps with Svelte kit:

https://scottspence.com/posts/make-a-sitemap-with-sveltekit

You can make a js file that returns a sitemap at: http://localhost:3000/sitemap.xml

Sitemap protocol: https://www.sitemaps.org/protocol.html

Here's some code I came up with (following the article linked above) for my site:

const website = 'https://example.com';

const pages = [
	{
		url: '',
		priority: 0.8
	},
	{
		url: 'mypage'
	}
];

export async function get() {
	return {
		headers: {
			'Cache-Control': 'max-age=0, s-maxage=3600',
			'Content-Type': 'application/xml'
		},
		body: `<?xml version="1.0" encoding="UTF-8" ?>
	<urlset xmlns="https://www.sitemaps.org/schemas/sitemap/0.9">
		${pages
			.map(
				(page) =>
					`<url>
			<loc>${website}/${page.url}</loc>
			<changefreq>monthly</changefreq>
			<priority>${page.priority ?? 0.5}</priority>
		  </url>`
			)
			.join('')}
	</urlset>`
	};
}

Last thing while I'm throwing in all the stuff I found useful that might found others, encoding XML:

const encodeXML = (str) =>
	str
		.replace(/&/g, '&amp;')
		.replace(/</g, '&lt;')
		.replace(/>/g, '&gt;')
		.replace(/"/g, '&quot;')
		.replace(/'/g, '&apos;');

@peterpeterparker
Copy link

Here's my variation for the current routing of the above neat solution shared by @Myrmod

const url = "https://ipi2f-uqaaa-aaaad-aabza-cai.ic0.app";

const staticPages = Object.keys(
  import.meta.glob("/src/routes/**/+page.(svelte|md)")
)
  .filter(
    (page) =>
      !["/src/routes/+page.svelte"].find((filter) => page.includes(filter))
  )
  .map((page) =>
    page
      .replace("/src/routes", url)
      .replace("/+page.svelte", ".html")
      .replace("/+page.md", ".html")
  );

export const prerender = true;

export const GET = async (): Promise<Response> => {
  const headers: Record<string, string> = {
    "Cache-Control": "max-age=3600",
    "Content-Type": "application/xml",
  };

  return new Response(
    `<?xml version="1.0" encoding="UTF-8" ?>
    <urlset
      xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
      xmlns:news="http://www.google.com/schemas/sitemap-news/0.9"
      xmlns:xhtml="http://www.w3.org/1999/xhtml"
      xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
      xmlns:video="http://www.google.com/schemas/sitemap-video/1.1"
    >
      <url>
        <loc>${url}</loc>
        <changefreq>weekly</changefreq>
        <priority>0.7</priority>
        <lastmod>${`${process.env.VITE_BUILD_TIME}`}</lastmod>
      </url>
      ${staticPages
        .map(
          (url: string) => `<url>
        <loc>${url}</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
        <lastmod>${`${process.env.VITE_BUILD_TIME}`}</lastmod>
      </url>`
        )
        .join("")}
    </urlset>`,
    { headers: headers }
  );
};

@beynar
Copy link

beynar commented Jan 4, 2023

hey i needed a way to generate sitemaps on the fly (not staticly) for some Shopify storefronts. So I built this hooks/plugin to do just that. It's really helpful when dealing with dynamic routes, here is an example of the api. (As a plus it also helps you to generate robots.txt). => https://github.com/beynar/sveltekit-sitemap.
let me know what you think ✌️

sitemapHook(sitemap, {
  //...
  getRoutes: async (event) => {
    const blogs = await event.locals.api.getBlogsForSitemap();
    // ^-- make async api call to get fresh data

    return {
      "/about": {
        path: "/",
        priority: "0.8"
      },
      // ^-- Static routes are automatically added to the sitemap. But if you want to customize them, you can return a route definition object.
      "blogs/[handle]": blogs,
      "/products/[id]": [
        { path: "/products/test-1" },
        { path: "/products/test-2" },
        {
          path: "/products/test-3",
          changeFreq: "Monthly",
          priority: "0.8",
          lastMod: "2023-01-01",
          image: {
            url: "https://picsum.photos/200/300",
            title: "test-1",
            altText: "image-product-test-1"
          }
        }
      ]
      // ^-- For dynamic routes you have to return an array of route definitions
    };
  }
});

@Wolverine971
Copy link

Wolverine971 commented Mar 5, 2023

Manually you can generate your sitemap.xml however you want, then just move it inside the /static

@jasongitmail
Copy link

Exposing a routing table would solve 90% of the remaining pain points, imo:

Currently:

  • Parameterized routes -- easy to collect by creating a data class and exposing a method like blog.getSitemapUrls(). ✅
  • Non-parameterized routes -- are more maintenance to maintain a list (Scanning the build for static files isn't an elegant solution b/c the rendering method shouldn't matter. Building my own routing table by parsing the routes dir is the solution, but something others need too.) 😬

A routing table would allow iterating over it to: 1.) automatically include non-parameterized routes, and 2.) throw a build warning when forgetting to add data for a parameterized route, ensuring no paths are forgotten.

(Side note: Google ignores priority and changefreq.)

@CaptainCodeman
Copy link
Contributor

CaptainCodeman commented Sep 12, 2023

This is what I use to get non-parameterized routes, with parameterized ones expected to be provided (e.g. they would come from a database)

import type { RequestHandler } from '@sveltejs/kit'
import { getProductURLs } from './store'

export const GET: RequestHandler = async ({}) => {
	const [pages, productURLs] = await Promise.all([
		import.meta.glob('/src/routes/**/+page.svelte'),
		getProductURLs(),
	])

	const routes = Object.keys(pages)
		.map((x) => x.substring(11)) // remove /src/routes prefix
		.map((x) => x.substring(0, x.length - 13)) // remove /+page.svelte suffix
		.map((x) => x.replaceAll(/\/\(\w+\)/g, '')) // remove (groups)
		.filter((x) => !x.includes('[')) // filter out parameterized routes
		.sort() // satisfy OCD

	const urls = routes.concat(productURLs)

	const sitemap = `<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
${urls.map((url) => `  <url><loc>https://example.com${url}</loc></url>`}).join('')}
</urlset>`

	return new Response(sitemap, {
		headers: {
			'Content-Type': 'application/xml',
			'Cache-Control': 'public, max-age=3600',
		},
	})
}

@jasongitmail
Copy link

jasongitmail commented Sep 14, 2023

I just published a package that I think creates a near ideal DX, for me at least. Thanks to @CaptainCodeman for the inspo.

Super Sitemap (npm) - SvelteKit sitemap focused on ease of use and making it impossible to forget to add your paths.

npm i -D super-sitemap

Features

  • 🤓 Supports any rendering method.
  • 🪄 Routes automatically found from /src/routes using Vite + data for route parameters provided by you.
  • 🧠 Easy maintenance–accidental omission of data for parameterized routes throws an error and requires the developer to either explicitly exclude the route pattern or provide an array of data for that param value.
  • 👻 Exclude specific routes or patterns using regex patterns (e.g. ^/dashboard.*, paginated URLs, etc).
  • 🚀 Defaults to 1h CDN cache, no browser cache.
  • 💆 Set custom headers, by passing an object as the 2nd argument to sitemap.response({...}, {'cache-control: '...'}).
  • 🫡 Uses SvelteKit's recommended sitemap XML structure.
  • 🤷 Note: Currently, uses priority 0.7 and changefreq daily for each item. Google ignores priority and changefreq and these could be excluded to save KB, but I kept them for now in case it improves compatibility by dumber bots.
  • 🧪 Well tested.
  • 🫶 Built with TypeScript.

The Github README has JS & TS examples. Happy to get feedback, issues, etc!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature / enhancement New feature or request
Projects
None yet
Development

No branches or pull requests