Skip to content

Commit

Permalink
doc: controller web crawlers article
Browse files Browse the repository at this point in the history
  • Loading branch information
harlan-zw committed Nov 3, 2024
1 parent e0a99fd commit 552673c
Show file tree
Hide file tree
Showing 7 changed files with 15 additions and 60 deletions.
2 changes: 1 addition & 1 deletion .playground/server/plugins/robots.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { defineNitroPlugin } from 'nitropack/runtime/plugin'
import { defineNitroPlugin } from '#imports'

export default defineNitroPlugin((nitroApp) => {
if (import.meta.dev) {
Expand Down
3 changes: 3 additions & 0 deletions .playground/server/tsconfig.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"extends": "../.nuxt/tsconfig.server.json"
}
5 changes: 4 additions & 1 deletion docs/content/1.getting-started/0.introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,10 @@ The core feature of the module is:
- Telling [crawlers](https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers) which paths they can and cannot access using a [robots.txt](https://developers.google.com/search/docs/crawling-indexing/robots/intro) file.
- Telling [search engine crawlers](https://developers.google.com/search/docs/crawling-indexing/googlebot) what they can show in search results from your site using a `<meta name="robots" content="index">`{lang="html"} `X-Robots-Tag` HTTP header.

It's important to [learn the difference between](/docs/robots/guides/robots-txt-vs-meta-tag) the two to avoid common SEO pitfalls.
New to robots or SEO? Check out the [Conquering Web Crawlers](/learn/controlling-crawlers) guide to learn more about why you might
need these features.

:LearnLabel{label="Conquering Web Crawlers" to="/learn/controlling-crawlers" icon="i-ph-robot-duotone"}

While it's simple to create your own robots.txt file, the module makes sure your non-production environments get disabled from indexing. This is important to avoid duplicate content issues and to avoid search engines serving your development or staging content to users.

Expand Down
4 changes: 3 additions & 1 deletion docs/content/1.getting-started/1.installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,9 @@ any routes which require authentication should be ignored.
- [Disabling Site Indexing](/docs/robots/guides/disable-indexing) - If you have non-production environments you should disable indexing for these environments,
while this works out-of-the-box for most providers, it's good to verify this is working as expected.

Make sure you understand the differences between [robots.txt vs robots meta tag](/docs/robots/guides/robots-txt-vs-meta-tag).
Make sure you understand the differences between robots.txt vs robots meta tag with the [Conquering Web Crawlers](/learn/conquering-crawlers) guide.

:LearnLabel{label="Conquering Web Crawlers" to="/learn/controlling-crawlers" icon="i-ph-robot-duotone"}

## Next Steps

Expand Down
54 changes: 0 additions & 54 deletions docs/content/2.guides/0.robots-txt-vs-meta-tag.md

This file was deleted.

4 changes: 3 additions & 1 deletion docs/content/2.guides/1.disable-page-indexing.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,9 @@ The best options to choose are either:
- [Robots.txt](#robotstxt) - Great for blocking robots from accessing specific pages that haven't been indexed yet.
- [useRobotsRule](#userobotsrule) - Controls the `<meta name="robots" content="...">` meta tag and `X-Robots-Tag` HTTP Header. Useful for dynamic pages where you may not know if it should be indexed at build time and when you need to remove pages from search results. For example, a user profile page that should only be indexed if the user has made their profile public.

If you're still unsure about which option to choose, see the [Robots.txt vs Robots Meta Tag](/docs/robots/guides/robots-txt-vs-meta-tag) guide.
If you're still unsure about which option to choose, make sure you read the [Conquering Web Crawlers](/learn/conquering-crawlers) guide.

:LearnLabel{label="Conquering Web Crawlers" to="/learn/controlling-crawlers" icon="i-ph-robot-duotone"}

[Route Rules](#route-rules) and [Nuxt Config](#nuxt-config) are also available for more complex scenarios.

Expand Down
3 changes: 1 addition & 2 deletions docs/content/3.nitro-api/0.get-path-robot-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ interface GetPathRobotResult {
}
```


### Arguments

- `e: H3Event`{lang="ts"}: The request event object, used to determine the current path.
Expand All @@ -46,7 +45,7 @@ interface GetPathRobotResult {
## Example

```ts twoslash [server/plugins/strip-og-tags-maybe.ts]
import { getPathRobotConfig, defineNitroPlugin } from '#imports'
import { defineNitroPlugin, getPathRobotConfig } from '#imports'

export default defineNitroPlugin((nitroApp) => {
// strip og meta tags if the site is not indexable
Expand Down

0 comments on commit 552673c

Please sign in to comment.