Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: crawl URLs in <meta> tags #9900

Merged
merged 13 commits into from
May 17, 2023
Merged

feat: crawl URLs in <meta> tags #9900

merged 13 commits into from
May 17, 2023

Conversation

LorisSigrist
Copy link
Contributor

As described in #5228

Up until now, urls in <meta> tags have not been crawled. This has made programmatic social-images tricky.

This PR crawls urls in <meta> tags, depending on if the have a name or property attribute that is whitelisted.
Not all meta-tags can contain urls, so a whitelist is needed to only crawl the ones that can. I added all the usual social-media tags to the whitelist.

This did require some minor reorganisation of prepublish/crawl.js, since this is the first time that more than one attribute needed to be evaluated to crawl single tag.

Please don't delete this checklist! Before submitting the PR, please make sure you do the following:

  • It's really useful if your PR references an issue where it is discussed ahead of time. In many cases, features are absent for a reason. For large changes, please create an RFC: https://github.com/sveltejs/rfcs
  • This message body should clearly illustrate what problems it solves.
  • Ideally, include a test that fails without this PR but passes with it.

Tests

  • Run the tests with pnpm test and lint the project with pnpm lint and pnpm check

Changesets

  • If your PR makes a change that should be noted in one or more packages' changelogs, generate a changeset by running pnpm changeset and following the prompts. Changesets that add features should be minor and those that fix bugs should be patch. Please prefix changeset messages with feat:, fix:, or chore:.

@changeset-bot
Copy link

changeset-bot bot commented May 10, 2023

🦋 Changeset detected

Latest commit: 774bc04

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@sveltejs/kit Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@benmccann benmccann changed the title feat: Crawl urls in <meta> tags feat: crawl URLs in <meta> tags May 11, 2023
Rich Harris added 2 commits May 16, 2023 17:48
Copy link
Member

@Rich-Harris Rich-Harris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you! had a question about .trim().toLowerCase(), inline

Comment on lines 230 to 236
if (name && CRAWLABLE_META_NAME_ATTRS.has(name)) {
hrefs.push(resolve(base, content.trim().toLowerCase()));
}

if (property && CRAWLABLE_META_NAME_ATTRS.has(property.trim().toLowerCase())) {
hrefs.push(resolve(base, content));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we trimming/lowercasing the content attribute when name is present, but the property attribute when property is present?

in fact do we need to do any trimming or lowercasing at all? this doesn't happen with any other attribute, i'm curious why these would be different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops I left that in there accidentally 😅

At one point I was worried about the case where someone would leave some whitespace / wrongly capitalise some letters in an attribute, so during the refactor I trimmed and lowercased all attributes before the checks. After some experimenting, I then decided that it wasn't really required, and would make this PR dual-purpouse (addressing something that wasn't in the issue) so I removed it.

Must have missed that one, since it doesn't break anything. Sorry about that

@Rich-Harris Rich-Harris merged commit ab9f577 into sveltejs:master May 17, 2023
@github-actions github-actions bot mentioned this pull request May 17, 2023
leonardoadame pushed a commit to leonardoadame/Affiliate-tech that referenced this pull request May 17, 2023
* feat: Add a speedier script tag for prerendered redirects (sveltejs#9911)

* feat: Add a script redirect to prerendered pages

* changeset

* Update .changeset/hungry-rocks-hunt.md

Co-authored-by: Simon H <[email protected]>

* fix test

* feat: Update string escaping

* Update .changeset/hungry-rocks-hunt.md

Co-authored-by: Conduitry <[email protected]>

---------

Co-authored-by: Simon H <[email protected]>
Co-authored-by: Conduitry <[email protected]>

* fix: avoid inlining raw/url CSS imports (sveltejs#9925)

* fix: use transformRequest for CSS modules

* just avoid raw or url

* test and changeset

* use parsed query

* remove only

* Update packages/kit/test/apps/basics/test/cross-platform/test.js

* drive by test speed up

* feat: prerender & analyse in worker rather than subprocess to support Deno (sveltejs#9919)

* feat(fork): use workers

* Create hot-actors-hope.md

* Update packages/kit/src/utils/fork.js

Co-authored-by: Rich Harris <[email protected]>

---------

Co-authored-by: Ben McCann <[email protected]>
Co-authored-by: Rich Harris <[email protected]>

* avoid using isMainThread, since it interacts poorly with vitest (sveltejs#9941)

Co-authored-by: Rich Harris <[email protected]>

* chore: bump vite and devalue (sveltejs#9933)

* bump vite and devalue

* update templates

* merge master

* fix test

---------

Co-authored-by: Rich Harris <[email protected]>

* chore: uvu -> vitest for create-svelte tests (sveltejs#9910)

* chore: uvu -> vitest for create-svelte tests

* format

* concurrency (sveltejs#9921)

* realised we werent typechecking this file, found some errors. fixed

* Wait for beforeAll hook to complete

* simplify

* exclude create-svelte/template files from prettier, so that we can emit correctly formatted templates

* remove unused file

* Revert "exclude create-svelte/template files from prettier, so that we can emit correctly formatted templates"

This reverts commit aa188d4.

---------

Co-authored-by: Ben McCann <[email protected]>
Co-authored-by: Rich Harris <[email protected]>

* feat: unshadow `form` and `data` in `enhance` (sveltejs#9902)

* feat: Un-shadow `data` and `form` in `enhance`, warn about future deprecation in dev

* changeset

* snek

* Update .changeset/odd-crews-own.md

Co-authored-by: Ben McCann <[email protected]>

* Update packages/kit/test/apps/dev-only/package.json

Co-authored-by: Ben McCann <[email protected]>

* am not smart

* still not smart

* oops

* oof

* add deprecation notice

---------

Co-authored-by: Ben McCann <[email protected]>
Co-authored-by: Rich Harris <[email protected]>

* fix: Set loader: { '.wasm': 'copy' } in esbuild config in `adapter-cloudflare-workers` (sveltejs#9940)

* fix: Set loader: { '.wasm': 'copy' } in esbuild config in `adapter-cloudflare-workers`

Copies WASM files in Cloudflare instead of trying to load them.

Related to sveltejs#9909

* Create brave-peaches-buy.md

* Update packages/adapter-cloudflare-workers/index.js

* format

---------

Co-authored-by: Rich Harris <[email protected]>
Co-authored-by: Rich Harris <[email protected]>

* fix: Flaky test (sveltejs#9947)

* fix: Flaky test

* reuse locator

* add semi

* drive by test speed up

* another classic

* oh my god brain, y u so bad

---------

Co-authored-by: gtmnayan <[email protected]>
Co-authored-by: gtmnayan <[email protected]>

* remove envVarsInUse (sveltejs#9942)

Co-authored-by: Rich Harris <[email protected]>

* fix: type `vitePlugin` in config (sveltejs#9946)

* fix: type `vitePlugin` in config

* changeset

* fix: Set `loader: { '.wasm': 'copy' }` in esbuild config in `adapter-vercel` (sveltejs#9944)

* fix: Enable wasm copy in adapter-vercel

* Create olive-rings-eat.md

---------

Co-authored-by: Rich Harris <[email protected]>

* feat: crawl URLs in `<meta>` tags (sveltejs#9900)

* Crawl social-image urls during prerender

* Formatting & Linting

* Format changeset & added exhaustive list of crawlable urls

* Changed severity to minor as described in sveltejs#5228

* Added support for `property` attribute & limited valid names to just social tags

* More tests

* Better changeset message - I'm indecisive

* Update .changeset/thirty-garlics-tan.md

Co-authored-by: Ben McCann <[email protected]>

* simplify

* simplify

* Removed redundant data-sanitation

* DRY out

---------

Co-authored-by: Ben McCann <[email protected]>
Co-authored-by: Rich Harris <[email protected]>

* Version Packages (sveltejs#9893)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* feat: support AWS via SST in adapter-auto (sveltejs#9874)

* [feat] support AWS via SST in adapter-auto

* Sync

* Delete 95-adapter-aws-sst.md

* Update .changeset/rotten-ducks-tan.md

---------

Co-authored-by: Rich Harris <[email protected]>

* Version Packages (sveltejs#9953)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* only cache response if response has cache-control header (sveltejs#9885)

* only cache response if response has cache-control header

* add changeset

* Version Packages (sveltejs#9955)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: ensure styles are loaded in dev mode for routes containing special characters (sveltejs#9894)

* Fix loading styles for routes containing special characters in dev mode.

SvelteKit doesn't decode special characters in pathnames when loading CSS modules in dev mode, resulting in an error:

Internal server error: Failed to load url /src/routes/(special)/hinnap%C3%A4ring/+page.svelte?svelte=&type=style&lang.css=&inline= (resolved id: /src/routes/(special)/hinnap%C3%A4ring/+page.svelte?svelte&type=style&lang.css). Does the file exist?

Actual path:

/src/routes/(special)/hinnapäring/

Fix by using decodeURI on the url.pathname when loading CSS modules.

* Create stale-houses-yell.md

* decodeURL inside if

* add test

---------

Co-authored-by: Ben McCann <[email protected]>
Co-authored-by: Rich Harris <[email protected]>

* feat: Warn users when submitting forms with files but no `enctype="multipart/form-data"` (sveltejs#9888)

* fix: Package name keeps me from filtering with pnpm

* feat: Warn users when submitting a form containing a file without the correct enctype

* changeset

* Update packages/kit/src/runtime/app/forms.js

Co-authored-by: gtmnayan <[email protected]>

* style

* moar style tweaks

* better test skip

* Update .changeset/tasty-llamas-relate.md

Co-authored-by: Ben McCann <[email protected]>

* DRY out

* only warn once per submit

* Update .changeset/tasty-llamas-relate.md

Co-authored-by: Rich Harris <[email protected]>

---------

Co-authored-by: gtmnayan <[email protected]>
Co-authored-by: Ben McCann <[email protected]>
Co-authored-by: Rich Harris <[email protected]>
Co-authored-by: Rich Harris <[email protected]>

* Version Packages (sveltejs#9957)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* security: Stop automatically adding URLs from server-side `load` `fetch` calls to dependencies (sveltejs#9945)

* feat: Add `dangerZone` config

* breaking: Don't implicitly track deps in server-side fetch

* changeset

* bein dumb

* fix: Server load invalidation

* fix: Write config for server

* fix: test

* unsurprisingly i am dumb

* docs: Clarify difference between server `fetch` and universal `fetch`

* tests

* rename to trackServerFetches

---------

Co-authored-by: Rich Harris <[email protected]>

* Version Packages (sveltejs#9963)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

---------

Co-authored-by: S. Elliott Johnson <[email protected]>
Co-authored-by: Simon H <[email protected]>
Co-authored-by: Conduitry <[email protected]>
Co-authored-by: gtmnayan <[email protected]>
Co-authored-by: Fernando López Guevara <[email protected]>
Co-authored-by: Ben McCann <[email protected]>
Co-authored-by: Rich Harris <[email protected]>
Co-authored-by: Rich Harris <[email protected]>
Co-authored-by: Rich Harris <[email protected]>
Co-authored-by: Conner <[email protected]>
Co-authored-by: gtmnayan <[email protected]>
Co-authored-by: Loris Sigrist <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Frank <[email protected]>
Co-authored-by: Frank Dumont <[email protected]>
Co-authored-by: Reio Remma <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants