Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redefine disablePathToLower setting for the section structure #9171

Closed
bep opened this issue Nov 15, 2021 · 10 comments
Closed

Redefine disablePathToLower setting for the section structure #9171

bep opened this issue Nov 15, 2021 · 10 comments

Comments

@bep
Copy link
Member

bep commented Nov 15, 2021

Edit in 14. oct 2023: I'm about to do this. So, after this, you can still preserve mixed case URLs

  • The URLs built from permalinks config that fetches values from the tokens that's not a part in the canonical path, e.g., :title. :slug etc.
  • URLs defined as url in front matter can be whatever you like.
  • But all URLs parts coming from the (new) canonical content path will always be lower cased.

Some examples of such canonical paths:

/ (home page)
/mysection (a section)
/mysection/mypage (a page/bundle)
/categories (a taxonomy)
/categories/mycategori (a term)

I have never used the disablePathToLower in any of my many Hugo sites, but I assume some do an will get mad at me if I just remove it without discussion ...

I'm doing some long-needed simplifications/improvements in this area to pave way for some features that I think many would like/want. And to do that we need some unified path normalization (to be able to quickly find stuff, but also to be able to quickly merge content from different sources etc.) -- and preserving this case information in the file paths just seems too expensive to me (implementation/test cases and performance/memory).

I assume people who want this want it because it ... looks prettier? I cannot imagine any CEO argument.

I have thought about a middle ground that would allow mixed-case URLs for the title/slug part of the URL, e.g:

https://example.com/posts/My-Cool-Blog-Post/

Would be OK, but

https://example.com/MyBlog/My-Cool-Blog-Post/

Would not be possible.

But that would make not the below construct truthful (or: it would be hard to make it truthful):

{{ $url := "https://example.com/posts/My-Cool-Blog-Post/" }}
Should be true: {{ eq $url ($url | urlize }}

Thoughts?

@bep bep added the Proposal label Nov 15, 2021
@bep bep pinned this issue Nov 15, 2021
@jmooring jmooring unpinned this issue Nov 17, 2021
@jmooring
Copy link
Member

jmooring commented Nov 17, 2021

Thoughts?

Start throwing a deprecation warning to get a feel for how many sites this would affect. The scenario I imagine is, "Hey, Google and Bing have indexed hundreds of pages of my Very Important Blog. I've spent years optimizing SEO. Now, because my web server is case sensitive, all those links on Google and Bing are broken."

@davidsneighbour
Copy link
Contributor

I also have done years of SEO and always used lowercase URLs because Windows is case-insensitive and Linux is case sensitive and both (hosting options) are always crashing against each other. I never understood people doing URLs with uppercase characters. This will end in some form of discussion of paradigms. It should have been lowercased over the full URL from the beginning ;)

For the "hey I have done years of work on SEO" user case I have the "hey, add a proper sitemap" response. Most search engines understand lower/uppercase confusion these days anyways and are able to move "google juice" to new URLs.

It's important to have a proper full documentation about it, that's all in my opinion.

@jmooring
Copy link
Member

I really hope someone isn't generating JSON at https://example.org/API/FOO on a case sensitive server. That seems like a self-inflicted wound with deferred pain.

@jmooring jmooring pinned this issue Nov 17, 2021
@bep
Copy link
Member Author

bep commented Nov 22, 2021

Start throwing a deprecation warning to get a feel for how many sites this would affect.

Sure, but I cannot do this without doing it, so any deprecation warning would warn about something that already happened.

I have thought further about this and preserving the case in these URLs AND adding new features to Hugo comes at a fairly high cost (mostly developer cost, but also a performance penalty) and even then, we most likely end up in a half assed state with lots of ambiguity and unclear documentation ... I don't think I'm prepared to go the extra mile and spend, say, 20 hours, to preserve a setting that I have never used and never will use and (I hope) didn't add myself in the first place.

@bep
Copy link
Member Author

bep commented Nov 22, 2021

An added note: I will release one or more beta versions of Hugo 0.90.0, which may allow us to get a feel for this before we pass it out to the wolf pack.

@gohugoio gohugoio deleted a comment from DevBeatDad Nov 22, 2021
@bep bep added this to the v0.91.0 milestone Dec 8, 2021
@sergio-carlavilla
Copy link

We use this approach in FreeBSD as you can see here https://www.freebsd.org/releases/13.0R/announce/

When you remove this option, we cannot use this kind URL, am I correct?

@bep bep unpinned this issue Dec 14, 2021
@bep bep modified the milestones: v0.91.0, v0.92.0 Dec 20, 2021
@bep bep modified the milestones: v0.92.0, v0.93.0 Jan 12, 2022
bep added a commit to bep/hugo that referenced this issue Jan 22, 2022
TODO(bep) improve commit message.

Hugo has always been a active user of in-memory caches, but before this commit we did nothing to control the memory usage.

One failing example would be loading lots of big JSON data files and unmarshal them via `transform.Unmarshal`.

This commit consolidates all these caches into one single LRU cache with an eviction strategy that also considers used vs. available memory.

Hugo will try to limit its memory usage to 1/4 or total system memory, but this can be controlled with the `HUGO_MEMORYLIMIT` environment variable (a float value representing Gigabytes).

A natural next step after this would be to use this cache for `.Content`.

Fixes gohugoio#8307
Fixes gohugoio#8498
Fixes gohugoio#8927
Fixes gohugoio#9192
Fixes gohugoio#9189
Fixes gohugoio#7425
Fixes gohugoio#7437
Fixes gohugoio#7436
Fixes gohugoio#7882
Updates gohugoio#7544
Fixes gohugoio#9224
Fixes gohugoio#9324
Fixes gohugoio#9352
Fixes gohugoio#9343
Fixes gohugoio#9171
@bep bep modified the milestones: v0.93.0, v0.94.0 Mar 1, 2022
@bep bep modified the milestones: v0.94.0, v0.95.0, v0.96.0 Mar 9, 2022
@bep bep modified the milestones: v0.96.0, v0.97.0 Mar 24, 2022
@bep bep modified the milestones: v0.97.0, v0.98.0 Apr 13, 2022
@bep bep removed this from the v0.98.0 milestone Apr 28, 2022
@bep bep added this to the v0.116.0 milestone Jun 30, 2023
bep added a commit to bep/hugo that referenced this issue Jul 18, 2023
TODO(bep) improve commit message.

Hugo has always been a active user of in-memory caches, but before this commit we did nothing to control the memory usage.

One failing example would be loading lots of big JSON data files and unmarshal them via `transform.Unmarshal`.

This commit consolidates all these caches into one single LRU cache with an eviction strategy that also considers used vs. available memory.

Hugo will try to limit its memory usage to 1/4 or total system memory, but this can be controlled with the `HUGO_MEMORYLIMIT` environment variable (a float value representing Gigabytes).

A natural next step after this would be to use this cache for `.Content`.

Fixes gohugoio#10386
Fixes gohugoio#8307
Fixes gohugoio#8498
Fixes gohugoio#8927
Fixes gohugoio#9192
Fixes gohugoio#9189
Fixes gohugoio#7425
Fixes gohugoio#7437
Fixes gohugoio#7436
Fixes gohugoio#7882
Updates gohugoio#7544
Fixes gohugoio#9224
Fixes gohugoio#9324
Fixes gohugoio#9352
Fixes gohugoio#9343
Fixes gohugoio#9171
Fixes gohugoio#10104
Fixes gohugoio#10380
@bep bep modified the milestones: v0.116.0, v0.117.0 Aug 1, 2023
@DominoPivot
Copy link
Contributor

Removing this feature would break all mixed-case links on an existing website hosted on a case-sensitive server. This could break links to existing Hugo site as well as internal links within the site, and also prevent anyone from migrating an existing site with mixed-case URLs to Hugo, something I've done before.

Sure, in an ideal world, your server is configured to normalize casing so any request gets redirected to the canonical, lowercase version, but people use static site generators specifically in situations where they don't control the server that hosts their site.

@sergio-carlavilla
Copy link

Ah, no problem, we handle it in another way, no need this anymore :)

@bep bep modified the milestones: v0.117.0, v0.118.0 Aug 30, 2023
@bep bep modified the milestones: v0.118.0, v0.119.0 Sep 15, 2023
@bep bep modified the milestones: v0.119.0, v0.120.0 Oct 5, 2023
@bep bep added Enhancement and removed Proposal labels Oct 14, 2023
@bep bep self-assigned this Oct 14, 2023
@bep bep changed the title Remove or redefine the disablePathToLower setting Ignore disablePathToLower setting for directories Oct 14, 2023
@bep bep changed the title Ignore disablePathToLower setting for directories Redefine disablePathToLower setting for the section structure Oct 14, 2023
@bep bep modified the milestones: v0.120.0, v0.121.0 Oct 31, 2023
@bep bep added the Breaking label Nov 1, 2023
@bep bep modified the milestones: v0.121.0, v0.122.0 Dec 6, 2023
@bep
Copy link
Member Author

bep commented Dec 20, 2023

I'm closing this. Working my way through all of this I found a simple way to preserve the old behaviour. I still don't think this config option is a great idea, but I will close.

@bep bep closed this as completed Dec 20, 2023
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants