-
-
Notifications
You must be signed in to change notification settings - Fork 8.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to generate offline static HTML files usable without server #3825
Comments
You can use electron to freeze it I believe |
That'd be a super overkill. Even you have just one webpage, Electron will make it 80-100 MB, putting the whole browser rendering and scripting engines in it. |
This comment was marked as duplicate.
This comment was marked as duplicate.
1 similar comment
This comment was marked as duplicate.
This comment was marked as duplicate.
@ohkimur mentionned he built a postprocessing step to enable support of local browsing, using the That doesn't look like a bad idea to build this as a postprocessing step. A Docusaurus plugin could be built in userland to solve this problem. Plugins have a Note: such plugin should take into account the |
Note: for some Docusaurus features (particularly SEO metas such as social image, i18n hreflang...), URLs in the HTML files MUST be fully qualified absolute URLs (domain + absolute path). Building a site for local offline usage does not prevent you from setting site URL and baseUrl in the config file, otherwise, the build output would not be suitable for online hosting. For these reasons, it's very unlikely we'll add support for using "relative baseUrl" in Docusaurus, such as |
Moving my conversation from #448 to this thread @ohkimur - your suggestion works for the most part of it, but Webpack configurations are still being difficult to resolve @slorber - my use case isn't for offline usage. I am trying to put together a simplistic developer workflow which involves publishing documentation to GitHub pages. At my workplace, we are using GitHub enterprise. The use case is as follows,
Given that I understand that this is not how most people work. This is part of an exercise where I am trying to encourage my team to get into the habit of documenting their software. The pandemic has made things worse because reviewing the UI / UX now requires a meeting instead of being able to just view the documentation against their repositories. Any ideas that you might have to improve this workflow / process are most welcome. I'm more of a Java/JVM guy .. which isn't helping and making the hacking process that much more challenging. Any help is greatly appreciated. |
@slorber I created the docusaurus-plugin-relative-paths to solve the issue. I used the same post-processing approach using Docusaurus |
@roguexz , if you used a modern Jamstack tool like Netlify or Vercel (both much better than GH pages), you'd get a much better experience and all PRs would have a "deploy preview" link that includes the changes from the PR, ensuring they are valid (docusaurus can build and you can check the end result a merge would lead to before doing that merge). See this Docusaurus PR, the Netlify bot added a link so that the PR can be reviewed more easily: #5462 (comment) This is very easy to set up. @ohkimur thanks! hope people will like this solution. One interesting idea could be to have 2 mods:
|
@slorber I think this is a great idea. If you want, you can open a issue here and I will work on it. 🐱👤 |
Picking up on @RDIL's comment: I added the build output files to an electron app and encountered a few issues. After specifying each of the files in package.json and with |
A very rough way to fix script references is Also, not sure if this is documented anywhere but to dev with npm run start, I had to deactivate Docusaurus creates quite a few js files to keep track of if you work in an environment that requires you to list every single file. I'm used to react-static, and its builds consist of far fewer files. |
@larissa-n Thank you for your observations. I know about the issue you mentioned, but I didn't fix it since I didn't find an elegant approach to do it. If you already have a potential solution (even though it's messy) I invite you to make a pull request in the plugin's repo. I can extend it later if necessary. Also, can you describe the problem you had when you tried |
Why must they be fully qualified, @slorber? I've just started to use docusaurus and I find the Is this necessity documented? |
It's not just coupling to a There are multiple things in Docusauurs relying on that, in particular SEO metadata like canonical URL <link data-rh="true" rel="canonical" href="https://docusaurus.io/docs/myDoc"> What Google says: https://developers.google.com/search/docs/advanced/crawling/consolidate-duplicate-urls Although relative URLs seem supported (maybe only by Google?), it's not recommended. Similarly, meta hreflang headers for i18n sites: https://developers.google.com/search/docs/advanced/crawling/localized-versions (including the transport method means you also can't switch from HTTP to HTTPS without a Docusaurus config change) Similarly for <meta property="og:image" content="https://docusaurus.io/img/socialcard.png"/> Using a relative URL can lead to failures to display the card and does not respect the spec: https://ogp.me/#data_types It's not a Docusaurus-side constraint, it's a constraint that comes from outside. You really have to build your site for a specific protocol/domain/baseUrl. Now I understand in some cases you don't care about the features above and prefer to have more "deployment flexibility", but for now we don't support that. |
Fantastic answers. Thank you. |
@ohkimur it looks like your completely deleted your docusaurus-plugin-relative-paths project? |
@jeacott1 Yeah. I did. I want to invest my time into something different. |
@ohkimur I appreciate your OSS work! Could you please put your docusaurus repos up as 'Archived'? Even temporarily? (I'm trying to help a former student make sense of some README notes left by a previous dev. It has permalinks to your If you'd rather not deal with it at all, I do understand. I hope your new focus is rewarding. Best wishes, Dan. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Hey, I'm sorry but all these discussions are off-topic and I'm hiding them. This is not the place to ask about what Docusaurus is. I'm still going to answer briefly. Docusaurus is a React-based static site generator. It generates static HTML pages using React, and then hydrates React client-side. When navigating, we don't navigate to another HTML document, but we render the next page with client-side JavaScript using soft navigations and the This kind of navigation is what permits Docusaurus to feel fast when clicking on a link, and also preserves the state of UI elements on the current page (for example the collapsible state of sidebar categories). This is a very different model from Jekyll, Hugo, Eleventy, MkDocs, Sphynx, and many other SSG tools that do not use client-side navigation and use a more traditional/old-school approach, but are usually less "interactive". Docusaurus v1 also worked that way, using React only during the build process and not loading React on the client side. If you open your Chrome DevTools network tab on v1.docusaurus.io VS docusaurus.io, you will notice a big difference when navigating. v1 will request a new HTML page, while now v2+ will request JS to render locally the new page. If you don't understand what Docusaurus, React, hydration, SPA, history API, and all these things are, then it is unlikely that you will be able to help us solve this issue. |
InvestigationsI've investigated 2 approaches so far:
I have also investigated using external tooling such as mkdir wget-test
cd wget-test
cp -R ../projects/docusaurus/website/build/ .
wget -mpEknH http://localhost:3000/ This kind of works, but it is somehow the same solution as the first one (SSG) where each page has a dedicated static html file. SSGTo me, the SSG approach is quite challenging. Notably, I'm not even sure dependencies such as React-Router can do routing using the However, it could work decently if you are ok with opting out of the SPA mode of Docusaurus and are ok with not hydrating React on the client. This means that things we implement with interactive React code will not work (tabs, theme switch, category collapse button, mobile drawer...). We try to make things work without any JS (#3030) but there are still a few things that require JS and/or React. This also makes it impossible for you to include React interactive inside docs (through MDX) however non-interactive React elements (such as admonitions) are perfectly fine. If you want to give this mode a try, I'd suggest to run this command on your computer. This is kind-of the equivalent of the HTML post-processing scripts that people shared earlier, the links and assets will use relative paths. wget -mpEk https://tutorial.docusaurus.io/ (there are JS loading console errors, but that's kind of on purpose: if JS succeeds in loading, then you'll get a 404 page being rendered after React hydration because React-Router does not know what to render for Hash RouterThe Hash Router solution looks easier to implement, and I'm almost able to make it work on our website, apart from a few linking edge cases to investigate. However, I'm not sure if it's the solution the community is looking for considering there would be a single HTML file emitted, and that file would initially be empty. Here's the deploy preview online demo of the Hash Router based app: The local app using You will notice that we have a "loading..." screen before the content appears. This is because the initial html file is empty and all the app is rendered with JS. |
+1 for the Hash Router. We use Docusaurus for interactive teaching-websites in a highschool and we'd like to give our students the ability to get a snapshot of the website when they completed their grade and leave the school. Important points for our usecase:
Thanks a lot - this really would be a huge thing for our school-department :) |
@slorber hashrouter sounds like an excellent solution to me. Definitely preferable to SSG imo, and leaves more features available over the long term. |
Thanks for your feedback. I'll focus on implementing proper support for the Hash Router then. It doesn't mean that we won't eventually support other "modes" later, but at least this one is a good starting point. Other alternatives to be considered:
Of course, I'm not a fan of those approaches (using a bazooka to kill a fly), but afaik you could implement them in userland today if you really need to solve this problem right now and be able to package/distribute your docs for offline usage. |
Could you confirm that plain static html files for each page will continue to be supported for the foreseeable future? |
This would be a new build mode you enable through a CLI option. So yes everything else will remain retrocompatible and Docusaurus will remain a static site generator |
@slorber ... thank you for working on this. Just to confirm another use case, I'm hosting documentation in this fashion within an ERP platform (which I won't mention) due to abysmal support for doing anything remotely useful or flexible for this purpose. It's basically a static resource defined within the ERP environment, which gives me authentication by default for users already authenticated into the ERP. So now I can have a flexible git-controlled documentation process and keep my docs private and secure. I could even build a CI/CD process to load updates into the ERP if I wanted. Right now, I'm using the post-process solution created by @andrigamerita. I have to wrap it using a supported technique, but as long it works completely offline it works. Thanks! |
Wow, looking forward to try it! Thanks for implementing this option 😍🥳 |
Hey 👋 The Hash Router PR has been merged: #9859 The hash router is useful in rare cases, and will:
You can try this new experimental site option: export default {
future: {
experimental_router: 'hash', // default to "browser"
}
} If you need to switch conditionally between normal/browser router and hash router, you can use a Node env variable. We don't provide any To dogfood this feature, make it easier to review and ensure it keeps working over time, we build our own website with the hash router and:
An example artifact you can download is available here: https://github.com/facebook/docusaurus/actions/runs/9159577535 This will download a Unzipping it gives you a static deployment. You can open it and browse locally without a web server by simply clicking the EXPERIMENTAL FEATURE:The hash router is experimental. It will be released in Docusaurus v3.4, but can already be tried in canary releases. We strongly discourage you from using a Otherwise, there may be unhandled edge cases that we missed, so please report here any issue you have by providing a repro. Remember that third-party plugin authors may also need to adjust their code to support this new router. Although it should work out-of-the-box for most plugins, we can't guarantee that it will. Thanks and please let us know if this feature works well for you. |
In my case, I use the browser route, and Traefik uses the
Due to the
I can't modify the configurations of traefik and nginx, and |
@pfdgithub your comment is quite hard for me to understand. So far I am not even sure if it is even related to the current issue because none of the URLs you share have a hash, and the hash part of the URL shouldn't affect routing and redirects in any way. If you want help, make sure it's relevant to the current issue,create a smaller repro, and try to explain better including fully qualified urls because the way you share urls right now does not even make it clear which router config you use. |
Sorry, this comment is not about hash route. It is a further discussion of the following comment. #448 |
Awesome work! but it's a pity that some local search plugins are not compatible with this feature now. for example https://github.com/easyops-cn/docusaurus-search-local |
@dingbo8128 unfortunately all search plugins I know crawl the static HTML files. Since we now emit a single empty HTML file and use client-side JS to display the actual content, it's not possible to crawl the HTML files anymore for search engines to index your content. The community will have to provide a different implementation for this new hash router mode. Since we can't read the HTML files directly, it will likely require using a headless browser to run the HTML pages and extract the rendered content out of it. Maybe external search engines like Algolia would keep working, considering they run an external crawler. I don't know, if someone gives it a try I'm curious. Although, it's not ideal since it would require network access to get the search results. Note that our sitemap does not emit a |
I'm running into an issue with the generated links returning a 404. Here's an example from the deployed Docusaurus site after navigating to Blog > Docusaurus 3.4 > Hash Router - Experimental (from right side menu). https://facebook.github.io/docusaurus/#/blog/releases/3.4%23hash-router---experimental The link should direct users to the |
@dspatoulas how did you obtain that link? The GitHub UI doesn't show it this way, but your link is:
The same link using https://facebook.github.io/docusaurus/#/blog/releases/3.4#hash-router---experimental And afaik nowhere in our UI we use To be honest I'm surprised in doesn't work in |
Hello I have might have a useCase + solution : So I tried your solution to produce a static website and this solution seems to works in the user-right limited network. If I don't use baseURL, it works perfectly. I tried to use baseURL and here is what seems to change in my opinion :
Didn't notice anything else in the static website. Mine is an simple one so I must skipped some problems, but the ones I had wasn't hard ones. PS : english is not my native language, sorry if my english is a but rusty |
@hellfiremaga thanks for sharing your use case but I can't do anything with this unfortunately
This is very vague and I'm not sure you "need" to use a baseURL. Please prove it. Show me concrete details of your setup, including very concrete examples of the other docs in your "network", how to do open those, what are the concrete browser URLs of each docs sites, what are the exact file system locations of these docs sites. I'm not even sure what you mean by "network". If there's no web server allowed, there's no network, you open the site locally using the For these reasons, in its current state, I cannot take your feedback into consideration and decide it's worth it to add support for hash router baseUrl. |
@slorber Actually, you're right, I agree with you : I don't really need this features. I was able to make a working static website without it, only by making a few change in my docusaurus configuration - thank you for it actually, a really cool feature ! . But, in my opinion, it don't seems an hard work either to use the base URL. I'll explain a bit more my use case, maybe I'm just using docusaurus wrong or I explained it wrong. Actually, for professional reasons, I'm using docusaurus for the documentation of an internally developed software (let's call it Msoft). I was inspired by colleagues who already use docusaurus for another software (let's call is Osoft) I'm working with, but don't develop anything in it. I'm sure we will have more project like this in the future, but let's stay on the actual time. In my company, we have multiple networks (I'm not sure it's the right term, we use "information system") and many of them are disconnected from internet. Here are some examples :
Sorry for the size of my explanation, I want you to haven enough details from my use case. This is why I don't need a baseURL, but I'm pretty sure I will forget to revert the baseURL after a static build and lost time to find this problem. The interraction between docs are not mandatory, if we have two static website for two documentation, it's ok. |
@hellfiremaga thanks for the explanation, but this is still not concrete enough. I don't see any concrete file path or URLs being shared here, as I asked.
When you use the hash router, you don't need to use a baseUrl to avoid conflicts. The conflicts are already avoided by opening different files on the file system: These won't conflict:
Using a baseUrl will only lead to a longer URL:
There's no real benefit to using hash router + baseUrl, unless proven otherwise.
You don't need to "hardcode" the baseurl in your config file, you can pass it dynamically with an env variable for example You could have scripts in package.json that enable you to conveniently choose the variant you want to build: "build": "docusaurus build",
"build:withBaseURL": "BASE_URL='/baseUrl' docusaurus build"
"build:withDifferentConfig": "docusaurus build --config docusaurus.config.different.ts" |
@slorber I guess you found the solution what I was looking for 👍 I agree about the non-benefit of the baseURL in a static docusaurus website for all the reason you have. My point was more about the build and the use of baseURL which "break" the mainpage. With a simple user POV : he double click the index : "page not found", he will think it's broken. Even if I give him the "good" URL (like "file:///network2/osoft/index.html/#/osoft/"), he will try to get back to the main page by the top left logo at a moment, "page not found", he will think it's broken. But your solution of using the environment variable is clever. I have to admit I didn't even know we could to such a thing. Thanks a lot for that ! I still think it could be a good idea to ignore the base URL for static site. Or, as you proposed, take is as an error. Other users would have the same problem and, if someone find a real use case, it will still be revertable :) Oh and thanks for your quick answer and help, highly appreciated :) |
Thanks Let's wait for a few more feedbacks. If nobody has a use-case for hash router + baseUrl we can force it to |
@hellfiremaga why not differentiate based on file path instead of hash route? or, you said there was a webserver and gitlab, so either multiple ports or a reverse proxy I assume?, why not just host your docusaurus as a website? |
🚀 Feature
docusaurus build
will build a local production version that has to bedocusaurus serve
'd to be usable. Can we add an option to build an offline static HTML files that are usable completely without any server, so user can just open index.html with a browser to read the whole documentation.It's about calculating relative (instead of absolute) URLs, and appending "index.html" at the end of the URLs. Algolia search will have to be removed, any online cloud assets will have to be put in local folders.
Have you read the Contributing Guidelines on issues?
Yes
Comment, Motivation, Pitch
What about other static site generators and libraries?
Gatsby, React, etc.'s
build
all do the similar thing, they all need a server.Gatsby has this feature request for option to build such offline static HTML site: gatsbyjs/gatsby#4610, which is closed without the issue being solved. Users keep asking for the feature and for reopening the issue. According to one comment, in Gatsby v1 it actually can generate such static site, it is in v2 it doesn't work.
React serves general purpose and Gatsby is made for any website. But Docusaurus, is primarily made for documentation, it may need the feature of the offline version generation more than React and Gatsby do.
PDF and ebook formats
There is already a feature request, #969, that asks for option to create an offline version in PDF format. It is obviously brilliant to be able to make PDF and maybe also EPUB, MOBI, AZW. PDF and these ebook formats may have less security concern than HTML. But the downsides are, it may be a little time-consuming to achieve the PDF feature; those interactive navs and TOCs and colorful website design and layout will have to be removed in PDF and other ebook formats. Offline static HTML is easier to make. If PDF feature is in the long-term plan, then Offline static HTML could be in a shorter-term to-do list.
Compressed web file format
The offline static web files usable without server, could be simply compressed as a zip or in other common archive formats. User will need to uncompress the file and click index.html in the root folder to use it.
They can also be compiled in CHM (Microsoft Compiled HTML Help), problem is it is a bit old and it does not have native support in non-Windows OS. It's a little surprising there's no standard or universally accepted file format similar to CHM. Perhaps it's due to security concerns.
The text was updated successfully, but these errors were encountered: