-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve search data by refining the article contents metadata #425
Comments
@cderv Could you offer your opinion on this? For building the rjournal distill website with >1000 article this file gets very large, causing substantial slowdown. My fork helps with this, but I expect to need to reduce the scope of the Alternatively, is there some way of suppressing the generation of |
Hi, thanks for pinging ! With such a large website, I can see how this can problematic. We can definitely refine the search.json file. We already do that in bookdown. Stripping I'll have a closer look at what is done in
You can deactivate search feature from distill using Then you would need to come up with your own processing to create the search - this is also feasible if you want something quite specific. |
Thanks! I did start by trying to strip out Would it be possible to add a condition on Lines 90 to 91 in d5545b7
|
Oh indeed, it would make sense to not write the json in that case. We should probably check using Lines 370 to 381 in d5545b7
and only write the file when activated. Do you want to make a PR and test it your website ? |
@mitchelloharawild #449 should solve part of the issue by allowing you to not write the I'll look also on how to correctly remove some content from the document in the search file. |
Great, thanks! |
Very nice to see this being implemented many thank. |
Oh yes, this seems definitely related. I'll take that into account. thanks a lot ! |
* Do not write search file if search is deactivated in config Related to #425
I have made the change in dev version so that content in This should make your website with htmlwidgets like plotly a lot quicker to load. |
Great, thanks! |
Would it be possible to modify the xpath expressions to exclude non-text content?
Having a few large interactive outputs results in the
search.json
and collections to be bloated and slow.I've made a small change (mitchelloharawild@153f8f9) as a quick fix to only use text within paragraphs for the article contents, but I'm sure this can be improved to only avoid JS within the article's body.
I've attached a minimal example that contains a page with an interactive plot (distill-test.zip).
You can see that the plot data is picked up in the article's search contents.
Created on 2021-11-24 by the reprex package (v2.0.0)
The text was updated successfully, but these errors were encountered: