rustdoc: Implement cross-crate searching #12946

alexcrichton · 2014-03-16T21:25:41Z

A major discoverability issue with rustdoc is that all crates have their
documentation built in isolation, so it's difficult when looking at the
documentation for libstd to learn that there's a libcollections crate with a
HashMap in it.

This commit moves rustdoc a little closer to improving the multiple crate
experience. This unifies all search indexes for all crates into one file so all
pages share the same search index. This allows searching to work across crates
in the same documentation directory (as the standard distribution is currently
built).

This strategy involves updating a shared file amongst many rustdoc processes, so
I implemented a simple file locking API for handling synchronization for updates
to the shared files.

cc #12554

alexcrichton · 2014-03-16T21:25:55Z

preview: http://people.mozilla.org/~acrichton/doc/std/index.html

huonw · 2014-03-16T21:43:55Z

The crates column should have a consistent order (alphabetical?).

I really like this, but I'm slightly nervous about it: it makes the documentation index a huge(r) blob (especially with syntax and rustc, like the online docs have) that is loaded unconditionally: #12597 is a bigger priority with this change IMO.

alexcrichton · 2014-03-16T21:50:26Z

The search index is loaded asynchronously, so the page is usable before its loaded. Additionally, it's super-cacheable because all crates and pages share the exact same index. I agree that it's a little large. With rustc/syntax, it's 1.7MB, 264K compressed. This is what we're buying into though if we want to be able to search across crates.

For comparison, google's homepage downloaded over a megabyte of JS.

sfackler · 2014-03-16T21:51:49Z

There should probably be some indication that you're going to a different crate. I'd be confused if I found Timespec in the libstd docs but couldn't use std::time::Timespec.

huonw · 2014-03-16T21:55:08Z

The search index is loaded asynchronously, so the page is usable before its loaded

The search isn't usable immediately; and, on even just mildly slow internet connections, the delay between page loading and search working is annoying.

For comparison, google's homepage downloaded over a megabyte of JS.

1 MB after decompressing it, and I'm sure their mobile site is significantly smaller.

Additionally, it's super-cacheable because all crates and pages share the exact same index

Do we actually set the appropriate caching headers so that it can be cached?

alexcrichton · 2014-03-16T21:59:28Z

There should probably be some indication that you're going to a different crate.

Each page has an accurate "breadcrumb trail" at the top, you think it should be more obvious than that? Everything is prefixed with what crate it's from.

even just mildly slow internet connections, the delay between page loading and search working is annoying

Our options here are abandon this completely or remove search altogether, otherwise I think this is basically a fact of how it works?

Do we actually set the appropriate caching headers so that it can be cached?

It's all hosted on S3, which apparently does the right thing. I'm getting lots of 304's on the homepage.

huonw · 2014-03-16T22:00:38Z

Our options here are abandon this completely or remove search altogether, otherwise I think this is basically a fact of how it works?

With compression, the transfer time would be much smaller; or if we had the option of a dynamic backend the main website wouldn't need to transfer anything. (The former is the solution I'm suggesting we should try to adopt as soon as possible.)

It's all hosted on S3, which apparently does the right thing. I'm getting lots of 304's on the homepage.

Ok, awesome.

huonw · 2014-03-17T00:05:14Z

src/librustdoc/flock.rs

+            };
+            if ret == -1 {
+                unsafe { libc::close(fd); }
+                fail!()


fail!("could not lock {}", p.display())? (And similarly below.)

Also, won't this cause rustdoc to just fail when another rustdoc process is running? Should it be sleeping and then retrying a few times? (My reading of the man-page is that fcntl doesn't block and just returns immediately?)

Apparently F_SETLKW is special (this code is all from online examples, I'm not intimately familiar with file locking).

From the manpage:

F_SETLKW

This command is the same as F_SETLK except that if a shared
or exclusive lock is blocked by other locks, the process
waits until the request can be satisfied. If a signal that
is to be caught is received while fcntl is waiting for a
region, the fcntl will be interrupted if the signal handler
has not specified the SA_RESTART (see sigaction(2)).

Which means that this should probably handle EINTR, but rustdoc usage does not currently generate any signals.

Ah, of course, I was reading the section for F_SETLK; my mistake.

huonw · 2014-03-17T00:23:38Z

r=me after comments (especially the file locking stuff).

DaGenix · 2014-03-17T02:07:23Z

FYI: it compresses from 838,669 bytes down to 135,868.

ehsanul · 2014-03-18T06:12:48Z

It shouldn't be too bad getting main.js compressed. S3 doesn't automatically handle the Accept-Encoding header unfortunately, but it can be made to work in the following (hacky) manner: http://stackoverflow.com/a/5502390/127219

huonw · 2014-03-18T06:30:49Z

@ehsanul see #12597 too.

A major discoverability issue with rustdoc is that all crates have their documentation built in isolation, so it's difficult when looking at the documentation for libstd to learn that there's a libcollections crate with a HashMap in it. This commit moves rustdoc a little closer to improving the multiple crate experience. This unifies all search indexes for all crates into one file so all pages share the same search index. This allows searching to work across crates in the same documentation directory (as the standard distribution is currently built). This strategy involves updating a shared file amongst many rustdoc processes, so I implemented a simple file locking API for handling synchronization for updates to the shared files. cc rust-lang#12554

Update bug_report.md Closes rust-lang/rust-analyzer#12944

huonw reviewed Mar 17, 2014
View reviewed changes

alexcrichton mentioned this pull request Mar 18, 2014

Rolling up some more PRs #13006

Closed

bors closed this in 87e72c3 Mar 19, 2014

alexcrichton deleted the discover branch March 19, 2014 04:53

lifthrasiir mentioned this pull request Apr 16, 2014

Make it possible to search for docs across crates #12121

Closed

lnicola pushed a commit to lnicola/rust that referenced this pull request Aug 9, 2022

Auto merge of rust-lang#12946 - rust-lang:issue-template, r=lnicola

cb52271

Update bug_report.md Closes rust-lang/rust-analyzer#12944

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rustdoc: Implement cross-crate searching #12946

rustdoc: Implement cross-crate searching #12946

alexcrichton commented Mar 16, 2014

alexcrichton commented Mar 16, 2014

huonw commented Mar 16, 2014

alexcrichton commented Mar 16, 2014

sfackler commented Mar 16, 2014

huonw commented Mar 16, 2014

alexcrichton commented Mar 16, 2014

huonw commented Mar 16, 2014

huonw Mar 17, 2014

huonw Mar 17, 2014

alexcrichton Mar 17, 2014

huonw Mar 17, 2014

huonw commented Mar 17, 2014

DaGenix commented Mar 17, 2014

ehsanul commented Mar 18, 2014

huonw commented Mar 18, 2014

rustdoc: Implement cross-crate searching #12946

rustdoc: Implement cross-crate searching #12946

Conversation

alexcrichton commented Mar 16, 2014

alexcrichton commented Mar 16, 2014

huonw commented Mar 16, 2014

alexcrichton commented Mar 16, 2014

sfackler commented Mar 16, 2014

huonw commented Mar 16, 2014

alexcrichton commented Mar 16, 2014

huonw commented Mar 16, 2014

huonw Mar 17, 2014

Choose a reason for hiding this comment

huonw Mar 17, 2014

Choose a reason for hiding this comment

alexcrichton Mar 17, 2014

Choose a reason for hiding this comment

huonw Mar 17, 2014

Choose a reason for hiding this comment

huonw commented Mar 17, 2014

DaGenix commented Mar 17, 2014

ehsanul commented Mar 18, 2014

huonw commented Mar 18, 2014