The final stage of indexing is to output a static HTML file for:
- Every source file via
tools/src/bin/output-file.rs
- Every directory, linking to subdirectories and source files via
scripts/output-dir.js
- The search file template used by
router/router.py
viascripts/output-template.js
. (The template is little more than the HTML UI boilerplate, a place to inline the JSON-style results object, and a "load" listener to trigger the JS logic to render the results.) - The
help.html
file at the root of the output tree.scripts/output-help.html
wraps the contents of the config tree'shelp.html
in the HTML boilerplate of the UI so that the standard search bar is at the top of the page.
Because output logic is currently split between rust and JS code, any structural
changes will require changes to both scripts/output.js
and
tools/src/output.rs
.
This code lives in tools/src/bin/output-file.rs
and tools/src/output.rs
. The
main formatting loop is in tools/src/format.rs
. The inputs to this process
are:
- The original source code, either from the file system (for the current version) or from version control (for historical versions).
- Blame information from the blame repository.
- Analysis records generated for the given file.
- Jump information generated by the cross referencer.
The original code is tokenized using one of two hand-coded tokenizers
(both in tools/src/tokenize.rs
). One tokenizer recognizes C-like
languages (JS, C++, IDL, Python) and the other recognizes tag-based
languages (HTML, XML).
The central loop in format.rs
iterates over tokens. When it finds
an identifier token, it outputs markup for all text between the
previous identifier and this one. Then it checks if this identifier
has an analysis source record for the token's location. If it does,
then it adds data-
attributes to the markup that describe what the
context menu should do for that identifier. Regardless, the markup
colors the identifier based on whether it's a reserved word as well as
the syntax
property on the source record (if there is one).
The output code also has the ability to show annotated commit
diffs. These diffs are generated dynamically by the web server when
the user requests an annotated diff. The diff is generated by running
git diff -U100000
. All the lines forming the "new" version of the
file are also run through format.rs
to syntax highlight them
(although there are no analysis records available). The "old" -
lines are then merged in at the right locations and the appropriate
blame information is fetched for unchanged and -
lines.