Skip to content

Commit

Permalink
fix(compartment-mapper): Stabilize hashes in face of layout changes
Browse files Browse the repository at this point in the history
Previously, compartment names in an archive always included a sequence
number, which allowed for the possibility of two packages with the same
name and version.
With this change, a sequence number is replaced with a duplicate
number only when there are multiple packages with the same number.

Please see the detailed comments for the rationale for this change.
The result should be that archive hashes are much less sensitive
to differences in the layout of the dependency graph.

Fixes #919
  • Loading branch information
kriskowal authored and naugtur committed May 16, 2022
1 parent 556a448 commit 8b044a5
Show file tree
Hide file tree
Showing 11 changed files with 166 additions and 21 deletions.
70 changes: 63 additions & 7 deletions packages/compartment-mapper/src/archive.js
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,11 @@ import parserArchiveCjs from './parse-archive-cjs.js';
import parserArchiveMjs from './parse-archive-mjs.js';
import { parseLocatedJson } from './json.js';
import { unpackReadPowers } from './powers.js';
import { assertCompartmentMap } from './compartment-map.js';
import {
assertCompartmentMap,
stringCompare,
pathCompare,
} from './compartment-map.js';

const textEncoder = new TextEncoder();

Expand All @@ -51,17 +55,69 @@ const resolveLocation = (rel, abs) => new URL(rel, abs).toString();
const { keys, entries, fromEntries } = Object;

/**
* We attempt to produce compartment maps that are consistent regardless of
* whether the packages were originally laid out on disk for development or
* production, and other trivia like the fully qualified path of a specific
* installation.
*
* Naming compartments for the self-ascribed name and version of each Node.js
* package is insufficient because they are not guaranteed to be unique.
* Dependencies do not necessarilly come from the npm registry and may be
* for example derived from fully qualified URL's or Github org and project
* names.
* Package managers are also not required to fully deduplicate the hard
* copy of each package even when they are identical resources.
* Duplication is undesirable, but we elect to defer that problem to solutions
* in the package managers, as the alternative would be to consistently hash
* the original sources of the packages themselves, which may not even be
* available much less pristine for us.
*
* So, instead, we use the lexically least path of dependency names, delimited
* by hashes.
* The compartment maps generated by the ./node-modules.js tooling pre-compute
* these traces for our use here.
* We sort the compartments lexically on their self-ascribed name and version,
* and use the lexically least dependency name path as a tie-breaker.
* The dependency path is logical and orthogonal to the package manager's
* actual installation location, so should be orthogonal to the vagaries of the
* package manager's deduplication algorithm.
*
* @param {Record<string, CompartmentDescriptor>} compartments
* @returns {Record<string, string>} map from old to new compartment names.
*/
const renameCompartments = compartments => {
/** @type {Record<string, string>} */
const renames = Object.create(null);
let n = 0;
for (const [name, compartment] of entries(compartments)) {
const { label } = compartment;
renames[name] = `${label}-n${n}`;
n += 1;
let index = 0;
let prev = '';

// The sort below combines two comparators to avoid depending on sort
// stability, which became standard as recently as 2019.
// If that date seems quaint, please accept my regards from the distant past.
// We are very proud of you.
const compartmentsByPath = Object.entries(compartments)
.map(([name, compartment]) => ({
name,
path: compartment.path,
label: compartment.label,
}))
.sort((a, b) => {
if (a.label === b.label) {
assert(a.path !== undefined && b.path !== undefined);
return pathCompare(a.path, b.path);
}
return stringCompare(a.label, b.label);
});

for (const { name, label } of compartmentsByPath) {
if (label === prev) {
renames[name] = `${label}-n${index}`;
index += 1;
} else {
renames[name] = label;
prev = label;
index = 1;
}
}
return renames;
};
Expand All @@ -73,7 +129,7 @@ const renameCompartments = compartments => {
*/
const translateCompartmentMap = (compartments, sources, renames) => {
const result = Object.create(null);
for (const compartmentName of keys(compartments).sort()) {
for (const compartmentName of keys(renames)) {
const compartment = compartments[compartmentName];
const { name, label, retained } = compartment;
if (retained) {
Expand Down
43 changes: 43 additions & 0 deletions packages/compartment-mapper/src/compartment-map.js
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,49 @@ const moduleLanguages = [
'pre-cjs-json',
];

/** @type {(a: string, b: string) => number} */
// eslint-disable-next-line no-nested-ternary
export const stringCompare = (a, b) => ((a === b ? 0 : a < b ? -1 : 1));

/**
* @param {number} length
* @param {string} term
*/
const cumulativeLength = (length, term) => {
return length + term.length;
};

/**
* @param {Array<string>} a
* @param {Array<string>} b
*/
export const pathCompare = (a, b) => {
// Prefer the shortest dependency path.
if (a.length !== b.length) {
return a.length - b.length;
}
// Otherwise, favor the shortest cumulative length.
const aSum = a.reduce(cumulativeLength, 0);
const bSum = b.reduce(cumulativeLength, 0);
if (aSum !== bSum) {
return aSum - bSum;
}
// Otherwise, compare terms lexically.
assert(a.length === b.length); // Reminder
// This loop guarantees that if any pair of terms is different, including the
// case where one is a prefix of the other, we will return a non-zero value.
for (let i = 0; i < a.length; i += 1) {
const comparison = stringCompare(a[i], b[i]);
if (comparison !== 0) {
return comparison;
}
}
// If all pairs of terms are the same respective lengths, we are guaranteed
// that they are exactly the same or one of them is lexically distinct and would
// have already been caught.
return 0;
};

/**
* @template T
* @param {Iterable<T>} iterable
Expand Down
43 changes: 35 additions & 8 deletions packages/compartment-mapper/src/node-modules.js
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
* @typedef {Object} Node
* @property {string} label
* @property {string} name
* @property {Array<string>} path
* @property {boolean} explicit
* @property {Record<string, string>} exports
* @property {Record<string, string>} dependencies - from module name to
Expand All @@ -37,7 +38,7 @@ import { inferExports } from './infer-exports.js';
import { searchDescriptor } from './search.js';
import { parseLocatedJson } from './json.js';
import { unpackReadPowers } from './powers.js';
import { assertCompartmentMap } from './compartment-map.js';
import { pathCompare } from './compartment-map.js';

const { assign, create, keys, values } = Object;

Expand Down Expand Up @@ -286,6 +287,7 @@ const graphPackage = async (

Object.assign(result, {
name,
path: undefined,
label: `${name}${version ? `-v${version}` : ''}`,
explicit: exports !== undefined,
exports: inferExports(packageDescriptor, tags, types),
Expand Down Expand Up @@ -412,6 +414,33 @@ const graphPackages = async (
return graph;
};

/**
* Compute the lexically shortest path from the entry package to each
* transitive dependency package.
* The path is a delimited with hashes, so hash is forbidden to dependency
* names.
* The empty string is a sentinel for a path that has not been computed.
*
* The shortest path serves as a suitable sort key for generating archives that
* are consistent even when the package layout on disk changes, as the package
* layout tends to differ between installation with and without devopment-time
* dependencies.
*
* @param {Graph} graph
* @param {string} location
* @param {Array<string>} path
*/
const trace = (graph, location, path) => {
const node = graph[location];
if (node.path !== undefined && pathCompare(node.path, path) <= 0) {
return;
}
node.path = path;
for (const name of keys(node.dependencies)) {
trace(graph, node.dependencies[name], [...path, name]);
}
};

/**
* translateGraph converts the graph returned by graph packages (above) into a
* compartment map.
Expand Down Expand Up @@ -442,7 +471,7 @@ const translateGraph = (
// package and is a complete list of every external module that the
// corresponding compartment can import.
for (const packageLocation of keys(graph).sort()) {
const { name, label, dependencies, parsers, types } = graph[
const { name, path, label, dependencies, parsers, types } = graph[
packageLocation
];
/** @type {Record<string, ModuleDescriptor>} */
Expand Down Expand Up @@ -478,6 +507,7 @@ const translateGraph = (
compartments[packageLocation] = {
label,
name,
path,
location: packageLocation,
modules,
scopes,
Expand Down Expand Up @@ -524,18 +554,15 @@ export const compartmentMapForNodeModules = async (
packageDescriptor,
dev,
);

trace(graph, packageLocation, []);

const compartmentMap = translateGraph(
packageLocation,
moduleSpecifier,
graph,
tags,
);

// Cross-check:
// We assert that we have constructed a valid compartment map, not because it
// might not be, but to ensure that the assertCompartmentMap function can
// accept all valid compartment maps.
assertCompartmentMap(compartmentMap);

return compartmentMap;
};
2 changes: 2 additions & 0 deletions packages/compartment-mapper/src/types.js
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ export {};
*
* @typedef {Object} CompartmentDescriptor
* @property {string} label
* @property {Array<string>} [path] - shortest path of dependency names to this
* compartment
* @property {string} name - the name of the originating package suitable for
* constructing a sourceURL prefix that will match it to files in a developer
* workspace.
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions packages/compartment-mapper/test/test-integrity.js
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ test('extracting an archive with a missing file', async t => {
const reader = new ZipReader(validBytes);
const writer = new ZipWriter();
writer.files = reader.files;
writer.files.delete('app-v1.0.0-n0/main.js');
writer.files.delete('app-v1.0.0/main.js');
const invalidBytes = writer.snapshot();

await t.throwsAsync(
Expand All @@ -65,7 +65,7 @@ test('extracting an archive with a missing file', async t => {
}),
{
message:
'Failed to load module "./main.js" in package "app-v1.0.0-n0" (1 underlying failures: Cannot find file app-v1.0.0-n0/main.js in Zip file missing.zip',
'Failed to load module "./main.js" in package "app-v1.0.0" (1 underlying failures: Cannot find file app-v1.0.0/main.js in Zip file missing.zip',
},
);

Expand All @@ -85,7 +85,7 @@ test('extracting an archive with an inconsistent hash', async t => {
writer.files = reader.files;

// Add a null byte to one file.
const node = writer.files.get('app-v1.0.0-n0/main.js');
const node = writer.files.get('app-v1.0.0/main.js');
const content = new Uint8Array(node.content.byteLength + 1);
content.set(node.content, 0);
node.content = content;
Expand All @@ -101,7 +101,7 @@ test('extracting an archive with an inconsistent hash', async t => {
},
}),
{
message: `Failed to load module "./main.js" in package "app-v1.0.0-n0" (1 underlying failures: Module "main.js" of package "app-v1.0.0-n0" in archive "corrupt.zip" failed a SHA-512 integrity check`,
message: `Failed to load module "./main.js" in package "app-v1.0.0" (1 underlying failures: Module "main.js" of package "app-v1.0.0" in archive "corrupt.zip" failed a SHA-512 integrity check`,
},
);

Expand Down

0 comments on commit 8b044a5

Please sign in to comment.