Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix validation of modified declarations #1066

Merged
merged 14 commits into from
Apr 8, 2024
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,15 @@

All changes that impact users of this module are documented in this file, in the [Common Changelog](https://common-changelog.org) format with some additional specifications defined in the CONTRIBUTING file. This codebase adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## Unreleased [patch]

> Development of this release was supported by the [French Ministry for Foreign Affairs](https://www.diplomatie.gouv.fr/fr/politique-etrangere-de-la-france/diplomatie-numerique/) through its ministerial [State Startups incubator](https://beta.gouv.fr/startups/open-terms-archive.html) under the aegis of the Ambassador for Digital Affairs.

### Fixed

- Fix validation of modified declarations when new services were added
- Fix validation of modified declarations when filters were modified

## 1.1.1 - 2024-03-28

_Full changeset and discussions: [#1065](https://github.com/OpenTermsArchive/engine/pull/1065)._
Expand Down
27 changes: 4 additions & 23 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,6 @@
"croner": "^4.3.6",
"cross-env": "^7.0.3",
"datauri": "^4.1.0",
"deep-diff": "^1.0.2",
"dotenv": "^10.0.0",
"eslint": "^8.53.0",
"eslint-config-airbnb-base": "^15.0.0",
Expand Down
2 changes: 1 addition & 1 deletion scripts/declarations/lint/index.mocha.js
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,12 @@
if (options.modified) {
const declarationUtils = new DeclarationUtils(instancePath);

({ services: servicesToValidate } = await declarationUtils.getModifiedServiceTermsTypes());
({ services: servicesToValidate } = await declarationUtils.getModifiedServicesTermsTypes());
}

const lintFile = lintAndFixFile(options.fix);

describe('Service declarations lint validation', async function () {

Check warning on line 39 in scripts/declarations/lint/index.mocha.js

View workflow job for this annotation

GitHub Actions / test (ubuntu-20.04)

Async function has no 'await' expression

Check warning on line 39 in scripts/declarations/lint/index.mocha.js

View workflow job for this annotation

GitHub Actions / test (windows-latest)

Async function has no 'await' expression

Check warning on line 39 in scripts/declarations/lint/index.mocha.js

View workflow job for this annotation

GitHub Actions / test (macos-latest)

Async function has no 'await' expression
this.timeout(30000);

servicesToValidate.forEach(serviceId => {
Expand All @@ -46,8 +46,8 @@
const filtersFilePath = path.join(declarationsPath, `${serviceId}.filters.js`);
const filtersHistoryFilePath = path.join(declarationsPath, `${serviceId}.filters.history.js`);

context(serviceId, async () => {

Check warning on line 49 in scripts/declarations/lint/index.mocha.js

View workflow job for this annotation

GitHub Actions / test (ubuntu-20.04)

Async arrow function has no 'await' expression

Check warning on line 49 in scripts/declarations/lint/index.mocha.js

View workflow job for this annotation

GitHub Actions / test (windows-latest)

Async arrow function has no 'await' expression

Check warning on line 49 in scripts/declarations/lint/index.mocha.js

View workflow job for this annotation

GitHub Actions / test (macos-latest)

Async arrow function has no 'await' expression
before(async function () {

Check warning on line 50 in scripts/declarations/lint/index.mocha.js

View workflow job for this annotation

GitHub Actions / test (ubuntu-20.04)

Async function has no 'await' expression

Check warning on line 50 in scripts/declarations/lint/index.mocha.js

View workflow job for this annotation

GitHub Actions / test (windows-latest)

Async function has no 'await' expression

Check warning on line 50 in scripts/declarations/lint/index.mocha.js

View workflow job for this annotation

GitHub Actions / test (macos-latest)

Async function has no 'await' expression
if (!service) {
console.log(' (Tests skipped as declaration has been archived)');
this.skip();
Expand Down
13 changes: 13 additions & 0 deletions scripts/declarations/utils/fixtures/serviceA.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"name": "Service",
"documents": {
"Terms of Service": {
"fetch": "https://domain.example/tos",
"select": "body"
},
"Privacy Policy": {
"fetch": "https://domain.example/tos",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"fetch": "https://domain.example/tos",
"fetch": "https://domain.example/privacy",

"select": "body"
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"name": "Service",
"documents": {
"Terms of Service": {
"fetch": "https://domain.example/tos",
"select": "body",
"remove": "footer"
},
"Privacy Policy": {
"fetch": "https://domain.example/tos",
"select": "body",
"remove": "footer"
}
}
}
17 changes: 17 additions & 0 deletions scripts/declarations/utils/fixtures/serviceATermsAdded.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"name": "Service",
"documents": {
"Terms of Service": {
"fetch": "https://domain.example/tos",
"select": "body"
},
"Privacy Policy": {
"fetch": "https://domain.example/tos",
"select": "body"
},
"Imprint": {
"fetch": "https://domain.example/tos",
"select": "body"
}
}
}
9 changes: 9 additions & 0 deletions scripts/declarations/utils/fixtures/serviceATermsRemoved.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"name": "Service",
"documents": {
"Terms of Service": {
"fetch": "https://domain.example/tos",
"select": "body"
}
}
}
14 changes: 14 additions & 0 deletions scripts/declarations/utils/fixtures/serviceATermsUpdated.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"name": "Service",
"documents": {
"Terms of Service": {
"fetch": "https://domain.example/tos",
"select": "body",
"remove": "footer"
},
"Privacy Policy": {
"fetch": "https://domain.example/tos",
"select": "body"
}
}
}
9 changes: 9 additions & 0 deletions scripts/declarations/utils/fixtures/serviceB.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"name": "Service B",
"documents": {
"Terms of Service": {
"fetch": "https://domain.example/tos",
"select": "body"
}
}
}
74 changes: 42 additions & 32 deletions scripts/declarations/utils/index.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import path from 'path';

import DeepDiff from 'deep-diff';
import simpleGit from 'simple-git';

export default class DeclarationUtils {
Expand All @@ -9,13 +8,17 @@ export default class DeclarationUtils {
this.defaultBranch = defaultBranch;
}

static filePathToServiceId = filePath => path.parse(filePath.replace(/\.history|\.filters/, '')).name;
static getServiceIdFromFilePath(filePath) {
return path.parse(filePath.replace(/\.history|\.filters/, '')).name;
}

async getJSONFile(path, ref) {
async getJSONFromFile(ref, filePath) {
try {
return JSON.parse(await this.git.show([`${ref}:${path}`]));
} catch (e) {
return {};
const fileContent = await this.git.show([`${ref}:${filePath}`]);

return JSON.parse(fileContent);
} catch (error) {
return undefined;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return undefined;
return;

}
}

Expand All @@ -24,7 +27,7 @@ export default class DeclarationUtils {

const modifiedFilePaths = (modifiedFilePathsAsString ? modifiedFilePathsAsString.split('\0') : []).filter(str => str !== ''); // split on \0 rather than \n due to the -z option of git diff

return { modifiedFilePaths, modifiedServicesIds: Array.from(new Set(modifiedFilePaths.map(DeclarationUtils.filePathToServiceId))) };
return { modifiedFilePaths, modifiedServicesIds: Array.from(new Set(modifiedFilePaths.map(DeclarationUtils.getServiceIdFromFilePath))) };
}

async getModifiedServices() {
Expand All @@ -33,48 +36,55 @@ export default class DeclarationUtils {
return modifiedServicesIds;
}

async getModifiedServiceTermsTypes() {
const { modifiedFilePaths, modifiedServicesIds } = await this.getModifiedData();
async getModifiedServicesTermsTypes() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If, as I understand from the tests, this actually returns { services: ['serviceId' ], servicesTermsTypes: { serviceId: [ termsTypes] } }, it should be named

Suggested change
async getModifiedServicesTermsTypes() {
async getModifiedServicesAndTermsTypes() {

(but also, why should it duplicate information that can be obtained with Object.keys??!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I not changed the previous API

const { modifiedFilePaths } = await this.getModifiedData();
const servicesTermsTypes = {};

await Promise.all(modifiedFilePaths.map(async modifiedFilePath => {
const serviceId = DeclarationUtils.filePathToServiceId(modifiedFilePath);
const serviceId = DeclarationUtils.getServiceIdFromFilePath(modifiedFilePath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract the block in map to new function, it is too long to be declared this way in an iteration 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand what you mean here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole anonymous function that is defined and called as the map parameter is huge. I believe it would help with readability if it was extracted into its own named function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the code simplification, I no longer find it useful.


if (modifiedFilePath.endsWith('.filters.js')) {
const declaration = await this.getJSONFromFile(this.defaultBranch, `declarations/${serviceId}.json`);

if (!modifiedFilePath.endsWith('.json')) {
// Here we should compare AST of both files to detect on which function
// change has been made, and then find which terms type depends on this
// function.
// As this is a complicated process, we will just send back all terms types
const declaration = await this.getJSONFile(`declarations/${serviceId}.json`, this.defaultBranch);
// TODO: Implement AST comparison between original and modified "filters" files to detect changes in functions, enabling identification of terms whose types depend on modified functions.
// Due to the complexity involved, temporarily returning all term types.
Ndpnt marked this conversation as resolved.
Show resolved Hide resolved
servicesTermsTypes[serviceId] = Object.keys(declaration.documents);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we map, why use an outside variable instead of using the return value of map?


return Object.keys(declaration.documents);
return;
}

const defaultFile = await this.getJSONFile(modifiedFilePath, this.defaultBranch);
const modifiedFile = await this.getJSONFile(modifiedFilePath, 'HEAD');
const originalService = await this.getJSONFromFile(this.defaultBranch, modifiedFilePath);
const modifiedService = await this.getJSONFromFile('HEAD', modifiedFilePath);
const modifiedServiceTermsTypes = Object.keys(modifiedService.documents);

const diff = DeepDiff.diff(defaultFile, modifiedFile);
if (!originalService) {
servicesTermsTypes[serviceId] = modifiedServiceTermsTypes;

if (!diff) {
// This can happen if only a lint has been applied to a document
return;
}

const modifiedTermsTypes = diff.reduce((acc, { path }) => {
if (modifiedFilePath.includes('.history')) {
acc.add(path[0]);
} else if (path[0] == 'documents') {
acc.add(path[1]);
}
const originalServiceTermsTypes = Object.keys(originalService.documents);
const addedTermsTypes = modifiedServiceTermsTypes.filter(termsType => !originalServiceTermsTypes.includes(termsType));

return acc;
}, new Set());
if (addedTermsTypes.length) {
servicesTermsTypes[serviceId] = addedTermsTypes;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, why do we assign instead of inserting? Couldn't there be both additions and changes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope it's not possible, you can't both declare a new service and have terms that changed for this service. If it's a new service you can only have all terms added.
This cannot be an override of a previous value as there is an early exit inside the previous if statement.

}

const potentiallyModifiedTermsTypes = modifiedServiceTermsTypes.filter(termsType => originalServiceTermsTypes.includes(termsType));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use two filters in a row to partition a space in two, this is inefficient and unclear.
Suggested alternative: something along the lines of

servicesTermsTypes[serviceId] = modifiedServiceTermsTypes.filter(termsType => {
   !originalServiceTermsTypes.includes(termsType)) ||
   JSON.stringify(originalService.documents[termsType]) != JSON.stringify(modifiedService.documents[termsType]);  // that comparison could quite well be extracted to function to encapsulate the implementation, be more understandable, and hint at future performance optimisation
}):


potentiallyModifiedTermsTypes.forEach(termsType => {
const areTermsModified = JSON.stringify(originalService.documents[termsType]) != JSON.stringify(modifiedService.documents[termsType]);

if (!areTermsModified) {
return;
}

servicesTermsTypes[serviceId] = Array.from(new Set([ ...servicesTermsTypes[serviceId] || [], ...modifiedTermsTypes ]));
servicesTermsTypes[serviceId] = [termsType].concat(servicesTermsTypes[serviceId] || []);
Ndpnt marked this conversation as resolved.
Show resolved Hide resolved
});
}));

return {
services: modifiedServicesIds,
services: Object.keys(servicesTermsTypes),
servicesTermsTypes,
};
}
Expand Down
Loading
Loading