Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add script to remove duplicate issues on declarations repository #1115

Merged
merged 8 commits into from
Oct 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,14 @@

All changes that impact users of this module are documented in this file, in the [Common Changelog](https://common-changelog.org) format with some additional specifications defined in the CONTRIBUTING file. This codebase adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## Unreleased [minor]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say this is a no-release considering there is strictly no change in behavior exposed to reusers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that initially, but if we ask our partners to update to the latest version to access this script, they won’t be able to do so if there’s no official release available.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, true 😅


> Development of this release was supported by the [French Ministry for Foreign Affairs](https://www.diplomatie.gouv.fr/fr/politique-etrangere-de-la-france/diplomatie-numerique/) through its ministerial [State Startups incubator](https://beta.gouv.fr/startups/open-terms-archive.html) under the aegis of the Ambassador for Digital Affairs.

### Added

- Add script to remove duplicate issues in GitHub reports

## 2.4.0 - 2024-10-24

_Full changeset and discussions: [#1114](https://github.com/OpenTermsArchive/engine/pull/1114)._
Expand Down
37 changes: 37 additions & 0 deletions scripts/reporter/duplicate/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Duplicate issues removal script

This script helps remove duplicate issues from a GitHub repository by closing issues that have the same title as any older issue.

## Prerequisites

1. Set up environment variables:
- Create a `.env` file in the root directory
- Add the GitHub personal access token of the bot that manages issues on your collection, with `repo` permissions:

```shell
OTA_ENGINE_GITHUB_TOKEN=your_github_token
```

2. Configure the target repository in your chosen configuration file within the `config` folder:

```json
{
"@opentermsarchive/engine": {
"reporter": {
"githubIssues": {
"repositories": {
"declarations": "owner/repository"
}
}
}
}
}
```

## Usage

Run the script using:

```shell
node scripts/reporter/duplicate/index.js
```
73 changes: 73 additions & 0 deletions scripts/reporter/duplicate/index.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
import 'dotenv/config';
import config from 'config';
import { Octokit } from 'octokit';

async function removeDuplicateIssues() {
const repository = config.get('@opentermsarchive/engine.reporter.githubIssues.repositories.declarations');

if (!repository.includes('/') || repository.includes('https://')) {
throw new Error(`Configuration entry "reporter.githubIssues.repositories.declarations" is expected to be a string in the format <owner>/<repo>, but received: "${repository}"`);
}

const [ owner, repo ] = repository.split('/');

const octokit = new Octokit({ auth: process.env.OTA_ENGINE_GITHUB_TOKEN });

console.log(`Getting issues from repository ${repository}…`);

const issues = await octokit.paginate('GET /repos/{owner}/{repo}/issues', {
owner,
repo,
state: 'open',
per_page: 100,
});

const onlyIssues = issues.filter(issue => !issue.pull_request);
const issuesByTitle = new Map();
let counter = 0;

console.log(`Found ${onlyIssues.length} issues`);

for (const issue of onlyIssues) {
if (!issuesByTitle.has(issue.title)) {
issuesByTitle.set(issue.title, [issue]);
} else {
issuesByTitle.get(issue.title).push(issue);
}
}

for (const [ title, duplicateIssues ] of issuesByTitle) {
if (duplicateIssues.length === 1) continue;

const originalIssue = duplicateIssues.reduce((oldest, current) => (new Date(current.created_at) < new Date(oldest.created_at) ? current : oldest));

console.log(`\nFound ${duplicateIssues.length - 1} duplicates for issue #${originalIssue.number} "${title}"`);

for (const issue of duplicateIssues) {
if (issue.number === originalIssue.number) {
continue;
}

await octokit.request('PATCH /repos/{owner}/{repo}/issues/{issue_number}', { /* eslint-disable-line no-await-in-loop */
owner,
repo,
issue_number: issue.number,
state: 'closed',
});

await octokit.request('POST /repos/{owner}/{repo}/issues/{issue_number}/comments', { /* eslint-disable-line no-await-in-loop */
owner,
repo,
issue_number: issue.number,
body: `This issue is detected as duplicate as it has the same title as #${originalIssue.number}. It most likely was created accidentally by an engine older than [v2.3.2](https://github.com/OpenTermsArchive/engine/releases/tag/v2.3.2). Closing automatically.`,
});

counter++;
console.log(`Closed issue #${issue.number}: ${issue.html_url}`);
}
}

console.log(`\nDuplicate removal process completed; ${counter} issues closed`);
}

removeDuplicateIssues();
Loading