-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add script to remove duplicate issues on declarations repository #1115
Changes from 3 commits
636c49b
4899a46
24a8c01
1eace8e
5f02583
8ea50a7
317eaf6
903f399
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
# Duplicate issues removal script | ||
|
||
This script helps remove duplicate issues from a GitHub repository by closing newer duplicate issues. | ||
MattiSG marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Prerequisites | ||
|
||
1. Set up environment variables: | ||
- Create a `.env` file in the root directory | ||
- Add the GitHub personal access token of the bot that manage issues on your collection with repo permissions: | ||
MattiSG marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
OTA_ENGINE_GITHUB_TOKEN=your_github_token | ||
``` | ||
|
||
2. Configure the target repository in `config/development.json`: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It could be in any config file There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated |
||
```json | ||
{ | ||
"@opentermsarchive/engine": { | ||
"reporter": { | ||
"githubIssues": { | ||
"repositories": { | ||
"declarations": "owner/repository" | ||
} | ||
} | ||
} | ||
} | ||
} | ||
``` | ||
|
||
## Usage | ||
|
||
Run the script using: | ||
|
||
``` | ||
node scripts/reporter/duplicate/index.js | ||
``` |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,77 @@ | ||||||
import 'dotenv/config'; | ||||||
import config from 'config'; | ||||||
import { Octokit } from 'octokit'; | ||||||
|
||||||
async function removeDuplicateIssues() { | ||||||
try { | ||||||
const repository = config.get('@opentermsarchive/engine.reporter.githubIssues.repositories.declarations'); | ||||||
const [ owner, repo ] = repository.split('/'); | ||||||
|
||||||
if (!repository) { | ||||||
throw new Error('Repository configuration is not set'); | ||||||
} | ||||||
Ndpnt marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
const octokit = new Octokit({ auth: process.env.OTA_ENGINE_GITHUB_TOKEN }); | ||||||
|
||||||
console.log(`Getting issues from repository ${repository}…`); | ||||||
|
||||||
const issues = await octokit.paginate('GET /repos/{owner}/{repo}/issues', { | ||||||
owner, | ||||||
repo, | ||||||
state: 'open', | ||||||
per_page: 100, | ||||||
}); | ||||||
|
||||||
const onlyIssues = issues.filter(issue => !issue.pull_request); | ||||||
const issuesByTitle = new Map(); | ||||||
let counter = 0; | ||||||
|
||||||
console.log(`Found ${onlyIssues.length} issues`); | ||||||
|
||||||
for (const issue of onlyIssues) { | ||||||
if (!issuesByTitle.has(issue.title)) { | ||||||
issuesByTitle.set(issue.title, [issue]); | ||||||
} else { | ||||||
issuesByTitle.get(issue.title).push(issue); | ||||||
} | ||||||
} | ||||||
|
||||||
for (const [ title, duplicateIssues ] of issuesByTitle) { | ||||||
if (duplicateIssues.length === 1) continue; | ||||||
|
||||||
const originalIssue = duplicateIssues.reduce((oldest, current) => (new Date(current.created_at) < new Date(oldest.created_at) ? current : oldest)); | ||||||
|
||||||
console.log(`\nFound ${duplicateIssues.length - 1} duplicates for issue #${originalIssue.number} "${title}"`); | ||||||
|
||||||
for (const issue of duplicateIssues) { | ||||||
if (issue.number === originalIssue.number) { | ||||||
continue; | ||||||
} | ||||||
|
||||||
await octokit.request('PATCH /repos/{owner}/{repo}/issues/{issue_number}', { /* eslint-disable-line no-await-in-loop */ | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do we There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It could, but I find logs much easier to read when they’re sequential. Since we need to not send our requests in parallel to avoid hitting GitHub’s rate limit, I opted for a setup that maintains clear, readable output |
||||||
owner, | ||||||
repo, | ||||||
issue_number: issue.number, | ||||||
state: 'closed', | ||||||
}); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Couldn't we use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess it is not what you think:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indeed, sorry. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could still set |
||||||
|
||||||
await octokit.request('POST /repos/{owner}/{repo}/issues/{issue_number}/comments', { /* eslint-disable-line no-await-in-loop */ | ||||||
owner, | ||||||
repo, | ||||||
issue_number: issue.number, | ||||||
body: `Closing duplicate issue. Original issue: #${originalIssue.number}`, | ||||||
MattiSG marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
}); | ||||||
|
||||||
counter++; | ||||||
console.log(`Closed issue #${issue.number}: ${issue.html_url}`); | ||||||
} | ||||||
} | ||||||
|
||||||
console.log(`\nDuplicate removal process completed; ${counter} issues closed`); | ||||||
} catch (error) { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Out of habit. But it can be removed. |
||||||
console.log(`Failed to remove duplicate issues: ${error.stack}`); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
process.exit(1); | ||||||
} | ||||||
} | ||||||
|
||||||
removeDuplicateIssues(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say this is a
no-release
considering there is strictly no change in behavior exposed to reusers.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered that initially, but if we ask our partners to update to the latest version to access this script, they won’t be able to do so if there’s no official release available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, true 😅