This scraper tools goes through github Organizations and gets the repositories and their topics and outputs them in a formathe that the UI can display.
NOTE: This whole process is manual and not interative wiht the UI, to be able to see the changes you want you need to run this scrapper to update wht the UI shows.
- The scraper requires a GitHub read-only oauth token as environment variable
GITHUB_AUTH_TOKEN
- The scraper will read the organizations.json file in the public folder
- It will also configure what to ignore bassed on ignored-repositories.json and ignored-topics.json files in public folder
- It will loop through all
organizations
and allrepositories
to generate the structure the UI needs to display the data.
JSON example:
{
"repos": [
{
"org": "redhat-performance",
"name": "aimlperf_reg_tests",
"description": "",
"url": "https://github.com/redhat-performance/aimlperf_reg_tests",
"labels": []
}
]
}
First
go mod download
To build:
go build main.go
To run:
go run main.go
To configure the tool add the dessired Organizations and Filters you want to apply.
The organizations.json file is a simple JSON string array. Add or remove organizations as you need.
Example
["cloud-bulldozer", "redhat-performance"]
The ignored-repositories.json file has a format of a json struct with three fields:
-
global: An array of strings. All names are ignored in all organizations.
-
ignoreArchived: Boolean if set to true, all archived repos are ignored.
-
orgs: A map of the organizations each with a Struct containing 3 fields:
- skip-global-ignore: Will ignore the global ignored repo list
- skip-global-archived: Will ignore the global
ignoreArchived
value - repos: An array of strings. All names are ignored
Example
{
"global": [
".github"
],
"orgs": {
"cloud-bulldozer": {
"skip-global-ignore": true,
"skip-global-archived": true,
"repos": []
},
"redhat-performance": {
"repos": [
"some-repo"
]
}
},
"ignoreArchived": true
}
Hierarchy
The more granular the rule biggest the priority.
- Repos at the level of orgs["org"].repos always get ignored
skip-global-ignore
andskip-global-archived
take priority over the global configsglobal
repos andignoreArchived
are the lower priority
The ignored-topics.json file is the same format as the Organizations
file, a JSON string array, add topics that you want to ignore from your results.
Example
["topic-one", "topic-two"]
To persist your changes for the moment the config files are being commited.
If you add new ignore rules or new organizations make sure to run the scraper
to get the latest repository list in this repo, and then commit all files.