-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broken links on SIG-Contribex Charter #5975
Comments
/sig contributor-experience |
/good-first-issue |
@Debanitrkl: Please ensure the request meets the requirements listed here. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign |
It would be interesting to catch these issues way before PRs changing or adding links get merged, assuming PR is the only mode of adding code Here's an issue detailing the simple idea - kubernetes/contributor-site#236 , same idea can be applied to any data / content The idea is to catch issues sooner than later; aka precaution instead of cure - this issue's PR would be a cure; aka fail fast / catch bugs/issues fast |
@karuppiah7890 thanks it's indeed a good suggestion but I'm afraid if anything automated solution is possible for this. Please if you could provide some examples or elaborate more on the methods you suggested, then it would be more helpful. |
I think it's very much possible. I have seen some examples years ago, where links are parsed and they are checked to see if they are valid links. It's a standard linting idea for links (URLs) in docs (markdown, webpages) or just any website |
The algorithm from a high level is simple - you process every content that you want to lint and look for links and check if they are valid For example, if you are looking at markdown files, get all markdown files, process them one by one, look for links in the file and we are only talking about web links here so just HTTP and HTTPS ones, so a simple search would be to use http:// and https:// as search and find links and I think there might be libraries to help with this and then you can check if the link is valid as such, for example, simply run it through a URL parser in any language and it should give any basic errors in the URL, then next, you can use HTTP client library to send a request to the link, assuming GET request for the link works, as these are all links on markdown files or say webpages (which are created from the markdown), so you send a HTTP GET request to the link and check if it returns a successful response, for example 200 status code generally and if it returns redirection, then follow the redirection and check if 200 status code comes up. Actually, to keep it light weight, even HEAD request can be used I guess, instead of GET which will return full response body too In case of using a website as input, unlike the above markdown file case, for website, one can crawl the website completely, through all the links by recursively opening up every link, and check if it gives 200 response status code. For websites, GET request has to be used as the program has to recursively crawl through links continuously in the response body. Ideally this is for websites where every resource is a HTML page, which has links and is not a single page app (SPA) which does not load / render outside browser and does not render links as anchor tags with href links and instead uses JavaScript for navigation |
@karuppiah7890 thanks for the explanation I think you should start a thread regarding this in the sig-contribex channel in slack. As k8s is a huge repository and also given the new security policies of GitHub in place I'm not sure if crawling would work. Also I'm not very well versed with this. |
FYI: (this might not be exactly what is being discussed here, but is definitely the right place to begin) |
@Debanitrkl I was only suggesting the lint check for this repo as part of this issue. Surely doing it on other repos in k8s orgs (k8s, k8s sigs etc) will be a big task, including the k8s/k8s repo, and I'm not sure about the status of those projects and what lints they do and if link check is being done. So I'll not comment on that until I know the details. This repo - I assumed it didn't have the lint after seeing this issue. Looks like there are related issues too mentioned above #5975 (comment) I'm not gonna start any threads in sig-contribex slack channel. It was just an idea I had based on what I had seen before in some places on the Internet. I'm not sure if I have the bandwidth to see it through. It's also a good first issue so I'll step back and let someone else looking for good-first-issues pick it up |
@karuppiah7890 I really liked your suggestion just I wasn't aware of how that's possible in k8s so I suggested you for the slack discussion, but with the reference Madhav gave and the explanation from your side I can now think it as more viable. And no worries about the constraints you are having regarding initiating and moving it forward I will try to look into it myself and try to loop in few other contributors as well. |
While going through the charter, I found the links on line
34-38
aren't working properly.The lines in the source of the doc:
cc @parispittman @nikhita
/sig contributor-experience
The text was updated successfully, but these errors were encountered: