Skip to content

Reimplementation of website-validator-dotnet5 using new features from c#10 and dotnet 6

License

Notifications You must be signed in to change notification settings

CleverCodeCravers/website-validator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

website-validator

A dotnet application that crawls a website checking for http 404s and maybe more stuff later

Intended usage:

websitevalidator -u https://www.yourdomain.whatever -c [--limit xxx] -o structure.json

Output:

A big json file with a lot of information. A part of it being the structure of the website. Useful for further analysis. Its a simple big JSON file. A good thing if you like to use powershell e.g.. Just read the thing and do whatever.

Next tasks

basic functionality

  • convert relative urls to absolute ones
  • return the output either as human readable or json (is there a generic approach?); maybe add a --human switch for the more readable output and default to json
  • return only distinct results
  • enable some basic crawling activity
    • remember the result of each url, so every url is only crawled once
    • only check external urls, but do not feed links from them back into the system. It is important that they are basically reachable but we do not want to check their pages, too)
  • also crawl resource files like linked images, css and javascript
  • add an option for a final human readable report?

validations

  • validations should be configurable without the need for a recompilation
  • group results by http status code, create error messages for 404s and other problems
  • pages shall not contain "Error", "Warning", or anything else that looks like a php problem
  • can I have an overview of which pages are mentioned in the sitemap and which are not
  • can I have an overview of pages which are possibly disallowed by robots.txt
  • we need something that allows us to mute known validation messages that we want to ignore

About

Reimplementation of website-validator-dotnet5 using new features from c#10 and dotnet 6

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages