Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect gitignore #1500

Merged
merged 11 commits into from
Sep 22, 2024
41 changes: 30 additions & 11 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

42 changes: 24 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -315,7 +315,7 @@ Arguments:
Options:
-c, --config <CONFIG_FILE>
Configuration file to use

[default: lychee.toml]

-v, --verbose...
Expand All @@ -333,7 +333,7 @@ Options:

--max-cache-age <MAX_CACHE_AGE>
Discard all cached requests older than this duration

[default: 1d]

--dump
Expand All @@ -344,33 +344,33 @@ Options:

--archive <ARCHIVE>
Specify the use of a specific web archive. Can be used in combination with `--suggest`

[possible values: wayback]

--suggest
Suggest link replacements for broken links, using a web archive. The web archive can be specified with `--archive`

-m, --max-redirects <MAX_REDIRECTS>
Maximum number of allowed redirects

[default: 5]

--max-retries <MAX_RETRIES>
Maximum number of retries per request

[default: 3]

--max-concurrency <MAX_CONCURRENCY>
Maximum number of concurrent network requests

[default: 128]

-T, --threads <THREADS>
Number of threads to utilize. Defaults to number of cores available to the system

-u, --user-agent <USER_AGENT>
User agent

[default: lychee/x.y.z]

-i, --insecure
Expand Down Expand Up @@ -420,46 +420,46 @@ Options:
Test the specified file extensions for URIs when checking files locally.
Multiple extensions can be separated by commas. Extensions will be checked in
order of appearance.

Example: --fallback-extensions html,htm,php,asp,aspx,jsp,cgi

--header <HEADER>
Custom request header

-a, --accept <ACCEPT>
A List of accepted status codes for valid links

The following accept range syntax is supported: [start]..[=]end|code. Some valid
examples are:

- 200..=204
- 200..204
- ..=204
- ..204
- 200

Use "lychee --accept '200..=204, 429, 500' <inputs>..." to provide a comma-
separated list of accepted status codes. This example will accept 200, 201,
202, 203, 204, 429, and 500 as valid status codes.

[default: 100..=103,200..=299]

--include-fragments
Enable the checking of fragments in links

-t, --timeout <TIMEOUT>
Website timeout in seconds from connect to response finished

[default: 20]

-r, --retry-wait-time <RETRY_WAIT_TIME>
Minimum wait time in seconds between retries of failed requests

[default: 1]

-X, --method <METHOD>
Request method

[default: get]

-b, --base <BASE>
Expand All @@ -470,12 +470,18 @@ Options:

--github-token <GITHUB_TOKEN>
GitHub API token to use when checking github.com links, to avoid rate limiting

[env: GITHUB_TOKEN]

--skip-missing
Skip missing input files (default is to error if they don't exist)

--no-ignore
Do not skip files that would otherwise be ignored by '.gitignore', '.ignore', or the global ignore file

--hidden
Do not skip hidden directories and files

--include-verbatim
Find links in verbatim sections like `pre`- and `code` blocks

Expand All @@ -487,13 +493,13 @@ Options:

--mode <MODE>
Set the output display mode. Determines how results are presented in the terminal

[default: color]
[possible values: plain, color, emoji]

-f, --format <FORMAT>
Output format of final status report

[default: compact]
[possible values: compact, detailed, json, markdown, raw]

Expand Down
3 changes: 2 additions & 1 deletion examples/collect_links/collect_links.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ use std::path::PathBuf;
use tokio_stream::StreamExt;

#[tokio::main]
#[allow(clippy::trivial_regex)]
async fn main() -> Result<()> {
// Collect all links from the following inputs
let inputs = vec![
Expand All @@ -24,6 +23,8 @@ async fn main() -> Result<()> {

let links = Collector::new(None) // base
.skip_missing_inputs(false) // don't skip missing inputs? (default=false)
.skip_hidden(false) // skip hidden files? (default=true)
.skip_ignored(false) // skip files that are ignored by git? (default=true)
.use_html5ever(false) // use html5ever for parsing? (default=false)
.collect_links(inputs) // base url or directory
.collect::<Result<Vec<_>>>()
Expand Down
1 change: 1 addition & 0 deletions fixtures/hidden/.hidden/file.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
https://wikipedia.org
1 change: 1 addition & 0 deletions fixtures/ignore/.ignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ignored-file.md
1 change: 1 addition & 0 deletions fixtures/ignore/ignored-file.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
https://archlinux.org
File renamed without changes.
File renamed without changes.
2 changes: 2 additions & 0 deletions lychee-bin/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -292,6 +292,8 @@ async fn run(opts: &LycheeOptions) -> Result<i32> {

let mut collector = Collector::new(opts.config.base.clone())
.skip_missing_inputs(opts.config.skip_missing)
.skip_hidden(!opts.config.hidden)
.skip_ignored(!opts.config.no_ignore)
.include_verbatim(opts.config.include_verbatim)
// File a bug if you rely on this envvar! It's going to go away eventually.
.use_html5ever(std::env::var("LYCHEE_USE_HTML5EVER").map_or(false, |x| x == "1"));
Expand Down
11 changes: 11 additions & 0 deletions lychee-bin/src/options.rs
Original file line number Diff line number Diff line change
Expand Up @@ -438,6 +438,17 @@ separated list of accepted status codes. This example will accept 200, 201,
#[serde(default)]
pub(crate) skip_missing: bool,

/// Do not skip files that would otherwise be ignored by
/// '.gitignore', '.ignore', or the global ignore file.
#[arg(long)]
#[serde(default)]
pub(crate) no_ignore: bool,

/// Do not skip hidden directories and files.
#[arg(long)]
#[serde(default)]
pub(crate) hidden: bool,

/// Find links in verbatim sections like `pre`- and `code` blocks
#[arg(long)]
#[serde(default)]
Expand Down
42 changes: 40 additions & 2 deletions lychee-bin/tests/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -467,6 +467,44 @@ mod cli {
cmd.arg(&filename).arg("--skip-missing").assert().success();
}

#[test]
fn test_skips_hidden_files_by_default() {
main_command()
.arg(fixtures_path().join("hidden/"))
.assert()
.success()
.stdout(contains("0 Total"));
}

#[test]
fn test_include_hidden_file() {
main_command()
.arg(fixtures_path().join("hidden/"))
.arg("--hidden")
.assert()
.success()
.stdout(contains("1 Total"));
}

#[test]
fn test_skips_ignored_files_by_default() {
main_command()
.arg(fixtures_path().join("ignore/"))
.assert()
.success()
.stdout(contains("0 Total"));
}

#[test]
fn test_include_ignored_file() {
main_command()
.arg(fixtures_path().join("ignore/"))
.arg("--no-ignore")
.assert()
.success()
.stdout(contains("1 Total"));
}

#[tokio::test]
async fn test_glob() -> Result<()> {
// using Result to be able to use `?`
Expand Down Expand Up @@ -755,7 +793,7 @@ mod cli {
#[test]
fn test_lycheeignore_file() -> Result<()> {
let mut cmd = main_command();
let test_path = fixtures_path().join("ignore");
let test_path = fixtures_path().join("lycheeignore");

let cmd = cmd
.current_dir(test_path)
Expand All @@ -776,7 +814,7 @@ mod cli {
#[test]
fn test_lycheeignore_and_exclude_file() -> Result<()> {
let mut cmd = main_command();
let test_path = fixtures_path().join("ignore");
let test_path = fixtures_path().join("lycheeignore");
let excludes_path = test_path.join("normal-exclude-file");

cmd.current_dir(test_path)
Expand Down
32 changes: 18 additions & 14 deletions lychee-bin/tests/usage.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,22 @@ mod readme {
fs::read_to_string(readme_path).unwrap()
}

/// Remove line `[default: lychee/x.y.z]` from the string
fn remove_lychee_version_line(string: &str) -> String {
string
.lines()
.filter(|line| !line.contains("[default: lychee/"))
.collect::<Vec<_>>()
.join("\n")
}

fn trim_empty_lines(str: &str) -> String {
str.lines()
.map(|line| if line.trim().is_empty() { "" } else { line })
.collect::<Vec<_>>()
.join("\n")
}

/// Test that the USAGE section in `README.md` is up to date with
/// `lychee --help`.
/// Only unix: might not work with windows CRLF line-endings returned from
Expand All @@ -37,13 +53,7 @@ mod readme {
.ok_or("Usage not found in help")?;
let usage_in_help = &help_output[usage_in_help_start..];

// Remove line `[default: lychee/0.1.0]` from the help output
let usage_in_help = usage_in_help
.lines()
.filter(|line| !line.contains("[default: lychee/"))
.collect::<Vec<_>>()
.join("\n");

let usage_in_help = trim_empty_lines(&remove_lychee_version_line(usage_in_help));
let readme = load_readme_text();
let usage_start = readme
.find(USAGE_STRING)
Expand All @@ -52,13 +62,7 @@ mod readme {
.find("\n```")
.ok_or("End of usage not found in README")?;
let usage_in_readme = &readme[usage_start..usage_start + usage_end];

// Remove line `[default: lychee/0.1.0]` from the README
let usage_in_readme = usage_in_readme
.lines()
.filter(|line| !line.contains("[default: lychee/"))
.collect::<Vec<_>>()
.join("\n");
let usage_in_readme = remove_lychee_version_line(usage_in_readme);

assert_eq!(usage_in_readme, usage_in_help);
Ok(())
Expand Down
2 changes: 1 addition & 1 deletion lychee-lib/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ html5ever = "0.28.0"
html5gum = "0.5.7"
http = "1.0.0"
hyper = "1.3.1"
ignore = "0.4.23"
ip_network = "0.4.1"
jwalk = "0.8.1"
linkify = "0.10.0"
log = "0.4.22"
octocrab = "0.39.0"
Expand Down
Loading
Loading