Skip to content

Commit

Permalink
sectxt update 0.9.2 (#68)
Browse files Browse the repository at this point in the history
* fix leap year

update version and fix the leap year issue by improving the year later check.

* fix issue with bom error

The BOM error message was shown every time even when it did not start with the byte order mark.

* adding local file test

Adding the option to test local files using the `is_local` parameter.

* bom replace first occurrence

---------

Co-authored-by: SanderKools-Ordina <[email protected]>
  • Loading branch information
DigitalTrustCenter and SanderKools-Ordina authored Mar 7, 2024
1 parent 27d8524 commit ad85c74
Show file tree
Hide file tree
Showing 3 changed files with 206 additions and 159 deletions.
79 changes: 43 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ The package is available on pypi. It can be installed using pip:
## Usage

```python

>>> from sectxt import SecurityTXT
>>> s = SecurityTXT("www.example.com")
>>> s.is_valid()
Expand All @@ -26,7 +25,6 @@ True
## Validation

```python

>>> from sectxt import SecurityTXT
>>> s = SecurityTXT("www.example.com")
>>> s.errors
Expand All @@ -46,38 +44,38 @@ a dict with three keys:

### Possible errors

| code | message |
|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| "no_security_txt" | "security.txt could not be located." |
| "location" | "security.txt was located on the top-level path (legacy place), but must be placed under the '/.well-known/' path." |
| "invalid_uri_scheme" | "Insecure URI scheme HTTP is not allowed. The security.txt file access must use the HTTPS scheme" |
| "invalid_cert" | "security.txt must be served with a valid TLS certificate." |
| "no_content_type" | "HTTP Content-Type header must be sent." |
| "invalid_media" | "Media type in Content-Type header must be 'text/plain'." |
| "invalid_charset" | "Charset parameter in Content-Type header must be 'utf-8' if present." |
| "utf8" | "Content must be utf-8 encoded." |
| "no_expire" | "'Expires' field must be present." |
| "multi_expire" | "'Expires' field must not appear more than once." |
| "invalid_expiry" | "Date and time in 'Expires' field must be formatted according to ISO 8601." |
| "expired" | "Date and time in 'Expires' field must not be in the past." |
| "no_contact" | "'Contact' field must appear at least once." |
| "no_canonical_match" | "Web URI where security.txt is located must match with a 'Canonical' field. In case of redirecting either the first or last web URI of the redirect chain must match." |
| "multi_lang" | "'Preferred-Languages' field must not appear more than once." |
| "invalid_lang" | "Value in 'Preferred-Languages' field must match one or more language tags as defined in RFC5646, separated by commas." |
| "no_uri" | "Field '{field}' value must be a URI." |
| "no_https" | "Web URI must begin with 'https://'." |
| "prec_ws" | "There must be no whitespace before the field separator (colon)." |
| "no_space" | "Field separator (colon) must be followed by a space." |
| "empty_key" | "Field name must not be empty." |
| "empty_value" | "Field value must not be empty." |
| "invalid_line" | "Line must contain a field name and value, unless the line is blank or contains a comment." |
| "no_line_separators" | "Every line, including the last one, must end with either a carriage return and line feed characters or just a line feed character" |
| "signed_format_issue" | "Signed security.txt must start with the header '-----BEGIN PGP SIGNED MESSAGE-----'. " |
| "data_after_sig" | "Signed security.txt must not contain data after the signature." |
| "no_csaf_file" | "All CSAF fields must point to a provider-metadata.json file." |
| "pgp_data_error" | "Signed message did not contain a correct ASCII-armored PGP block." |
| "pgp_error" | "Decoding or parsing of the pgp message failed." |
| "bom_in_file" | "The Byte-Order Mark was found in the UTF-8 File. Security.txt must be encoded using UTF-8 in Net-Unicode form, the BOM signature must not appear at the beginning." |
| code | message |
|-----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| "no_security_txt" | "security.txt could not be located." |
| "location" | "security.txt was located on the top-level path (legacy place), but must be placed under the '/.well-known/' path." |
| "invalid_uri_scheme" | "Insecure URI scheme HTTP is not allowed. The security.txt file access must use the HTTPS scheme" |
| "invalid_cert" | "security.txt must be served with a valid TLS certificate." |
| "no_content_type" | "HTTP Content-Type header must be sent." |
| "invalid_media" | "Media type in Content-Type header must be 'text/plain'." |
| "invalid_charset" | "Charset parameter in Content-Type header must be 'utf-8' if present." |
| "utf8" | "Content must be utf-8 encoded." |
| "no_expire" | "'Expires' field must be present." |
| "multi_expire" | "'Expires' field must not appear more than once." |
| "invalid_expiry" | "Date and time in 'Expires' field must be formatted according to ISO 8601." |
| "expired" | "Date and time in 'Expires' field must not be in the past." |
| "no_contact" | "'Contact' field must appear at least once." |
| "no_canonical_match" | "Web URI where security.txt is located must match with a 'Canonical' field. In case of redirecting either the first or last web URI of the redirect chain must match." |
| "multi_lang" | "'Preferred-Languages' field must not appear more than once." |
| "invalid_lang" | "Value in 'Preferred-Languages' field must match one or more language tags as defined in RFC5646, separated by commas." |
| "no_uri" | "Field '{field}' value must be a URI." |
| "no_https" | "Web URI must begin with 'https://'." |
| "prec_ws" | "There must be no whitespace before the field separator (colon)." |
| "no_space" | "Field separator (colon) must be followed by a space." |
| "empty_key" | "Field name must not be empty." |
| "empty_value" | "Field value must not be empty." |
| "invalid_line" | "Line must contain a field name and value, unless the line is blank or contains a comment." |
| "no_line_separators" | "Every line, including the last one, must end with either a carriage return and line feed characters or just a line feed character" |
| "signed_format_issue" | "Signed security.txt must start with the header '-----BEGIN PGP SIGNED MESSAGE-----'. " |
| "data_after_sig" | "Signed security.txt must not contain data after the signature." |
| "no_csaf_file" | "All CSAF fields must point to a provider-metadata.json file." |
| "pgp_data_error" | "Signed message did not contain a correct ASCII-armored PGP block." |
| "pgp_error" | "Decoding or parsing of the pgp message failed." |
| "bom_in_file" | "The Byte-Order Mark was found at the start of the file. Security.txt must be encoded using UTF-8 in Net-Unicode form, the BOM signature must not appear at the beginning." |


### Possible recommendations
Expand All @@ -102,13 +100,22 @@ a dict with three keys:
The scraper attempts to find the security.txt of the given domain in the correct location `/.well-known/security.txt`. It also looks in the old location and with unsecure `http` scheme which would result in validation errors. To prevent possible errors getting the file from the domain a user-agent is added to the header of the request. The user agent that is added is `Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20100101 Firefox/12.0`, which would mock a browser in firefox with a Windows 7 OS.
If a security.txt file is found that file is than parsed. Any errors, recommendations or notifications that are found would be returned.

### Test security.txt files locally

It is possible to give a local path as the url parameter. For this behaviour you have to turn on the `is_local` parameter.
Doing this will only validate the contents of the file given.

```python
>>> from sectxt import SecurityTXT
>>> s = SecurityTXT("/home/example/security.txt", is_local=True)
```

---

[1] The security.txt parser will check for the addition of the digital signature, but it will not verify the validity of the signature.

[2] Regarding code "unknown_field": According to RFC 9116 section 2.4, any fields that are not explicitly supported must be ignored. This parser does add a notification for unknown fields by default. This behaviour can be turned off using the parameter recommend_unknown_fields:
[2] Regarding code "unknown_field": According to RFC 9116 section 2.4, any fields that are not explicitly supported must be ignored. This parser does add a notification for unknown fields by default. This behaviour can be turned off using the parameter `recommend_unknown_fields`:
```python

>>> from sectxt import SecurityTXT
>>> s = SecurityTXT("www.example.com", recommend_unknown_fields=False)
```
Loading

0 comments on commit ad85c74

Please sign in to comment.