Skip to content

Commit

Permalink
Feature: Export (brief) analysis to XML
Browse files Browse the repository at this point in the history
  • Loading branch information
rfc-st committed Dec 5, 2024
1 parent a3ac2e4 commit a71236d
Show file tree
Hide file tree
Showing 6 changed files with 119 additions and 62 deletions.
78 changes: 42 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
<a target="_blank" href="https://www.python.org/downloads/" title="Minimum Python version required to run this tool"><img src="https://img.shields.io/badge/Python-%3E%3D3.8-blue?labelColor=343b41"></a>
<a target="_blank" href="LICENSE" title="License of this tool"><img src="https://img.shields.io/badge/License-MIT-blue.svg?labelColor=343b41"></a>
<a target="_blank" href="https://github.com/rfc-st/humble/releases" title="Latest release of this tool"><img src="https://img.shields.io/github/v/release/rfc-st/humble?display_name=release&label=Latest%20Release&labelColor=343b41"></a>
<a target="_blank" href="https://github.com/rfc-st/humble/commits/master" title="Latest commit of this tool"><img src="https://img.shields.io/badge/Latest_Commit-2024--12--04-blue.svg?labelColor=343b41"></a>
<a target="_blank" href="https://github.com/rfc-st/humble/commits/master" title="Latest commit of this tool"><img src="https://img.shields.io/badge/Latest_Commit-2024--12--05-blue.svg?labelColor=343b41"></a>
<a target="_blank" href="https://github.com/rfc-st/humble/actions?query=workflow%3ACodeQL" title="Results of the last analysis of this tool with CodeQL"><img src="https://github.com/rfc-st/humble/workflows/CodeQL/badge.svg"></a>
<a target="_blank" href="https://pkg.kali.org/pkg/humble" title="Official tool in Kali Linux"><img src="https://img.shields.io/badge/Kali%20Linux-Tool-blue?labelColor=343b41"></a>
<br />
Expand Down Expand Up @@ -61,7 +61,7 @@
:heavy_check_mark: Browser support references for enabled HTTP security headers: provided by https://caniuse.com/.<br />
:heavy_check_mark: Two types of analysis: brief and detailed, along with HTTP response headers.<br />
:heavy_check_mark: Can exclude specific HTTP response headers from the analysis.<br />
:heavy_check_mark: Can export each analysis to CSV, HTML5, JSON, PDF 1.4 and TXT (and in a filename and path of your choice).<br />
:heavy_check_mark: Can export each analysis to CSV, HTML5, JSON, PDF 1.4, TXT and XML (and in a filename and path of your choice).<br />
:heavy_check_mark: Can analyze '_raw response files_': text files with HTTP response headers and values. Ex: curl option '<a href="https://curl.se/docs/manpage.html" target="_blank">--dump-header<a>'.<br />
:heavy_check_mark: Highlights <a href="https://developer.mozilla.org/en-US/docs/MDN/Writing_guidelines/Experimental_deprecated_obsolete" target="_blank">experimental<a> headers in each analysis.<br />
:heavy_check_mark: Each detailed analysis may include up to dozens of official links, references and technical articles.<br />
Expand Down Expand Up @@ -148,6 +148,12 @@ Options used: -f -g -p -U -s --hints
<img src="https://github.com/rfc-st/humble/blob/master/screenshots/humble_json_s.PNG" alt="(Linux) - Brief analysis saved as JSON" width=70% height=70%>
</p>
<br />
.: (Linux) - Brief analysis saved as XML. <a href="https://github.com/rfc-st/humble/raw/master/samples/humble_https_facebook_com_20241205_200353_en.xml">Example.</a><br />
<p></p>
<p align="center">
<img src="https://github.com/rfc-st/humble/blob/master/screenshots/humble_xml_s.PNG" alt="(Linux) - Brief analysis saved as XML" width=70% height=70%>
</p>
<br />
.: (Linux) - Analysis history file: Date, URL, Enabled, Missing, Fingerprint, Deprecated/Insecure, Empty headers & Total warnings (the four previous totals).<br />
<p></p>
<p align="center">
Expand Down Expand Up @@ -247,45 +253,45 @@ $ docker rmi humble:1.42
(Linux) $ python3 humble.py
(macOS) $ python3 humble.py

usage: humble.py [-h] [-a] [-b] [-df] [-e [TESTSSL_PATH]] [-f [FINGERPRINT_TERM]] [-g] [-grd] [-if INPUT_FILE] [-l {es}] [-lic] [-o {csv,html,json,pdf,txt}] [-of OUTPUT_FILE]
[-op OUTPUT_PATH] [-r] [-s [SKIP_HEADERS ...]] [-u URL] [-ua USER_AGENT] [-v]
usage: humble.py [-h] [-a] [-b] [-df] [-e [TESTSSL_PATH]] [-f [FINGERPRINT_TERM]] [-g] [-grd] [-if INPUT_FILE] [-l {es}] [-lic] [-o {csv,html,json,pdf,txt,xml}]
[-of OUTPUT_FILE] [-op OUTPUT_PATH] [-r] [-s [SKIP_HEADERS ...]] [-u URL] [-ua USER_AGENT] [-v]

'humble' (HTTP Headers Analyzer) | https://github.com/rfc-st/humble | v.2024-12-03
'humble' (HTTP Headers Analyzer) | https://github.com/rfc-st/humble | v.2024-12-05

options:
-h, --help show this help message and exit
-a Shows statistics of the performed analysis; if the '-u' parameter is ommited they will be global
-b Shows overall findings; if omitted detailed ones will be shown
-df Do not follow redirects; if omitted the last redirection will be the one analyzed
-e [TESTSSL_PATH] Shows TLS/SSL checks; requires the PATH of https://testssl.sh/
-f [FINGERPRINT_TERM] Shows fingerprint statistics; if 'FINGERPRINT_TERM' (e.g., 'Google') is omitted the top 20 results will be shown
-g Shows guidelines for enabling security HTTP response headers on popular frameworks, servers and services
-grd Shows the checks to grade an analysis, along with advice for improvement
-if INPUT_FILE Analyzes 'INPUT_FILE': must contain HTTP response headers and values separated by ': '; E.g. 'server: nginx'.
-l {es} Defines the language for displaying analysis, errors and messages; if omitted, will be shown in English
-lic Shows the license for 'humble', along with permissions, limitations and conditions.
-o {csv,html,json,pdf,txt} Exports analysis to 'humble_scheme_URL_port_yyyymmdd_hhmmss_language.ext' file; csv/json will have a brief analysis
-of OUTPUT_FILE Exports analysis to 'OUTPUT_FILE'; if omitted the default filename of the parameter '-o' will be used
-op OUTPUT_PATH Exports analysis to 'OUTPUT_PATH'; must be absolute. If omitted the PATH of 'humble.py' will be used
-r Shows HTTP response headers and a detailed analysis; '-b' parameter will take priority
-s [SKIP_HEADERS ...] Skips 'deprecated/insecure' and 'missing' checks for the indicated 'SKIP_HEADERS' (separated by spaces)
-u URL Scheme, host and port to analyze. E.g. https://google.com
-ua USER_AGENT User-Agent ID from 'additional/user_agents.txt' file to use. '0' will show all and '1' is the default
-v, --version Checks for updates at https://github.com/rfc-st/humble
-h, --help show this help message and exit
-a Shows statistics of the performed analysis; if the '-u' parameter is ommited they will be global
-b Shows overall findings; if omitted detailed ones will be shown
-df Do not follow redirects; if omitted the last redirection will be the one analyzed
-e [TESTSSL_PATH] Shows TLS/SSL checks; requires the PATH of https://testssl.sh/
-f [FINGERPRINT_TERM] Shows fingerprint statistics; if 'FINGERPRINT_TERM' (e.g., 'Google') is omitted the top 20 results will be shown
-g Shows guidelines for enabling security HTTP response headers on popular frameworks, servers and services
-grd Shows the checks to grade an analysis, along with advice for improvement
-if INPUT_FILE Analyzes 'INPUT_FILE': must contain HTTP response headers and values separated by ': '; E.g. 'server: nginx'.
-l {es} Defines the language for displaying analysis, errors and messages; if omitted, will be shown in English
-lic Shows the license for 'humble', along with permissions, limitations and conditions.
-o {csv,html,json,pdf,txt,xml} Exports analysis to 'humble_scheme_URL_port_yyyymmdd_hhmmss_language.ext' file; csv/json/xml will have a brief analysis
-of OUTPUT_FILE Exports analysis to 'OUTPUT_FILE'; if omitted the default filename of the parameter '-o' will be used
-op OUTPUT_PATH Exports analysis to 'OUTPUT_PATH'; must be absolute. If omitted the PATH of 'humble.py' will be used
-r Shows HTTP response headers and a detailed analysis; '-b' parameter will take priority
-s [SKIP_HEADERS ...] Skips 'deprecated/insecure' and 'missing' checks for the indicated 'SKIP_HEADERS' (separated by spaces)
-u URL Scheme, host and port to analyze. E.g. https://google.com
-ua USER_AGENT User-Agent ID from 'additional/user_agents.txt' file to use. '0' will show all and '1' is the default
-v, --version Checks for updates at https://github.com/rfc-st/humble

examples:
-u URL -a Shows statistics of the analysis performed against the URL
-u URL -b Analyzes URL and reports overall findings
-u URL -b -o csv Analyzes URL and exports overall findings to CSV format
-u URL -l es Analyzes URL and reports (in Spanish) detailed findings
-u URL -o pdf Analyzes URL and exports detailed findings to PDF format
-u URL -o html -of test Analyzes URL and exports detailed findings to HTML format and 'test' filename
-u URL -o pdf -op D:/Tests Analyzes URL and exports detailed findings to PDF format and 'D:/Tests' path
-u URL -r Analyzes URL and reports detailed findings along with HTTP response headers
-u URL -s ETag NEL Analyzes URL and skips 'deprecated/insecure' and 'missing' checks for 'ETag' and 'NEL' headers
-u URL -ua 4 Analyzes URL using the fourth User-Agent of 'additional/user_agents.txt' file
-a -l es Shows statistics (in Spanish) of the analysis performed against all URLs
-f Google Shows HTTP fingerprint headers related to the term 'Google'
-u URL -a Shows statistics of the analysis performed against the URL
-u URL -b Analyzes URL and reports overall findings
-u URL -b -o csv Analyzes URL and exports overall findings to CSV format
-u URL -l es Analyzes URL and reports (in Spanish) detailed findings
-u URL -o pdf Analyzes URL and exports detailed findings to PDF format
-u URL -o html -of test Analyzes URL and exports detailed findings to HTML format and 'test' filename
-u URL -o pdf -op D:/Tests Analyzes URL and exports detailed findings to PDF format and 'D:/Tests' path
-u URL -r Analyzes URL and reports detailed findings along with HTTP response headers
-u URL -s ETag NEL Analyzes URL and skips 'deprecated/insecure' and 'missing' checks for 'ETag' and 'NEL' headers
-u URL -ua 4 Analyzes URL using the fourth User-Agent of 'additional/user_agents.txt' file
-a -l es Shows statistics (in Spanish) of the analysis performed against all URLs
-f Google Shows HTTP fingerprint headers related to the term 'Google'
```

## Advanced usage (Linux)
Expand Down
58 changes: 50 additions & 8 deletions humble.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@
import sys
import contextlib
import concurrent.futures
import xml.etree.ElementTree as ET

# Third-Party imports
from colorama import Fore, Style, init
Expand All @@ -75,6 +76,12 @@
CSV_SECTION = ('0section', '0headers', '1enabled', '2missing', '3fingerprint',
'4depinsecure', '5empty', '6compat', '7result')
DELETED_LINES = '\x1b[1A\x1b[2K\x1b[1A\x1b[2K\x1b[1A\x1b[2K'
DTD_CONTENT = '''<!ELEMENT analysis (section+)>
<!ELEMENT section (item*)>
<!ATTLIST section name CDATA #REQUIRED>
<!ELEMENT item (#PCDATA)>
<!ATTLIST item name CDATA #IMPLIED>
'''
# https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers
EXP_HEADERS = ('activate-storage-access', 'critical-ch', 'document-policy',
'nel', 'no-vary-search', 'observe-browsing-topics',
Expand Down Expand Up @@ -126,7 +133,7 @@
URL_STRING = ('rfc-st', ' URL : ', 'caniuse')

current_time = datetime.now().strftime("%Y/%m/%d - %H:%M:%S")
local_version = datetime.strptime('2024-12-04', '%Y-%m-%d').date()
local_version = datetime.strptime('2024-12-05', '%Y-%m-%d').date()


class SSLContextAdapter(requests.adapters.HTTPAdapter):
Expand Down Expand Up @@ -758,7 +765,7 @@ def print_basic_info(export_filename):
def print_extended_info(args, reliable, status_code):
if args.skip_headers:
print_skipped_headers(args)
if args.output in ('csv', 'json'):
if args.output in ('csv', 'json', 'xml'):
print(get_detail('[limited_analysis_note]', replace=True))
if (status_code is not None and 400 <= status_code <= 451) or reliable or \
args.redirects or args.skip_headers:
Expand Down Expand Up @@ -1392,6 +1399,39 @@ def format_html_enabled(ln, sub_d):
return ln, ln_enabled


def generate_xml(temp_filename, final_filename):
dtd_declaration = f'<!DOCTYPE analysis [\n{DTD_CONTENT}]\n>'
root = ET.Element('analysis')
with open(temp_filename, 'r', encoding='utf8') as txt_source:
section = None
stripped_txt = (line.strip() for line in txt_source)
parse_xml(root, section, stripped_txt)
xml_content = ET.tostring(root, encoding='unicode', xml_declaration=False)
with open(final_filename, 'wb') as xml_final:
xml_final.write(b'<?xml version="1.0" encoding="utf-8"?>\n')
xml_final.write(dtd_declaration.encode('utf-8'))
xml_final.write(xml_content.encode('utf-8'))
print_export_path(final_filename, reliable)
remove(temp_filename)


def parse_xml(root, section, stripped_txt):
for line in stripped_txt:
if not line:
continue
if line.startswith('['):
section = ET.SubElement(root, 'section', {'name': line})
elif section is not None:
item = ET.SubElement(section, 'item')
if ': ' in line:
key, value = line.split(': ', 1)
item.set('name', key.strip())
item.text = value.strip()
else:
item.text = line
return section


def print_http_exception(exception_id, exception_v):
delete_lines()
print("")
Expand Down Expand Up @@ -1544,7 +1584,7 @@ def manage_http_request(status_code, reliable, body):


def custom_help_formatter(prog):
return RawDescriptionHelpFormatter(prog, max_help_position=30)
return RawDescriptionHelpFormatter(prog, max_help_position=34)


# Main functionality for argparse
Expand Down Expand Up @@ -1581,9 +1621,9 @@ def custom_help_formatter(prog):
parser.add_argument("-lic", dest='license', action="store_true", help="Shows \
the license for 'humble', along with permissions, limitations and conditions.")
parser.add_argument("-o", dest='output', choices=['csv', 'html', 'json', 'pdf',
'txt'], help="Exports \
analysis to 'humble_scheme_URL_port_yyyymmdd_hhmmss_language.ext' file; \
csv/json will have a brief analysis")
'txt', 'xml'], help="Exports\
analysis to 'humble_scheme_URL_port_yyyymmdd_hhmmss_language.ext' file; \
csv/json/xml will have a brief analysis")
parser.add_argument("-of", dest='output_file', type=str, help="Exports \
analysis to 'OUTPUT_FILE'; if omitted the default filename of the parameter \
'-o' will be used")
Expand Down Expand Up @@ -1664,8 +1704,8 @@ def custom_help_formatter(prog):
args.URL_A is None):
print_error_detail('[args_several]')

if args.output in ['csv', 'json'] and not args.brief:
print_error_detail('[args_csv_json]')
if args.output in ['csv', 'json', 'xml'] and not args.brief:
print_error_detail('[args_brief_filetype]')

skip_list, unsupported_headers = [], []

Expand Down Expand Up @@ -2524,6 +2564,8 @@ def custom_help_formatter(prog):
generate_csv(tmp_filename, final_filename)
elif args.output == 'json':
generate_json(tmp_filename, final_filename)
elif args.output == 'xml':
generate_xml(tmp_filename, final_filename)
elif args.output == 'pdf':
# Optimized the loading of third-party dependencies and relevant logic
# for 'fpdf2', improving analysis speed for tasks that do not involve PDF
Expand Down
30 changes: 15 additions & 15 deletions l10n/details.txt
Original file line number Diff line number Diff line change
Expand Up @@ -836,7 +836,7 @@ HTTP Response Headers
Note : The analysis may not be reliable because of the time it took for the URL to respond.

[limited_analysis_note]
Note : Exporting to CSV/JSON is currently limited to a brief analysis
Note : Exporting to CSV/JSON/XML is currently limited to a brief analysis

[analysis_redirects_note]
Note : The exact URL will be analyzed, without following redirects.
Expand Down Expand Up @@ -1501,8 +1501,8 @@ Error: The parameters '-b', '-df', '-'o', '-r' and '-s' require the parameter '-
[args_customfile]
Error: The parameter '-of' requires the parameters '-u' and '-o'.

[args_csv_json]
Error: The parameters '-o csv' and '-o json' require the parameter '-b'.
[args_brief_filetype]
Error: The parameters '-o csv', '-o json' and '-o xml' require the parameter '-b'.

[notestssl_file]
Error: 'testssl.sh' is not found in that PATH.
Expand Down Expand Up @@ -1536,18 +1536,18 @@ Values

[epilog_content]
examples:
-u URL -a Shows statistics of the analysis performed against the URL
-u URL -b Analyzes URL and reports overall findings
-u URL -b -o csv Analyzes URL and exports overall findings to CSV format
-u URL -l es Analyzes URL and reports (in Spanish) detailed findings
-u URL -o pdf Analyzes URL and exports detailed findings to PDF format
-u URL -o html -of test Analyzes URL and exports detailed findings to HTML format and 'test' filename
-u URL -o pdf -op D:/Tests Analyzes URL and exports detailed findings to PDF format and 'D:/Tests' path
-u URL -r Analyzes URL and reports detailed findings along with HTTP response headers
-u URL -s ETag NEL Analyzes URL and skips 'deprecated/insecure' and 'missing' checks for 'ETag' and 'NEL' headers
-u URL -ua 4 Analyzes URL using the fourth User-Agent of 'additional/user_agents.txt' file
-a -l es Shows statistics (in Spanish) of the analysis performed against all URLs
-f Google Shows HTTP fingerprint headers related to the term 'Google'
-u URL -a Shows statistics of the analysis performed against the URL
-u URL -b Analyzes URL and reports overall findings
-u URL -b -o csv Analyzes URL and exports overall findings to CSV format
-u URL -l es Analyzes URL and reports (in Spanish) detailed findings
-u URL -o pdf Analyzes URL and exports detailed findings to PDF format
-u URL -o html -of test Analyzes URL and exports detailed findings to HTML format and 'test' filename
-u URL -o pdf -op D:/Tests Analyzes URL and exports detailed findings to PDF format and 'D:/Tests' path
-u URL -r Analyzes URL and reports detailed findings along with HTTP response headers
-u URL -s ETag NEL Analyzes URL and skips 'deprecated/insecure' and 'missing' checks for 'ETag' and 'NEL' headers
-u URL -ua 4 Analyzes URL using the fourth User-Agent of 'additional/user_agents.txt' file
-a -l es Shows statistics (in Spanish) of the analysis performed against all URLs
-f Google Shows HTTP fingerprint headers related to the term 'Google'

[fng_value]
Value:
Expand Down
Loading

0 comments on commit a71236d

Please sign in to comment.