⚠️ This project is no longer actively maintained: feel free to contact me if you need more information.
LoggyLogs is a parser for log servers. Its goal is to help detecting vulnerabilities in HTTP requests. It supports NGINX and Apache logs. This tool was created during a professional engagement, and was shared for educational purposes only.
🔍
For each HTTP request, LoggyLogs is capable of detecting:
- If the host is valid
- If the IP address is valid
- If the HTTP status code is valid
- If HTTPS is properly enforced, to ensure the traffic is encrypted
- If the user agent is a recognized web browser
- If endpoints are actually related to the API, to detect potential brute-force attacks
- If the body size is not too big, as those could lead to denial-of-service (DoS) attacks
- If non-secure HTTP methods are used (i.e.
TRACE
orDEBUG
), as those could lead to several information disclosures
Note: All the tests mentioned above can be run at once!
🔍
- Getting started: This section provides everything that is required to install LoggyLogs.
- Usage: This section shows how to configure LoggyLogs to have the best outputs possible. It includes:
- First run: How to run LoggyLogs for the first time
- Tweak parameters: How to modify several values in the source code
- Examples: A few examples of outputs are shown in this section.
- Going further: LoggyLogs has been designed as an open source project since day 1. This section includes:
- Code documentation: To clarify the tool's internals, and explaining how to generate the source code documentation
- Improvements: Additional features that could be implemented to make LoggyLogs even more efficient
- License: The license of the project.
🔍
- Clone this repository:
git clone https://github.com/sljrobin/LoggyLogs
- Go to the LoggyLogs directory:
cd LoggyLogs/
- Create and activate a Python virtual environment:
python3 -m venv venv
source ./venv/bin/activate
- Install the requirements with pip3:
pip install -r requirements.txt
- Ensure first the Python virtual environment is enabled by running
source ./venv/bin/activate
- Run LoggyLogs for the first time:
python loggylogs.py --help
% python loggylogs.py --help
usage: loggylogs.py [-h] [--all] [--bodysize] [--host] [--ip] [--httpmethod] [--httpprotocol] [--statuscode] [--url] [--useragent]
optional arguments:
-h, --help show this help message and exit
--all Perform all the available tests
--bodysize Check the body size is not too important
--host Check the host is valid
--ip Check the IP address is valid
--httpmethod Check which HTTP method is being used
--httpprotocol Check if HTTP connections (i.e. without encryption) are made to the server
--statuscode Check the HTTP status code
--url Check the URL is an endpoint available within the API documentation
--useragent Check the user agent is a recognized web browser
Under the data/
directory:
- Put the logs to analyze here:
- A file example named
samples.log
is already present - Feel free to replace it
- In case the analysis of several files is required, change the
LOGS_PATH
constant inloggylogs.py
(e.g.LOGS_PATH = './data/new-samples.log'
)
- A file example named
- If need be, add more user agents:
- The
user-agents.txt
file already contains an exhaustive list of user agents - Other lists are available on the Internet, like this one
- The
In the lib/scanner.py
file:
- Depending on which endpoints are valid or not, modify the
self.__hosts
andself.__urls
variables - An example is shown below:
self.__hosts = ['https://www.google.com/', 'https://status.google.com/']
self.__urls = ['/1/indexing', '/1/infrastructure', '/1/inventory', '/1/latency', '/1/reachability', '/1/status']
- In addition, it might be useful to modify the
self.__limit_body_size
variable to inform LoggyLogs when raising an alert when an HTTP request is considered too large - The default size has been set to 1,500 bytes, as shown below:
self.__limit_body_size = 1500
🎉 Everything is now ready!
Once LoggyLogs is set, it can parse HTTP requests, and eventually detect suspicious activities. To run it, use the following command:
python loggylogs.py --<test>
Example of output with the --bodysize
test:
python loggylogs.py --bodysize
Example of output with the --host
test:
python loggylogs.py --host
Example of output with the --statuscode
and --ip
tests:
python loggylogs.py --statuscode
python loggylogs.py --ip
As a project never ends, the following might help the most curious.
- The source code of LoggyLogs has been thoroughly documented in order to help people adding new features or simply improving the code
- Because the code is commented, generating a documentation becomes easy
- Amongst most popular solutions, we recommend using pydoc for the documentation generation process.
- Examples:
- Snippet of the documentation for the
lib/scanner.py
class:
NAME
scanner
CLASSES
builtins.object
Scanner
class Scanner(builtins.object)
| Scanner(display)
|
| Methods defined here:
|
| __init__(self, display)
| Initialize the Scanner object.
|
| :param Display display: a Display object from the internal library.
|
| check_all(self)
| Perform all the checks previously described.
|
| check_body_size(self)
| Check the size (in bytes) of the HTTP request. Large requests might be used during denial of service (DoS)
| attacks to exhaust the server. If the request appears too large, display an error.
|
| check_host(self)
| Check the host used to perform the request is related to the API, following a whitelist approach. If not,
| display an error.
[...]
Improvements could be added to LoggyLogs to make it even more efficient. Several thoughts are shared below:
- For the IP address, a DNS lookup could be implemented. This, used in conjunction with a blacklist of domains, could be used to detect potential DoS or DDoS attacks
- Groups of requests from the same IP address could be made. From there, inspection of the timestamps between each log could be performed – note that a method to extract the date of each log entry has successfully been created. This could help in detecting potential DoS attacks
- For each URL, a search for specific characters could help detecting potential injection attacks. For instance, characters like
<
,{
,;
, or'
, might be injected to test for cross-site scripting (XSS) attacks or SQL injections - Still, with the URL, lists of known endpoints could be banned. For example, the lists of DirBuster or other web wordlists could be used as blacklists of endpoints. On the server-side, it would be required to ensure all the proper checks have been implemented so that those resources cannot be accessed without proper authorization
- Instead of using a file, regular expressions for user agents could be added, just like this one
- Optimization of the log analysis could be made with specialized libraries. The advertools library seems to be an interesting approach. Indeed, the logs are being compressed and the analysis appears to be much faster