OSINT tool for estimating when a web page was written.
From Wikipedia:
Radiocarbon dating (also referred to as carbon dating or carbon-14 dating) is a method for determining the age of an object containing organic material by using the properties of radiocarbon (14C), a radioactive isotope of carbon.
While performing digital forensics or OSINT, it might be crucial to determine
when a certain blog post has been written. Common CMS's easily permit to change
the displayed date of content, affecting both websites and RSS feeds. Moreover,
the dynamic nature of most web pages does not allow investigators to use the
Last-Modified
HTTP header.
However, most users do not alter the timestamps of static resources that are
uploaded while writing articles. The Last-Modified
header of linked images can
be leveraged to estimate the time period spent by the writer while preparing a
blog post. This period can be compared to what the CMS shows in order to detect
notable differences.
Carbon14 accepts the target URL and an optional author name. It works on Python 3 and Python 2 as well.
usage: carbon14.py [-h] [-a name] url
Date images on a web page.
positional arguments:
url URL of the page
optional arguments:
-h, --help show this help message and exit
-a name, --author name
author to be included in the report
The tool prints a simple report in Pandoc's extended Markdown format which
can then be redirected to a file (or with tee
). For example:
carbon14.py 'https://eforensicsmag.com/extracting-data-damaged-ntfs-drives-andrea-lazzarotto/' > report.md
Here's a snippet of the output:
# Internal images
--------------------------------------------------------------------------------
Date (UTC) Date (Europe/Rome) URL
-------------------- -------------------- --------------------------------------
[...]
2017-03-06 14:27:17 2017-03-06 15:27:17 <https://eforensicsmag.com/wp-content/uploads/2017/03/image06-1.png>
2017-03-06 14:43:04 2017-03-06 15:43:04 <https://eforensicsmag.com/wp-content/uploads/2017/03/image04-1.png>
2017-03-06 14:48:22 2017-03-06 15:48:22 <https://eforensicsmag.com/wp-content/uploads/2017/03/image02-1.png>
We can infer that work on that article began on March 6, 2017.
The Markdown syntax is text-based and lightweight. This means that the report can be used or printed as-is, like a normal text file. Optionally, examiners might want to convert it to a different format such as HTML, ODT or DOCX.
This optional step can be performed with Pandoc:
pandoc -s report.md -o report-web.html
pandoc report.md -o report-libreoffice.odt
pandoc report.md -o report-msword.docx
Pandoc can also be used to generate HTML reports with custom CSS files, PDF reports and several other outputs. Please refer to its documentation for further details.