Skip to content

A project to design and implement a general approach to file inspection.

Notifications You must be signed in to change notification settings

punytroll/inspection

Repository files navigation

What?

inspection is a project to design and implement a general approach to file inspection.

An inspector from the inspection project provides a hierarchically structured view of a file's content, if the formats of that file or parts thereof are known. It is the express goal of this project to account for each and every bit in the relevant part of the file.

The tools are currently designed to present clear output on the command line. New file formats can be defined to extend the reach of the project. Specialized inspectors are provided to filter for certain file contents.

File formats

This project started out as an attempt to analyze ID3 tags in audio files. In that vein, more audio and other media formats have been added partially.

  • ID3 tags (ID3v1, ID3v1.1, ID3v2.2, ID3v2.3, ID3v2.4)
  • FLAC (full support, right down to the bits of the residual, including vorbis comments)
  • APE tags (APEv2)
  • MPEGv1 (all frame headers in a stream)
  • RIFF
  • ASF
  • PWG Raster
  • URF Raster
  • Vorbis comments and file structure
  • WavPack
  • AppleSingle
  • BMP, ICO

inspectors

Note: Inspectors don't consider the file extension when inspecting a file - instead, they always investigate the file's content!

generalinspector

First and foremost is the generalinspector. This program is meant to parse any file and does a best effort to recognize its content. Of course, this is limited by the number of known formats. For example, an MP3 file may be structured in any number of different ways (using the type names from the type library and the pipe symbol to denote sequential parts):

  • MPEG.1.Stream
  • MPEG.1.Stream | ID3.v1.Tag
  • MPEG.1.Stream | APE.Tag
  • MPEG.1.Stream | APE.Tag | ID3.v1.Tag
  • ID3.v2_Tag | MPEG.1.Stream
  • ID3.v2_Tag | MPEG.1.Stream | ID3.v1.Tag
  • ID3.v2_Tag | MPEG.1.Stream | APE.Tag
  • ID3.v2_Tag | MPEG.1.Stream | APE.Tag | ID3.v1.Tag
  • ...

At the moment the generalinspector has a predefined and hardcoded list of possible parts in a file. It is the intention of further development to ease this restriction and become more generic.

The generalinspector has two powerful options to help investigate a file:

--types=<type>,...

With this option, you can force the generalinspector to interpret the file as a specific sequence of types as taken from the type library.

This is the option to choose, when the generalinspector fails to read a file that is supposed to be supported. In most cases, this is due to the fact that support for this format is not complete and your file contains something that is not yet supported. In that case, the generalinspector fails parsing the file utilising the expected format, falls back onto other formats but fails there as well. Finally, no format seems to be the right one and the generalinspector finishes without meaningful output.

However, if you use the --types=<type>,... option, and fix the generalinspector on the type you know the file to be, parsing will go as far as it can and then stop, hopefully providing some hint as to why parsing cannot continue.

--query=<query>

With this option, you select a certain part of the output and only display that. This allows you to drill down to the level of individual data pieces and tags, query for the existance of fields or tags and select fields based on their properties.

id3inspector

The id3inspector can be used to display only the ID3 parts of a file and skip all other audio data. For this to work ID3v2 must be at the beginning of the file and ID3v1 must be at the end, as per specification. ID3 tags of a certain version can be requested (--id3v1-only and --id3v2-only).

flacinspector

The flacinspector displays the FLAC stream's meta data blocks, including the vorbis comment. All audio data is skipped by default, except when requested (--with-frames - prepare for a LONG wait).

mpeginspector

The mpeginspector displays the header information of all MPEG frames in an MPEG stream. By default, this program is very strict and doesn't allow any extra-frame data (i.e. tags of any kind). It can be instructed to seek for all MPEG frames though (--seek).

apeinspector

The apeinspector searches for all APE tags in a file and displays them. All other content is skipped with a comment.

asfinspector

The asfinspector tries to interpret the input file as one ASF file and displays all meta data. The file might be a .wmv or a .wma. The actual video or audio data is not yet inspected.

riffinspector

The riffinspector tries to interpret the input file as one RIFF chunk with all content and displays all meta data. Predominantly, a RIFF file might have an extension .avi or .wav. The actual video or audio data is not yet inspected.

vorbisinspector

The vorbisinspector tries to interpret the input file as one Ogg Stream and displays the file's structure, as well as any Vorbis comments contained therein. It does not interpret audio data.

Technical

Getting the code

Simply do:

git clone https://github.com/punytroll/inspection.git

Building

TL;DR

meson setup build
meson compile -C build
meson test -C build
meson compile check -C build

For more information, please consult the file documentation/BUILDING.md.

About

A project to design and implement a general approach to file inspection.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages