GitHub - rayl/headergraphs: Perl/graphviz analysis of kernel header files

rayl / headergraphs Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Perl/graphviz analysis of kernel header files

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 140 Commits
Gather		Gather
AUTHORS		AUTHORS
Analysis.pm		Analysis.pm
Analysis2.pm		Analysis2.pm
COPYING		COPYING
Dot.pm		Dot.pm
Dot2.pm		Dot2.pm
Graph.pm		Graph.pm
README		README
Report.pm		Report.pm
SOURCEME		SOURCEME
TODO		TODO
graph.pl		graph.pl

Repository files navigation

A set of scripts to visualize header inclusion trees.


Quickstart
----------

Install perl, graphviz, and kghostview.

Configure your linux kernel and 'make prepare'.

Adjust the Gather::Linux line near top of graph.pl.  Only
tested on x86_64 so far.

mkdir tmp

./graph.pl
  - do_it (this takes 30-120 seconds)
  - save_it

./graph.pl
  - load_it (this takes about 10 seconds)
  - x "linux/types.h"
  - x "linux/list.h"
  - x "linux/sched.h"
  - x "fs/dcache.c"



Concepts
--------

tsize
  transitive size, a measure of how many header files are included
  from a given top-level root file.

unique tsize
  The number of different files included from a given root file.

total tsize
  The number of files included from a given root file, if include
  guards were not used.  this number is useful when evaluating header
  file partitioning schemes which involve splitting up a header file.
  Doing that will tend to slightly increase the unique tsize while
  drastically lowering the total tsize.

backbone
  A contiguous subgraph starting from the root and composed of the
  nodes with "large" unique tsizes, for some arbitrary value of large.  
  Any child nodes below this size are considered "regular headers"
  and are not considered to be part of the backbone. "large" is
  currently hardcoded to a value useful for the Linux kernel headers.

 
Description
-----------

Nodes in the graph repesent files, edges represent inclusions.


Node Shapes:

To help relax the graph and clean up the layout, some child edges
may be snipped.  The node shapes and labels indicate where and when
this has occurred.

  The house-shaped pentagon is the root of the inclusion tree.

  Ellipses are normal header files.

  Boxes are header files which have had one or more links to child
  nodes snipped in order to relax the graph.

  Circles are the places where the root node children were snipped,
  in the case that root had more than 3 children.


Node Colors:

Different colors are used to flag interesting nodes.

  Orange represents the root node of the graph.

  Blue represents the "backbone" of the inclusion hierarchy.  This is
  the set of files with the largest unique tsizes. The theory is that
  these files represent the "most important" concepts used by the
  root node.

  Yellow represents a popular inclusion target with small unique tsize.

  Red represents a popular inclusion target with a large unique tsize.

  Pale green represents normal header files.

The more saturated a red, yellow or blue colored object is, the larger
the unique tsize.

Conceptually, colors are applied in the following order:

  - The entire graph is painted green.
  - Backbone nodes are then painted blue.
  - Cutpoint targets are painted yellow.
  - Yellow nodes with very large tsizes are repainted red.
  - Finally, the root node is painted orange.


Node Labels:

  All nodes are labelled with the file name. The unique and total
  tsizes are placed underneath the name and separated by a dash.

  Nodes which have had some incoming edges snipped have a third number
  indicating the total number of parents for that file, including any
  incoming edges that weren't actually snipped.

  Nodes with outgoing edges snipped also include a list of the children
  who have been detached.  The information for each child (name, unique
  tsize, total tsize, and parent count) is placed on a single line per
  child.