Skip to content

Releases: eXascaleInfolab/StaTIX

Fast Type Inference

09 Apr 14:46
d4bbed4
Compare
Choose a tag to compare
  • Imported updated DAOC library with orders of magnitude faster clustering retaining exactly the same forming clusters.
  • Description and Evaluation results updated

Links Cutting On Preprocessing

19 Jan 17:12
Compare
Choose a tag to compare
  • Optional links cutting (similarity matrix reduction) on graph construction implemented to reduce the memory consumption (affects the accuracy)
  • Links reduction policies (on clustering) refined
  • Network serialization refined considering the filtering
  • Cluster labels output added

The executable is built on Linux Ubuntu 16.04 x64, Java OpenJDK 1.8

Optional Features Added

20 Dec 22:04
Compare
Choose a tag to compare
  • Links reduction policy parameterized
  • Optional weighting of the input instances besides their relations
  • Similarity function parameterized

The executable is built on Ubuntu x64 16.04, Java OpenJDK 1.8

Representative Types

14 Dec 22:05
Compare
Choose a tag to compare
  • Synced with the updated version of the DAOC clustering library, which provides refined identification of the representative clusters (inferred types)
  • Added output of the input network for the clustering without the type inference itself
  • Fixed filtered out ids in the outputting .rcg network (clustering input)

The executable of libdaoc is built on Ubuntu x64 16.04, StaTIX jar is build on OpenJDK Java 1.8 x64.

Property Occurrences per Type

16 Oct 20:23
Compare
Choose a tag to compare
  • Fixed more accurate properties occurrences evaluation (considering the number of properties per each type from each instance) instead of the binary presence of properties in the types
  • Brief hints strategy a bit refined (early termination dropped)
  • N-Triples format parsing refined: comments considered, malformed files parsed more reliable

Build is made using Java OpenJDK 1.8 x64 on Linux Ubuntu 16.04 x64 with default GCC (important for the linked native libdaoc clusterirng lib).

Brief Hints Supervision

16 Oct 20:16
Compare
Choose a tag to compare
  • Brief Hints lightweight semi-supervision implemented
  • Properties weights estimated more accurate for the [semi-]supervised type inferences

Property Weights Normalization Refined

28 Sep 00:53
Compare
Choose a tag to compare
  • Property weights normalization refined => accuracy improved for both non-supervised and semi-supervised clustering
  • Refined multi-level output with outliers filtering and custom output step
  • Build fixed for Java 1.8 (earlier worked fine only on Java 1.9)
  • Various minor optimizations performed (lower memory consumption, higher speed)

Built on Ubuntu 16.04 x64, Java 1.8

Note: --brief-hints is not implemented except the stub API.

Benchmarked Release

04 Sep 01:05
Compare
Choose a tag to compare
  • Execution options extended (input links reduction policies added, multi-level output of the representative clusters, etc.)
  • Some bugs fixed (NAN values in the similarity matrix could occur in the semi-supervised mode when some properties of the input dataset were not present in the prelabeled one)
  • Updated DAOC library linked

Benchmark results

Build on Linux Ubuntu 16.04 x64, Java1.9 is attached

Initial Release of the Statistical Type Inference

17 Aug 11:04
Compare
Choose a tag to compare

Performs both non supervised (fully automatic) and semi supervised (using hinting sample dataset) statistical type inference for the RDF dataset (N3 format) yielding ids of the sequential unique subjects grouped by the #type (clusters) in .cnl (space separated list of member ids) format.

The build of the linked native clustering library (DAOC, contact Artem for the details) is performed on Linux Ubuntu 16.04 x64, might also work in the Ubuntu console of Windows 10 x64. On other platforms only the native library should be substituted to run the app.