Skip to content

Set of Hadoop, Spark and Storm based tools for web and customer analytic

Notifications You must be signed in to change notification settings

nattachai305/visitante

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

The original goal of visitante was to calculate various web analytic metric as defined by Avinash Kaushik (http://www.kaushik.net/avinash/) on the Hadoop, Spark and Storm platform. However, it has evolved into a general purpose log analytic and mining solution, beyond web server logs.

It also includes customer or marketing analytic solution. Since customer behavior data is mostly captured in logs, there is a close relationship between customer analytics and log analytics.

Philosophy

  • Simple and easy to use batch and real time web analytic
  • Highly configurable

Blogs

The following blogs of mine are good source of details of visitante

Solutions

  • Hadoop based batch analytic for

    • Num of pages visited
    • Total time spent
    • Last page visited
    • Flow status (e.g., whether checkout flow was entered, entered but not completed or completed)
    • Incident detection
    • Pattern based event detection with context
    • Customer life time value
  • Storm based real time analytic for

    • Bounce rate
    • Visit depth distribution

Build

For Hadoop 1

  • mvn clean install

For Hadoop 2 (non yarn)

  • git checkout nuovo
  • mvn clean install

For Hadoop 2 (yarn)

  • git checkout nuovo
  • mvn clean install -P yarn

For spark

  • Build chombo first in master branch with
    • mvn clean install
    • sbt publishLocal
  • Build chombo-spark in chombo/spark directory
    • sbt clean package

Need help?

Please feel free to email me at [email protected]

Contribution

Contributors are welcome. Please email me at [email protected]

About

Set of Hadoop, Spark and Storm based tools for web and customer analytic

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 79.6%
  • Python 6.6%
  • Ruby 5.5%
  • Scala 4.7%
  • Shell 3.6%