Skip to content

nikodemin/spark-social-network

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Social network feed analyzing app

This is sample app showing usage of spark sql and how to analyze social network feeds information.

Data format

This project contains feeds_show.json file with following schema:

root
 |-- durationMs: long (nullable = true)
 |-- owners: struct (nullable = true)
 |    |-- group: array (nullable = true)
 |    |    |-- element: long (containsNull = true)
 |    |-- user: array (nullable = true)
 |    |    |-- element: long (containsNull = true)
 |-- platform: string (nullable = true)
 |-- position: long (nullable = true)
 |-- resources: struct (nullable = true)
 |    |-- GROUP_PHOTO: array (nullable = true)
 |    |    |-- element: long (containsNull = true)
 |    |-- MOVIE: array (nullable = true)
 |    |    |-- element: long (containsNull = true)
 |    |-- POST: array (nullable = true)
 |    |    |-- element: long (containsNull = true)
 |    |-- USER_PHOTO: array (nullable = true)
 |    |    |-- element: long (containsNull = true)
 |-- timestamp: long (nullable = true)
 |-- userId: long (nullable = true)

The input schema described in Main.scala object

Metrics

In this app following metrics are collected and shown:

  • Count of views and users grouped by platforms
  • Summary count of views and users
  • Daily unique authors and content
  • Views grouped by sessions and some information about these sessions (count, average duration, viewing depth)
  • Views grouped by users

How to run

You can use one of the following variants:

  • run in Intellij IDEA or Eclipse IDE using Main.scala class
  • run sbt run

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages