Skip to content

DeepHawkes: Popularity Prediction of Information Cascades

Notifications You must be signed in to change notification settings

CaoQi92/DeepHawkes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 

Repository files navigation

DeepHawkes

This repository is an implementation of our proposed DeepHawkes model in the following paper:

Qi Cao, Huawei Shen, Keting Cen, Wentao Ouyang, Xueqi Cheng. 2017. DeepHawkes: Bridging the Gap between 
Prediction and Understanding of Information Cascades. In Proceedings of CIKM'17, Singapore., November 
6-10, 2017, 11 pages.

For more details, you can download this paper from ACM DIGITAL LIBRARY. The following url is the corresponding download link: https://dl.acm.org/citation.cfm?id=3132973&CFID=1005695721&CFTOKEN=57128415

DataSet

We publish the Sina Weibo Dataset used in our paper,i.e., dataset_weibo.txt. It contains 119,313 messages in June 1, 2016. Each line contains the information of a certain message, the format of which is:

<message_id>\tab<user_id>\tab<publish_time>\tab<retweet_number>\tab<retweets>
<message_id>:     the unique id of each message, ranging from 1 to 119,313.
<root_user_id>:   the unique id of root user. The user id ranges from 1 to 6,738,040.
<publish_time>:   the publish time of this message, recorded as unix timestamp.
<retweet_number>: the total number of retweets of this message within 24 hours.
<retweets>:       the retweets of this message, each retweet is split by " ". Within each retweet, it records 
the entile path for this retweet, the format of which is <user1>/<user2>/......<user n>:<retweet_time>.

    This dataset is limited to only use in research. And when you use this dataset, please cite our paper as listed above.

Downlowd link: https://pan.baidu.com/s/1c2rnvJq

password: ijp6

Steps to run DeepHawkes

1.split the data to train set, validation set and test set. command:

cd gen_sequence
python gen_sequence.py
#you can configure parameters and filepath in the file of "config.py"

2.trainsform the datasets to the format of ".pkl" command:

cd deep_learning
python preprocess.py
#you can configure parameters and filepath in the file of "config.py"

3.train DeepHawkes command:

cd deep_learning
python run_sparse.py learning_rate learning_rate_for_embeddings l2 dropout
#exsamples  python -u run_sparse.py 0.005 0.0005 0.05 0.8

About

DeepHawkes: Popularity Prediction of Information Cascades

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages