Skip to content

Semantic Proximity Search on Heterogeneous Graph by Proximity Embedding

Notifications You must be signed in to change notification settings

shuaiOKshuai/ProxEmbed

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

This directory contains the source code of ProxEmbed model.
========================================================================================================
@inproceedings{LiuZZZCWY17,
 author = {Liu, Zemin and Zheng, Vincent W. and Zhao, Zhou and Zhu, Fanwei and Chang, Kevin Chen-Chuan and Wu, Minghui and Ying, Jing},
 title = {Semantic Proximity Search on Heterogeneous Graph by Proximity Embedding},
 booktitle = {Proc. of the 31st AAAI Conference on Artificial Intelligence},
 series = {AAAI '17},
 year = {2017}
} 

Please cite the above reference for using our code.

For inqueries, please contact:
Zemin Liu ([email protected])
Vincent Zheng ([email protected])
========================================================================================================

This experiment consists of two parts, symmetric and asymmetric. 
The symmetric part is used to compute facebook and linkedin dataset, whose relation is symmetric.
While the asymmetric part is used to compute dblp dataset, whose relation is asymmetric.

In each part, such as this symmetric directory, there are two directories, "java - prepare data for model" and "python - model".
The "java - prepare data for model" directory is used to prepare input data for ProxEmbed model, while "python - model" directory is used to generate and assess ProxEmbed model.

In "java - prepare data for model", 
Main.java is the entry class. And the program will read parameters from file javaParams.properties.

In "python - model", the entry file is experimentForOneFileByParams.py. This file provides us a method to get parameters from file pythonParamsConfig, and then trains the model and then tests it.
For instance, in file pythonParamsConfig, if we set 
	root_dir = ./toy_data/,
	dataset_name = linkedin,
	suffix = 4,
	class_name = school,
	index = 1,
then this model will use ./toy_data/linkedin.splits/train.4/train_school_1 as the training data, 
	and will use ./toy_data/linkedin.splits/test/test_school_1 as test data,
	and will use ./toy_data/linkedin.splits/ideal/ideal_school_1 as ideal data (ground-truth).
	
So at last, this model then will output the results of NDCG and MAP.

The "java - prepare data for model" will read data from main folder of the dataset (which contains graph.node and graph.edge), such as "linkedin" folder, and then output intermediate dataset into the same folder.
Then "python - model" will take these data as one part of the input.

About

Semantic Proximity Search on Heterogeneous Graph by Proximity Embedding

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 61.9%
  • Java 38.1%