Skip to content

This is the repository for the Gender Differences in AI Scholar

Notifications You must be signed in to change notification settings

causalNLP/ai-scholar-gender

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

AI Scholar Gender

Overview

This is the (private) repo for AI Scholar (gender project).

File Structure

.
├── README.md
├── code
│   ├── basic_scholar_profiles_1.ipynb
│   ├── basic_scholar_profiles_2.ipynb
│   ├── citation_by_academic_age.ipynb
│   ├── domain_analysis.ipynb
│   ├── gs_scholar_analysis.ipynb
│   ├── organization_clustering.ipynb
│   ├── paper_centric_analysis.ipynb
│   ├── prepare_gs_feature_data.ipynb
│   ├── stylish_title_detector.ipynb
│   └── time_series_clustering.ipynb
└── data
    ├── AIScholars78k_samp1000.csv
    └── Papers100k_samp1000.csv

code

  • prepare_gs_feature_data.ipynb: Obtain necessary data features for later analysis.
  • basic_scholar_profiles_1.ipynb and basic_scholar_profiles_2.ipynb: Analyze basic features in GS scholar profiles.
  • gs_scholar_analysis.ipynb: Analyze GS scholar features from different perspectives.
  • paper_centric_analysis.ipynb: Perform paper centric analysis corresponding to the section with the same name in the paper.
  • citation_by_academic_age.ipynb: Direct analysis of GS scholars' academic age time series.
  • time_series_clustering.ipynb: Clustering analysis of GS scholars' academic age time series.
  • organization_clustering.ipynb: Clustering analysis of GS scholars' organizations.
  • domain_analysis.ipynb: Analyze GS scholars' domain tags.
  • stylish_title_detector.ipynb: Stylish title detector implementation and samples of stylish title.

data

  • AIScholars78k_samp1000.csv: 1000 samples of the 78k AI scholar dataset. Full dataset can be accessed from Google Drive link.
  • Papers100k_samp1000.csv: 1000 samples of the 100k paper dataset. Full dataset can be accessed from Google Drive link.
  • More descriptions of the full data statistics are shown in the GitHub repo: causalNLP/AI-Scholar.

Instead, you can use the following commands to download the full dataset:

pip install gdown
cd <path_to_store_data>
python -c "https://drive.google.com/uc?id=1sfNLH549c0IMp-hojnpmskBftsW5jB7a" # AIScholars78k_samp1000.csv
python -c "https://drive.google.com/uc?id=16cmOlJ-8--7vqIXY-hP0JXtRwqaPoOfh" # Papers100k_samp1000.csv

About

This is the repository for the Gender Differences in AI Scholar

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published