Skip to content

sAbakumoff/gh-emoji

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Emoji Analysis of Github Commit Messages

language Most common emojis
c :octocat: 🎉 🎪 📝 🎊 👾 💄 🍭 :neckbeard: 🐛
c# :octocat: :neckbeard: 🍭 🎊 👾 🎉 🎪 🎨 🚧 🐛
c++ 📝 🎉 🐛 👾 :octocat: :neckbeard: 💄 🍭 🎪 🎊
clojure 😋 🔧 😒 🙀 🐛 ⬆️ 🍯 📝 🚿 🎨
coffeescript ⬆️ 📝 🎨 🐛 💄 🔥 🆕 ✅ 🎁 💚
css 📝 🎉 💃 🐛 🏀 🎨 ✨ :octocat: 🎪 👾
go ⬆️ 🐛 🎨 📝 💄 🎉 🔥 🚀 ✨ 🚧
html 🎨 :octocat: 👾 🎉 💥 :neckbeard: 🙈 🚀 🎪 🍭
java 🐛 :octocat: 🎉 👾 🎊 🎪 ⚡ 💄 :neckbeard: 🍭
javascript ⬆️ 🐛 🎨 📝 🎉 🆕 🔧 ✨ 💄 🔥
makefile 🔫 📝 💄 :neckbeard: 🎉 📖 🐛 🎪 🚮 🎈
objective-c 🎨 🔥 🐟 📝 ⭕ :shipit: 🎉 🐠 🐛 :trollface:
perl 🎨 😄 🔧 🐸 🍰 🌈 🐜 🐛 💩 🌱
php 🐛 🎉 📝 :octocat: 👾 :neckbeard: 🆙 🎊 🎪 🍭
python ⬆️ 🎉 💄 🐛 👾 🚧 :octocat: 🎊 🆕 💪
ruby 💎 🔥 🐛 💄 📝 🎉 ❤️ :nodoc: ✨ ✂️
shell ⬆️ 📝 🔧 ✨ 🎉 🐛 🎨 💄 🔥 📦
swift 🎨 ➕ 🔥 📝 🐛 😊 🎉 ⬆️ 📝 💄
typescript 🌹 📝 ✨ 💄 🐛 ⬆️ 🔧 🔥 ❤️ 💚
viml 📝 ✨ 🎨 🐛 🔥 🍺 ⚡ 😉 💥 🐎

Steps to reproduce

Find commit messages that include emojis

Run the following query by using BigQuery

SELECT
  LANGUAGE,
  repo,
  commit,
  message,
  REGEXP_EXTRACT_ALL(REPLACE(message, ' ', '  '), r'(?:\s|^)(\:[A-Za-z_]+\:)(?:\s|$)') emoji
FROM
  `bigquery-public-data.github_repos.commits` a
JOIN
  `fh-bigquery.github_extracts.ght_project_languages` b
ON
  a.repo_name[OFFSET(0)] = b.repo
WHERE
  REGEXP_CONTAINS(message, r'(?:\s|^)(\:[A-Za-z_]+\:)(?:\s|$)')
  AND b.percent > 0.5

the resulting data table is available here

Select language and emoji columns for further analysis

SELECT
  LANGUAGE,
  emoji
FROM
  `in-full-gear.Dataset1.commits_with_emojis`,
  UNNEST(emoji) AS emoji

The result is available in emoji.csv

Run R script to obtain the summary information

library("dplyr")
emoji<-read.csv(file="emoji.csv", h=TRUE)
top_lang<-emoji %>% 
  group_by(language) %>% summarize(count=n()) %>% top_n(n=20, wt=count) %>% 
  ungroup() %>% select(language)
stat<- emoji %>% merge(top_lang) %>%
  group_by(language, emoji) %>% summarize(count=n()) %>% ungroup() %>%
  arrange(desc(count)) %>% group_by(language) %>% slice(1:10) %>% ungroup() %>%
  group_by(language) %>% summarize(top_emojis=paste(emoji, collapse=" "))
write.csv(stat, row.names=FALSE, file="stat.csv")

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages