language | Most common emojis |
---|---|
c | 🎉 🎪 📝 🎊 👾 💄 🍭 🐛 |
c# | 🍭 🎊 👾 🎉 🎪 🎨 🚧 🐛 |
c++ | 📝 🎉 🐛 👾 💄 🍭 🎪 🎊 |
clojure | 😋 🔧 😒 🙀 🐛 ⬆️ 🍯 📝 🚿 🎨 |
coffeescript | ⬆️ 📝 🎨 🐛 💄 🔥 🆕 ✅ 🎁 💚 |
css | 📝 🎉 💃 🐛 🏀 🎨 ✨ 🎪 👾 |
go | ⬆️ 🐛 🎨 📝 💄 🎉 🔥 🚀 ✨ 🚧 |
html | 🎨 👾 🎉 💥 🙈 🚀 🎪 🍭 |
java | 🐛 🎉 👾 🎊 🎪 ⚡ 💄 🍭 |
javascript | ⬆️ 🐛 🎨 📝 🎉 🆕 🔧 ✨ 💄 🔥 |
makefile | 🔫 📝 💄 🎉 📖 🐛 🎪 🚮 🎈 |
objective-c | 🎨 🔥 🐟 📝 ⭕ 🎉 🐠 🐛 |
perl | 🎨 😄 🔧 🐸 🍰 🌈 🐜 🐛 💩 🌱 |
php | 🐛 🎉 📝 👾 🆙 🎊 🎪 🍭 |
python | ⬆️ 🎉 💄 🐛 👾 🚧 🎊 🆕 💪 |
ruby | 💎 🔥 🐛 💄 📝 🎉 ❤️ :nodoc: ✨ ✂️ |
shell | ⬆️ 📝 🔧 ✨ 🎉 🐛 🎨 💄 🔥 📦 |
swift | 🎨 ➕ 🔥 📝 🐛 😊 🎉 ⬆️ 📝 💄 |
typescript | 🌹 📝 ✨ 💄 🐛 ⬆️ 🔧 🔥 ❤️ 💚 |
viml | 📝 ✨ 🎨 🐛 🔥 🍺 ⚡ 😉 💥 🐎 |
Run the following query by using BigQuery
SELECT
LANGUAGE,
repo,
commit,
message,
REGEXP_EXTRACT_ALL(REPLACE(message, ' ', ' '), r'(?:\s|^)(\:[A-Za-z_]+\:)(?:\s|$)') emoji
FROM
`bigquery-public-data.github_repos.commits` a
JOIN
`fh-bigquery.github_extracts.ght_project_languages` b
ON
a.repo_name[OFFSET(0)] = b.repo
WHERE
REGEXP_CONTAINS(message, r'(?:\s|^)(\:[A-Za-z_]+\:)(?:\s|$)')
AND b.percent > 0.5
the resulting data table is available here
SELECT
LANGUAGE,
emoji
FROM
`in-full-gear.Dataset1.commits_with_emojis`,
UNNEST(emoji) AS emoji
The result is available in emoji.csv
Run R script to obtain the summary information
library("dplyr")
emoji<-read.csv(file="emoji.csv", h=TRUE)
top_lang<-emoji %>%
group_by(language) %>% summarize(count=n()) %>% top_n(n=20, wt=count) %>%
ungroup() %>% select(language)
stat<- emoji %>% merge(top_lang) %>%
group_by(language, emoji) %>% summarize(count=n()) %>% ungroup() %>%
arrange(desc(count)) %>% group_by(language) %>% slice(1:10) %>% ungroup() %>%
group_by(language) %>% summarize(top_emojis=paste(emoji, collapse=" "))
write.csv(stat, row.names=FALSE, file="stat.csv")