forked from alshedivat/al-folio
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added support for google scholar citations (alshedivat#2193)
Closes alshedivat#1809, but there are caveats: 1 - it only works at build time, which means it won't update the numbers unless you build your site again 2 - Google might block the request if it receives lots of it, failing the whole process. This is how it looks like when it can fetch the information: ![Screenshot from 2024-02-13 00-37-52](https://github.com/alshedivat/al-folio/assets/31376482/646d1f3c-1294-491b-bc13-9013e38918b4) And this when it fails: ![image](https://github.com/alshedivat/al-folio/assets/31376482/516eefff-d394-44ad-8702-8982233f8c4f) Signed-off-by: George Araujo <[email protected]>
- Loading branch information
1 parent
aa0e9ad
commit b5b0e34
Showing
4 changed files
with
88 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
require "active_support/all" | ||
require 'nokogiri' | ||
require 'open-uri' | ||
|
||
module Helpers | ||
extend ActiveSupport::NumberHelper | ||
end | ||
|
||
module Jekyll | ||
class GoogleScholarCitationsTag < Liquid::Tag | ||
Citations = { } | ||
|
||
def initialize(tag_name, params, tokens) | ||
super | ||
splitted = params.split(" ").map(&:strip) | ||
@scholar_id = splitted[0] | ||
@article_id = splitted[1] | ||
end | ||
|
||
def render(context) | ||
article_id = context[@article_id.strip] | ||
scholar_id = context[@scholar_id.strip] | ||
article_url = "https://scholar.google.com/citations?view_op=view_citation&hl=en&user=#{scholar_id}&citation_for_view=#{scholar_id}:#{article_id}" | ||
|
||
begin | ||
# If the citation count has already been fetched, return it | ||
if GoogleScholarCitationsTag::Citations[article_id] | ||
return GoogleScholarCitationsTag::Citations[article_id] | ||
end | ||
|
||
# Sleep for a random amount of time to avoid being blocked | ||
sleep(rand(1.5..3.5)) | ||
|
||
# Fetch the article page | ||
doc = Nokogiri::HTML(URI.open(article_url, "User-Agent" => "Ruby/#{RUBY_VERSION}")) | ||
|
||
# Attempt to extract the "Cited by n" string from the meta tags | ||
citation_count = 0 | ||
|
||
# Look for meta tags with "name" attribute set to "description" | ||
description_meta = doc.css('meta[name="description"]') | ||
og_description_meta = doc.css('meta[property="og:description"]') | ||
|
||
if !description_meta.empty? | ||
cited_by_text = description_meta[0]['content'] | ||
matches = cited_by_text.match(/Cited by (\d+[,\d]*)/) | ||
|
||
if matches | ||
citation_count = matches[1].to_i | ||
end | ||
|
||
elsif !og_description_meta.empty? | ||
cited_by_text = og_description_meta[0]['content'] | ||
matches = cited_by_text.match(/Cited by (\d+[,\d]*)/) | ||
|
||
if matches | ||
citation_count = matches[1].to_i | ||
end | ||
end | ||
|
||
citation_count = Helpers.number_to_human(citation_count, :format => '%n%u', :precision => 2, :units => { :thousand => 'K', :million => 'M', :billion => 'B' }) | ||
|
||
rescue Exception => e | ||
# Handle any errors that may occur during fetching | ||
citation_count = "N/A" | ||
|
||
# Print the error message including the exception class and message | ||
puts "Error fetching citation count for #{article_id}: #{e.class} - #{e.message}" | ||
end | ||
|
||
|
||
GoogleScholarCitationsTag::Citations[article_id] = citation_count | ||
return "#{citation_count}" | ||
end | ||
end | ||
end | ||
|
||
Liquid::Template.register_tag('google_scholar_citations', Jekyll::GoogleScholarCitationsTag) |