Linkedin-scraper is a gem for scraping public Linkedin profiles. Given the URL of the profile, it gets the name, country, title, area, current companies, past companies, organizations, skills, groups, etc
Install the gem from RubyGems:
gem install linkedin-scraper
Include the gem
require 'linkedin-scraper'
Initialize a scraper instance
profile = Linkedin::Profile.new("http://www.linkedin.com/in/jeffweiner08")
With a http web-proxy:
profile = Linkedin::Profile.new("http://www.linkedin.com/in/jeffweiner08", { proxy_ip: '127.0.0.1', proxy_port: '3128', username: 'user', password: 'pass' })
The scraper can also get the details of each past and current companies. This will lead to multiple hits.
To enable this functionality, pass company_details=true
in options. You can pass them along with proxy options
as well
profile = Linkedin::Profile.new("http://www.linkedin.com/in/jeffweiner08", { company_details: true })
profile = Linkedin::Profile.new("http://www.linkedin.com/in/jeffweiner08", { company_details: true, proxy_ip: '127.0.0.1', proxy_port: '3128', username: 'user', password: 'pass' })
It's also possible to set number of seconds to wait for the connection to open.
Scraper will throw exception after given timeout. E. g. to set two seconds long timeout
pass open_timeout: 2
in options. Defaults to 60 seconds.
profile = Linkedin::Profile.new("http://www.linkedin.com/in/jeffweiner08", {open_timeout: 2 })
The returning object responds to the following methods
profile.first_name # The first name of the contact
profile.last_name # The last name of the contact
profile.name # The full name of the profile
profile.title # The job title
profile.summary # The summary of the profile
profile.location # The location of the contact
profile.country # The country of the contact
profile.industry # The domain for which the contact belongs
profile.picture # The profile picture link of profile
profile.skills # Array of skills of the profile
profile.organizations # Array organizations of the profile
profile.education # Array of hashes for education
profile.websites # Array of websites
profile.groups # Array of groups
profile.languages # Array of languages
profile.certifications # Array of certifications
profile.number_of_connections # The number of connections as a string
For current and past companies it also provides the details of the companies like company size, industry, address, etc The company details will only be scraped if you pass company_details=true. It is false by default.
profile.current_companies
[
[0] {
:title => "CEO",
:company => "LinkedIn",
:company_logo => "https://media.licdn.com/media/AAEAAQAAAAAAAAL0AAAAJGMwYWZhNTYxLWJkMTktNDAzMi05NzEzLTlhNzUxMGU0NDg0Mw.png",
:duration => "7 years 6 months",
:start_date => #<Date: 2008-12-01 ((2454802j,0s,0n),+0s,2299161j)>,
:end_date => "Present",
:linkedin_company_url => "https://www.linkedin.com/company/linkedin",
:website => "http://www.linkedin.com",
:description => "The future is all about what you do next and we’re excited to help you get there. Ready for your moonshot? You're closer than you think. \r\n\r\nFounded in 2003, LinkedIn connects the world's professionals to make them more productive and successful. With more than 430 million members worldwide, including executives from every Fortune 500 company, LinkedIn is the world's largest professional network on the Internet. The company has a diversified business model with revenue coming from Talent Solutions, Marketing Solutions and Premium Subscriptions products. Headquartered in Silicon Valley, LinkedIn has offices across the globe.",
:company_size => "5001-10,000 employees",
:type => "Public Company",
:industry => "Internet",
:founded => 2003,
:address => "2029 Stierlin Court Mountain View, CA 94043 United States",
:street1 => "2029 Stierlin Court",
:street2 => "",
:city => "Mountain View",
:zip => "94043",
:state => "CA",
:country => "United States"
}
]
[1] {
:current_company => "Intuit",
:current_title => "Member, Board of Directors",
:current_company_url => "http://network.intuit.com/",
:description => nil,
:linkedin_company_url => "http://www.linkedin.com/company/intuit?trk=ppro_cprof",
:url => "http://network.intuit.com/",
:type => "Public Company",
.
.
.
profile.past_companies
# Same as current companies
profile.recommended_visitors
# It is the list of visitors "Viewers of this profile also viewed..."
[
[0] {
:link => "http://www.linkedin.com/in/barackobama?trk=pub-pbmap",
:name => "Barack Obama",
:title => "President of the United States of ",
:company => nil
},
[1] {
:link => "http://www.linkedin.com/in/marissamayer?trk=pub-pbmap",
:name => "Marissa Mayer",
:title => "Yahoo!, President & CEO",
:company => nil
},
[2] {
:link => "http://www.linkedin.com/pub/sean-parker/0/1/826?trk=pub-pbmap",
:name => "Sean Parker",
:title => nil,
:company => nil
},
[3] {
:link => "http://www.linkedin.com/pub/eduardo-saverin/0/70a/31b?trk=pub-pbmap",
:name => "Eduardo Saverin",
:title => nil,
:company => nil
},
[4] {
:link => "http://www.linkedin.com/in/rbranson?trk=pub-pbmap",
:name => "Richard Branson",
:title => "Founder",
:company => "Virgin Group"
},
[5] {
:link => "http://www.linkedin.com/in/reidhoffman?trk=pub-pbmap",
:name => "Reid Hoffman",
:title => "Entrepreneur. Product Strategist. ",
:company => nil
},
[6] {
:link => "http://www.linkedin.com/in/mdell?trk=pub-pbmap",
:name => "Michael Dell",
:title => "Chairman and CEO",
:company => "Dell"
},
[7] {
:link => "http://www.linkedin.com/in/mittromney?trk=pub-pbmap",
:name => "Mitt Romney",
:title => "Believe in America",
:company => nil
},
[8] {
:link => "http://www.linkedin.com/pub/sheryl-sandberg/2/665/512?trk=pub-pbmap",
:name => "Sheryl Sandberg",
:title => nil,
:company => nil
}
]
The gem also comes with a binary and can be used from the command line to get a json response of the scraped data. It takes the url as the first argument. If the last argument is true it will fetch the company details for each company.
linkedin-scraper http://www.linkedin.com/in/jeffweiner08 127.0.0.1 3128 username password
linkedin-scraper http://www.linkedin.com/in/jeffweiner08 127.0.0.1 3128 username password true
Bug reports and pull requests are welcome on GitHub at https://github.com/yatish27/linkedin-scraper. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.
The gem is available as open source under the terms of the MIT License.