Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle URLs with anchors #8

Open
jhass opened this issue Sep 23, 2014 · 3 comments
Open

Handle URLs with anchors #8

jhass opened this issue Sep 23, 2014 · 3 comments

Comments

@jhass
Copy link

jhass commented Sep 23, 2014

Anchor tags are a valid part of an URI but shouldn't be included in the request.

url = "http://www.cnet.com/news/pi-top-the-3d-printable-raspberry-pi-laptop-anyone-can-build/#ftag=CAD590a51e"

require 'uri'
require 'open-uri'
require 'nokogiri'

p Nokogiri::HTML(open URI.parse(url), &:read).css('title').text #=> "Pi-Top: The 3D-printable Raspberry Pi laptop anyone can build - CNET"

require 'opengraph_parser'
p OpenGraph.new(url).title #=> "Page Not Found (404) - CNET"
@jhass
Copy link
Author

jhass commented Nov 25, 2014

To pinpoint the issue: The URI.escape on https://github.com/huyha85/opengraph_parser/blob/master/lib/redirect_follower.rb#L21 is causing it. There's no commit introducing it that explains why it is necessary, it was right there with f602681.

@huyha85 would you mind explaining why it's there?

@julien51
Copy link

We have the same issue... Seems like an easy fix?

@jhass
Copy link
Author

jhass commented Oct 29, 2015

I ended up writing my own gem due to issues with this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants