Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Segmentation fault #203

Closed
lls opened this issue Dec 29, 2009 · 10 comments
Closed

[BUG] Segmentation fault #203

lls opened this issue Dec 29, 2009 · 10 comments

Comments

@lls
Copy link

lls commented Dec 29, 2009

trying this code just crashes:

require 'rubygems'
require 'nokogiri'
document = Nokogiri::HTML.parse("  <div style=\"overflow: hidden; float: left; margin: 2px 0;\">\n    <a href=\"http://aaaa.aaaa.aaaa/aaaaa/aaaaa_aaaaa/aaaaaa/aaaaaa/aaaaaaaaaaaa/aaaaaa/aaaaaaaaaa/view\">\n      <div class=\"aaaa-aaaaa\" title=\"000_0000.jpg\" style=\"border: 1px solid #666; height: 62px; width: 62px; overflow: hidden; background-color: white;\"><a href=\"http://aaaa.aaaaa.aaa/aaaa/aaaaa_aaaaa/aaaa/aaaa/aaaaaaaaa/item/aaaaaaaaaaaa/view\"><img src=\"http://a.aaaa.aaa/aa/aaaaaa/aaaaaaaaaa/tn/aaaaaaa/name/000_0000.jpg\" style=\"border: none;\" alt=\"000_0000.jpg\"/></a></div>\n    <div class=\"aaaa-aaaa-aaaa\" style=\"clear: both; font-size: smaller; height: 12px; overflow: hidden; text-align: center; width: 72px;\"> <a style=\"text-decoration: none;\" href=\"http://aaaa.aaaa/aaaa/aaaaa_aaaaaa/aaaa/aaaa/aaaaaaaaaaaa/item/aaaaaaaaaaaaa/view\" title=\"000_0000.jpg\">000_0000.jpg</a></div>\n  </div>\n\t      \t       \t   ")
document.search('//a[@href]').each do |link|
      p link
      title = link.content
      puts "*************"
      p title
      link.content = title.nil? ? "":title.strip
    end

just execute in irb:

irb
$ irb(main):001:0> require 'rubygems'
=> true
irb(main):002:0> require 'nokogiri'
=> true
irb(main):003:0> document = Nokogiri::HTML.parse("  <div style=\"overflow: hidden; float: left; margin: 2px 0;\">\n    <a href=\"http://aaaa.aaaa.aaaa/aaaaa/aaaaa_aaaaa/aaaaaa/aaaaaa/aaaaaaaaaaaa/aaaaaa/aaaaaaaaaa/view\">\n      <div class=\"aaaa-aaaaa\" title=\"000_0000.jpg\" style=\"border: 1px solid #666; height: 62px; width: 62px; overflow: hidden; background-color: white;\"><a href=\"http://aaaa.aaaaa.aaa/aaaa/aaaaa_aaaaa/aaaa/aaaa/aaaaaaaaa/item/aaaaaaaaaaaa/view\"><img src=\"http://a.aaaa.aaa/aa/aaaaaa/aaaaaaaaaa/tn/aaaaaaa/name/000_0000.jpg\" style=\"border: none;\" alt=\"000_0000.jpg\"/></a></div>\n    <div class=\"aaaa-aaaa-aaaa\" style=\"clear: both; font-size: smaller; height: 12px; overflow: hidden; text-align: center; width: 72px;\"> <a style=\"text-decoration: none;\" href=\"http://aaaa.aaaa/aaaa/aaaaa_aaaaaa/aaaa/aaaa/aaaaaaaaaaaa/item/aaaaaaaaaaaaa/view\" title=\"000_0000.jpg\">000_0000.jpg</a></div>\n  </div>\n\t      \t       \t   ")
=> #<Nokogiri::HTML::Document:0x3fdbeb4d1848 name="document" children=[#<Nokogiri::XML::DTD:0x3fdbeb4d1514 name="html">, #<Nokogiri::XML::Element:0x3fdbeb4d14c4 name="html" children=[#<Nokogiri::XML::Element:0x3fdbeb4d0b50 name="body" children=[#<Nokogiri::XML::Element:0x3fdbeb4d0858 name="div" attributes=[#<Nokogiri::XML::Attr:0x3fdbeb4d077c name="style" value="overflow: hidden; float: left; margin: 2px 0;">] children=[#<Nokogiri::XML::Text:0x3fdbeb4d0330 "\n    ">, #<Nokogiri::XML::Element:0x3fdbeb4d02e0 name="a" attributes=[#<Nokogiri::XML::Attr:0x3fdbeb4cff84 name="href" value="http://aaaa.aaaa.aaaa/aaaaa/aaaaa_aaaaa/aaaaaa/aaaaaa/aaaaaaaaaaaa/aaaaaa/aaaaaaaaaa/view">] children=[#<Nokogiri::XML::Text:0x3fdbeb4cf520 "\n      ">, #<Nokogiri::XML::Element:0x3fdbeb4cf480 name="div" attributes=[#<Nokogiri::XML::Attr:0x3fdbeb4cef30 name="class" value="aaaa-aaaaa">, #<Nokogiri::XML::Attr:0x3fdbeb4cedf0 name="title" value="000_0000.jpg">, #<Nokogiri::XML::Attr:0x3fdbeb4ced64 name="style" value="border: 1px solid #666; height: 62px; width: 62px; overflow: hidden; background-color: white;">] children=[#<Nokogiri::XML::Element:0x3fdbeb4c9d64 name="a" attributes=[#<Nokogiri::XML::Attr:0x3fdbeb4c94f4 name="href" value="http://aaaa.aaaaa.aaa/aaaa/aaaaa_aaaaa/aaaa/aaaa/aaaaaaaaa/item/aaaaaaaaaaaa/view">] children=[#<Nokogiri::XML::Element:0x3fdbeb4c5174 name="img" attributes=[#<Nokogiri::XML::Attr:0x3fdbeb4c505c name="src" value="http://a.aaaa.aaa/aa/aaaaaa/aaaaaaaaaa/tn/aaaaaaa/name/000_0000.jpg">, #<Nokogiri::XML::Attr:0x3fdbeb4c5020 name="style" value="border: none;">, #<Nokogiri::XML::Attr:0x3fdbeb4c4ff8 name="alt" value="000_0000.jpg">]>]>]>, #<Nokogiri::XML::Text:0x3fdbeb4c44cc "\n    ">, #<Nokogiri::XML::Element:0x3fdbeb4c4468 name="div" attributes=[#<Nokogiri::XML::Attr:0x3fdbeb4c429c name="class" value="aaaa-aaaa-aaaa">, #<Nokogiri::XML::Attr:0x3fdbeb4c4288 name="style" value="clear: both; font-size: smaller; height: 12px; overflow: hidden; text-align: center; width: 72px;">] children=[#<Nokogiri::XML::Text:0x3fdbeb4c3b6c " ">, #<Nokogiri::XML::Element:0x3fdbeb4c3b08 name="a" attributes=[#<Nokogiri::XML::Attr:0x3fdbeb4c3978 name="style" value="text-decoration: none;">, #<Nokogiri::XML::Attr:0x3fdbeb4c3964 name="href" value="http://aaaa.aaaa/aaaa/aaaaa_aaaaaa/aaaa/aaaa/aaaaaaaaaaaa/item/aaaaaaaaaaaaa/view">, #<Nokogiri::XML::Attr:0x3fdbeb4c3950 name="title" value="000_0000.jpg">] children=[#<Nokogiri::XML::Text:0x3fdbeb4c3004 "000_0000.jpg">]>]>, #<Nokogiri::XML::Text:0x3fdbeb4c2dc0 "\n  ">]>]>]>]>]>
irb(main):004:0> document.search('//a[@href]').each do |link|
irb(main):005:1*       puts "-------------"
irb(main):006:1>       p link
irb(main):007:1>       title = link.content
irb(main):008:1>       puts "*************"
irb(main):009:1>       p title
irb(main):010:1>       link.content = title.nil? ? "":title.strip
irb(main):011:1>     end
-------------
#<Nokogiri::XML::Element:0x3fdbeb4d02e0 name="a" attributes=[#<Nokogiri::XML::Attr:0x3fdbeb4cff84 name="href" value="http://aaaa.aaaa.aaaa/aaaaa/aaaaa_aaaaa/aaaaaa/aaaaaa/aaaaaaaaaaaa/aaaaaa/aaaaaaaaaa/view">] children=[#<Nokogiri::XML::Text:0x3fdbeb4cf520 "\n      ">, #<Nokogiri::XML::Element:0x3fdbeb4cf480 name="div" attributes=[#<Nokogiri::XML::Attr:0x3fdbeb4cef30 name="class" value="aaaa-aaaaa">, #<Nokogiri::XML::Attr:0x3fdbeb4cedf0 name="title" value="000_0000.jpg">, #<Nokogiri::XML::Attr:0x3fdbeb4ced64 name="style" value="border: 1px solid #666; height: 62px; width: 62px; overflow: hidden; background-color: white;">] children=[#<Nokogiri::XML::Element:0x3fdbeb4c9d64 name="a" attributes=[#<Nokogiri::XML::Attr:0x3fdbeb4c94f4 name="href" value="http://aaaa.aaaaa.aaa/aaaa/aaaaa_aaaaa/aaaa/aaaa/aaaaaaaaa/item/aaaaaaaaaaaa/view">] children=[#<Nokogiri::XML::Element:0x3fdbeb4c5174 name="img" attributes=[#<Nokogiri::XML::Attr:0x3fdbeb4c505c name="src" value="http://a.aaaa.aaa/aa/aaaaaa/aaaaaaaaaa/tn/aaaaaaa/name/000_0000.jpg">, #<Nokogiri::XML::Attr:0x3fdbeb4c5020 name="style" value="border: none;">, #<Nokogiri::XML::Attr:0x3fdbeb4c4ff8 name="alt" value="000_0000.jpg">]>]>]>, #<Nokogiri::XML::Text:0x3fdbeb4c44cc "\n    ">, #<Nokogiri::XML::Element:0x3fdbeb4c4468 name="div" attributes=[#<Nokogiri::XML::Attr:0x3fdbeb4c429c name="class" value="aaaa-aaaa-aaaa">, #<Nokogiri::XML::Attr:0x3fdbeb4c4288 name="style" value="clear: both; font-size: smaller; height: 12px; overflow: hidden; text-align: center; width: 72px;">] children=[#<Nokogiri::XML::Text:0x3fdbeb4c3b6c " ">, #<Nokogiri::XML::Element:0x3fdbeb4c3b08 name="a" attributes=[#<Nokogiri::XML::Attr:0x3fdbeb4c3978 name="style" value="text-decoration: none;">, #<Nokogiri::XML::Attr:0x3fdbeb4c3964 name="href" value="http://aaaa.aaaa/aaaa/aaaaa_aaaaaa/aaaa/aaaa/aaaaaaaaaaaa/item/aaaaaaaaaaaaa/view">, #<Nokogiri::XML::Attr:0x3fdbeb4c3950 name="title" value="000_0000.jpg">] children=[#<Nokogiri::XML::Text:0x3fdbeb4c3004 "000_0000.jpg">]>]>, #<Nokogiri::XML::Text:0x3fdbeb4c2dc0 "\n  ">]>
*************
"\n      \n     000_0000.jpg\n  "
-------------
#<Nokogiri::XML::Text:0x3fdbeb27da74 "000_0000.jpg">
*************
"000_0000.jpg"
-------------
(irb):6: [BUG] Segmentation fault
ruby 1.8.7 (2008-08-11 patchlevel 72) [x86_64-linux]

Aborted

$
@lls
Copy link
Author

lls commented Dec 29, 2009

using

  • nokogiri 1.4.1
  • ruby 1.8.7
  • linux kernel 2.6.28-17

@tenderlove
Copy link
Member

Thanks for the report! Can you also provide the output of:

$ nokogiri -v

@flavorjones
Copy link
Member

I've reproduced. Investigating.

@flavorjones
Copy link
Member

Here is a much simpler example that causes the crash:

document = Nokogiri::HTML <<-EOH
<div id='foo'>
  <div id='bar'></div>
</div>
EOH

divs = document.css("div")
divs.first.content = "hello"
divs.last.content  = "crash!"

@flavorjones
Copy link
Member

or even:

...
divs = document.css("div")
divs.first.content = "hello"
puts divs.last.to_html

@tenderlove
Copy link
Member

Ugh. I see the problem. Dude, this sucks.

@flavorjones
Copy link
Member

unlinking children within Node#content= to prevent freeing memory out from under living Ruby objects. Closed by 982a43f.

@edgarjs
Copy link

edgarjs commented Jan 9, 2010

Sorry is this fix released on any version? I'm getting this error when trying to parse the response from http://gdata.youtube.com/feeds/api/videos?q=ruby

on

Gems/1.8/gems/nokogiri-1.4.1/lib/nokogiri/xml/sax/parser.rb:108: [BUG] Segmentation fault
ruby 1.8.7 (2009-06-12 patchlevel 174) [i686-darwin10.2.0]

@tenderlove
Copy link
Member

This particular bug won't be released until 1.4.2. You can grab one of our nightly builds and try that though. But judging by your error message, you're experiencing a different bug. Would you mind opening a new ticket with steps to reproduce your error? Make sure to include a script that crashes along with the output of nokogiri -v.

@edgarjs
Copy link

edgarjs commented Jan 10, 2010

Ok, I'll try with a nightly build, and open a new ticket if that doesn't work. Thanks.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants