Can Nokogiri use a SAX parser to parse an HTML fragment?
Posted
by .yahoo.co.jpaqwsykcj3aulh3h1k0cy6nzs3isj
on Stack Overflow
See other posts from Stack Overflow
or by .yahoo.co.jpaqwsykcj3aulh3h1k0cy6nzs3isj
Published on 2010-03-16T05:18:54Z
Indexed on
2010/03/16
5:26 UTC
Read the original article
Hit count: 396
I have this code.
class MyParser < Nokogiri::XML::SAX::Document
def characters(string)
LOG.debug("characters #{string}")
end
def start_element(name, attrs = [])
LOG.debug("start_element #{name}")
end
def end_element(name)
LOG.debug("end_element #{name}")
end
end
parser = Nokogiri::HTML::SAX::Parser.new(MyParser.new)
parser.parse(File.new($*[0], 'rb'))
Run on an HTML fragment like this,
<h1>Hello</h1>
<p>Hi.</p>
the output shows that only the first element is processed:
start_element h1
characters Hello
end_element h1
If I wrap the fragment in html
and body
tags, the whole input is parsed.
Is there a way to use a SAX style parser on HTML fragments?
© Stack Overflow or respective owner