XmlSlurper/NekoHTML document fragment parsing - No HTML or BODY tags wanted

Posted by Misha Koshelev on Stack Overflow See other posts from Stack Overflow or by Misha Koshelev
Published on 2010-06-11T16:31:20Z Indexed on 2010/06/11 16:52 UTC
Read the original article Hit count: 519

Filed under:

groovy

|

fragment

|

xmlslurper

|

cyberneko

Dear All, I am trying to parse the following HTML fragment, and I would like to get the same fragment as output (without HTML and BODY tags). Is this possible? If so, how?

Thank you Misha

p.s. I am reading here: http://nekohtml.sourceforge.net/faq.html#fragments and I believe I have added the correct options below. However, the output is still incorrect :(

Thank you Misha

import groovy.xml.MarkupBuilder
import groovy.xml.StreamingMarkupBuilder
import groovy.util.XmlNodePrinter
import groovy.util.slurpersupport.NodeChild


def text="""
<div><h2>Test</h2>
<div>Hi</div>
</div>
"""

// Parse
def config=new org.cyberneko.html.HTMLConfiguration()
config.setFeature("http://cyberneko.org/html/features/balance-tags/document-fragment",true)
def html=new XmlSlurper(new org.cyberneko.html.parsers.SAXParser()).parseText(text)          

// Output
def printNode(NodeChild node) {
    def writer = new StringWriter()
    writer << new StreamingMarkupBuilder().bind {
        mkp.declareNamespace('':node[0].namespaceURI())
        mkp.yield node
    }
    new XmlNodePrinter().print(new XmlParser().parseText(writer.toString()))
}
printNode(html)

Output:

<HTML>
  <tag0:HEAD xmlns:tag0="http://www.w3.org/1999/xhtml"/>
  <BODY>
    <DIV>
      <H2>
        Test
      </H2>
      <DIV>
        Hi
      </DIV>
    </DIV>
  </BODY>
</HTML>

Developer IT

XmlSlurper/NekoHTML document fragment parsing - No HTML or BODY tags wanted - Developer IT

XmlSlurper/NekoHTML document fragment parsing - No HTML or BODY tags wanted

groovy

fragment

xmlslurper

cyberneko

Related posts about groovy

Does IntelliJ-Idea support Groovy 2.x?

grails 1.3.1 Error executing script GenerateViews:

How to list all (groovy) classes in JVM in groovy

Multiple file access in groovy(Groovy on Grails)

Geb not working with chrome driver

Related posts about fragment

JQuery preventDefault() but still add the fragment path to the URL without navigating to the fragment

How to add a Fragment inside a ViewPager using Nested Fragment (Android 4.2)

FragmentStatePagerAdapter IllegalStateException: <MyFragment> is not currently in the FragmentManager

Response.Redirect with a fragment identifier causes unexpected refresh when later using location.has

VirtualBox sound problem under Ubuntu

Categories cloud