Nokogiri changing custom elements
- by dagda1
Hi,
I have sample html that I have marked up with some special tags that will be used by a different program, an example of the html is below. You should note the <START:organization>..<END> elements.
<html>
<head/>
<body>
<ul>
<li> <START:organization> Advanced Integrated Pest Management <END> </li>
<li> <START:organization> American Bakers Association <END> </li>
</ul>
</body>
</html>
I wanted to use nokogiri to preprocess the html to easily remove irrelevant tags like <script>. I created the following extension to the nokogiri document class:
module Nokogiri
module HTML
class Document
def prepare_html
xpath("//script").remove
to_html.remove_new_lines
end
end
end
end
The problem is that nokogiri is changing the <START:organization> element to <organization>.
Is there anyway that I can preserve the htnl to maintain my custom markup tags?
Thanks
Paul