Tricky issue with using xslt with badly formed html...
Posted
by Ryba
on Stack Overflow
See other posts from Stack Overflow
or by Ryba
Published on 2010-05-21T22:30:49Z
Indexed on
2010/05/21
22:50 UTC
Read the original article
Hit count: 139
Hi there, I am fairly new to xslt (2.0) and am having some trouble with a tricky issue. Essentially I have a badly formatted html file like below:
<html>
<body>
<p> text 1 </p>
<div> <p> text 2</p> </div>
<p> Here is a list
<ul>
<ol>
<li> ListItem1 </li>
<li> ListItem1 </li>
</ol>
<dl>
<li> dl item </li>
<li> dl item2 </li>
</dl>
</ul>
<div>
<p> I was here</p>
</div>
</p>
And I am trying to put it into a nicely formated XML file. In my xslt file I recursively check if all children of a p or div are other p's or div's and just promote them, other wise I use them as stand alone paragraphs. I extended this idea so that if a p or div with a child list show up properly but don't promote the list children.
A problem that I am having is that the output xml I get is the following
<?xml version="1.0" encoding="utf-8"?><html>
<body>
<p> text 1 </p>
<p> text 2</p>
Here is a list
<ul>
<ol>
<li> ListItem1 </li>
<li> ListItem1 </li>
</ol>
<dl>
<li> dl item </li>
<li> dl item2 </li>
</dl>
</ul>
<p> I was here</p>
"Here is a list" needs to be in paragraph tags too! I am going crazy trying to solve this ... Any input/links would be greatly appreciated.
© Stack Overflow or respective owner