XSLT: moving a grouping html elements into section levels
- by Jeff
Hello there, I'm trying to write an XSLT that organizes an HTML file into different section levels depending on the header level. Here is my input:
<html>
<head>
<title></title>
</head>
<body>
<h1>HEADER 1 CONTENT</h1>
<p>Level 1 para</p>
<p>Level 1 para</p>
<p>Level 1 para</p>
<p>Level 1 para</p>
<h2>Header 2 CONTENT</h2>
<p>Level 2 para</p>
<p>Level 2 para</p>
<p>Level 2 para</p>
<p>Level 2 para</p>
</body>
</html>
I'm working with a fairly simple structure at the moment so this pattern will be constant for the time-being. I need an output like this...
<document>
<section level="1">
<header1>Header 1 CONTENT</header1>
<p>Level 1 para</p>
<p>Level 1 para</p>
<p>Level 1 para</p>
<p>Level 1 para</p>
<section level="2">
<header2>Header 2 CONTENT</header2>
<p>Level 2 para</p>
<p>Level 2 para</p>
<p>Level 2 para</p>
<p>Level 2 para</p>
</section>
</section>
</document>
I had been working with this example: Stackoverflow Answer
However, I cannot get it to do exactly what I need.
I'm using Saxon 9 to run the xslt within Oxygen for dev. I'll be using a cmd/bat file in production. Still Saxon 9. I'd like to handle up to 4 nested section levels if possible.
Any help is much appreciated!
I need to append onto this as I've encountered another stipulation. I probably should have thought of this before.
I'm encountering the following code sample
<html>
<head>
<title></title>
</head>
<body>
<p>Level 1 para</p>
<p>Level 1 para</p>
<p>Level 1 para</p>
<p>Level 1 para</p>
<h1>Header 2 CONTENT</h1>
<p>Level 2 para</p>
<p>Level 2 para</p>
<p>Level 2 para</p>
<p>Level 2 para</p>
</body>
</html>
As you can see, the <p> is a child of <body> while in my first snippet, <p> was always a child of a header level. My desired result is the same as above except that when I encounter <p> as a child of <body>, it should be wrapped in <section level="1">.
<document>
<section level="1">
<p>Level 1 para</p>
<p>Level 1 para</p>
<p>Level 1 para</p>
<p>Level 1 para</p>
</section>
<section level="1">
<header1>Header 2 CONTENT</header1>
<p>Level 2 para</p>
<p>Level 2 para</p>
<p>Level 2 para</p>
<p>Level 2 para</p>
</section>
</document>