basic operations for modifying a source document with XSLT
- by SpliFF
All the tutorials and examples I've found of XSLT processing seem to assume your destination will be a significantly different format/structure to your source and that you know the structure of the source in advance. I'm struggling with finding out how to perform simple "in-place" modifications to a HTML document without knowing anything else about its existing structure.
Could somebody show me a clear example that, given an arbitrary unknown HTML source will:
1.) delete the classname 'foo' from all divs
2.) delete a node if its empty (ie <p></p>)
3.) delete a <p> node if its first child is <br>
4.) add newattr="newvalue" to all H1
5.) replace 'heading' in text nodes with 'title'
6.) wrap all <u> tags in <b> tags (ie, <u>foo</u> -> <b><u>foo</u></b>)
7.) output the transformed document without changing anything else
The above examples are the primary types of transform I wish to accomplish. Understanding how to do the above will go a long way towards helping me build more complex transforms.
To help clarify/test the examples here is a sample source and output, however I must reiterate that I want to work with arbitrary samples without rewriting the XSLT for each source:
<!doctype html>
<html>
<body>
<h1>heading</h1>
<p></p>
<p><br>line</p>
<div class="foo bar"><u>baz</u></div>
<p>untouched</p>
</body>
</html>
output:
<!doctype html>
<html>
<body>
<h1 newattr="newvalue">title</h1>
<div class="bar"><b><u>baz</u></b></div>
<p>untouched</p>
</body>
</html>