Is it possible to split HTML using DOMDocument?
- by Lynn Adrianna
Using DOMDocument, is it possible to split a block of HTML by text wrapped in tags and those that are not, while maintaining the order? Sorry, if this doesn't make sense. My example should make it clear.
Let's say I have the following block of HTML:
text1<b style="color:pink">text2</b>text3<b>text4</b> <b style="font-weight:bold">text5</b>
Is it possible create an array as such:
array(
[0] => text1
[1] => <b style="color:pink">text2</b>
[2] => text3
[3] => <b>text4</b>
[4] =>
[5] => <b style="font-weight:bold">text5</b>
)
Below is my current working solution, which uses a regular expression, to split the HTML.
$tokens = preg_split('/(<b\b[^>]*>.*?<\/b>)/i', $html, null, PREG_SPLIT_DELIM_CAPTURE);
However, I always read that it is a bad idea to parse HTML using regular expressions, so was just wondering if there is a better way.