Is it possible to split HTML using DOMDocument?
Posted
by
Lynn Adrianna
on Stack Overflow
See other posts from Stack Overflow
or by Lynn Adrianna
Published on 2014-08-22T16:15:49Z
Indexed on
2014/08/22
16:20 UTC
Read the original article
Hit count: 228
Using DOMDocument, is it possible to split a block of HTML by text wrapped in tags and those that are not, while maintaining the order? Sorry, if this doesn't make sense. My example should make it clear.
Let's say I have the following block of HTML:
text1<b style="color:pink">text2</b>text3<b>text4</b> <b style="font-weight:bold">text5</b>
Is it possible create an array as such:
array(
[0] => text1
[1] => <b style="color:pink">text2</b>
[2] => text3
[3] => <b>text4</b>
[4] =>
[5] => <b style="font-weight:bold">text5</b>
)
Below is my current working solution, which uses a regular expression, to split the HTML.
$tokens = preg_split('/(<b\b[^>]*>.*?<\/b>)/i', $html, null, PREG_SPLIT_DELIM_CAPTURE);
However, I always read that it is a bad idea to parse HTML using regular expressions, so was just wondering if there is a better way.
© Stack Overflow or respective owner