Is there a way to optimise finding text items on a page (not regex)

Posted by Jeepstone on Stack Overflow See other posts from Stack Overflow or by Jeepstone
Published on 2010-05-06T10:25:25Z Indexed on 2010/05/06 10:28 UTC
Read the original article Hit count: 167

Filed under:
|
|
|
|

After seeing several threads rubbishing the regexp method of finding a term to match within an HTML document, I've used the Simple HTML DOM PHP parser (http://simplehtmldom.sourceforge.net/) to get the bits of text I'm after, but I want to know if my code is optimal. It feels like I'm looping too many times. Is there a way to optimise the following loop?

//Get the HTML and look at the text nodes
   $html = str_get_html($buffer);
   //First we match the <body> tag as we don't want to change the <head> items
   foreach($html->find('body') as $body) {
    //Then we get the text nodes, rather than any HTML
    foreach($body->find('text') as $text) {
     //Then we match each term
     foreach ($terms as $term) {
      //Match to the terms within the text nodes
      $text->outertext = str_replace($term, '<span class="highlight">'.$term.'</span>', $text->outertext);
     }       
    }
   }

For example, would it make a difference to determine check if I have any matches before I start the loop maybe?

© Stack Overflow or respective owner

Related posts about php

Related posts about regex