Squid proxy not serving modified html content
        Posted  
        
            by Matthew
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by Matthew
        
        
        
        Published on 2010-03-24T08:29:48Z
        Indexed on 
            2010/03/24
            8:33 UTC
        
        
        Read the original article
        Hit count: 692
        
I'm trying to use squid to modify the page content of web page requests. I followed the upside-down-ternet tutorial which showed instructions for how to flip images on pages.
I need to change the actual html of the page. I've been trying to do the same thing as in the tutorial, but instead of editing the image I'm trying to edit the html page. Below is a php script I'm using to try to do it.
All jpg images get flipped, but the content on the page does not get edited. The edited index.html files written contain the edited content, but the pages the users receive don't contain the edited content.
#!/usr/bin/php
<?php
$temp = array();
while ( $input = fgets(STDIN) ) {
    $micro_time = microtime();
    // Split the output (space delimited) from squid into an array.
    $temp = split(' ', $input);
    //Flip jpg images, this works correctly
    if (preg_match("/.*\.jpg/i", $temp[0])) {
        system("/usr/bin/wget -q -O /var/www/cache/$micro_time.jpg ". $temp[0]);
        system("/usr/bin/mogrify -flip /var/www/cache/$micro_time.jpg");
        echo "http://127.0.0.1/cache/$micro_time.jpg\n";
    }
    //Don't edit files that are obviously not html. $temp[0] contains url of file to get
    elseif (preg_match("/(jpg|png|gif|css|js|\(|\))/i", $temp[0], $matches)) {
        echo $input;
    }   
    //Otherwise, could be html (e.g. `wget http://www.google.com` downloads index.html)
    else{ 
        $time = time() . microtime();       //For unique directory names
        $time = preg_replace("/ /", "", $time); //Simplify things by removing the spaces
        mkdir("/var/www/cache/". $time);    //Create unique folder
        system("/usr/bin/wget -q --directory-prefix=\"/var/www/cache/$time/\" ". $temp[0]);
        $filename = system("ls /var/www/cache/$time/");     //Get filename of downloaded file
        //File is html, edit the content (this does not work)
        if(preg_match("/.*\.html/", $filename)){
            //Get the html file contents  
            $contentfh = fopen("/var/www/cache/$time/". $filename, 'r');
            $content = fread($contentfh, filesize("/var/www/cache/$time/". $filename));
            fclose($contentfh);
            //Edit the html file contents
            $content = preg_replace("/<\/body>/i", "<!-- content served by proxy --></body>", $content);
            //Write the edited file
            $contentfh = fopen("/var/www/cache/$time/". $filename, 'w');
            fwrite($contentfh, $content);
            fclose($contentfh);
            //Return the edited page
            echo "http://127.0.0.1/cache/$time/$filename\n";
        }               
        //Otherwise file is not html, don't edit
        else{
            echo $input;
        }
    }
}
?>
© Stack Overflow or respective owner