removing phone number from a document.
- by Grant Collins
Hi,
I've got a challenge that I am hoping that the SO community is able to help me with.
I trying to parse a lot of html documents in my PHP application to remove personal details, such as names, addresses and phone numbers. I can remove most of these details without too much trouble, however the phone number is a real problem for me.
My idea is to take the text from these documents and the use a regex to identify the phone numbers and replace them with another value such as 'xxxx'.
I've got 2 regex that I am using one for UK landline numbers and one for UK cell/mobile numbers.
However when I try and run them against the text it just returns an empty string.
I am using the following preg_replace code:
$pattens = array(
'/^(((\+44\s?\d{4}|\(?0\d{4}\)?)\s?\d{3}\s?\d{3})|((\+44\s?\d{3}|\(?0\d{3}\)?)\s?\d{3}\s?\d{4})|((\+44\s?\d{2}|\(?0\d{2}\)?)\s?\d{4}\s?\d{4}))(\s?\#(\d{4}|\d{3}))?$/',
'/^(\+44\s?7\d{3}|\(?07\d{3}\)?)\s?\d{3}\s?\d{3}$/'
);
$replace = array('xxxxx', 'xxxxx');
//do the search for the numbers.
$updatedContents = preg_replace($pattens, $replace, $htmlContents);
At the moment this is causing me a lot of head scratching as I thought that I had this nailed, but at the moment I can't see what's wrong??
I am sure that it is something really simple.
Thanks,
Grant