I have a text description field for classified listings in a site. There is a separate field for entering the phone number for contacting the person but many people are entering the phone inside the description, which I do not want.
I use a regex to filter it if entered in number format but they get really creative and use letters as well. Is there a way to filter that using regex or some sort of preprocessing of the field.
Here is an example code (which only works for the phone as numbers):
$phone = '0888123123';
$text = 'Some description with phone set to 0888123123 as well as zero eight eight eight one two three one two three.';
preg_match_all('/('.implode('[\D]*', str_split($phone)).')/i', strip_tags($text), $matches);
if (count($matches) > 0) {
foreach($matches as $value) {
$text = str_replace($value, $phone, $text);
}
}
I am thinking of replacing each of the letters with a number from an array like array('one'=>1)
and it would work except if they have a number somewhere else in the text it would replace it as well. Is there a way to upgrade the regex to catch the letters case for that phone number?
EDIT:
I updated the regex with an array for the numbers in letter format but there is another problem. If the phone number appears more than once in the text it doesn't work:
$text = 'Some description with phone set to 0888123123 as well as zero eight eight eight one two three one two three and a second time - zero eight eight eight one two three one two three.';
$phone_digits = array(0=>'zero',
1=>'one',
2=>'two',
3=>'three',
4=>'four',
5=>'five',
6=>'six',
7=>'seven',
8=>'eight',
9=>'nine');
$phone_letters = array();
foreach (str_split($phone) as $number)
{
$phone_letters[] = "($number|$phone_digits[$number])";
}
preg_match_all('/('.implode('[\D]*', $phone_letters).')/i', $text, $matches);
It matches the letters but doesn't stop at the last number of the phone:
Matches: "zero eight eight eight one two three one two three and a second time - zero eight eight eight one two three one two three."
instead of matching twice:
"zero eight eight eight one two three one two three"
Can the preg_match not be greedy and stop when the first match is found and then process the rest of the string for additional matches?