Sanitize a string with non-alphanum repetition
Posted
by Toto
on Stack Overflow
See other posts from Stack Overflow
or by Toto
Published on 2010-03-28T10:07:32Z
Indexed on
2010/03/28
11:23 UTC
Read the original article
Hit count: 444
I need to sanitize article titles when (creative) users try to "attract attention" with some non-alphanum repetition.
Exemples:
- Buy my product !!!!!!!!!!!!!!!!!!!!!!!!
- Buy my product !? !? !? !? !? !?
- Buy my product !!!!!!!!!.......!!!!!!!!
- Buy my product <-----------
Some acceptable solution would be to reduce the repetition of non-alphanum to 2.
So I would get:
- Buy my product !!
- Buy my product !? !?
- Buy my product !!..!!
- Buy my product <--
This solution did not work that well:
preg_replace('/(\W{2,})(?=\1+)/', '', $title)
Any idea how to do it in PHP with regex?
Other better solution is also welcomed (I cannot strip all the non-alphanum characters as they can make sense).
Edit: the objective is only to avoid most common issues. The other creative cases will be sanitized manually or sanitized with an other regex.
© Stack Overflow or respective owner