Sanitize a string from ascii art
- by Toto
I need to sanitize article titles when (creative) users try to "attract attention" with some bad "ascii art".
Exemples:
Buy my product !!!!!!!!!!!!!!!!!!!!!!!!
Buy my product !? !? !? !? !? !?
Buy my product !!!!!!!!!.......!!!!!!!!
Buy my product <-----------
Some acceptable solution would be to reduce the repetition of non-alphanum to 2.
So I would get:
Buy my product !!
Buy my product !? !?
Buy my product !!..!!
Buy my product <--
This solution did not work that well:
preg_replace('/(\W{2,})(?=\1+)/', '', $title)
Any idea how to do it in PHP with regex?
Other better solution is also welcomed (I cannot strip all the non-alphanum characters as they can make sense).