remove words containing non-alpha characters
- by dnkb
Given a text file with space separated string and a tab separated integer, I'd ;like to get rid of all words that have non-alpha characters but keep words consisting of alpha only characters and the tab plus the integer afterwards.
My attempts like the ones below didin't yield any good. What I was trying to express is
something like: "replace anything within word boundaries that starts and ends with 0 or more whatever and there is at least one :digits: or :punct: in between".
sed 's/\b.[:digits::punct:]+.\b//g'
sed 's/\b.[^:alpha:]+.\b//g'
What am I missing? See sample input data below.
Thank you!
asdf 754m 563
a2a 754mm 291
754n 463
754 ppp 1409
754pin 4652
pin pin 462
754pins 652
754 ppp 1409
754pin 4652
pi$n pin 462
754/p ins 652
754 pp+p 1409
754 p=in 4652