SED and Unicode Quotation Marks
- by Jonathan Patt
When testing against this string:
“… so that’s that… ”
The following should, but does not, match the opening quotation mark and following ellipsis and space:
sed "s/\([“‘\"']…\) /\1/g"
However, this correctly matches the second ellipsis and following space and closing quotation mark:
sed "s/… \([”’\"'.!?]\)/…\1/g"
If I split the first apart it works fine:
sed -e "s/\(“…\) /\1/g" \
-e "s/\(‘…\) /\1/g" \
-e "s/\(\"…\) /\1/g" \
-e "s/\('…\) /\1/g"
So why doesn't it work when it's grouped together? Especially when it works fine with the closing quotation marks.