Pulling out two separate words from a string using reg expressions?
Posted
by Marvin
on Stack Overflow
See other posts from Stack Overflow
or by Marvin
Published on 2010-03-29T21:37:43Z
Indexed on
2010/03/29
21:43 UTC
Read the original article
Hit count: 351
regex
I need to improve on a regular expression I'm using. Currently, here it is:
^[a-zA-Z\s/-]+
I'm using it to pull out medication names from a variety of formulation strings, for example:
- SULFAMETHOXAZOLE-TRIMETHOPRIM 200-40 MG/5ML PO SUSP
- AMOX TR/POTASSIUM CLAVULANATE 125 mg-31.25 mg ORAL TABLET, CHEWABLE
- AMOXICILLIN TRIHYDRATE 125 mg ORAL TABLET, CHEWABLE
- AMOX TR/POTASSIUM CLAVULANATE 125 mg-31.25 mg ORAL TABLET, CHEWABLE
- Amoxicillin 1000 MG / Clavulanate 62.5 MG Extended Release Tablet
The resulting matches on these examples are:
- SULFAMETHOXAZOLE-TRIMETHOPRIM
- AMOX TR/POTASSIUM CLAVULANATE
- AMOXICILLIN TRIHYDRATE
- AMOX TR/POTASSIUM CLAVULANATE
- Amoxicillin
The first four are what I want, but on the fifth, I really need "Amoxicillin / Clavulanate".
How would I pull out patterns like "Amoxicillin / Clavulanate" (in fifth row) while missing patterns like "MG/5 ML" (in the first row)?
© Stack Overflow or respective owner