Pulling out two separate words from a string using reg expressions?
- by Marvin
I need to improve on a regular expression I'm using. Currently, here it is:
^[a-zA-Z\s/-]+
I'm using it to pull out medication names from a variety of formulation strings, for example:
SULFAMETHOXAZOLE-TRIMETHOPRIM 200-40 MG/5ML PO SUSP
AMOX TR/POTASSIUM CLAVULANATE 125 mg-31.25 mg ORAL TABLET, CHEWABLE
AMOXICILLIN TRIHYDRATE 125 mg ORAL TABLET, CHEWABLE
AMOX TR/POTASSIUM CLAVULANATE 125 mg-31.25 mg ORAL TABLET, CHEWABLE
Amoxicillin 1000 MG / Clavulanate 62.5 MG Extended Release Tablet
The resulting matches on these examples are:
SULFAMETHOXAZOLE-TRIMETHOPRIM
AMOX TR/POTASSIUM CLAVULANATE
AMOXICILLIN TRIHYDRATE
AMOX TR/POTASSIUM CLAVULANATE
Amoxicillin
The first four are what I want, but on the fifth, I really need "Amoxicillin / Clavulanate".
How would I pull out patterns like "Amoxicillin / Clavulanate" (in fifth row) while missing patterns like "MG/5 ML" (in the first row)?