codingBat separateThousands using regex (and unit testing how-to)
Posted
by polygenelubricants
on Stack Overflow
See other posts from Stack Overflow
or by polygenelubricants
Published on 2010-04-24T08:04:18Z
Indexed on
2010/04/24
8:13 UTC
Read the original article
Hit count: 455
This question is a combination of regex practice and unit testing practice.
Regex part
I authored this problem separateThousands
for personal practice:
Given a number as a string, introduce commas to separate thousands. The number may contain an optional minus sign, and an optional decimal part. There will not be any superfluous leading zeroes.
Here's my solution:
String separateThousands(String s) {
return s.replaceAll(
String.format("(?:%s)|(?:%s)",
"(?<=\\G\\d{3})(?=\\d)",
"(?<=^-?\\d{1,3})(?=(?:\\d{3})+(?!\\d))"
),
","
);
}
The way it works is that it classifies two types of commas, the first, and the rest. In the above regex, the rest subpattern actually appears before the first. A match will always be zero-length, which will be replaceAll
with ","
.
The rest basically looks behind to see if there was a match followed by 3 digits, and looks ahead to see if there's a digit. It's some sort of a chain reaction mechanism triggered by the previous match.
The first basically looks behind for ^
anchor, followed by an optional minus sign, and between 1 to 3 digits. The rest of the string from that point must match triplets of digits, followed by a nondigit (which could either be $
or \.
).
My question for this part is:
- Can this regex be simplified?
- Can it be optimized further?
- Ordering rest before first is deliberate, since first is only needed once
- No capturing group
Unit testing part
As I've mentioned, I'm the author of this problem, so I'm also the one responsible for coming up with testcases for them. Here they are:
INPUT, OUTPUT
"1000", "1,000"
"-12345", "-12,345"
"-1234567890.1234567890", "-1,234,567,890.1234567890"
"123.456", "123.456"
".666666", ".666666"
"0", "0"
"123456789", "123,456,789"
"1234.5678", "1,234.5678"
"-55555.55555", "-55,555.55555"
"0.123456789", "0.123456789"
"123456.789", "123,456.789"
I haven't had much experience with industrial-strength unit testing, so I'm wondering if others can comment whether this is a good coverage, whether I've missed anything important, etc (I can always add more tests if there's a scenario I've missed).
© Stack Overflow or respective owner