I've got a problem where calling grep from inside java gives incorrect results, as compared to the results from calling grep on the same file in the shell.
My grep command (called both in Java and in bash. I escaped the slash in Java accordingly):
/bin/grep -vP --regexp='^[0-9]+\t.*' /usr/local/apache-tomcat-6.0.18/work/Catalina/localhost/saccitic/237482319867147879_1271411421
Java Code:
String filepath = "/path/to/file";
String options = "P";
String grepparams = "^[0-9]+\\t.*";
String greppath = "/bin/";
String[] localeArray = new String[] {
"LANG=",
"LC_COLLATE=C",
"LC_CTYPE=UTF-8",
"LC_MESSAGES=C",
"LC_MONETARY=C",
"LC_NUMERIC=C",
"LC_TIME=C",
"LC_ALL="
};
options = "v"+options; //Assign optional params
if (options.contains("P")) {
grepparams = "\'"+grepparams+"\'"; //Quote the regex expression if -P flag is used
} else {
options = "E"+options; //equivalent to calling egrep
}
proc = sysRuntime.exec(greppath+"/grep -"+options+" --regexp="+grepparams+" "+filepath, localeArray);
System.out.println(greppath+"/grep -"+options+" --regexp="+grepparams+" "+filepath);
inStream = proc.getInputStream();
The command is supposed to match and discard strings like these:
85295371616 Hi Mr Lee, please be informed that...
My input file is this:
85aaa234567 Hi Ms Chan, please be informed that...
85292vx5678 Hi Mrs Ng, please be informed that...
85295371616 Hi Mr Lee, please be informed that...
85aaa234567 Hi Ms Chan, please be informed that...
85292vx5678 Hi Mrs Ng, please be informed that...
85295371616 Hi Mr Lee, please be informed that...
85295371616 Hi Mr Lee, please be informed that...
85295371616 Hi Mr Lee, please be informed that...
85295371616 Hi Mr Lee, please be informed that...
85295371616 Hi Mr Lee, please be informed that...
8~!95371616 Hi Mr Lee, please be informed that...
85295371616 Hi Mr Lee, please be informed that...
852&^*&1616 Hi Mr Lee, please be informed that...
8529537Ax16 Hi Mr Lee, please be informed that...
85====ppq16 Hi Mr Lee, please be informed that...
85291234783 a3283784428349247233834728482984723333
85219299222
The commands works when I call it from inside bash (Results below):
85aaa234567 Hi Ms Chan, please be informed that...
85292vx5678 Hi Mrs Ng, please be informed that...
85aaa234567 Hi Ms Chan, please be informed that...
85292vx5678 Hi Mrs Ng, please be informed that...
8~!95371616 Hi Mr Lee, please be informed that...
852&^*&1616 Hi Mr Lee, please be informed that...
8529537Ax16 Hi Mr Lee, please be informed that...
85====ppq16 Hi Mr Lee, please be informed that...
85219299222
However, when I call grep again inside java, I get the entire file (Results below):
85aaa234567 Hi Ms Chan, please be informed that...
85292vx5678 Hi Mrs Ng, please be informed that...
85295371616 Hi Mr Lee, please be informed that...
85aaa234567 Hi Ms Chan, please be informed that...
85292vx5678 Hi Mrs Ng, please be informed that...
85295371616 Hi Mr Lee, please be informed that...
85295371616 Hi Mr Lee, please be informed that...
85295371616 Hi Mr Lee, please be informed that...
85295371616 Hi Mr Lee, please be informed that...
85295371616 Hi Mr Lee, please be informed that...
8~!95371616 Hi Mr Lee, please be informed that...
85295371616 Hi Mr Lee, please be informed that...
852&^*&1616 Hi Mr Lee, please be informed that...
8529537Ax16 Hi Mr Lee, please be informed that...
85====ppq16 Hi Mr Lee, please be informed that...
85291234783 a3283784428349247233834728482984723333
85219299222
What could be the problem that will cause the grep called by Java to return incorrect results? I tried passing local information via the environment string array in runtime.exec, but nothing seems to change. Am I passing in the locale information incorrectly, or is the problem something else entirely?