Strange (Undefined?) Behavior of Free in C
- by Chris Cirefice
This is really strange... and I can't debug it (tried for about two hours, debugger starts going haywire after a while...). Anyway, I'm trying to do something really simple:
Free an array of strings. The array is in the form:
char **myStrings. The array elements are initialized as:
myString[index] = malloc(strlen(word));
myString[index] = word;
and I'm calling a function like this:
free_memory(myStrings, size); where size is the length of the array (I know this is not the problem, I tested it extensively and everything except this function is working).
free_memory looks like this:
void free_memory(char **list, int size) {
for (int i = 0; i < size; i ++) {
free(list[i]);
}
free(list);
}
Now here comes the weird part. if (size> strlen(list[i])) then the program crashes. For example, imagine that I have a list of strings that looks something like this:
myStrings[0] = "Some";
myStrings[1] = "random";
myStrings[2] = "strings";
And thus the length of this array is 3.
If I pass this to my free_memory function, strlen(myStrings[0]) > 3 (4 3), and the program crashes.
However, if I change myStrings[0] to be "So" instead, then strlen(myStrings[0]) < 3 (2 < 3) and the program does not crash.
So it seems to me that free(list[i]) is actually going through the char[] that is at that location and trying to free each character, which I imagine is undefined behavior.
The only reason I say this is because I can play around with the size of the first element of myStrings and make the program crash whenever I feel like it, so I'm assuming that this is the problem area.
Note: I did try to debug this by stepping through the function that calls free_memory, noting any weird values and such, but the moment I step into the free_memory function, the debugger crashes, so I'm not really sure what is going on. Nothing is out of the ordinary until I enter the function, then the world explodes.
Another note: I also posted the shortened version of the source for this program (not too long; Pastebin) here. I am compiling on MinGW with the c99 flag on.
PS - I just thought of this. I am indeed passing numUniqueWords to the free function, and I know that this does not actually free the entire piece of memory that I allocated. I've called it both ways, that's not the issue. And I left it how I did because that is the way that I will be calling it after I get it to work in the first place, I need to revise some of my logic in that function.
Source, as per request (on-site):
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <stdlib.h>
#include "words.h"
int getNumUniqueWords(char text[], int size);
int main(int argc, char* argv[]) {
setvbuf(stdout, NULL, 4, _IONBF); // For Eclipse... stupid bug. --> does NOT affect the program, just the output to console!
int nbr_words;
char text[] = "Some - \"text, a stdin\". We'll have! also repeat? We'll also have a repeat!";
int length = sizeof(text);
nbr_words = getNumUniqueWords(text, length);
return 0;
}
void free_memory(char **list, int size) {
for (int i = 0; i < size; i ++) {
// You can see that printing the values is fine, as long as free is not called.
// When free is called, the program will crash if (size > strlen(list[i]))
//printf("Wanna free value %d w/len of %d: %s\n", i, strlen(list[i]), list[i]);
free(list[i]);
}
free(list);
}
int getNumUniqueWords(char text[], int length) {
int numTotalWords = 0;
char *word;
printf("Length: %d characters\n", length);
char totalWords[length];
strcpy(totalWords, text);
word = strtok(totalWords, " ,.-!?()\"0123456789");
while (word != NULL) {
numTotalWords ++;
printf("%s\n", word);
word = strtok(NULL, " ,.-!?()\"0123456789");
}
printf("Looks like we counted %d total words\n\n", numTotalWords);
char *uniqueWords[numTotalWords];
char *tempWord;
int wordAlreadyExists = 0;
int numUniqueWords = 0;
char totalWordsCopy[length];
strcpy(totalWordsCopy, text);
for (int i = 0; i < numTotalWords; i++) {
uniqueWords[i] = NULL;
}
// Tokenize until all the text is consumed.
word = strtok(totalWordsCopy, " ,.-!?()\"0123456789");
while (word != NULL) {
// Look through the word list for the current token.
for (int j = 0; j < numTotalWords; j ++) {
// Just for clarity, no real meaning.
tempWord = uniqueWords[j];
// The word list is either empty or the current token is not in the list.
if (tempWord == NULL) {
break;
}
//printf("Comparing (%s) with (%s)\n", tempWord, word);
// If the current token is the same as the current element in the word list, mark and break
if (strcmp(tempWord, word) == 0) {
printf("\nDuplicate: (%s)\n\n", word);
wordAlreadyExists = 1;
break;
}
}
// Word does not exist, add it to the array.
if (!wordAlreadyExists) {
uniqueWords[numUniqueWords] = malloc(strlen(word));
uniqueWords[numUniqueWords] = word;
numUniqueWords ++;
printf("Unique: %s\n", word);
}
// Reset flags and continue.
wordAlreadyExists = 0;
word = strtok(NULL, " ,.-!?()\"0123456789");
}
// Print out the array just for funsies - make sure it's working properly.
for (int x = 0; x <numUniqueWords; x++) {
printf("Unique list %d: %s\n", x, uniqueWords[x]);
}
printf("\nNumber of unique words: %d\n\n", numUniqueWords);
// Right below is where things start to suck.
free_memory(uniqueWords, numUniqueWords);
return numUniqueWords;
}