Search Results

Search found 5 results on 1 pages for 'nedmalloc'.

Page 1/1 | 1 

  • NedMalloc / DlMalloc experiences

    - by Suma
    I am currently evaluating a few of scalable memory allocators, namely nedmalloc and ptmalloc (both built on top of dlmalloc), as a replacement for default malloc / new because of significant contention seen in multithreaded environment. Their published performance seems to be good, however I would like to check what are experiences of other people who have really used them. Were your performance goals satisfied? Did you experience any unexpected or hard to solve issues (like heap corruption)? If you have tried both ptmaalloc and nedmalloc, which of the two would you recommend? Why (ease of use, performance)?

    Read the article

  • nedmalloc: where does mem<fm come from?

    - by Suma
    While implementing nedmalloc into my application, I am frequently hitting a situation when nedmalloc refuses to free a block of memory, claiming it did not allocate it. While debugging I have come up to the point I see a particular condition which is failing, all other (including magic numbers) succeed. The condition is this: if((size_t)mem-(size_t)fm>=(size_t)1<<(SIZE_T_BITSIZE-1)) return 0; On Win32 this seems to be equivalent to: if((int)((size_t)mem-(size_t)fm)<0) return 0; Which seems to be the same as: if((size_t)mem<(size_t)fm) return 0; In my case I really see mem < fm. What I do not understand now is, where does this condition come from. I cannot find anything which would guarantee the fm <= m anywhere in code. Yet, "select isn't broken": I doubt it would really be a bug in nedmalloc, most likely I am doing something wrong somewhere, but I cannot find it. Once I turn debugging features of nedmalloc on, the problem goes away. If someone here understands inner working of nedmalloc, could you please explain to me why is fm <= m?

    Read the article

  • Boost::Mutex & Malloc

    - by M. Tibbits
    Hi all, I'm trying to use a faster memory allocator in C++. I can't use Hoard due to licensing / cost. I was using NEDMalloc in a single threaded setting and got excellent performance, but I'm wondering if I should switch to something else -- as I understand things, NEDMalloc is just a replacement for C-based malloc() & free(), not the C++-based new & delete operators (which I use extensively). The problem is that I now need to be thread-safe, so I'm trying to malloc an object which is reference counted (to prevent excess copying), but which also contains a mutex pointer. That way, if you're about to delete the last copy, you first need to lock the pointer, then free the object, and lastly unlock & free the mutex. However, using malloc to create a boost::mutex appears impossible because I can't initialize the private object as calling the constructor directly ist verboten. So I'm left with this odd situation, where I'm using new to allocate the lock and nedmalloc to allocate everything else. But when I allocate a large amount of memory, I run into allocation errors (which disappear when I switch to malloc instead of nedmalloc ~ but the performance is terrible). My guess is that this is due to fragmentation in the memory and an inability of nedmalloc and new to place nice side by side. There has to be a better solution. What would you suggest?

    Read the article

  • Random Complete System Unresponsiveness Running Mathematical Functions

    - by Computer Guru
    I have a program that loads a file (anywhere from 10MB to 5GB) a chunk at a time (ReadFile), and for each chunk performs a set of mathematical operations (basically calculates the hash). After calculating the hash, it stores info about the chunk in an STL map (basically <chunkID, hash>) and then writes the chunk itself to another file (WriteFile). That's all it does. This program will cause certain PCs to choke and die. The mouse begins to stutter, the task manager takes 2 min to show, ctrl+alt+del is unresponsive, running programs are slow.... the works. I've done literally everything I can think of to optimize the program, and have triple-checked all objects. What I've done: Tried different (less intensive) hashing algorithms. Switched all allocations to nedmalloc instead of the default new operator Switched from stl::map to unordered_set, found the performance to still be abysmal, so I switched again to Google's dense_hash_map. Converted all objects to store pointers to objects instead of the objects themselves. Caching all Read and Write operations. Instead of reading a 16k chunk of the file and performing the math on it, I read 4MB into a buffer and read 16k chunks from there instead. Same for all write operations - they are coalesced into 4MB blocks before being written to disk. Run extensive profiling with Visual Studio 2010, AMD Code Analyst, and perfmon. Set the thread priority to THREAD_MODE_BACKGROUND_BEGIN Set the thread priority to THREAD_PRIORITY_IDLE Added a Sleep(100) call after every loop. Even after all this, the application still results in a system-wide hang on certain machines under certain circumstances. Perfmon and Process Explorer show minimal CPU usage (with the sleep), no constant reads/writes from disk, few hard pagefaults (and only ~30k pagefaults in the lifetime of the application on a 5GB input file), little virtual memory (never more than 150MB), no leaked handles, no memory leaks. The machines I've tested it on run Windows XP - Windows 7, x86 and x64 versions included. None have less than 2GB RAM, though the problem is always exacerbated under lower memory conditions. I'm at a loss as to what to do next. I don't know what's causing it - I'm torn between CPU or Memory as the culprit. CPU because without the sleep and under different thread priorities the system performances changes noticeably. Memory because there's a huge difference in how often the issue occurs when using unordered_set vs Google's dense_hash_map. What's really weird? Obviously, the NT kernel design is supposed to prevent this sort of behavior from ever occurring (a user-mode application driving the system to this sort of extreme poor performance!?)..... but when I compile the code and run it on OS X or Linux (it's fairly standard C++ throughout) it performs excellently even on poor machines with little RAM and weaker CPUs. What am I supposed to do next? How do I know what the hell it is that Windows is doing behind the scenes that's killing system performance, when all the indicators are that the application itself isn't doing anything extreme? Any advice would be most welcome.

    Read the article

1