Search Results

Search found 1188 results on 48 pages for 'heap fragmentation'.

Page 10/48 | < Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >

Is there a fundamental difference between malloc and HeapAlloc (aside from the portability)?

- by Lambert

Hi, I'm having code that, for various reasons, I'm trying to port from the C runtime to one that uses the Windows Heap API. I've encountered a problem: If I redirect the malloc/calloc/realloc/free calls to HeapAlloc/HeapReAlloc/HeapFree (with GetProcessHeap for the handle), the memory seems to be allocated correctly (no bad pointer returned, and no exceptions thrown), but the library I'm porting says "failed to allocate memory" for some reason. I've tried this both with the Microsoft CRT (which uses the Heap API underneath) and with another company's run-time library (which uses the Global Memory API underneath); the malloc for both of those works well with the library, but for some reason, using the Heap API directly doesn't work. I've checked that the allocations aren't too big (= 0x7FFF8 bytes), and they're not. The only problem I can think of is memory alignment; is that the case? Or other than that, is there a fundamental difference between the Heap API and the CRT memory API that I'm not aware of? If so, what is it? And if not, then why does the static Microsoft CRT (included with Visual Studio) take some extra steps in malloc/calloc before calling HeapAlloc? I'm suspecting there's a difference but I can't think of what it might be. Thank you!

Read the article
Optimizing sorting container of objects with heap-allocated buffers - how to avoid hard-copying buff

- by Kache4

I was making sure I knew how to do the op= and copy constructor correctly in order to sort() properly, so I wrote up a test case. After getting it to work, I realized that the op= was hard-copying all the data_. I figure if I wanted to sort a container with this structure (its elements have heap allocated char buffer arrays), it'd be faster to just swap the pointers around. Is there a way to do that? Would I have to write my own sort/swap function? #include <deque> //#include <string> //#include <utility> //#include <cstdlib> #include <cstring> #include <iostream> //#include <algorithm> // I use sort(), so why does this still compile when commented out? #include <boost/filesystem.hpp> #include <boost/foreach.hpp> using namespace std; namespace fs = boost::filesystem; class Page { public: // constructor Page(const char* path, const char* data, int size) : path_(fs::path(path)), size_(size), data_(new char[size]) { // cout << "Creating Page..." << endl; strncpy(data_, data, size); // cout << "done creating Page..." << endl; } // copy constructor Page(const Page& other) : path_(fs::path(other.path())), size_(other.size()), data_(new char[other.size()]) { // cout << "Copying Page..." << endl; strncpy(data_, other.data(), size_); // cout << "done copying Page..." << endl; } // destructor ~Page() { delete[] data_; } // accessors const fs::path& path() const { return path_; } const char* data() const { return data_; } int size() const { return size_; } // operators Page& operator = (const Page& other) { if (this == &other) return *this; char* newImage = new char[other.size()]; strncpy(newImage, other.data(), other.size()); delete[] data_; data_ = newImage; path_ = fs::path(other.path()); size_ = other.size(); return *this; } bool operator < (const Page& other) const { return path_ < other.path(); } private: fs::path path_; int size_; char* data_; }; class Book { public: Book(const char* path) : path_(fs::path(path)) { cout << "Creating Book..." << endl; cout << "pushing back #1" << endl; pages_.push_back(Page("image1.jpg", "firstImageData", 14)); cout << "pushing back #3" << endl; pages_.push_back(Page("image3.jpg", "thirdImageData", 14)); cout << "pushing back #2" << endl; pages_.push_back(Page("image2.jpg", "secondImageData", 15)); cout << "testing operator <" << endl; cout << pages_[0].path().string() << (pages_[0] < pages_[1]? " < " : " > ") << pages_[1].path().string() << endl; cout << pages_[1].path().string() << (pages_[1] < pages_[2]? " < " : " > ") << pages_[2].path().string() << endl; cout << pages_[0].path().string() << (pages_[0] < pages_[2]? " < " : " > ") << pages_[2].path().string() << endl; cout << "sorting" << endl; BOOST_FOREACH (Page p, pages_) cout << p.path().string() << endl; sort(pages_.begin(), pages_.end()); cout << "done sorting\n"; BOOST_FOREACH (Page p, pages_) cout << p.path().string() << endl; cout << "checking datas" << endl; BOOST_FOREACH (Page p, pages_) { char data[p.size() + 1]; strncpy((char*)&data, p.data(), p.size()); data[p.size()] = '\0'; cout << p.path().string() << " " << data << endl; } cout << "done Creating Book" << endl; } private: deque<Page> pages_; fs::path path_; }; int main() { Book* book = new Book("/some/path/"); }

Read the article
Valgrind says "stack allocation," I say "heap allocation"

- by Joel J. Adamson

Dear Friends, I am trying to trace a segfault with valgrind. I get the following message from valgrind: ==3683== Conditional jump or move depends on uninitialised value(s) ==3683== at 0x4C277C5: sparse_mat_mat_kron (sparse.c:165) ==3683== by 0x4C2706E: rec_mating (rec.c:176) ==3683== by 0x401C1C: age_dep_iterate (age_dep.c:287) ==3683== by 0x4014CB: main (age_dep.c:92) ==3683== Uninitialised value was created by a stack allocation ==3683== at 0x401848: age_dep_init_params (age_dep.c:131) ==3683== ==3683== Conditional jump or move depends on uninitialised value(s) ==3683== at 0x4C277C7: sparse_mat_mat_kron (sparse.c:165) ==3683== by 0x4C2706E: rec_mating (rec.c:176) ==3683== by 0x401C1C: age_dep_iterate (age_dep.c:287) ==3683== by 0x4014CB: main (age_dep.c:92) ==3683== Uninitialised value was created by a stack allocation ==3683== at 0x401848: age_dep_init_params (age_dep.c:131) However, here's the offending line: /* allocate mating table */ age_dep_data->mtable = malloc (age_dep_data->geno * sizeof (double *)); if (age_dep_data->mtable == NULL) error (ENOMEM, ENOMEM, nullmsg, __LINE__); for (int j = 0; j < age_dep_data->geno; j++) { 131=> age_dep_data->mtable[j] = calloc (age_dep_data->geno, sizeof (double)); if (age_dep_data->mtable[j] == NULL) error (ENOMEM, ENOMEM, nullmsg, __LINE__); } What gives? I thought any call to malloc or calloc allocated heap space; there is no other variable allocated here, right? Is it possible there's another allocation going on (the offending stack allocation) that I'm not seeing? You asked to see the code, here goes: /* Copyright 2010 Joel J. Adamson <[email protected]> $Id: age_dep.c 1010 2010-04-21 19:19:16Z joel $ age_dep.c:main file Joel J. Adamson -- http://www.unc.edu/~adamsonj Servedio Lab University of North Carolina at Chapel Hill CB #3280, Coker Hall Chapel Hill, NC 27599-3280 This file is part of an investigation of age-dependent sexual selection. This code is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with haploid. If not, see <http://www.gnu.org/licenses/>. */ #include "age_dep.h" /* global variables */ extern struct argp age_dep_argp; /* global error message variables */ char * nullmsg = "Null pointer: %i"; /* error message for conversions: */ char * errmsg = "Representation error: %s"; /* precision for formatted output: */ const char prec[] = "%-#9.8f "; const size_t age_max = AGEMAX; /* maximum age of males */ static int keep_going_p = 1; int main (int argc, char ** argv) { /* often used counters: */ int i, j; /* read the command line */ struct age_dep_args age_dep_args = { NULL, NULL, NULL }; argp_parse (&age_dep_argp, argc, argv, 0, 0, &age_dep_args); /* set the parameters here: */ /* initialize an age_dep_params structure, set the members */ age_dep_params_t * params = malloc (sizeof (age_dep_params_t)); if (params == NULL) error (ENOMEM, ENOMEM, nullmsg, __LINE__); age_dep_init_params (params, &age_dep_args); /* initialize frequencies: this initializes a list of pointers to initial frqeuencies, terminated by a NULL pointer*/ params->freqs = age_dep_init (&age_dep_args); params->by = 0.0; /* what range of parameters do we want, and with what stepsize? */ /* we should go from 0 to half-of-theta with a step size of about 0.01 */ double from = 0.0; double to = params->theta / 2.0; double stepsz = 0.01; /* did you think I would spell the whole word? */ unsigned int numparts = floor(to / stepsz); do { #pragma omp parallel for private(i) firstprivate(params) \ shared(stepsz, numparts) for (i = 0; i < numparts; i++) { params->by = i * stepsz; int tries = 0; while (keep_going_p) { /* each time through, modify mfreqs and mating table, then go again */ keep_going_p = age_dep_iterate (params, ++tries); if (keep_going_p == ERANGE) error (ERANGE, ERANGE, "Failure to converge\n"); } fprintf (stdout, "%i iterations\n", tries); } /* for i < numparts */ params->freqs = params->freqs->next; } while (params->freqs->next != NULL); return 0; } inline double age_dep_pmate (double age_dep_t, unsigned int genot, double bp, double ba) { /* the probability of mating between these phenotypes */ /* the female preference depends on whether the female has the preference allele, the strength of preference (parameter bp) and the male phenotype (age_dep_t); if the female lacks the preference allele, then this will return 0, which is not quite accurate; it should return 1 */ return bits_isset (genot, CLOCI)? 1.0 - exp (-bp * age_dep_t) + ba: 1.0; } inline double age_dep_trait (int age, unsigned int genot, double by) { /* return the male trait, a function of the trait locus, age, the age-dependent scaling parameter (bx) and the males condition genotype */ double C; double T; /* get the male's condition genotype */ C = (double) bits_popcount (bits_extract (0, CLOCI, genot)); /* get his trait genotype */ T = bits_isset (genot, CLOCI + 1)? 1.0: 0.0; /* return the trait value */ return T * by * exp (age * C); } int age_dep_iterate (age_dep_params_t * data, unsigned int tries) { /* main driver routine */ /* number of bytes for female frequencies */ size_t geno = data->age_dep_data->geno; size_t genosize = geno * sizeof (double); /* female frequencies are equal to male frequencies at birth (before selection) */ double ffreqs[geno]; if (ffreqs == NULL) error (ENOMEM, ENOMEM, nullmsg, __LINE__); /* do not set! Use memcpy (we need to alter male frequencies (selection) without altering female frequencies) */ memmove (ffreqs, data->freqs->freqs[0], genosize); /* for (int i = 0; i < geno; i++) */ /* ffreqs[i] = data->freqs->freqs[0][i]; */ #ifdef PRMTABLE age_dep_pr_mfreqs (data); #endif /* PRMTABLE */ /* natural selection: */ age_dep_ns (data); /* normalized mating table with new frequencies */ age_dep_norm_mtable (ffreqs, data); #ifdef PRMTABLE age_dep_pr_mtable (data); #endif /* PRMTABLE */ double * newfreqs; /* mutate here */ /* i.e. get the new frequency of 0-year-olds using recombination; */ newfreqs = rec_mating (data->age_dep_data); /* return block */ { if (sim_stop_ck (data->freqs->freqs[0], newfreqs, GENO, TOL) == 0) { /* if we have converged, stop the iterations and handle the data */ age_dep_sim_out (data, stdout); return 0; } else if (tries > MAXTRIES) return ERANGE; else { /* advance generations */ for (int j = age_max - 1; j < 0; j--) memmove (data->freqs->freqs[j], data->freqs->freqs[j-1], genosize); /* advance the first age-class */ memmove (data->freqs->freqs[0], newfreqs, genosize); return 1; } } } void age_dep_ns (age_dep_params_t * data) { /* calculate the new frequency of genotypes given additive fitness and selection coefficient s */ size_t geno = data->age_dep_data->geno; double w[geno]; double wbar, dtheta, ttheta, dcond, tcond; double t, cond; /* fitness parameters */ double mu, nu; mu = data->wparams[0]; nu = data->wparams[1]; /* calculate fitness */ for (int j = 0; j < age_max; j++) { int i; for (i = 0; i < geno; i++) { /* calculate male trait: */ t = age_dep_trait(j, i, data->by); /* calculate condition: */ cond = (double) bits_popcount (bits_extract(0, CLOCI, i)); /* trait-based fitness term */ dtheta = data->theta - t; ttheta = (dtheta * dtheta) / (2.0 * nu * nu); /* condition-based fitness term */ dcond = CLOCI - cond; tcond = (dcond * dcond) / (2.0 * mu * mu); /* calculate male fitness */ w[i] = 1 + exp(-tcond) - exp(-ttheta); } /* calculate mean fitness */ /* as long as we calculate wbar before altering any values of freqs[], we're safe */ wbar = gen_mean (data->freqs->freqs[j], w, geno); for (i = 0; i < geno; i++) data->freqs->freqs[j][i] = (data->freqs->freqs[j][i] * w[i]) / wbar; } } void age_dep_norm_mtable (double * ffreqs, age_dep_params_t * params) { /* this function produces a single mating table that forms the input for recombination () */ /* i is female genotype; j is male genotype; k is male age */ int i,j,k; double norm_denom; double trait; size_t geno = params->age_dep_data->geno; for (i = 0; i < geno; i++) { double norm_mtable[geno]; /* initialize the denominator: */ norm_denom = 0.0; /* find the probability of mating and add it to the denominator */ for (j = 0; j < geno; j++) { /* initialize entry: */ norm_mtable[j] = 0.0; for (k = 0; k < age_max; k++) { trait = age_dep_trait (k, j, params->by); norm_mtable[j] += age_dep_pmate (trait, i, params->bp, params->ba) * (params->freqs->freqs)[k][j]; } norm_denom += norm_mtable[j]; } /* now calculate entry (i,j) */ for (j = 0; j < geno; j++) params->age_dep_data->mtable[i][j] = (ffreqs[i] * norm_mtable[j]) / norm_denom; } } My current suspicion is the array newfreqs: I can't memmove, memcpy or assign a stack variable then hope it will persist, can I? rec_mating() returns double *.

Read the article
OutOfMemoryError what to increase and how?

- by Pentium10

I have a really long collection with 10k items, and when running a toString() on the object it crashes. I need to use this output somehow. 05-21 12:59:44.586: ERROR/dalvikvm-heap(6415): Out of memory on a 847610-byte allocation. 05-21 12:59:44.636: ERROR/dalvikvm(6415): Out of memory: Heap Size=15559KB, Allocated=12932KB, Bitmap Size=613KB 05-21 12:59:44.636: ERROR/AndroidRuntime(6415): Uncaught handler: thread main exiting due to uncaught exception 05-21 12:59:44.636: ERROR/AndroidRuntime(6415): java.lang.OutOfMemoryError 05-21 12:59:44.636: ERROR/AndroidRuntime(6415): at java.lang.AbstractStringBuilder.enlargeBuffer(AbstractStringBuilder.java:97) 05-21 12:59:44.636: ERROR/AndroidRuntime(6415): at java.lang.AbstractStringBuilder.append0(AbstractStringBuilder.java:155) 05-21 12:59:44.636: ERROR/AndroidRuntime(6415): at java.lang.StringBuilder.append(StringBuilder.java:202) 05-21 12:59:44.636: ERROR/AndroidRuntime(6415): at java.util.AbstractCollection.toString(AbstractCollection.java:384) I need step by step guide how to increase the heap for and Android application. I don't run the command line.

Read the article
How to analyse Dalvik GC behaviour?

- by HRJ

I am developing an application on Android. It is a long running application that continuously processes sensor data. While running the application I see a lot of GC messages in the logcat; about one every second. This is most probably because of objects being created and immediately de-referenced in a loop. How do I find which objects are being created and released immediately? All the java heap analysis tools that I have tried(*) are bothered with the counts and sizes of objects on the heap. While they are useful, I am more interested in finding out the site where temporary short-lived objects get created the most. (*) I tried jcat and Eclipse MAT. I couldn't get hat to work on the Android heap-dumps; it complained of an unsupported dump file version.

Read the article
Does Windows XP automatically reassemble UDP fragments?

- by Matt Davis

I've got a Windows application that receives and processes XML messages transmitted via UDP. The application collects the data using Windows "raw" sockets, so the entire layer 3 packet is visible. We've recently run across a problem that has me stumped. If the XML messages (i.e., UDP packets) are large (i.e., 1500 bytes), they get fragmented as expected. Ordinarily, this will cause the XML processor to fail because it attempts to process each UDP packet as if it is a complete XML message. This is a known short-coming in the system at this stage of its development. On Windows 7, this is exactly what happens. The fragments are received and logged, but no processing occurs. On Windows XP, however, the same fragments are seen, and the XML processor seems to handle everything just fine. Does Windows XP automatically reassemble UDP fragments? I guess I could expect this for a normal UDP socket, but it's not expected behavior for a "raw" socket, IMO. Further, if this is the case on Windows XP, why isn't the behavior the same on Windows 7? Is there a way to enable this?

Read the article
Memorystream and Large Object Heap

- by Flo

I have to transfer large files between computers on via unreliable connections using WCF. Because I want to be able to resume the file and I don't want to be limited in my filesize by WCF, I am chunking the files into 1MB pieces. These "chunk" are transported as stream. Which works quite nice, so far. My steps are: open filestream read chunk from file into byet[] and create memorystream transfer chunk back to 2. until the whole file is sent My problem is in step 2. I assume that when I create a memory stream from a byte array, it will end up on the LOH and ultimately cause an outofmemory exception. I could not actually create this error, maybe I am wrong in my assumption. Now, I don't want to send the byte[] in the message, as WCF will tell me the array size is too big. I can change the max allowed array size and/or the size of my chunk, but I hope there is another solution. My actual question(s): Will my current solution create objects on the LOH and will that cause me problem? Is there a better way to solve this? Btw.: On the receiving side I simple read smaller chunks from the arriving stream and write them directly into the file, so no large byte arrays involved.

Read the article
Unusual heap size limitations in VS2003 C++

- by Shane MacLaughlin

I have a C++ app that uses large arrays of data, and have noticed while testing that it is running out of memory, while there is still plenty of memory available. I have reduced the code to a sample test case as follows; void MemTest() { size_t Size = 500*1024*1024; // 512mb if (Size > _HEAP_MAXREQ) TRACE("Invalid Size"); void * mem = malloc(Size); if (mem == NULL) TRACE("allocation failed"); } If I create a new MFC project, include this function, and run it from InitInstance, it works fine in debug mode (memory allocated as expected), yet fails in release mode (malloc returns NULL). Single stepping through release into the C run times, my function gets inlined I get the following // malloc.c void * __cdecl _malloc_base (size_t size) { void *res = _nh_malloc_base(size, _newmode); RTCCALLBACK(_RTC_Allocate_hook, (res, size, 0)); return res; } Calling _nh_malloc_base void * __cdecl _nh_malloc_base (size_t size, int nhFlag) { void * pvReturn; // validate size if (size > _HEAP_MAXREQ) return NULL; ' ' And (size _HEAP_MAXREQ) returns true and hence my memory doesn't get allocated. Putting a watch on size comes back with the exptected 512MB, which suggests the program is linking into a different run-time library with a much smaller _HEAP_MAXREQ. Grepping the VC++ folders for _HEAP_MAXREQ shows the expected 0xFFFFFFE0, so I can't figure out what is happening here. Anyone know of any CRT changes or versions that would cause this problem, or am I missing something way more obvious?

Read the article
java.lang.OutOfMemoryError: Java heap space

- by houlahan

i get this error when calling a mysql Prepared Statement every 30 seconds this is the code which is been called: public static int getUserConnectedatId(Connection conn, int i) throws SQLException { pstmt = conn.prepareStatement("SELECT UserId from connection where ConnectionId ='" + i + "'"); ResultSet rs = pstmt.executeQuery(); int id = -1; if (rs.next()) { id = rs.getInt(1); } pstmt = null; rs = null; return id; } not sure what the problem is :s thanks in advanced.

Read the article
Jelly Bean équipe un terminal Android sur deux, vers la fin de la fragmentation de l'OS ?

Jelly Bean équipe un terminal Android sur deux vers la fin de la fragmentation de l'OS ?Comme il est de coutume, Google vient de publier son baromètre mensuel à destination des développeurs Android.Jelly Bean, les versions cumulées les plus récentes du système d'exploitation mobile, continue à s'accaparer des parts au détriment des versions les plus anciennes (Froyo, Gingerbread).Concrètement, Jelly Bean (Android 4.1.x, 4.2.x et 4.3) équipe désormais plus de la moitié des terminaux Android, avec...

Read the article
No Significant Fragmentation? Look Closer…

If you are relying on using 'best-practice' percentage-based thresholds when you are creating an index maintenance plan for a SQL Server that checks the fragmentation in your pages, you may miss occasional 'edge' conditions on larger tables that will cause severe degradation in performance. It is worth being aware of patterns of data access in particular tables when judging the best threshold figure to use.

Read the article
No Significant Fragmentation? Look Closer…

If you are relying on using 'best-practice' percentage-based thresholds when you are creating an index maintenance plan for a SQL Server that checks the fragmentation in your pages, you may miss occasional 'edge' conditions on larger tables that will cause severe degradation in performance. It is worth being aware of patterns of data access in particular tables when judging the best threshold figure to use.

Read the article
strategy to allocate/free lots of small objects

- by aaa

hello I am toying with certain caching algorithm, which is challenging somewhat. Basically, it needs to allocate lots of small objects (double arrays, < 256 elements), with objects accessible through mapped value, map[key] = array. time to initialized array may be quite large, generally more than 10 thousand cpu cycles. By lots I mean around gigabyte in total. objects may need to be popped/pushed as needed, generally in random places, one object at a time. lifetime of an object is generally long, minutes or more, however, object may be subject to allocation/deallocation several times during duration of program. What would be good strategy to avoid memory fragmentation, while still maintaining reasonable allocate deallocate speed? I am using C++, so I can use new and malloc. Thanks. I know there a similar questions on website, http://stackoverflow.com/questions/2156745/efficiently-allocating-many-short-lived-small-objects, are somewhat different, thread safety is not immediate issue for me.

Read the article
6 important .NET concepts: - Stack, heap, Value types, reference types, boxing and Unboxing.

6 important .NET concepts: - Stack, heap, Value types, reference types, boxing and Unboxing.

Read the article
Can I switch the Visual C++ runtime to another heap?

- by sharptooth

My program uses a third party dynamic link library that has huge memory leaks inside. Both my program and the library are Visual C++ native code. Both link to the Visual C++ runtime dynamically. I'd like to force the library into another heap so that all allocations that are done through the Visual C++ runtime while the library code is running are done on that heap. I can call HeapCreate() and later HeapDestroy(). If I somehow ensure that all allocations are done in the new heap I don't care of the leaks anymore - they all go when I destroy the second heap. Is it possible to force the Visual C++ runtime to make all allocations on a specified heap?

Read the article
Design approach, string table data, variables, stl memory usage

- by howieh

I have an old structure class like this: typedef vector<vector<string>> VARTYPE_T; which works as a single variable. This variable can hold from one value over a list to data like a table. Most values are long,double, string or double [3] for coordinates (x,y,z). I just convert them as needed. The variables are managed in a map like this : map<string,VARTYPE_T *> where the string holds the variable name. Sure, they are wrapped in classes. Also i have a tree of nodes, where each node can hold one of these variablemaps. Using VS 2008 SP1 for this, i detect a lot of memory fragmentation. Checking against the stlport, stlport seemed to be faster (20% ) and uses lesser memory (30%, for my test cases). So the question is: What is the best implementation to solve this requirement with fast an properly used memory ? Should i write an own allocator like a pool allocator. How would you do this ? Thanks in advance, Howie

Read the article
Benefits of "Don't Fragment" on TCP Packets?

- by taspeotis

One of our customers is having trouble submitting data from our application (on their PC) to a server (different geographical location). When sending packets under 1100 bytes everything works fine, but above this we see TCP retransmitting the packet every few seconds and getting no response. The packets we are using for testing are about 1400 bytes (but less than 1472). I can send an ICMP ping to www.google.com that is 1472 bytes and get a response (so it's not their router/first few hops). I found that our application sets the DF flag for these packets, and I believe a router along the way to the server has an MTU less than/equal to 1100 and dropping the packet. This affects 1 client in 5000, but since everybody's routes will be different this is expected. The data is a SOAP envelope and we expect a SOAP response back. I can't justify WHY we do it, the code to do this was written by a previous developer. So... Are there are benefits OR justification to setting the DF flag on TCP packets for application data? I can think of reasons it is needed for network diagnostics applications but not in our situation (we want the data to get to the endpoint, fragmented or not). One of our sysadmins said that it might have something to do with us using SSL, but as far as I know SSL is like a stream and regardless of fragmentation, as long as the stream is rebuilt at the end, there's no problem. If there's no good justification I will be changing the behaviour of our application. Thanks in advance.

Read the article
Using R to Analyze G1GC Log Files

- by user12620111

Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=|| Using R to Analyze G1GC Log Files Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z\(\),]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\\( " , " " ); gsub ( "\ \)" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period: par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight. plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

Read the article
JVM memory initializazion error after windows update

- by Pier Luigi

Hi all, I have three Windows Server 2003 with 2 GB RAM. Server1 tomcat 5.5.25 jvm version SUN 1.6.0_11-b03 Server2 tomcat 5.5.25 jvm version SUN 1.6.0_14-b08 Server3 tomcat 6.0.18 jvm version SUN 1.6.0_14-b08 For the three servers JVM parameters are: -XX:MaxPermSize=256m -Dcatalina.base=C:\Apache Group\apache-tomcat-5.5.25 -Dcatalina.home=C:\Apache Group\apache-tomcat-5.5.25 -Djava.endorsed.dirs=C:\Apache Group\apache-tomcat-5.5.25\common\endorsed -Djava.io.tmpdir=C:\Apache Group\apache-tomcat-5.5.25\temp vfprintf -Xms512m -Xmx1024m For some months everithing worked fine. Last friday we installed some windows updates. After the reboot tomcat doesn't start anymore, with error: Error occurred during initialization of VM Could not reserve enough space for object heap We reduced the parameter -Xmx1024m to -Xmx768m and now tomcat starts. But we need greater max heap size What happened to our servers ? Thanks in advance.

Read the article
Stack memory in Android

- by Matt

I'm writing an app that has a foreground service, content provider, and a Activity front end that binds to the service and gets back a List of objects using AIDL. The service does work and updates a database. If I leave the activity open for 4-8+ hours, and go to the "Running Services" section under settings on the phone (Nexus One) an unusually large amount of memory being used is shown (~42MB). I figure there is a leak. When I check the heap memory i get Heap size:~18MB, ~2MB allocated, ~16MB free. Analyzing the hprof in Eclipse MAT seems fine, which leads me to theorize that memory is leaking on the stack. Is this even possible? If it is, what can I do to stop or investigate the leak? Is the reported memory usage on the "Running Services" section of android even correct (I assume it is)? Another note: I have been unable to reproduce this issue when the UI is not up (with only the service running)

Read the article
What Is Disk Fragmentation and Do I Still Need to Defragment?

- by Jason Fitzpatrick

Do modern computers still need the kind of routine defragmentation procedures that older computers called for? Read on to learn about fragmentation and what modern operating systems and file systems do to minimize performance impacts. Today’s Question & Answer session comes to us courtesy of SuperUser—a subdivision of Stack Exchange, a community-drive grouping of Q&A web sites. Secure Yourself by Using Two-Step Verification on These 16 Web Services How to Fix a Stuck Pixel on an LCD Monitor How to Factory Reset Your Android Phone or Tablet When It Won’t Boot

Read the article
Efficient heaps in purely functional languages

- by Kim

As an exercise in Haskell, I'm trying to implement heapsort. The heap is usually implemented as an array in imperative languages, but this would be hugely inefficient in purely functional languages. So I've looked at binary heaps, but everything I found so far describes them from an imperative viewpoint and the algorithms presented are hard to translate to a functional setting. How to efficiently implement a heap in a purely functional language such as Haskell? Edit: By efficient I mean it should still be in O(n*log n), but it doesn't have to beat a C program. Also, I'd like to use purely functional programming. What else would be the point of doing it in Haskell?

Read the article
Memory in Eclipse

- by user247866

I'm getting the java.lang.OutOfMemoryError exception in Eclipse. I know that Eclipse by default uses heap size of 256M. I'm trying to increase it but nothing happens. For example: eclipse -vmargs -Xmx16g -XX:PermSize=2g -XX:MaxPermSize=2g I also tried different settings, using only the -Xmx option, using different cases of g, G, m, M, different memory sizes, but nothing helps. Does not matter which params I specify, the heap exception is thrown at the same time, so I assume there's something I'm doing wrong that Eclipse ignores the -Xmx parameter. I'm using a 32GB RAM machine and trying to execute something very simple such as: double[][] a = new double[15000][15000]; It only works when I reduce the array size to something around 10000 on 10000. I'm working on Linux and using the top command I can see how much memory the Java process is consuming; it's less than 2%. Thanks!

Read the article
Why can't I reclaim my dynamically allocated memory using the "delete" keyword?

- by synaptik

I have the following class: class Patient { public: Patient(int x); ~Patient(); private: int* RP; }; Patient::Patient(int x) { RP = new int [x]; } Patient::~Patient() { delete [] RP; } I create an instance of this class on the stack as follows: void f() { Patient p(10); } Now, when f() returns, I get a "double free or corruption" error, which signals to me that something is attempted to be deleted more than once. But I don't understand why that would be so. The space for the array is created on the heap, and just because the function from inside which the space was allocated returns, I wouldn't expect the space to be reclaimed. I thought that if I allocate space on the heap (using the new keyword), then the only way to reclaim that space is to use the delete keyword. Help! :)

Read the article
GCC/XCode equivalent of _CrtCheckMemory?

- by Chris Becke

When dealing with random memory overwrites, in MSVC it is possible to validate the state of the heap at various points with a call to _CrtCheckMemory, and know with at least a small level of confidence that the code up until the check was not responsible for any errors that might cause new or malloc to fail later. In XCode, whats the equivalent way to try and box in a memory overwrite? All I have at the moment is a random failure of a call to new, somewhere deep in the bowels of some code with no real idea of how long the code has been running with a corrupt heap up until that point.

Read the article

< Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >