oracle data mining - Page 491

Oracle 10g - Best way to escape single quotes

- by satynos

I have to generate some update statements based off a table in our database. I created the following script which is generating the update statements I need. But when I try to run those scripts I am getting errors pertaining to unescaped single quotes in the content and &B, &T characters which have special meaning in oracle. I took care of the &B and &T problem by setting SET DEFINE OFF. Whats the best way to escape single quotes within the content? DECLARE CURSOR C1 IS SELECT * FROM EMPLOYEES; BEGIN FOR I IN C1 LOOP DBMS_OUTPUT.PUT_LINE('UPDATE EMPLOYEES SET FIRST_NAME= ''' || I.FIRST_NAME|| ''', LAST_NAME = ''' || I.LAST_NAME ''', DOB = ''' || I.DOB|| ''' WHERE EMPLOYEE_ID = ''' || I.EMPLOYEE_ID || ''';'); END LOOP; END; Here if the first_name or last_name contains single quotes then the generated update statements break. Whats the best way to escape those single quotes within the first_name and last_name?

Read the article

Oracle global lock across process

- by Jimm

I would like to synchronize access to a particular insert. Hence, if multiple applications execute this "one" insert, the inserts should happen one at a time. The reason behind synchronization is that there should only be ONE instance of this entity. If multiple applications try to insert the same entity,only one should succeed and others should fail. One option considered was to create a composite unique key, that would uniquely identify the entity and rely on unique constraint. For some reasons, the dba department rejected this idea. Other option that came to my mind was to create a stored proc for the insert and if the stored proc can obtain a global lock, then multiple applications invoking the same stored proc, though in their seperate database sessions, it is expected that the stored proc can obtain a global lock and hence serialize the inserts. My question is it possible to for a stored proc in oracle version 10/11, to obtain such a lock and any pointers to documentation would be helpful.

Read the article

Cannot execute "LOAD DATA LOCAL INFILE" Mysql query in Rails after a connection reconnection

- by Ngan

On Rails 2.3.8 (but I think Rails 3 might have this issue as well, not sure): I get an error when trying to execute a LOAD DATA LOCAL INFILE query after reconnecting to a database. I have a process that parses a file that can potentially take a bit of time. During the parsing, Mysql closes the connection due to timeout. This is fine, I do a ActiveRecord::Base.verify_active_connections! and I get the connection back (I do this in several places through my app). However, running a LOAD DATA LOCAL INFILE statement, I get this error: Mysql::Error: The used command is not allowed with this MySQL version It's not a permission issue, I know that for sure. Check out my test in console: ActiveRecord::Base.connection.execute("LOAD DATA LOCAL INFILE '/tmp/test.infile' INTO TABLE users") [Sat Jan 08 00:09:29 2011] (9990) SQL (1.7ms) LOAD DATA LOCAL INFILE '/tmp/test.infile' INTO TABLE users => nil > ActiveRecord::Base.connection.disconnect! => #<Mysql:0x104c6f890> > ActiveRecord::Base.verify_active_connections! [Sat Jan 08 00:09:58 2011] (9990) SQL (0.2ms) SET SQL_AUTO_IS_NULL=0 => {...connection stuff...} > ActiveRecord::Base.connection.execute("LOAD DATA LOCAL INFILE '/tmp/test.infile' INTO TABLE users") [Sat Jan 08 00:10:00 2011] (9990) SQL (0.0ms) Mysql::Error: The used command is not allowed with this MySQL version: LOAD DATA LOCAL INFILE '/tmp/test.infile' INTO TABLE users ActiveRecord::StatementInvalid: Mysql::Error: The used command is not allowed with this MySQL version: LOAD DATA LOCAL INFILE '/tmp/test.infile' INTO TABLE users from ~/gems/activerecord-2.3.8/lib/active_record/connection_adapters/abstract_adapter.rb:221:in `log' from ~/gems/activerecord-2.3.8/lib/active_record/connection_adapters/mysql_adapter.rb:323:in `execute' from (irb):6 I am able to do other queries like SELECT and whatnot, and I will get the correct result. It's just this one that giving me the error. I even tested this with a fresh rails app. You'll notice that I am able to do the exact same query before the disconnect. Thanks for the help!

Read the article

Pre approve expenditure batch in oracle apps project module

- by nil

hi to all i have to crete one pre approve expence batch when i crete batch and then go to india local payble (MHE) but when i run the request Expense Report Import Report then i got following out put hear some error Rejection Reason = no location so my problem is that where i have to define location please give me guidance for that Total Functional Currency Invoice Amount: 100.00 Elecon Engineering Co. Ltd. Expense Report Import Report 17-MAY-10 16:57 Page: 2 Source: Oracle Projects Exceptions Report Supplier Supplier Invoice Invoice Invoice Invoice Name Number Name Number Number Date Currency Amount Rejection Reason ------------ Megha, Nilesh M. 90054 XSAM R17-MAY-1 31-MAY-10 INR 400.00 No Location Megha, Nilesh M. 90054 XT2 R17-MAY-10 31-MAY-10 INR 100.00 No Location Total Expense Reports Rejected: 2 Total Functional Currency Invoice Amount: 500.00 Edited by: user12921822 on May 17, 2010 9:00 PM

Read the article

Improving performance on data pasting 2000 rows with validations

- by Lohit

I have N rows (which could be nothing less than 1000) on an excel spreadsheet. And in this sheet our project has 150 columns like this: Now, our application needs data to be copied (using normal Ctrl+C) and pasted (using Ctrl+V) from the excel file sheet on our GUI sheet. Copy pasting 1000 records takes around 5-6 seconds which is okay for our requirement, but the problem is when we need to make sure the data entered is valid. So we have to validate data in each row generate appropriate error messages and format the data as per requirement. So we need to at runtime parse and evaluate data in each row. Now all the formatting of data and validations come from the back-end database and we have it in a data-table (dtValidateAndFormatConditions). The conditions would be around 50. So you can see how slow this whole process becomes since N X 150 X 50 operations are required to complete this whole process. Initially it took approximately 2-3 minutes but now i have reduced it to 20 - 30 seconds. However i have increased the speed by making an expression parser of my own - and not by any algorithm, is there any other way i can improve performance, by using Divide and Conquer or some other mechanism. Currently i am not really sure how to go about this. Here is what part of my code looks like: public virtual void ValidateAndFormatOnCopyPaste(DataTable DtCopied, int CurRow) { foreach (DataRow dRow in dtValidateAndFormatConditions.Rows) { string Condition = dRow["Condition"]; string FormatValue = Value = dRow["Value"]; GetValidatedFormattedData(DtCopied,ref Condition, ref FormatValue ,iRowIndex); Condition = Parse(Condition); dRow["Condition"] = Condition; FormatValue = Parse(FormatValue ); dRow["Value"] = FormatValue; } } The above code gets called row-wise like this: public override void ValidateAndFormat(DataTable dtChangedRecords, CellRange cr) { int iRowStart = cr.Row, iRowEnd = cr.Row + cr.RowCount; for (int iRow = iRowStart; iRow < iRowEnd; iRow++) { ValidateAndFormatOnCopyPaste(dtChangedRecords,iRow); } } Please know my question needs a more algorithmic solution than code optimization, however any answers containing code related optimizations will be appreciated as well. (Tagged Linq because although not seen i have been using linq in some parts of my code).

Read the article

BULK INSERT from one table to another all on the server

- by steve_d

I have to copy a bunch of data from one database table into another. I can't use SELECT ... INTO because one of the columns is an identity column. Also, I have some changes to make to the schema. I was able to use the export data wizard to create an SSIS package, which I then edited in Visual Studio 2005 to make the changes desired and whatnot. It's certainly faster than an INSERT INTO, but it seems silly to me to download the data to a different computer just to upload it back again. (Assuming that I am correct that that's what the SSIS package is doing). Is there an equivalent to BULK INSERT that runs directly on the server, allows keeping identity values, and pulls data from a table? (as far as I can tell, BULK INSERT can only pull data from a file) Edit: I do know about IDENTITY_INSERT, but because there is a fair amount of data involved, INSERT INTO ... SELECT is kinda of slow. SSIS/BULK INSERT dumps the data into the table without regards to indexes and logging and whatnot, so it's faster. (Of course creating the clustered index on the table once it's populated is not fast, but it's still faster than the INSERT INTO...SELECT that I tried in my first attempt) Edit 2: The schema changes include (but are not limited to) the following: 1. Splitting one table into two new tables. In the future each will have its own IDENTITY column, but for the migration I think it will be simplest to use the identity from the original table as the identity for the both new tables. Once the migration is over one of the tables will have a one-to-many relationship to the other. 2. Moving columns from one table to another. 3. Deleting some cross reference tables that only cross referenced 1-to-1. Instead the reference will be a foreign key in one of the two tables. 4. Some new columns will be created with default values. 5. Some tables aren’t changing at all, but I have to copy them over due to the "put it all in a new DB" request.

Read the article

Javascript terminates after trying to select data from an object passed to a function

- by Silmaril89

Here is my javascript: $(document).ready(function(){ var queries = getUrlVars(); $.get("mail3.php", { listid: queries["listid"], mindex: queries["mindex"] }, showData, 'html'); }); function showData(data) { var response = $(data).find("#mailing").html(); if (response == null) { $("#results").html("<h3>Server didn't respond, try again.</h3>"); } else if (response.length) { var old = $("#results").html(); old = old + "<br /><h3>" + response + "</h3>"; $("#results").html(old); var words = response.split(' '); words[2] = words[2] * 1; words[4] = words[4] * 1; if (words[2] < words[4]) { var queries = getUrlVars(); $.get("mail3.php", { listid: queries["listid"], mindex: words[2] }, function(data){showData(data);}, 'html'); } else { var done = $(data).find("#done").html(); old = old + "<br />" + done; $("#results").html(old); } } else { $("#results").html("<h3>Server responded with an empty reply, try again.</h3>"); } } function getUrlVars() { var vars = [], hash; var hashes = window.location.href.slice(window.location.href.indexOf('?') + 1).split('&'); for (var i = 0; i < hashes.length; i++) { hash = hashes[i].split('='); vars.push(hash[0]); vars[hash[0]] = hash[1]; } return vars; } After the first line in showData: var response = $(data).find("#mailing").html(); the javascript stops. If I put an alert before it, the alert pops up, after it, it doesn't pop up. There must be something wrong with using $(data), but why? Any ideas would be appreciated.

Read the article

Invoking a function call in a string in an Oracle Procedure

- by DMS

Hello, I writing an application using Oracle 10g. I am currently facing this problem. I take in "filename" as parameter of type varchar2. A sample value that filename may contain is: 'TEST || to_char(sysdate, 'DDD')'. In the procedure, I want to get the value of this file name as in TEST147. When i write: select filename into ffilename from dual; I get the value ffilename = TEST || to_char(sysdate, 'DDD') whick makes sense. But how can I get around this issue and invoke the function in the string value? Help appreciated. Thanks.

Read the article

MATLAB query about for loop, reading in data and plotting

- by mp7

Hi there, I am a complete novice at using matlab and am trying to work out if there is a way of optimising my code. Essentially I have data from model outputs and I need to plot them using matlab. In addition I have reference data (with 95% confidence intervals) which I plot on the same graph to get a visual idea on how close the model outputs and reference data is. In terms of the model outputs I have several thousand files (number sequentially) which I open in a loop and plot. The problem/question I have is whether I can preprocess the data and then plot later - to save time. The issue I seem to be having when I try this is that I have a legend which either does not appear or is inaccurate. My code (apolgies if it not elegant): fn= xlsread(['tbobserved' '.xls']); time= fn(:,1); totalreference=fn(:,4); totalreferencelowerci=fn(:,6); totalreferenceupperci=fn(:,7); figure plot(time,totalrefrence,'-', time, totalreferencelowerci,'--', time, totalreferenceupperci,'--'); xlabel('Year'); ylabel('Reference incidence per 100,000 population'); title ('Total'); clickableLegend('Observed reference data', 'Totalreferencelowerci', 'Totalreferenceupperci','Location','BestOutside'); xlim([1910 1970]); hold on start_sim=10000; end_sim=10005; h = zeros (1,1000); for i=start_sim:end_sim %is there any way of doing this earlier to save time? a=int2str(i); incidenceFile =strcat('result_', 'Sim', '_', a, 'I_byCal_total.xls'); est_tot=importdata(incidenceFile, '\t', 1); cal_tot=est_tot.data; magnitude=1; t1=cal_tot(:,1)+1750; totalmodel=cal_tot(:,3)+cal_tot(:,5); h(a)=plot(t1,totalmodel); xlim([1910 1970]); ylim([0 500]); hold all clickableLegend(h(a),a,'Location','BestOutside') end Essentially I was hoping to have a way of reading in the data and then plot later - ie. optimise the code. I hope you might be able to help. Thanks. mp

Read the article

Ping remote server and wait to get data

- by infinity

Hi I'm building my first application for android and I've reached a point where I can't find a solution even have no idea what to search for in Google. So the problem: I am pinging a remote server with GET request through the application passing some parameters like file_id. Then the server gives back confirmation if the file exists or error otherwise, both in plain text. The error string is $$$ERROR$$$. Actually the confirmation is JSON string that holds the path to the file. If the file doesn't exists on the server it generated the error message and start downloading the file and processing it which normally takes 10-30 seconds. What would be the best way to check if the file is ready for download? I have DownloadFile class that extends AsyncTask but before I reach the point to download the file I need the URL which is dependant on the previous request which is in the main class in the UI thread. Here is some code: public class MainActivity extends Activity { private String getInfo() { // Create a new HttpClient and Post Header HttpClient httpClient = new DefaultHttpClient(); HttpGet httpPost = new HttpGet(infoUrl); StringBuilder sb = null; String data; JSONObject jObject = null; try { HttpResponse response = httpClient.execute(httpPost); // This might be equal "$$$ERROR$$$" if no file exists sb = inputStreamToString(response.getEntity().getContent()); } catch(ClientProtocolException e) { // TODO Auto-generated catch block Log.v("Error: pushItem ClientProtocolException: ", e.toString()); } catch (IOException e) { // TODO Auto-generated catch block Log.v("Error: pushItem IOException: ", e.toString()); } // Clean the data to be complaint JSON format data = sb.toString().replace("info = ", ""); try { jObject = new JSONObject(data); data = jObject.getString("h"); fileTitle = jObject.getString("title"); } catch (JSONException e) { // TODO Auto-generated catch block e.printStackTrace(); } downloadUrl = String.format(downloadUrl, fileId, data); return downloadUrl; } } So my idea was to get the content and if equal to $$$ERROR$$$ go into loop until JSON data is passed but I guess there is better solution. Note: I don't have control over the server output so have to deal with what I have.

Read the article

Oracle curcular join sometimes give duplicates, but sometimes does not

- by Kaushik

By mistake I wrote a query like this: select * from a,b,c where a.col=b.col and b.col2=c.col2 and c.col3=a.col4 So there is a circular join here. Now the thing is sometimes this query returns duplicate result, sometimes it returns unique(correct) results. I am trying to understand why it does not give duplicate results always. Also if circular joins are not allowed, how come Oracle does not throw an error. EDIT: This is the actual query. After reading ti carefully, I am not sure anymore if this is a circular join or not.It does not seem so...but why I get duplicates only sometime? select * from a,b,c,d where a.col=b.col and b.col=c.col and c.col2=d.col2 and d.col2 =a.col2

Read the article

Persistence scheme & state data for low memory situations (iphone)

- by Robin Jamieson

What happens to state information held by a class's variable after coming back from a low memory situation? I know that views will get unloaded and then reloaded later but what about some ancillary classes & data held in them that's used by the controller that launched the view? Sample scenario in question: @interface MyCustomController: UIViewController { ServiceAuthenticator *authenticator; } -(id)initWithAuthenticator:(ServiceAuthenticator *)auth; // the user may press a button that will cause the authenticator // to post some data to the service. -(IBAction)doStuffButtonPressed:(id)sender; @end @interface ServiceAuthenticator { BOOL hasValidCredentials; // YES if user's credentials have been validated NSString *username; NSString *password; // password is not stored in plain text } -(id)initWithUserCredentials:(NSString *)username password:(NSString *)aPassword; -(void)postData:(NSString *)data; @end The app delegate creates the ServiceAuthenticator class with some user data (read from plist file) and the class logs the user with the remote service. inside MyAppDelegate's applicationDidFinishLaunching: - (void)applicationDidFinishLaunching:(UIApplication *)application { ServiceAuthenticator *auth = [[ServiceAuthenticator alloc] initWithUserCredentials:username password:userPassword]; MyCustomController *controller = [[MyCustomController alloc] initWithNibName:...]; controller.authenticator = auth; // Configure and show the window [window addSubview:..]; // make everything visible [window makeKeyAndVisible]; } Then whenever the user presses a certain button, 'MyCustomController's doStuffButtonPressed' is invoked. -(IBAction)doStuffButtonPressed:(id)sender { [authenticator postData:someDataFromSender]; } The authenticator in-turn checks to if the user is logged in (BOOL variable indicates login state) and if so, exchanges data with the remote service. The ServiceAuthenticator is the kind of class that validates the user's credentials only once and all subsequent calls to the object will be to postData. Once a low memory scenario occurs and the associated nib & MyCustomController will get unloaded -- when it's reloaded, what's the process for resetting up the 'ServiceAuthenticator' class & its former state? I'm periodically persisting all of the data in my actual model classes. Should I consider also persisting the state data in these utility style classes? Is that the pattern to follow?

Read the article

Creation of database in Oracle

- by macha

Hello, I am a newbie to Oracle, and I have used MySQL for most of the time. So now for testing scripts, I was just planning to create a database, but from the resources I have found on google, it doesn't look as simple it is maybe in mysql or in sqlserver. I just need to create a database, say "CREATE DATABASE TESTDB";. That is it, but of the resources I have found, it seems I need to create an instance identifier, decide an authentication method, create an initialization file etc. Do I really have to do all this or am I using the wrong resources. I just need to create a database and add a few tables into it, just to check my connection string etc. I need to check if I am able to connect to my web server.

Read the article

Using R to Analyze G1GC Log Files

- by user12620111

Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=|| Using R to Analyze G1GC Log Files Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z,]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\$ " , " " ); gsub ( "\ $" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period: par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight. plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

Read the article

SQL Server 2000 tables

- by user40766

We currently have an SQL Server 2000 database with one table containing data for multiple users. The data is keyed by memberid which is an integer field. The table has a clustered index on memberid. The table is now about 200 million rows. Indexing and maintenance are becoming issues. We are debating splitting the table into one table per user model. This would imply that we would end up with a very large number of tables potentially upto the 2,147,483,647, considering just positive values. My questions: Does anyone have any experience with a SQL Server (2000/2005) installation with millions of tables? What are the implications of this architecture with regards to maintenance and access using Query Analyzer, Enterprise Manager etc. What are the implications to having such a large number of indexes in a database instance. All comments are appreciated. Thanks

Read the article

Automating an SSRS 2008 R2 Report Snapshots and run report with most recent data

- by Mr Shoubs

I would like to automate a report snapshot, but there is only an option to take a snapshot in the Report History Tab. All the resources I've found suggest I need to go to processing options and select "Render this report from a snapshot". But I don't want to do that - when I go to a report, I want to get the most recent data. However daily at midnight I'd like to take a snapshot and store it in the history in case I want to compare the reports as of midnight for the last few weeks. Or am I doing this wrong and have to create a subscription instead? Note: this is for an auditing database and has way to much data in to query a range with more than 1 day in it - reports are restricted as such. (1 day has over 1 million rows on it's own).

Read the article

Associating multiple data with a single entry in Open Office Base

- by idyllhands

I'm trying to build a database that I can use to track prices of groceries on certain dates. My problem is that I cannot figure out how to have a single entry associate with multiple data. For example, carrots. The index would be carrots. Then, a few categorizing fields (ie, Produce|Vegetable) Then, I can enter a price, date that the price was valid, store that was selling for said price, etc. And the next time I buy carrots, I can just add a new set of pricing data that would be associated with the original carrots entry. I know very little about database building, so if anyone has something I could just modify, I would greatly appreciate it. Alternatively, a step by step tutorial would be great.

Read the article

Need Some comparative data on Apache and IIS

- by zod

This is not a pure programming question but very programming related question I am not sure where should i ask. if you dont know answer please dont downvote . If you know the right place to ask please suggest or move this question We have a web application running on PHP 5 , Zend Framework , Apche . I need some comparitive data which states the above technologies and server is one of the best and thousands of domains are using this. Do you know any website i can get this type of data listing major websites developed in PHP. Major domains runs on apache Major website running on Zend A comparitive case study on Apache and IIS or PHP and .Net

Read the article

Stack , data and address space limits on an Ubuntu server

- by PaulDaviesC

I am running an Ubuntu server which has around 5000 users. The users are allowed to SSH in to the system. So in order to cap the memory used up by a process I have capped the address space limits using limits.conf. So my question is , should I be limiting the data and stack ? I feel that is not required since I am capping address space. Are there any pitfalls if I do not cap the stack and data limits?

Read the article

How to warehouse data that is not needed from MS SQL server

- by I__

I have been asked to truncate a large table in MS SQL Server 2008. The data is not needed but might be needed once every two years. It will NEVER have to be changed, only viewed. The question is, since I don't need the data on a day-to-day basis, what do I do with it to protect and back it up? Please keep in mind that I will need to have it accessible maybe once every two years, and it is FINE for us if the recovery process takes a few hours. The entire table is about 3 million rows and I need to truncate it to about 1 million rows.

Read the article

Using LDAP to store customer data

- by mechcow

We wish to store some data in 389 Directory Server LDAP that doesn't fit that well into the standard set of schema's that come with the product. Nothing too amazing, things like: when the customer joined are they currently active customer certificate[1] which environment they are using My question is this: should we register with OID and start writing up our own custom schema OR is there a standard schema definition not provided by Directory Server that we can download and use that would fit our needs? Should we munge/hack existing attributes and store the data among there (I'm strongly opposed to this, but would be interested in arguments about why its better than extending)? [1] I know there is a field for this userCertificate but we don't want to use it to authenticate the user for the purposes of binding Using CentOS 5.5 with 389 Directory Server 8.1

Read the article

Can't connect to wireless router anymore due to data rate problem

- by Jay White

I was playing around with my wireless router, and switched the mode to a fixed mode B. Now< I can no longer assoicate to the AP. Windows does not give any particular error message, but with wireshark I see that the returned error is that the client does not support the necessary data rate. My wireless card is type n, and it is set to mode a/b/g compatible. I tried setting ot to just b, however this made no difference. How can I set the data rate of my card so that I can connect again to my AP? I would prefer not to just reset the device, as there has been some configuration done that would be a pain to redo, and as well I do not have the ISP password handy. Regardless I would like to understand this situation better.

Read the article

AWS EC2 can't execute user-data script

- by Bloodnut

I'm pretty new to AWS and EC2 but I want to run instances with a user script after it's booted from another instance. I have installed ec2 tools and ran the command as it's explained in various examples like here http://www.turnkeylinux.org/blog/ec2-userdata and Eric Hammond's tutorials. however when I actually use the command: "ec2-run-instances --key my-key --user-data-file myscript my-ami" it only runs the new instance but doesn't execute the script myscript contains: #!/bin/bash echo "hello" ~/output.txt I'm running ubuntu server 12.04 AMIs. the target AMIs are duplicates of the initiating instance. if I run curl http:// 169.254.169.254/latest/user-data the imported script is there.

Read the article

Where do deleted items go on the hard drive ?

- by Jerry

After reading the quote below on the Casey Anthony trial (CNN) ,I am curious about where deleted files actually go on a hard drive, how they can be seen after being deleted, and to what extent the data can be recovered (fully, partially, etc). "Earlier in the trial, experts testified that someone conducted the keyword searches on a desktop computer in the home Casey Anthony shared with her parents. The searches were found in a portion of the computer's hard drive that indicated they had been deleted, Detective Sandra Osborne of the Orange County Sheriff's Office testified Wednesday in Anthony's capital murder trial." I know some of the questions here on SO address third party software that can used for this kind of thing, but I'm more interested in how this data can be seen after deletion, where it resides on the hard drive, etc. I find the whole topic intriguing, so any additional insight is welcome.

Read the article

Where do deleted items go on the hard drive?

- by Jerry

After reading the quote below on the Casey Anthony trial (CNN) ,I am curious about where deleted files actually go on a hard drive, how they can be seen after being deleted, and to what extent the data can be recovered (fully, partially, etc). "Earlier in the trial, experts testified that someone conducted the keyword searches on a desktop computer in the home Casey Anthony shared with her parents. The searches were found in a portion of the computer's hard drive that indicated they had been deleted, Detective Sandra Osborne of the Orange County Sheriff's Office testified Wednesday in Anthony's capital murder trial." I know some of the questions here on Super User address third party software that can used for this kind of thing, but I'm more interested in how this data can be seen after deletion, where it resides on the hard drive, etc. I find the whole topic intriguing, so any additional insight is welcome.

Search Results

Search found 69138 results on 2766 pages for 'oracle data mining'.

Page 491/2766 | < Previous Page | 487 488 489 490 491 492 493 494 495 496 497 498 | Next Page >

- by satynos

- by Jimm

- by Ngan

- by nil

- by Lohit

- by steve_d

- by Silmaril89

- by DMS

- by mp7

- by infinity

- by Kaushik

- by Robin Jamieson

- by macha

- by user12620111

- by user40766

- by Mr Shoubs

- by idyllhands

- by zod

- by PaulDaviesC

- by I__

- by mechcow

- by Jay White

- by Bloodnut

- by Jerry

- by Jerry

< Previous Page | 487 488 489 490 491 492 493 494 495 496 497 498 | Next Page >