refresh rate - Page 67 - Developer IT

Would like to change audio codec, but keep video settings with ffmpeg

- by Craig Tataryn

I have a video for which I'd like to convert the audio codec to AAC 320 kbps / 44.100 kHz. What would I use for ffmpeg switches such that all the video settings and codec remain the same, but only the audio codec and settings change? Here's my video: $ ffmpeg -i Winnipeg.rb\ Scala-Talk.mov FFmpeg version SVN-r25375, Copyright (c) 2000-2010 the FFmpeg developers built on Oct 6 2010 13:02:41 with gcc 4.2.1 (Apple Inc. build 5664) configuration: --enable-libmp3lame --enable-shared --disable-mmx --arch=x86_64 libavutil 50.32. 2 / 50.32. 2 libavcore 0. 9. 1 / 0. 9. 1 libavcodec 52.92. 0 / 52.92. 0 libavformat 52.80. 0 / 52.80. 0 libavdevice 52. 2. 2 / 52. 2. 2 libavfilter 1.48. 0 / 1.48. 0 libswscale 0.12. 0 / 0.12. 0 Seems stream 0 codec frame rate differs from container frame rate: 2000.00 (2000/1) -> 10.00 (10/1) Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'Winnipeg.rb Scala-Talk.mov': Metadata: major_brand : qt minor_version : 537199360 compatible_brands: qt Duration: 01:10:53.00, start: 0.000000, bitrate: 283 kb/s Stream #0.0(eng): Video: h264, yuv420p, 800x598, 94 kb/s, 10 fps, 10 tbr, 1k tbn, 2k tbc Stream #0.1(eng): Audio: adpcm_ima_qt, 22050 Hz, 1 channels, s16 Stream #0.2(eng): Audio: adpcm_ima_qt, 22050 Hz, 1 channels, s16 At least one output file must be specified Many thanks in advance! One with with ffmpeg I've never been able to grok is how to just "tweak" files without having to regurgitate every little setting for things you don't want changes.

Read the article

Unexpected(?) high 'wasted' memory in memcached

- by Nanne

Looking at our memcached stats I think I have found an issue I was not aware of before. It seems that we have a strangely high amount of wasted space. I checked with phpmemcacheadmin for a change, and found this image staring at me: Now I was under the impression that the worst-case scenario would be that there is 50% waste, although I am the first to admit not knowing all the details. I have read - amongst others- this page which is indeed somewhat old, but so is our version of memcached. I think I do understand how the system works (e.g.) I believe, but I have a hard time understanding how we could get to 76% wasted space. The eviction rate that phpmemcacheadmin shows is 2 ev/s, so there is some problem here. The primary question is: what can I do to fix this. I could throw more memory at it (there is some extra available I think), maybe I should fiddle with the slab config (is that even possible with this version?), maybe there are other options? Upgrading the memcached version is not a quickly available option. The secondairy question, out of curiosity, is of course if the rate of 75% (and rising) wasted space is expected, and if so, why. System: This is currently not something I can do anything about, I know the memcached version isn't the newest, but these are the cards I've been dealt. Memcached 1.4.5 Apache 2.2.17 PHP 5.3.5

Read the article

How can I stop SipVicious ('friendly-scanner') from flooding my SIP server?

- by a1kmm

I run an SIP server which listens on UDP port 5060, and needs to accept authenticated requests from the public Internet. The problem is that occasionally it gets picked up by people scanning for SIP servers to exploit, who then sit there all day trying to brute force the server. I use credentials that are long enough that this attack will never feasibly work, but it is annoying because it uses up a lot of bandwidth. I have tried setting up fail2ban to read the Asterisk log and ban IPs that do this with iptables, which stops Asterisk from seeing the incoming SIP REGISTER attempts after 10 failed attempts (which happens in well under a second at the rate of attacks I'm seeing). However, SipVicious derived scripts do not immediately stop sending after getting an ICMP Destination Host Unreachable - they keep hammering the connection with packets. The time until they stop is configurable, but unfortunately it seems that the attackers doing these types of brute force attacks generally set the timeout to be very high (attacks continue at a high rate for hours after fail2ban has stopped them from getting any SIP response back once they have seen initial confirmation of an SIP server). Is there a way to make it stop sending packets at my connection?

Read the article

A web app provider has asked for specific browser config

- by Matthew

They have asks to turn off caching on our browsers. I was aghast that they would ask such a thing. I said to them; To avoid caching it is best practice to use; <meta http-equiv="pragma" content="no-cache" /> <meta http-equiv="cache-control" content="no-cache" /> This should work across all browsers. Their reply was; We need to refresh javascript at runtime, this will not help us – any more ideas? I replied; Unsure what you mean by “refresh javascript at runtime”. If you are using ajax, browser caching can effect the XMLHttpRequest open method. Adding these meta tags to the source has fixed this for me in the past. Browser caching only caches resources, it should have no effect on site scripting. These meta tags will bypass browser caching. This is a reasonable request, isn't it?

Read the article

Matched or unmatched drives for RAID arrays?

- by Will

Looking around there is conflciting information on this, with some strongly suggesting one or the other. From my understanding the issue with matched drives is that the wear on both drives is more or less the same, so the potential for the second drive failing with or very soon after the first is pretty high. People also claim matched drives give substianatally higher performance however assuming the unmatched drives are more or less the same (eg 2, 1 TB STATA II 7200rpm drives with 32MB cache), would the minor differences between say a Seagate and a Western Digital one (say one has a 128MB/s read rate, and the other a 150MB/s read rate, as well as I guess various other minor differences) actually cause any notable performance loss, ie potentialy worse than two matched 128MB/s drives, or does RAID not really care and give you essentially an optimal solution (eg upto 278MB/s total read speed for RAID 0 and 1) and similar for other RAID with more "unmatched" drives (5 and 1+0 come to mind as possibilities)? Also I couldnt find much info on how this is different on different RAID setups, eg RAID 0 or RAID 1, software or hardware RAID, etc. I'm assuming such things have an effect, and thats it's not all the same for RAID in general?

Read the article

Windows 7 breaks even in safe mode

- by delenda

Hi, I have a Dell XPS M1730 with Windows 7 installed. I noticed last night that after a few hours of use, the fans kicked into full and I couldn't do anything without it taking forever. Minimising windows, opening device manager or even opening process explorer took minutes and a game install I had just started took nearly 4 hours to complete. When procexp finally loaded, the refresh was so slow that it was mostly useless. From what I could gather, it was reporting 60% idle processes with procexp using nearly 40%. There were no hardware interrupts listed. When I rebooted, the problem went away for about 10 minutes and then the same thing happened. The issue persists in safe mode and even after I removed the graphics drivers, which have been an issue in the past, it still happens. Icons flash quite quickly on the desktop periodically and screen refresh is painfully slow. When booting now, the fans kick in to full as soon as the windows logon box comes up and it's taking 10 minutes to bring the desktop up. Chkdsk reports nothing and the raid check says that everything is fine. I'm thinking hardware failure, probably HDD but wanted some other opinions. I'm planning to try a linux live cd to see if it works without using the hard disks. If anyone has any input, it would be greatly appreciated. Delenda

Read the article

Linux Centos 6 becomes unavailable from time to time - OS&network issue

- by adoado0

I am encountering following problem. There is one server (DL160 G5) running Centos 6.3 with default kernel 2.6.32-220.2.1.el6.x86_64 - at this point I'd like to add that issue appeared also at older version - 6.1 and older kernel (do not remember exactly which version). There is cPanel installed and from time to time it becomes unavailable (network connection). What I've checked is (via KVMoIP): load average is completely normal it does not lack memory or disk space when problem occurs no console notifications checked all access logs and there is no sign that it can be caused by a client script cannot even access local interface (127.0.0.1) or main IP address running tcpdump I can only see packets arriving to server - no responses all services seem to be running properly (mail,sql,http,ssh) checked crontab and all clients' crontabs too network port utilisation is low ( up to several Mbits) arriving packet rate is low - hundreds per second (according to tcpdump) console (via kvmoip) works fine, no lags there is no conntrack at this server there is no ipv6 at this server flushing iptables, unloading modules does not resolve problem restarting network does not resolve problem, no errors appear it also occurs when two sepearate networks are configured (and multiple gateways) as well as one IP, one default gw and one network is configured - so it seems network configuration independent it seems to repeat randomly (load,packet rate,bandwith usage,load independent) checked server with different rootkit detection tools - it seems to be clean server has been rebooted, it did not change anything there are no interface errors it apperas randomly can be once a week or several times per day It usually works fine after 1-15 minutes. What I can also check? It is definitely OS issue - there is traffic at interface only in one direction when problem occurs, can not even ping loopback. Any ideas? Recommended checks? Anything I did not checked above.

Read the article

85 Hz on old/new driver looks the same like 75 Hz on previous one?

- by jon

I have old philips 107T5 CRT and Nvidia graphics card. I used old Nvidia driver (but it wasn't 'legacy' one when I installed it) for few years but recently I decided to install other Linux distribution. I used 75 Hz refresh rate and 1024x768 resolution on my previous distribution. After I installed the new distribution I had to install a Nvidia driver so I downloaded one from the Nvidia site (this time only legacy supported my card so I downloaded legacy and installed it). It wasn't automatically updating xorg.conf but I had my previous xorg.conf copy and I used it. When I run X I could only choose 85 and 75 Hz, 85 was checked as default. And now what shocks me: that default 85 Hz looks identically like 75 Hz on previous driver looked (at least to me). I tried 75 Hz out of curiosity and it's too bright, hurts, etc. But on the previous driver 75 Hz wasn't hurting my eyes. Why is it different? It's the same number after all, so it should always give the same results, right? That's my first question. Second question: Is 85 Hz OK for that monitor model? Would it break it? I tried to find the optimal refresh rate for this model but couldn't find it.

Read the article

ffmpeg - creating DNxHD MFX files with alphas

- by Hugh

I'm struggling with something in FFMpeg at the moment... I'm trying to make DNxHD 1080p/24, 36Mb/s MXF files from a sequence of PNG files. My current command-line is: ffmpeg -y -f image2 -i /tmp/temp.%04d.png -s 1920x1080 -r 24 -vcodec dnxhd -f mxf -pix_fmt rgb32 -b 36Mb /tmp/temp.mxf To which ffmpeg gives me the output: Input #0, image2, from '/tmp/temp.%04d.png': Duration: 00:00:01.60, start: 0.000000, bitrate: N/A Stream #0.0: Video: png, rgb32, 1920x1080, 25 tbr, 25 tbn, 25 tbc Output #0, mxf, to '/tmp/temp.mxf': Stream #0.0: Video: dnxhd, yuv422p, 1920x1080, q=2-31, 36000 kb/s, 90k tbn, 24 tbc Stream mapping: Stream #0.0 -> #0.0 [mxf @ 0x1005800]unsupported video frame rate Could not write header for output file #0 (incorrect codec parameters ?) There are a few things in here that concern me: The output stream is insisting on being yuv422p, which doesn't support alpha. 24fps is an unsupported video frame rate? I've tried 23.976 too, and get the same thing. I then tried the same thing, but writing to a quicktime (still DNxHD, though) with: ffmpeg -y -f image2 -i /tmp/temp.%04d.png -s 1920x1080 -r 24 -vcodec dnxhd -f mov -pix_fmt rgb32 -b 36Mb /tmp/temp.mov This gives me the output: Input #0, image2, from '/tmp/1274263259.28098.%04d.png': Duration: 00:00:01.60, start: 0.000000, bitrate: N/A Stream #0.0: Video: png, rgb32, 1920x1080, 25 tbr, 25 tbn, 25 tbc Output #0, mov, to '/tmp/1274263259.28098.mov': Stream #0.0: Video: dnxhd, yuv422p, 1920x1080, q=2-31, 36000 kb/s, 90k tbn, 24 tbc Stream mapping: Stream #0.0 -> #0.0 Press [q] to stop encoding frame= 39 fps= 9 q=1.0 Lsize= 7177kB time=1.62 bitrate=36180.8kbits/s video:7176kB audio:0kB global headers:0kB muxing overhead 0.013636% Which obviously works, to a certain extent, but still has the issue of being yuv422p, and therefore losing the alpha. If I'm going to QuickTime, then I can get what I need using Shake, but my main aim here is to be able to generate .mxf files. Any thoughts? Thanks

Read the article

Unexpected behaviour in a Lotus Notes programmable table

- by Mark B

I'm designing a workflow database in Lotus Notes 6.0.3 (soon upgrading to 8.5), and my OS is Windows XP. I have recently tried converting a tabbed table into a programmable one. This was so that I could control which tab was displayed to the user when it was opened, so that they were presented with the most appropriate one for that document's progress through the workflow. That part of it works! One of the tabs features a radio button that controls visibility of the next tab, and a pair of cascading dialogue boxes. One contains the static list "Person":"Team", and the other has a formula based on the first: view:=@If(PeerReview = "Team"; "GroupNames"; "GroupMembers"); @Unique(@DbColumn(""; ""; view; 1)) The dialogue boxes have the property "Refresh fields on keyword change" selected. The behaviour that I wasn't expecting is this. If the radio button is set to "Yes" and a value is selected in one of the dialogue boxes, the table opens the next tab. If the radio button is set to "No" and a value is selected in one of the dialogue boxes, the entire table is hidden. I can duplicate the latter by switching off the "Refresh fields on keyword change" property on the dialogue boxes and instead pressing F9 after selecting a value. I have no idea why the former occurs, though. The table is called "RFCInfo", and I have a field on the form called "$RFCInfo" which is editable, hidden from all users who aren't me and initially set by a Postopen script, which I can post if necessary - it's essentially a Select Case statement that looks at a particular item value and returns the name of the table row relating to that value. Can anyone offer any pointers?

Read the article

Oracle 11g Data Guard over a WAN

- by Dave LeJeune

Hi - We are in process of looking at using Oracle's Data Guard to replicate our 11g instance from a colo facility in Washington DC to Chicago. To give some basics we have approximately 25TB of storage and a healthy transaction rate in the 1-2K/sec range. Also, because we are processing data in real-time we have a 24x7x365 requirement for processing data. We don't have any respites as far as volume except for system upgrades (once every few months) where we take the system offline but then course experience a spike in transactions when we bring the system back on-line. Ideally we would want the second instance in the DG configuration semi-online in a read-only fashion for reports/etc. We evaluated DG in 10g and were not overly impressed and research seemed to show that earlier versions had issues with replication over a WAN but I have heard good things about modifications the product has gone through w/ 11g. Can anyone confirm an instance of this size and transaction rate being replicated over a WAN and if so what is the general latency? An information or experiences with a DG implementation that is of this size and scope would really be helpful (or larger - I also realize we are still relatively small compared to many others out there). Many thanks in advance.

Read the article

How can I write an excel formula to do row based calculations; where certain conditions need to be met?

- by BDY

I am given: An excel sheet contains around 200 tasks (described in rows 2-201 in Column A). Each task can be elegible for a max of two projects (There are 4 projects in total, called "P1-P4" - drop down lists in Columns B and D); and this with a specific %-rate allocation (columns C & E - Column C refers to the Project Column B, and Column E refers to the Project in Column D). Column F shows the amount of work days spent on each task. Example in row 2: Task 1 (Column A); P1 (Column B) ; 80% (Column C) ; P3 (Column D) ; 20% (Column E) ; 3 (Column F) I need to know the sum of the working days spent on Project P3 respecting the %-rate for elegibility. I know how to calculate it for each Task (each Row) - e.g. for Task 1: =IF(B2="P3";C2*F2)+IF(D2="P3";E2*F2) However instead of repeating this for each task, I need a formula that adds them all together. Unfortunately the following formula shows me an error: =IF(B2:B201="P3";C2:C201*F2:F201)+IF(D2:D201="P3";E2:E201*F2:F201) Can anyone help please? Thank you!!

Read the article

Matched or unmatched drives for RAID arrays?

- by Will

Looking around there is conflciting information on this, with some strongly suggesting one or the other. From my understanding the issue with matched drives is that the wear on both drives is more or less the same, so the potential for the second drive failing with or very soon after the first is pretty high. People also claim matched drives give substianatally higher performance however assuming the unmatched drives are more or less the same (eg 2, 1 TB STATA II 7200rpm drives with 32MB cache), would the minor differences between say a Seagate and a Western Digital one (say one has a 128MB/s read rate, and the other a 150MB/s read rate, as well as I guess various other minor differences) actually cause any notable performance loss, ie potentialy worse than two matched 128MB/s drives, or does RAID not really care and give you essentially an optimal solution (eg upto 278MB/s total read speed for RAID 0 and 1) and similar for other RAID with more "unmatched" drives (5 and 1+0 come to mind as possibilities)? Also I couldnt find much info on how this is different on different RAID setups, eg RAID 0 or RAID 1, software or hardware RAID, etc. I'm assuming such things have an effect, and thats it's not all the same for RAID in general?

Read the article

How can I use Perl regular expressions to parse XML data?

- by Luke

I have a pretty long piece of XML that I want to parse. I want to remove everything except for the subclass-code and city. So that I am left with something like the example below. EXAMPLE TEST SUBCLASS|MIAMI CODE <?xml version="1.0" standalone="no"?> <web-export> <run-date>06/01/2010 <pub-code>TEST <ad-type>TEST <cat-code>Real Estate</cat-code> <class-code>TEST</class-code> <subclass-code>TEST SUBCLASS</subclass-code> <placement-description></placement-description> <position-description>Town House</position-description> <subclass3-code></subclass3-code> <subclass4-code></subclass4-code> <ad-number>0000284708-01</ad-number> <start-date>05/28/2010</start-date> <end-date>06/09/2010</end-date> <line-count>6</line-count> <run-count>13</run-count> <customer-type>Private Party</customer-type> <account-number>100099237</account-number> <account-name>DOE, JOHN</account-name> <addr-1>207 CLARENCE STREET</addr-1> <addr-2> </addr-2> <city>MIAMI</city> <state>FL</state> <postal-code>02910</postal-code> <country>USA</country> <phone-number>4014612880</phone-number> <fax-number></fax-number> <url-addr> </url-addr> <email-addr>[email protected]</email-addr> <pay-flag>N</pay-flag> <ad-description>DEANESTATES2BEDS2BATHSAPPLIANCED</ad-description> <order-source>Import</order-source> <order-status>Live</order-status> <payor-acct>100099237</payor-acct> <agency-flag>N</agency-flag> <rate-note></rate-note> <ad-content> MIAMI/Dean Estates: 2 beds, 2 baths. Applianced. Central air. Carpets. Laundry. 2 decks. Pool. Parking. Close to everything.No smoking. No utilities. $1275 mo. 401-578-1501. </ad-content> </ad-type> </pub-code> </run-date> </web-export> PERL So what I want to do is open an existing file read the contents then use regular expressions to eliminate the unnecessary XML tags. open(READFILE, "FILENAME"); while(<READFILE>) { $_ =~ s/<\?xml version="(.*)" standalone="(.*)"\?>\n.*//g; $_ =~ s/<subclass-code>//g; $_ =~ s/<\/subclass-code>\n.*/|/g; $_ =~ s/(.*)PJ RER Houses /PJ RER Houses/g; $_ =~ s/\G //g; $_ =~ s/<city>//g; $_ =~ s/<\/city>\n.*//g; $_ =~ s/<(\/?)web-export>(.*)\n.*//g; $_ =~ s/<(\/?)run-date>(.*)\n.*//g; $_ =~ s/<(\/?)pub-code>(.*)\n.*//g; $_ =~ s/<(\/?)ad-type>(.*)\n.*//g; $_ =~ s/<(\/?)cat-code>(.*)<(\/?)cat-code>\n.*//g; $_ =~ s/<(\/?)class-code>(.*)<(\/?)class-code>\n.*//g; $_ =~ s/<(\/?)placement-description>(.*)<(\/?)placement-description>\n.*//g; $_ =~ s/<(\/?)position-description>(.*)<(\/?)position-description>\n.*//g; $_ =~ s/<(\/?)subclass3-code>(.*)<(\/?)subclass3-code>\n.*//g; $_ =~ s/<(\/?)subclass4-code>(.*)<(\/?)subclass4-code>\n.*//g; $_ =~ s/<(\/?)ad-number>(.*)<(\/?)ad-number>\n.*//g; $_ =~ s/<(\/?)start-date>(.*)<(\/?)start-date>\n.*//g; $_ =~ s/<(\/?)end-date>(.*)<(\/?)end-date>\n.*//g; $_ =~ s/<(\/?)line-count>(.*)<(\/?)line-count>\n.*//g; $_ =~ s/<(\/?)run-count>(.*)<(\/?)run-count>\n.*//g; $_ =~ s/<(\/?)customer-type>(.*)<(\/?)customer-type>\n.*//g; $_ =~ s/<(\/?)account-number>(.*)<(\/?)account-number>\n.*//g; $_ =~ s/<(\/?)account-name>(.*)<(\/?)account-name>\n.*//g; $_ =~ s/<(\/?)addr-1>(.*)<(\/?)addr-1>\n.*//g; $_ =~ s/<(\/?)addr-2>(.*)<(\/?)addr-2>\n.*//g; $_ =~ s/<(\/?)state>(.*)<(\/?)state>\n.*//g; $_ =~ s/<(\/?)postal-code>(.*)<(\/?)postal-code>\n.*//g; $_ =~ s/<(\/?)country>(.*)<(\/?)country>\n.*//g; $_ =~ s/<(\/?)phone-number>(.*)<(\/?)phone-number>\n.*//g; $_ =~ s/<(\/?)fax-number>(.*)<(\/?)fax-number>\n.*//g; $_ =~ s/<(\/?)url-addr>(.*)<(\/?)url-addr>\n.*//g; $_ =~ s/<(\/?)email-addr>(.*)<(\/?)email-addr>\n.*//g; $_ =~ s/<(\/?)pay-flag>(.*)<(\/?)pay-flag>\n.*//g; $_ =~ s/<(\/?)ad-description>(.*)<(\/?)ad-description>\n.*//g; $_ =~ s/<(\/?)order-source>(.*)<(\/?)order-source>\n.*//g; $_ =~ s/<(\/?)order-status>(.*)<(\/?)order-status>\n.*//g; $_ =~ s/<(\/?)payor-acct>(.*)<(\/?)payor-acct>\n.*//g; $_ =~ s/<(\/?)agency-flag>(.*)<(\/?)agency-flag>\n.*//g; $_ =~ s/<(\/?)rate-note>(.*)<(\/?)rate-note>\n.*//g; $_ =~ s/<ad-content>(.*)\n.*//g; $_ =~ s/\t(.*)\n.*//g; $_ =~ s/<\/ad-content>(.*)\n.*//g; } close( READFILE1 ); Is there an easier way of doing this? I don't want to use any modules. I know that it might make this easier but the file I am reading has a lot of data in it.

Read the article

Problems with real-valued input deep belief networks (of RBMs)

- by Junier

I am trying to recreate the results reported in Reducing the dimensionality of data with neural networks of autoencoding the olivetti face dataset with an adapted version of the MNIST digits matlab code, but am having some difficulty. It seems that no matter how much tweaking I do on the number of epochs, rates, or momentum the stacked RBMs are entering the fine-tuning stage with a large amount of error and consequently fail to improve much at the fine-tuning stage. I am also experiencing a similar problem on another real-valued dataset. For the first layer I am using a RBM with a smaller learning rate (as described in the paper) and with negdata = poshidstates*vishid' + repmat(visbiases,numcases,1); I'm fairly confident I am following the instructions found in the supporting material but I cannot achieve the correct errors. Is there something I am missing? See the code I'm using for real-valued visible unit RBMs below, and for the whole deep training. The rest of the code can be found here. rbmvislinear.m: epsilonw = 0.001; % Learning rate for weights epsilonvb = 0.001; % Learning rate for biases of visible units epsilonhb = 0.001; % Learning rate for biases of hidden units weightcost = 0.0002; initialmomentum = 0.5; finalmomentum = 0.9; [numcases numdims numbatches]=size(batchdata); if restart ==1, restart=0; epoch=1; % Initializing symmetric weights and biases. vishid = 0.1*randn(numdims, numhid); hidbiases = zeros(1,numhid); visbiases = zeros(1,numdims); poshidprobs = zeros(numcases,numhid); neghidprobs = zeros(numcases,numhid); posprods = zeros(numdims,numhid); negprods = zeros(numdims,numhid); vishidinc = zeros(numdims,numhid); hidbiasinc = zeros(1,numhid); visbiasinc = zeros(1,numdims); sigmainc = zeros(1,numhid); batchposhidprobs=zeros(numcases,numhid,numbatches); end for epoch = epoch:maxepoch, fprintf(1,'epoch %d\r',epoch); errsum=0; for batch = 1:numbatches, if (mod(batch,100)==0) fprintf(1,' %d ',batch); end %%%%%%%%% START POSITIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% data = batchdata(:,:,batch); poshidprobs = 1./(1 + exp(-data*vishid - repmat(hidbiases,numcases,1))); batchposhidprobs(:,:,batch)=poshidprobs; posprods = data' * poshidprobs; poshidact = sum(poshidprobs); posvisact = sum(data); %%%%%%%%% END OF POSITIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% poshidstates = poshidprobs > rand(numcases,numhid); %%%%%%%%% START NEGATIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% negdata = poshidstates*vishid' + repmat(visbiases,numcases,1);% + randn(numcases,numdims) if not using mean neghidprobs = 1./(1 + exp(-negdata*vishid - repmat(hidbiases,numcases,1))); negprods = negdata'*neghidprobs; neghidact = sum(neghidprobs); negvisact = sum(negdata); %%%%%%%%% END OF NEGATIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% err= sum(sum( (data-negdata).^2 )); errsum = err + errsum; if epoch>5, momentum=finalmomentum; else momentum=initialmomentum; end; %%%%%%%%% UPDATE WEIGHTS AND BIASES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% vishidinc = momentum*vishidinc + ... epsilonw*( (posprods-negprods)/numcases - weightcost*vishid); visbiasinc = momentum*visbiasinc + (epsilonvb/numcases)*(posvisact-negvisact); hidbiasinc = momentum*hidbiasinc + (epsilonhb/numcases)*(poshidact-neghidact); vishid = vishid + vishidinc; visbiases = visbiases + visbiasinc; hidbiases = hidbiases + hidbiasinc; %%%%%%%%%%%%%%%% END OF UPDATES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% end fprintf(1, '\nepoch %4i error %f \n', epoch, errsum); end dofacedeepauto.m: clear all close all maxepoch=200; %In the Science paper we use maxepoch=50, but it works just fine. numhid=2000; numpen=1000; numpen2=500; numopen=30; fprintf(1,'Pretraining a deep autoencoder. \n'); fprintf(1,'The Science paper used 50 epochs. This uses %3i \n', maxepoch); load fdata %makeFaceData; [numcases numdims numbatches]=size(batchdata); fprintf(1,'Pretraining Layer 1 with RBM: %d-%d \n',numdims,numhid); restart=1; rbmvislinear; hidrecbiases=hidbiases; save mnistvh vishid hidrecbiases visbiases; maxepoch=50; fprintf(1,'\nPretraining Layer 2 with RBM: %d-%d \n',numhid,numpen); batchdata=batchposhidprobs; numhid=numpen; restart=1; rbm; hidpen=vishid; penrecbiases=hidbiases; hidgenbiases=visbiases; save mnisthp hidpen penrecbiases hidgenbiases; fprintf(1,'\nPretraining Layer 3 with RBM: %d-%d \n',numpen,numpen2); batchdata=batchposhidprobs; numhid=numpen2; restart=1; rbm; hidpen2=vishid; penrecbiases2=hidbiases; hidgenbiases2=visbiases; save mnisthp2 hidpen2 penrecbiases2 hidgenbiases2; fprintf(1,'\nPretraining Layer 4 with RBM: %d-%d \n',numpen2,numopen); batchdata=batchposhidprobs; numhid=numopen; restart=1; rbmhidlinear; hidtop=vishid; toprecbiases=hidbiases; topgenbiases=visbiases; save mnistpo hidtop toprecbiases topgenbiases; backpropface; Thanks for your time

Read the article

Problems with real-valued deep belief networks (of RBMs)

- by Junier

I am trying to recreate the results reported in Reducing the dimensionality of data with neural networks of autoencoding the olivetti face dataset with an adapted version of the MNIST digits matlab code, but am having some difficulty. It seems that no matter how much tweaking I do on the number of epochs, rates, or momentum the stacked RBMs are entering the fine-tuning stage with a large amount of error and consequently fail to improve much at the fine-tuning stage. I am also experiencing a similar problem on another real-valued dataset. For the first layer I am using a RBM with a smaller learning rate (as described in the paper) and with negdata = poshidstates*vishid' + repmat(visbiases,numcases,1); I'm fairly confident I am following the instructions found in the supporting material but I cannot achieve the correct errors. Is there something I am missing? See the code I'm using for real-valued visible unit RBMs below, and for the whole deep training. The rest of the code can be found here. rbmvislinear.m: epsilonw = 0.001; % Learning rate for weights epsilonvb = 0.001; % Learning rate for biases of visible units epsilonhb = 0.001; % Learning rate for biases of hidden units weightcost = 0.0002; initialmomentum = 0.5; finalmomentum = 0.9; [numcases numdims numbatches]=size(batchdata); if restart ==1, restart=0; epoch=1; % Initializing symmetric weights and biases. vishid = 0.1*randn(numdims, numhid); hidbiases = zeros(1,numhid); visbiases = zeros(1,numdims); poshidprobs = zeros(numcases,numhid); neghidprobs = zeros(numcases,numhid); posprods = zeros(numdims,numhid); negprods = zeros(numdims,numhid); vishidinc = zeros(numdims,numhid); hidbiasinc = zeros(1,numhid); visbiasinc = zeros(1,numdims); sigmainc = zeros(1,numhid); batchposhidprobs=zeros(numcases,numhid,numbatches); end for epoch = epoch:maxepoch, fprintf(1,'epoch %d\r',epoch); errsum=0; for batch = 1:numbatches, if (mod(batch,100)==0) fprintf(1,' %d ',batch); end %%%%%%%%% START POSITIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% data = batchdata(:,:,batch); poshidprobs = 1./(1 + exp(-data*vishid - repmat(hidbiases,numcases,1))); batchposhidprobs(:,:,batch)=poshidprobs; posprods = data' * poshidprobs; poshidact = sum(poshidprobs); posvisact = sum(data); %%%%%%%%% END OF POSITIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% poshidstates = poshidprobs > rand(numcases,numhid); %%%%%%%%% START NEGATIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% negdata = poshidstates*vishid' + repmat(visbiases,numcases,1);% + randn(numcases,numdims) if not using mean neghidprobs = 1./(1 + exp(-negdata*vishid - repmat(hidbiases,numcases,1))); negprods = negdata'*neghidprobs; neghidact = sum(neghidprobs); negvisact = sum(negdata); %%%%%%%%% END OF NEGATIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% err= sum(sum( (data-negdata).^2 )); errsum = err + errsum; if epoch>5, momentum=finalmomentum; else momentum=initialmomentum; end; %%%%%%%%% UPDATE WEIGHTS AND BIASES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% vishidinc = momentum*vishidinc + ... epsilonw*( (posprods-negprods)/numcases - weightcost*vishid); visbiasinc = momentum*visbiasinc + (epsilonvb/numcases)*(posvisact-negvisact); hidbiasinc = momentum*hidbiasinc + (epsilonhb/numcases)*(poshidact-neghidact); vishid = vishid + vishidinc; visbiases = visbiases + visbiasinc; hidbiases = hidbiases + hidbiasinc; %%%%%%%%%%%%%%%% END OF UPDATES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% end fprintf(1, '\nepoch %4i error %f \n', epoch, errsum); end dofacedeepauto.m: clear all close all maxepoch=200; %In the Science paper we use maxepoch=50, but it works just fine. numhid=2000; numpen=1000; numpen2=500; numopen=30; fprintf(1,'Pretraining a deep autoencoder. \n'); fprintf(1,'The Science paper used 50 epochs. This uses %3i \n', maxepoch); load fdata %makeFaceData; [numcases numdims numbatches]=size(batchdata); fprintf(1,'Pretraining Layer 1 with RBM: %d-%d \n',numdims,numhid); restart=1; rbmvislinear; hidrecbiases=hidbiases; save mnistvh vishid hidrecbiases visbiases; maxepoch=50; fprintf(1,'\nPretraining Layer 2 with RBM: %d-%d \n',numhid,numpen); batchdata=batchposhidprobs; numhid=numpen; restart=1; rbm; hidpen=vishid; penrecbiases=hidbiases; hidgenbiases=visbiases; save mnisthp hidpen penrecbiases hidgenbiases; fprintf(1,'\nPretraining Layer 3 with RBM: %d-%d \n',numpen,numpen2); batchdata=batchposhidprobs; numhid=numpen2; restart=1; rbm; hidpen2=vishid; penrecbiases2=hidbiases; hidgenbiases2=visbiases; save mnisthp2 hidpen2 penrecbiases2 hidgenbiases2; fprintf(1,'\nPretraining Layer 4 with RBM: %d-%d \n',numpen2,numopen); batchdata=batchposhidprobs; numhid=numopen; restart=1; rbmhidlinear; hidtop=vishid; toprecbiases=hidbiases; topgenbiases=visbiases; save mnistpo hidtop toprecbiases topgenbiases; backpropface; Thanks for your time

Read the article

Web Browser Control – Specifying the IE Version

- by Rick Strahl

I use the Internet Explorer Web Browser Control in a lot of my applications to display document type layout. HTML happens to be one of the most common document formats and displaying data in this format – even in desktop applications, is often way easier than using normal desktop technologies. One issue the Web Browser Control has that it’s perpetually stuck in IE 7 rendering mode by default. Even though IE 8 and now 9 have significantly upgraded the IE rendering engine to be more CSS and HTML compliant by default the Web Browser control will have none of it. IE 9 in particular – with its much improved CSS support and basic HTML 5 support is a big improvement and even though the IE control uses some of IE’s internal rendering technology it’s still stuck in the old IE 7 rendering by default. This applies whether you’re using the Web Browser control in a WPF application, a WinForms app, a FoxPro or VB classic application using the ActiveX control. Behind the scenes all these UI platforms use the COM interfaces and so you’re stuck by those same rules. Rendering Challenged To see what I’m talking about here are two screen shots rendering an HTML 5 doctype page that includes some CSS 3 functionality – rounded corners and border shadows - from an earlier post. One uses IE 9 as a standalone browser, and one uses a simple WPF form that includes the Web Browser control. IE 9 Browser: Web Browser control in a WPF form: The IE 9 page displays this HTML correctly – you see the rounded corners and shadow displayed. Obviously the latter rendering using the Web Browser control in a WPF application is a bit lacking. Not only are the new CSS features missing but the page also renders in Internet Explorer’s quirks mode so all the margins, padding etc. behave differently by default, even though there’s a CSS reset applied on this page. If you’re building an application that intends to use the Web Browser control for a live preview of some HTML this is clearly undesirable. Feature Delegation via Registry Hacks Fortunately starting with Internet Explore 8 and later there’s a fix for this problem via a registry setting. You can specify a registry key to specify which rendering mode and version of IE should be used by that application. These are not global mind you – they have to be enabled for each application individually. There are two different sets of keys for 32 bit and 64 bit applications. 32 bit: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Internet Explorer\MAIN\FeatureControl\FEATURE_BROWSER_EMULATION Value Key: yourapplication.exe 64 bit: HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Internet Explorer\MAIN\FeatureControl\FEATURE_BROWSER_EMULATION Value Key: yourapplication.exe The value to set this key to is (taken from MSDN here) as decimal values: 9999 (0x270F) Internet Explorer 9. Webpages are displayed in IE9 Standards mode, regardless of the !DOCTYPE directive. 9000 (0x2328) Internet Explorer 9. Webpages containing standards-based !DOCTYPE directives are displayed in IE9 mode. 8888 (0x22B8) Webpages are displayed in IE8 Standards mode, regardless of the !DOCTYPE directive. 8000 (0x1F40) Webpages containing standards-based !DOCTYPE directives are displayed in IE8 mode. 7000 (0x1B58) Webpages containing standards-based !DOCTYPE directives are displayed in IE7 Standards mode. The added key looks something like this in the Registry Editor: With this in place my Html Html Help Builder application which has wwhelp.exe as its main executable now works with HTML 5 and CSS 3 documents in the same way that Internet Explorer 9 does. Incidentally I accidentally added an ‘empty’ DWORD value of 0 to my EXE name and that worked as well giving me IE 9 rendering. Although not documented I suspect 0 (or an invalid value) will default to the installed browser. Don’t have a good way to test this but if somebody could try this with IE 8 installed that would be great: What happens when setting 9000 with IE 8 installed? What happens when setting 0 with IE 8 installed? Don’t forget to add Keys for Host Environments If you’re developing your application in Visual Studio and you run the debugger you may find that your application is still not rendering right, but if you run the actual generated EXE from Explorer or the OS command prompt it works. That’s because when you run the debugger in Visual Studio it wraps your application into a debugging host container. For this reason you might want to also add another registry key for yourapp.vshost.exe on your development machine. If you’re developing in Visual FoxPro make sure you add a key for vfp9.exe to see the rendering adjustments in the Visual FoxPro development environment. Cleaner HTML - no more HTML mangling! There are a number of additional benefits to setting up rendering of the Web Browser control to the IE 9 engine (or even the IE 8 engine) beyond the obvious rendering functionality. IE 9 actually returns your HTML in something that resembles the original HTML formatting, as opposed to the IE 7 default format which mangled the original HTML content. If you do the following in the WPF application: private void button2_Click(object sender, RoutedEventArgs e) { dynamic doc = this.webBrowser.Document; MessageBox.Show(doc.body.outerHtml); } you get different output depending on the rendering mode active. With the default IE 7 rendering you get: <BODY><DIV> <H1>Rounded Corners and Shadows - Creating Dialogs in CSS</H1> <DIV class=toolbarcontainer><A class=hoverbutton href="./"><IMG src="../../css/images/home.gif"> Home</A> <A class=hoverbutton href="RoundedCornersAndShadows.htm"><IMG src="../../css/images/refresh.gif"> Refresh</A> </DIV> <DIV class=containercontent> <FIELDSET><LEGEND>Plain Box</LEGEND> <DIV style="BORDER-BOTTOM: steelblue 2px solid; BORDER-LEFT: steelblue 2px solid; WIDTH: 550px; BORDER-TOP: steelblue 2px solid; BORDER-RIGHT: steelblue 2px solid" class="roundbox boxshadow"> <DIV style="BACKGROUND: khaki" class="boxcontenttext roundbox">Simple Rounded Corner Box. </DIV></DIV></FIELDSET> <FIELDSET><LEGEND>Box with Header</LEGEND> <DIV style="BORDER-BOTTOM: steelblue 2px solid; BORDER-LEFT: steelblue 2px solid; WIDTH: 550px; BORDER-TOP: steelblue 2px solid; BORDER-RIGHT: steelblue 2px solid" class="roundbox boxshadow"> <DIV class="gridheaderleft roundbox-top">Box with a Header</DIV> <DIV style="BACKGROUND: khaki" class="boxcontenttext roundbox-bottom">Simple Rounded Corner Box. </DIV></DIV></FIELDSET> <FIELDSET><LEGEND>Dialog Style Window</LEGEND> <DIV style="POSITION: relative; WIDTH: 450px" id=divDialog class="dialog boxshadow" jQuery16107208195684204002="2"> <DIV style="POSITION: relative" class=dialog-header> <DIV class=closebox></DIV>User Sign-in <DIV class=closebox jQuery16107208195684204002="3"></DIV></DIV> <DIV class=descriptionheader>This dialog is draggable and closable</DIV> <DIV class=dialog-content><LABEL>Username:</LABEL> <INPUT name=txtUsername value=" "> <LABEL>Password</LABEL> <INPUT name=txtPassword value=" "> <HR> <INPUT id=btnLogin value=Login type=button> </DIV> <DIV class=dialog-statusbar>Ready</DIV></DIV></FIELDSET> </DIV> <SCRIPT type=text/javascript> $(document).ready(function () { $("#divDialog") .draggable({ handle: ".dialog-header" }) .closable({ handle: ".dialog-header", closeHandler: function () { alert("Window about to be closed."); return true; // true closes - false leaves open } }); }); </SCRIPT> </DIV></BODY> Now lest you think I’m out of my mind and create complete whacky HTML rooted in the last century, here’s the IE 9 rendering mode output which looks a heck of a lot cleaner and a lot closer to my original HTML of the page I’m accessing: <body> <div> <h1>Rounded Corners and Shadows - Creating Dialogs in CSS</h1> <div class="toolbarcontainer"> <a class="hoverbutton" href="./"> <img src="../../css/images/home.gif"> Home</a> <a class="hoverbutton" href="RoundedCornersAndShadows.htm"> <img src="../../css/images/refresh.gif"> Refresh</a> </div> <div class="containercontent"> <fieldset> <legend>Plain Box</legend>  <div style="border: 2px solid steelblue; width: 550px;" class="roundbox boxshadow"> <div style="background: khaki;" class="boxcontenttext roundbox"> Simple Rounded Corner Box. </div> </div> </fieldset> <fieldset> <legend>Box with Header</legend> <div style="border: 2px solid steelblue; width: 550px;" class="roundbox boxshadow"> <div class="gridheaderleft roundbox-top">Box with a Header</div> <div style="background: khaki;" class="boxcontenttext roundbox-bottom"> Simple Rounded Corner Box. </div> </div> </fieldset> <fieldset> <legend>Dialog Style Window</legend> <div style="width: 450px; position: relative;" id="divDialog" class="dialog boxshadow"> <div style="position: relative;" class="dialog-header"> <div class="closebox"></div> User Sign-in <div class="closebox"></div></div> <div class="descriptionheader">This dialog is draggable and closable</div> <div class="dialog-content"> <label>Username:</label> <input name="txtUsername" value=" " type="text"> <label>Password</label> <input name="txtPassword" value=" " type="text"> <hr/> <input id="btnLogin" value="Login" type="button"> </div> <div class="dialog-statusbar">Ready</div> </div> </fieldset> </div> <script type="text/javascript"> $(document).ready(function () { $("#divDialog") .draggable({ handle: ".dialog-header" }) .closable({ handle: ".dialog-header", closeHandler: function () { alert("Window about to be closed."); return true; // true closes - false leaves open } }); }); </script> </div> </body> IOW, in IE9 rendering mode IE9 is much closer (but not identical) to the original HTML from the page on the Web that we’re reading from. As a side note: Unfortunately, the browser feature emulation can't be applied against the Html Help (CHM) Engine in Windows which uses the Web Browser control (or COM interfaces anyway) to render Html Help content. I tried setting up hh.exe which is the help viewer, to use IE 9 rendering but a help file generated with CSS3 features will simply show in IE 7 mode. Bummer - this would have been a nice quick fix to allow help content served from CHM files to look better. HTML Editing leaves HTML formatting intact In the same vane, if you do any inline HTML editing in the control by setting content to be editable, IE 9’s control does a much more reasonable job of creating usable and somewhat valid HTML. It also leaves the original content alone other than the text your are editing or adding. No longer is the HTML output stripped of excess spaces and reformatted in IEs format. So if I do: private void button3_Click(object sender, RoutedEventArgs e) { dynamic doc = this.webBrowser.Document; doc.body.contentEditable = true; } and then make some changes to the document by typing into it using IE 9 mode, the document formatting stays intact and only the affected content is modified. The created HTML is reasonably clean (although it does lack proper XHTML formatting for things like <br/> <hr/>). This is very different from IE 7 mode which mangled the HTML as soon as the page was loaded into the control. Any editing you did stripped out all white space and lost all of your existing XHTML formatting. In IE 9 mode at least *most* of your original formatting stays intact. This is huge! In Html Help Builder I have supported HTML editing for a long time but the HTML mangling by the Web Browser control made it very difficult to edit the HTML later. Previously IE would mangle the HTML by stripping out spaces, upper casing all tags and converting many XHTML safe tags to its HTML 3 tags. Now IE leaves most of my document alone while editing, and creates cleaner and more compliant markup (with exception of self-closing elements like BR/HR). The end result is that I now have HTML editing in place that's much cleaner and actually capable of being manually edited. Caveats, Caveats, Caveats It wouldn't be Internet Explorer if there weren't some major compatibility issues involved in using this various browser version interaction. The biggest thing I ran into is that there are odd differences in some of the COM interfaces and what they return. I specifically ran into a problem with the document.selection.createRange() function which with IE 7 compatibility returns an expected text range object. When running in IE 8 or IE 9 mode however. I could not retrieve a valid text range with this code where loEdit is the WebBrowser control: loRange = loEdit.document.selection.CreateRange() The loRange object returned (here in FoxPro) had a length property of 0 but none of the other properties of the TextRange or TextRangeCollection objects were available. I figured this was due to some changed security settings but even after elevating the Intranet Security Zone and mucking with the other browser feature flags pertaining to security I had no luck. In the end I relented and used a JavaScript function in my editor document that returns a selection range object: function getselectionrange() { var range = document.selection.createRange(); return range; } and call that JavaScript function from my host applications code: *** Use a function in the document to get around HTML Editing issues loRange = loEdit.document.parentWindow.getselectionrange(.f.) and that does work correctly. This wasn't a big deal as I'm already loading a support script file into the editor page so all I had to do is add the function to this existing script file. You can find out more how to call script code in the Web Browser control from a host application in a previous post of mine. IE 8 and 9 also clamp down the security environment a little more than the default IE 7 control, so there may be other issues you run into. Other than the createRange() problem above I haven't seen anything else that is breaking in my code so far though and that's encouraging at least since it uses a lot of HTML document manipulation for the custom editor I've created (and would love to replace - any PROFESSIONAL alternatives anybody?) Registry Key Installation for your Application It’s important to remember that this registry setting is made per application, so most likely this is something you want to set up with your installer. Also remember that 32 and 64 bit settings require separate settings in the registry so if you’re creating your installer you most likely will want to set both keys in the registry preemptively for your application. I use Tarma Installer for all of my application installs and in Tarma I configure registry keys for both and set a flag to only install the latter key group in the 64 bit version: Because this setting is application specific you have to do this for every application you install unfortunately, but this also means that you can safely configure this setting in the registry because it is after only applied to your application. Another problem with install based installation is version detection. If IE 8 is installed I’d want 8000 for the value, if IE 9 is installed I want 9000. I can do this easily in code but in the installer this is much more difficult. I don’t have a good solution for this at the moment, but given that the app works with IE 7 mode now, IE 9 mode is just a bonus for the moment. If IE 9 is not installed and 9000 is used the default rendering will remain in use. It sure would be nice if we could specify the IE rendering mode as a property, but I suspect the ActiveX container has to know before it loads what actual version to load up and once loaded can only load a single version of IE. This would account for this annoying application level configuration… Summary The registry feature emulation has been available for quite some time, but I just found out about it today and started experimenting around with it. I’m stoked to see that this is available as I’d pretty much given up in ever seeing any better rendering in the Web Browser control. Now at least my apps can take advantage of newer HTML features. Now if we could only get better HTML Editing support somehow <snicker>… ah can’t have everything.© Rick Strahl, West Wind Technologies, 2005-2011Posted in .NET FoxPro Windows

Read the article

256 Windows Azure Worker Roles, Windows Kinect and a 90's Text-Based Ray-Tracer

- by Alan Smith

For a couple of years I have been demoing a simple render farm hosted in Windows Azure using worker roles and the Azure Storage service. At the start of the presentation I deploy an Azure application that uses 16 worker roles to render a 1,500 frame 3D ray-traced animation. At the end of the presentation, when the animation was complete, I would play the animation delete the Azure deployment. The standing joke with the audience was that it was that it was a “$2 demo”, as the compute charges for running the 16 instances for an hour was $1.92, factor in the bandwidth charges and it’s a couple of dollars. The point of the demo is that it highlights one of the great benefits of cloud computing, you pay for what you use, and if you need massive compute power for a short period of time using Windows Azure can work out very cost effective. The “$2 demo” was great for presenting at user groups and conferences in that it could be deployed to Azure, used to render an animation, and then removed in a one hour session. I have always had the idea of doing something a bit more impressive with the demo, and scaling it from a “$2 demo” to a “$30 demo”. The challenge was to create a visually appealing animation in high definition format and keep the demo time down to one hour. This article will take a run through how I achieved this. Ray Tracing Ray tracing, a technique for generating high quality photorealistic images, gained popularity in the 90’s with companies like Pixar creating feature length computer animations, and also the emergence of shareware text-based ray tracers that could run on a home PC. In order to render a ray traced image, the ray of light that would pass from the view point must be tracked until it intersects with an object. At the intersection, the color, reflectiveness, transparency, and refractive index of the object are used to calculate if the ray will be reflected or refracted. Each pixel may require thousands of calculations to determine what color it will be in the rendered image. Pin-Board Toys Having very little artistic talent and a basic understanding of maths I decided to focus on an animation that could be modeled fairly easily and would look visually impressive. I’ve always liked the pin-board desktop toys that become popular in the 80’s and when I was working as a 3D animator back in the 90’s I always had the idea of creating a 3D ray-traced animation of a pin-board, but never found the energy to do it. Even if I had a go at it, the render time to produce an animation that would look respectable on a 486 would have been measured in months. PolyRay Back in 1995 I landed my first real job, after spending three years being a beach-ski-climbing-paragliding-bum, and was employed to create 3D ray-traced animations for a CD-ROM that school kids would use to learn physics. I had got into the strange and wonderful world of text-based ray tracing, and was using a shareware ray-tracer called PolyRay. PolyRay takes a text file describing a scene as input and, after a few hours processing on a 486, produced a high quality ray-traced image. The following is an example of a basic PolyRay scene file. background Midnight_Blue static define matte surface { ambient 0.1 diffuse 0.7 } define matte_white texture { matte { color white } } define matte_black texture { matte { color dark_slate_gray } } define position_cylindrical 3 define lookup_sawtooth 1 define light_wood <0.6, 0.24, 0.1> define median_wood <0.3, 0.12, 0.03> define dark_wood <0.05, 0.01, 0.005> define wooden texture { noise surface { ambient 0.2 diffuse 0.7 specular white, 0.5 microfacet Reitz 10 position_fn position_cylindrical position_scale 1 lookup_fn lookup_sawtooth octaves 1 turbulence 1 color_map( [0.0, 0.2, light_wood, light_wood] [0.2, 0.3, light_wood, median_wood] [0.3, 0.4, median_wood, light_wood] [0.4, 0.7, light_wood, light_wood] [0.7, 0.8, light_wood, median_wood] [0.8, 0.9, median_wood, light_wood] [0.9, 1.0, light_wood, dark_wood]) } } define glass texture { surface { ambient 0 diffuse 0 specular 0.2 reflection white, 0.1 transmission white, 1, 1.5 }} define shiny surface { ambient 0.1 diffuse 0.6 specular white, 0.6 microfacet Phong 7 } define steely_blue texture { shiny { color black } } define chrome texture { surface { color white ambient 0.0 diffuse 0.2 specular 0.4 microfacet Phong 10 reflection 0.8 } } viewpoint { from <4.000, -1.000, 1.000> at <0.000, 0.000, 0.000> up <0, 1, 0> angle 60 resolution 640, 480 aspect 1.6 image_format 0 } light <-10, 30, 20> light <-10, 30, -20> object { disc <0, -2, 0>, <0, 1, 0>, 30 wooden } object { sphere <0.000, 0.000, 0.000>, 1.00 chrome } object { cylinder <0.000, 0.000, 0.000>, <0.000, 0.000, -4.000>, 0.50 chrome } After setting up the background and defining colors and textures, the viewpoint is specified. The “camera” is located at a point in 3D space, and it looks towards another point. The angle, image resolution, and aspect ratio are specified. Two lights are present in the image at defined coordinates. The three objects in the image are a wooden disc to represent a table top, and a sphere and cylinder that intersect to form a pin that will be used for the pin board toy in the final animation. When the image is rendered, the following image is produced. The pins are modeled with a chrome surface, so they reflect the environment around them. Note that the scale of the pin shaft is not correct, this will be fixed later. Modeling the Pin Board The frame of the pin-board is made up of three boxes, and six cylinders, the front box is modeled using a clear, slightly reflective solid, with the same refractive index of glass. The other shapes are modeled as metal. object { box <-5.5, -1.5, 1>, <5.5, 5.5, 1.2> glass } object { box <-5.5, -1.5, -0.04>, <5.5, 5.5, -0.09> steely_blue } object { box <-5.5, -1.5, -0.52>, <5.5, 5.5, -0.59> steely_blue } object { cylinder <-5.2, -1.2, 1.4>, <-5.2, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <5.2, -1.2, 1.4>, <5.2, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <-5.2, 5.2, 1.4>, <-5.2, 5.2, -0.74>, 0.2 steely_blue } object { cylinder <5.2, 5.2, 1.4>, <5.2, 5.2, -0.74>, 0.2 steely_blue } object { cylinder <0, -1.2, 1.4>, <0, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <0, 5.2, 1.4>, <0, 5.2, -0.74>, 0.2 steely_blue } In order to create the matrix of pins that make up the pin board I used a basic console application with a few nested loops to create two intersecting matrixes of pins, which models the layout used in the pin boards. The resulting image is shown below. The pin board contains 11,481 pins, with the scene file containing 23,709 lines of code. For the complete animation 2,000 scene files will be created, which is over 47 million lines of code. Each pin in the pin-board will slide out a specific distance when an object is pressed into the back of the board. This is easily modeled by setting the Z coordinate of the pin to a specific value. In order to set all of the pins in the pin-board to the correct position, a bitmap image can be used. The position of the pin can be set based on the color of the pixel at the appropriate position in the image. When the Windows Azure logo is used to set the Z coordinate of the pins, the following image is generated. The challenge now was to make a cool animation. The Azure Logo is fine, but it is static. Using a normal video to animate the pins would not work; the colors in the video would not be the same as the depth of the objects from the camera. In order to simulate the pin board accurately a series of frames from a depth camera could be used. Windows Kinect The Kenect controllers for the X-Box 360 and Windows feature a depth camera. The Kinect SDK for Windows provides a programming interface for Kenect, providing easy access for .NET developers to the Kinect sensors. The Kinect Explorer provided with the Kinect SDK is a great starting point for exploring Kinect from a developers perspective. Both the X-Box 360 Kinect and the Windows Kinect will work with the Kinect SDK, the Windows Kinect is required for commercial applications, but the X-Box Kinect can be used for hobby projects. The Windows Kinect has the advantage of providing a mode to allow depth capture with objects closer to the camera, which makes for a more accurate depth image for setting the pin positions. Creating a Depth Field Animation The depth field animation used to set the positions of the pin in the pin board was created using a modified version of the Kinect Explorer sample application. In order to simulate the pin board accurately, a small section of the depth range from the depth sensor will be used. Any part of the object in front of the depth range will result in a white pixel; anything behind the depth range will be black. Within the depth range the pixels in the image will be set to RGB values from 0,0,0 to 255,255,255. A screen shot of the modified Kinect Explorer application is shown below. The Kinect Explorer sample application was modified to include slider controls that are used to set the depth range that forms the image from the depth stream. This allows the fine tuning of the depth image that is required for simulating the position of the pins in the pin board. The Kinect Explorer was also modified to record a series of images from the depth camera and save them as a sequence JPEG files that will be used to animate the pins in the animation the Start and Stop buttons are used to start and stop the image recording. En example of one of the depth images is shown below. Once a series of 2,000 depth images has been captured, the task of creating the animation can begin. Rendering a Test Frame In order to test the creation of frames and get an approximation of the time required to render each frame a test frame was rendered on-premise using PolyRay. The output of the rendering process is shown below. The test frame contained 23,629 primitive shapes, most of which are the spheres and cylinders that are used for the 11,800 or so pins in the pin board. The 1280x720 image contains 921,600 pixels, but as anti-aliasing was used the number of rays that were calculated was 4,235,777, with 3,478,754,073 object boundaries checked. The test frame of the pin board with the depth field image applied is shown below. The tracing time for the test frame was 4 minutes 27 seconds, which means rendering the2,000 frames in the animation would take over 148 hours, or a little over 6 days. Although this is much faster that an old 486, waiting almost a week to see the results of an animation would make it challenging for animators to create, view, and refine their animations. It would be much better if the animation could be rendered in less than one hour. Windows Azure Worker Roles The cost of creating an on-premise render farm to render animations increases in proportion to the number of servers. The table below shows the cost of servers for creating a render farm, assuming a cost of $500 per server. Number of Servers Cost 1 $500 16 $8,000 256 $128,000 As well as the cost of the servers, there would be additional costs for networking, racks etc. Hosting an environment of 256 servers on-premise would require a server room with cooling, and some pretty hefty power cabling. The Windows Azure compute services provide worker roles, which are ideal for performing processor intensive compute tasks. With the scalability available in Windows Azure a job that takes 256 hours to complete could be perfumed using different numbers of worker roles. The time and cost of using 1, 16 or 256 worker roles is shown below. Number of Worker Roles Render Time Cost 1 256 hours $30.72 16 16 hours $30.72 256 1 hour $30.72 Using worker roles in Windows Azure provides the same cost for the 256 hour job, irrespective of the number of worker roles used. Provided the compute task can be broken down into many small units, and the worker role compute power can be used effectively, it makes sense to scale the application so that the task is completed quickly, making the results available in a timely fashion. The task of rendering 2,000 frames in an animation is one that can easily be broken down into 2,000 individual pieces, which can be performed by a number of worker roles. Creating a Render Farm in Windows Azure The architecture of the render farm is shown in the following diagram. The render farm is a hybrid application with the following components: · On-Premise o Windows Kinect – Used combined with the Kinect Explorer to create a stream of depth images. o Animation Creator – This application uses the depth images from the Kinect sensor to create scene description files for PolyRay. These files are then uploaded to the jobs blob container, and job messages added to the jobs queue. o Process Monitor – This application queries the role instance lifecycle table and displays statistics about the render farm environment and render process. o Image Downloader – This application polls the image queue and downloads the rendered animation files once they are complete. · Windows Azure o Azure Storage – Queues and blobs are used for the scene description files and completed frames. A table is used to store the statistics about the rendering environment. The architecture of each worker role is shown below. The worker role is configured to use local storage, which provides file storage on the worker role instance that can be use by the applications to render the image and transform the format of the image. The service definition for the worker role with the local storage configuration highlighted is shown below. <?xml version="1.0" encoding="utf-8"?> <ServiceDefinition name="CloudRay" > <WorkerRole name="CloudRayWorkerRole" vmsize="Small"> <Imports> </Imports> <ConfigurationSettings> <Setting name="DataConnectionString" /> </ConfigurationSettings> <LocalResources> <LocalStorage name="RayFolder" cleanOnRoleRecycle="true" /> </LocalResources> </WorkerRole> </ServiceDefinition> The two executable programs, PolyRay.exe and DTA.exe are included in the Azure project, with Copy Always set as the property. PolyRay will take the scene description file and render it to a Truevision TGA file. As the TGA format has not seen much use since the mid 90’s it is converted to a JPG image using Dave's Targa Animator, another shareware application from the 90’s. Each worker roll will use the following process to render the animation frames. 1. The worker process polls the job queue, if a job is available the scene description file is downloaded from blob storage to local storage. 2. PolyRay.exe is started in a process with the appropriate command line arguments to render the image as a TGA file. 3. DTA.exe is started in a process with the appropriate command line arguments convert the TGA file to a JPG file. 4. The JPG file is uploaded from local storage to the images blob container. 5. A message is placed on the images queue to indicate a new image is available for download. 6. The job message is deleted from the job queue. 7. The role instance lifecycle table is updated with statistics on the number of frames rendered by the worker role instance, and the CPU time used. The code for this is shown below. public override void Run() { // Set environment variables string polyRayPath = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), PolyRayLocation); string dtaPath = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), DTALocation); LocalResource rayStorage = RoleEnvironment.GetLocalResource("RayFolder"); string localStorageRootPath = rayStorage.RootPath; JobQueue jobQueue = new JobQueue("renderjobs"); JobQueue downloadQueue = new JobQueue("renderimagedownloadjobs"); CloudRayBlob sceneBlob = new CloudRayBlob("scenes"); CloudRayBlob imageBlob = new CloudRayBlob("images"); RoleLifecycleDataSource roleLifecycleDataSource = new RoleLifecycleDataSource(); Frames = 0; while (true) { // Get the render job from the queue CloudQueueMessage jobMsg = jobQueue.Get(); if (jobMsg != null) { // Get the file details string sceneFile = jobMsg.AsString; string tgaFile = sceneFile.Replace(".pi", ".tga"); string jpgFile = sceneFile.Replace(".pi", ".jpg"); string sceneFilePath = Path.Combine(localStorageRootPath, sceneFile); string tgaFilePath = Path.Combine(localStorageRootPath, tgaFile); string jpgFilePath = Path.Combine(localStorageRootPath, jpgFile); // Copy the scene file to local storage sceneBlob.DownloadFile(sceneFilePath); // Run the ray tracer. string polyrayArguments = string.Format("\"{0}\" -o \"{1}\" -a 2", sceneFilePath, tgaFilePath); Process polyRayProcess = new Process(); polyRayProcess.StartInfo.FileName = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), polyRayPath); polyRayProcess.StartInfo.Arguments = polyrayArguments; polyRayProcess.Start(); polyRayProcess.WaitForExit(); // Convert the image string dtaArguments = string.Format(" {0} /FJ /P{1}", tgaFilePath, Path.GetDirectoryName (jpgFilePath)); Process dtaProcess = new Process(); dtaProcess.StartInfo.FileName = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), dtaPath); dtaProcess.StartInfo.Arguments = dtaArguments; dtaProcess.Start(); dtaProcess.WaitForExit(); // Upload the image to blob storage imageBlob.UploadFile(jpgFilePath); // Add a download job. downloadQueue.Add(jpgFile); // Delete the render job message jobQueue.Delete(jobMsg); Frames++; } else { Thread.Sleep(1000); } // Log the worker role activity. roleLifecycleDataSource.Alive ("CloudRayWorker", RoleLifecycleDataSource.RoleLifecycleId, Frames); } } Monitoring Worker Role Instance Lifecycle In order to get more accurate statistics about the lifecycle of the worker role instances used to render the animation data was tracked in an Azure storage table. The following class was used to track the worker role lifecycles in Azure storage. public class RoleLifecycle : TableServiceEntity { public string ServerName { get; set; } public string Status { get; set; } public DateTime StartTime { get; set; } public DateTime EndTime { get; set; } public long SecondsRunning { get; set; } public DateTime LastActiveTime { get; set; } public int Frames { get; set; } public string Comment { get; set; } public RoleLifecycle() { } public RoleLifecycle(string roleName) { PartitionKey = roleName; RowKey = Utils.GetAscendingRowKey(); Status = "Started"; StartTime = DateTime.UtcNow; LastActiveTime = StartTime; EndTime = StartTime; SecondsRunning = 0; Frames = 0; } } A new instance of this class is created and added to the storage table when the role starts. It is then updated each time the worker renders a frame to record the total number of frames rendered and the total processing time. These statistics are used be the monitoring application to determine the effectiveness of use of resources in the render farm. Rendering the Animation The Azure solution was deployed to Windows Azure with the service configuration set to 16 worker role instances. This allows for the application to be tested in the cloud environment, and the performance of the application determined. When I demo the application at conferences and user groups I often start with 16 instances, and then scale up the application to the full 256 instances. The configuration to run 16 instances is shown below. <?xml version="1.0" encoding="utf-8"?> <ServiceConfiguration serviceName="CloudRay" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="1" osVersion="*"> <Role name="CloudRayWorkerRole"> <Instances count="16" /> <ConfigurationSettings> <Setting name="DataConnectionString" value="DefaultEndpointsProtocol=https;AccountName=cloudraydata;AccountKey=..." /> </ConfigurationSettings> </Role> </ServiceConfiguration> About six minutes after deploying the application the first worker roles become active and start to render the first frames of the animation. The CloudRay Monitor application displays an icon for each worker role instance, with a number indicating the number of frames that the worker role has rendered. The statistics on the left show the number of active worker roles and statistics about the render process. The render time is the time since the first worker role became active; the CPU time is the total amount of processing time used by all worker role instances to render the frames. Five minutes after the first worker role became active the last of the 16 worker roles activated. By this time the first seven worker roles had each rendered one frame of the animation. With 16 worker roles u and running it can be seen that one hour and 45 minutes CPU time has been used to render 32 frames with a render time of just under 10 minutes. At this rate it would take over 10 hours to render the 2,000 frames of the full animation. In order to complete the animation in under an hour more processing power will be required. Scaling the render farm from 16 instances to 256 instances is easy using the new management portal. The slider is set to 256 instances, and the configuration saved. We do not need to re-deploy the application, and the 16 instances that are up and running will not be affected. Alternatively, the configuration file for the Azure service could be modified to specify 256 instances. <?xml version="1.0" encoding="utf-8"?> <ServiceConfiguration serviceName="CloudRay" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="1" osVersion="*"> <Role name="CloudRayWorkerRole"> <Instances count="256" /> <ConfigurationSettings> <Setting name="DataConnectionString" value="DefaultEndpointsProtocol=https;AccountName=cloudraydata;AccountKey=..." /> </ConfigurationSettings> </Role> </ServiceConfiguration> Six minutes after the new configuration has been applied 75 new worker roles have activated and are processing their first frames. Five minutes later the full configuration of 256 worker roles is up and running. We can see that the average rate of frame rendering has increased from 3 to 12 frames per minute, and that over 17 hours of CPU time has been utilized in 23 minutes. In this test the time to provision 140 worker roles was about 11 minutes, which works out at about one every five seconds. We are now half way through the rendering, with 1,000 frames complete. This has utilized just under three days of CPU time in a little over 35 minutes. The animation is now complete, with 2,000 frames rendered in a little over 52 minutes. The CPU time used by the 256 worker roles is 6 days, 7 hours and 22 minutes with an average frame rate of 38 frames per minute. The rendering of the last 1,000 frames took 16 minutes 27 seconds, which works out at a rendering rate of 60 frames per minute. The frame counts in the server instances indicate that the use of a queue to distribute the workload has been very effective in distributing the load across the 256 worker role instances. The first 16 instances that were deployed first have rendered between 11 and 13 frames each, whilst the 240 instances that were added when the application was scaled have rendered between 6 and 9 frames each. Completed Animation I’ve uploaded the completed animation to YouTube, a low resolution preview is shown below. Pin Board Animation Created using Windows Kinect and 256 Windows Azure Worker Roles The animation can be viewed in 1280x720 resolution at the following link: http://www.youtube.com/watch?v=n5jy6bvSxWc Effective Use of Resources According to the CloudRay monitor statistics the animation took 6 days, 7 hours and 22 minutes CPU to render, this works out at 152 hours of compute time, rounded up to the nearest hour. As the usage for the worker role instances are billed for the full hour, it may have been possible to render the animation using fewer than 256 worker roles. When deciding the optimal usage of resources, the time required to provision and start the worker roles must also be considered. In the demo I started with 16 worker roles, and then scaled the application to 256 worker roles. It would have been more optimal to start the application with maybe 200 worker roles, and utilized the full hour that I was being billed for. This would, however, have prevented showing the ease of scalability of the application. The new management portal displays the CPU usage across the worker roles in the deployment. The average CPU usage across all instances is 93.27%, with over 99% used when all the instances are up and running. This shows that the worker role resources are being used very effectively. Grid Computing Scenarios Although I am using this scenario for a hobby project, there are many scenarios where a large amount of compute power is required for a short period of time. Windows Azure provides a great platform for developing these types of grid computing applications, and can work out very cost effective. · Windows Azure can provide massive compute power, on demand, in a matter of minutes. · The use of queues to manage the load balancing of jobs between role instances is a simple and effective solution. · Using a cloud-computing platform like Windows Azure allows proof-of-concept scenarios to be tested and evaluated on a very low budget. · No charges for inbound data transfer makes the uploading of large data sets to Windows Azure Storage services cost effective. (Transaction charges still apply.) Tips for using Windows Azure for Grid Computing Scenarios I found the implementation of a render farm using Windows Azure a fairly simple scenario to implement. I was impressed by ease of scalability that Azure provides, and by the short time that the application took to scale from 16 to 256 worker role instances. In this case it was around 13 minutes, in other tests it took between 10 and 20 minutes. The following tips may be useful when implementing a grid computing project in Windows Azure. · Using an Azure Storage queue to load-balance the units of work across multiple worker roles is simple and very effective. The design I have used in this scenario could easily scale to many thousands of worker role instances. · Windows Azure accounts are typically limited to 20 cores. If you need to use more than this, a call to support and a credit card check will be required. · Be aware of how the billing model works. You will be charged for worker role instances for the full clock our in which the instance is deployed. Schedule the workload to start just after the clock hour has started. · Monitor the utilization of the resources you are provisioning, ensure that you are not paying for worker roles that are idle. · If you are deploying third party applications to worker roles, you may well run into licensing issues. Purchasing software licenses on a per-processor basis when using hundreds of processors for a short time period would not be cost effective. · Third party software may also require installation onto the worker roles, which can be accomplished using start-up tasks. Bear in mind that adding a startup task and possible re-boot will add to the time required for the worker role instance to start and activate. An alternative may be to use a prepared VM and use VM roles. · Consider using the Windows Azure Autoscaling Application Block (WASABi) to autoscale the worker roles in your application. When using a large number of worker roles, the utilization must be carefully monitored, if the scaling algorithms are not optimal it could get very expensive!

Read the article

Part 2–Load Testing In The Cloud

- by Tarun Arora

Welcome to Part 2, In Part 1 we discussed the advantages of creating a Test Rig in the cloud, the Azure edge and the Test Rig Topology we want to get to. In Part 2, Let’s start by understanding the components of Azure we’ll be making use of followed by manually putting them together to create the test rig, so… let’s get down dirty start setting up the Test Rig. What Components of Azure will I be using for building the Test Rig in the Cloud? To run the Test Agents we’ll make use of Windows Azure Compute and to enable communication between Test Controller and Test Agents we’ll make use of Windows Azure Connect. Azure Connect The Test Controller is on premise and the Test Agents are in the cloud (How will they talk?). To enable communication between the two, we’ll make use of Windows Azure Connect. With Windows Azure Connect, you can use a simple user interface to configure IPsec protected connections between computers or virtual machines (VMs) in your organization’s network, and roles running in Windows Azure. With this you can now join Windows Azure role instances to your domain, so that you can use your existing methods for domain authentication, name resolution, or other domain-wide maintenance actions. For more details refer to an overview of Windows Azure connect. A very useful video explaining everything you wanted to know about Windows Azure connect. Azure Compute Windows Azure compute provides developers a platform to host and manage applications in Microsoft’s data centres across the globe. A Windows Azure application is built from one or more components called ‘roles.’ Roles come in three different types: Web role, Worker role, and Virtual Machine (VM) role, we’ll be using the Worker role to set up the Test Agents. A very nice blog post discussing the difference between the 3 role types. Developers are free to use the .NET framework or other software that runs on Windows with the Worker role or Web role. Developers can also create applications using languages such as PHP and Java. More on Windows Azure Compute. Each Windows Azure compute instance represents a virtual server... Virtual Machine Size CPU Cores Memory Cost Per Hour Extra Small Shared 768 MB $0.04 Small 1 1.75 GB $0.12 Medium 2 3.50 GB $0.24 Large 4 7.00 GB $0.48 Extra Large 8 14.00 GB $0.96 You might want to review the Windows Azure Pricing FAQ. Let’s Get Started building the Test Rig… Configuration Machine Role Comments VM – 1 Domain Controller for Playpit.com On Premise VM – 2 TFS, Test Controller On Premise VM – 3 Test Agent Cloud In this blog post I would assume that you have the domain, Team Foundation Server and Test Controller Installed and set up already. If not, please refer to the TFS 2010 Installation Guide and this walkthrough on MSDN to set up your Test Controller. You can also download a preconfigured TFS 2010 VM from Brian Keller's blog, Brian also has some great hands on Labs on TFS 2010 that you may want to explore. I. Lets start building VM – 3: The Test Agent Download the Windows Azure SDK and Tools Open Visual Studio and create a new Windows Azure Project using the Cloud Template Choose the Worker Role for reasons explained in the earlier post The WorkerRole.cs implements the Run() and OnStart() methods, no code changes required. You should be able to compile the project and run it in the compute emulator (The compute emulator should have been installed as part of the Windows Azure Toolkit) on your local machine. We will only be making changes to WindowsAzureProject, open ServiceDefinition.csdef. Ensure that the vmsize is small (remember the cost chart above). Import the “Connect” module. I am importing the Connect module because I need to join the Worker role VM to the Playpit domain. <?xml version="1.0" encoding="utf-8"?> <ServiceDefinition name="WindowsAzureProject2" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition"> <WorkerRole name="WorkerRole1" vmsize="Small"> <Imports> <Import moduleName="Diagnostics" /> <Import moduleName="Connect"/> </Imports> </WorkerRole> </ServiceDefinition> Go to the ServiceConfiguration.Cloud.cscfg and note that settings with key ‘Microsoft.WindowsAzure.Plugins.Connect.%%%%’ have been added to the configuration file. This is because you decided to import the connect module. See the config below. <?xml version="1.0" encoding="utf-8"?> <ServiceConfiguration serviceName="WindowsAzureProject2" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="1" osVersion="*"> <Role name="WorkerRole1"> <Instances count="1" /> <ConfigurationSettings> <Setting name="Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString" value="UseDevelopmentStorage=true" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.ActivationToken" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.Refresh" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.WaitForConnectivity" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.Upgrade" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.EnableDomainJoin" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainFQDN" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainControllerFQDN" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainAccountName" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainPassword" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainOU" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.Administrators" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainSiteName" value="" /> </ConfigurationSettings> </Role> </ServiceConfiguration> Let’s go step by step and understand all the highlighted parameters and where you can find the values for them. osFamily – By default this is set to 1 (Windows Server 2008 SP2). Change this to 2 if you want the Windows Server 2008 R2 operating system. The Advantage of using osFamily = “2” is that you get Powershell 2.0 rather than Powershell 1.0. In Powershell 2.0 you could simply use “powershell -ExecutionPolicy Unrestricted ./myscript.ps1” and it will work while in Powershell 1.0 you will have to change the registry key by including the following in your command file “reg add HKLM\Software\Microsoft\PowerShell\1\ShellIds\Microsoft.PowerShell /v ExecutionPolicy /d Unrestricted /f” before you can execute any power shell. The other reason you might want to move to os2 is if you wanted IIS 7.5. Activation Token – To enable communication between the on premise machine and the Windows Azure Worker role VM both need to have the same token. Log on to Windows Azure Management Portal, click on Connect, click on Get Activation Token, this should give you the activation token, copy the activation token to the clipboard and paste it in the configuration file. Note – Later in the blog I’ll be showing you how to install connect on the on premise machine. EnableDomainJoin – Set the value to true, ofcourse we want to join the on windows azure worker role VM to the domain. DomainFQDN, DomainControllerFQDN, DomainAccountName, DomainPassword, DomainOU, Administrators – This information is specific to your domain. I have extracted this information from the ‘service manager’ and ‘Active Directory Users and Computers’. Also, i created a new Domain-OU namely ‘CloudInstances’ so all my cloud instances joined to my domain show up here, this is optional. You can encrypt the DomainPassword – refer to the instructions here. Or hold fire, I’ll be covering that when i come to certificates and encryption in the coming section. Now once you have filled all this information up, the configuration file should look something like below, <?xml version="1.0" encoding="utf-8"?> <ServiceConfiguration serviceName="WindowsAzureProject2" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="2" osVersion="*"> <Role name="WorkerRole1"> <Instances count="1" /> <ConfigurationSettings> <Setting name="Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString" value="UseDevelopmentStorage=true" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.ActivationToken" value="45f55fea-f194-4fbc-b36e-25604faac784" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.Refresh" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.WaitForConnectivity" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.Upgrade" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.EnableDomainJoin" value="true" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainFQDN" value="play.pit.com" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainControllerFQDN" value="WIN-KUDQMQFGQOL.play.pit.com" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainAccountName" value="playpit\Administrator" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainPassword" value="************************" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainOU" value="OU=CloudInstances, DC=Play, DC=Pit, DC=com" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.Administrators" value="Playpit\Administrator" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainSiteName" value="" /> </ConfigurationSettings> </Role> </ServiceConfiguration> Next we will be enabling the Remote Desktop module in to the ServiceDefinition.csdef, we could make changes manually or allow a beautiful wizard to help us make changes. I prefer the second option. So right click on the Windows Azure project and choose Publish Now once you get the publish wizard, if you haven’t already you would be asked to import your Windows Azure subscription, this is simply the Msdn subscription activation key xml. Once you have done click Next to go to the Settings page and check ‘Enable Remote Desktop for all roles’. As soon as you do that you get another pop up asking you the details for the user that you would be logging in with (make sure you enter a reasonable expiry date, you do not want the user account to expire today). Notice the more information tag at the bottom, click that to get access to the certificate section. See screen shot below. From the drop down select the option to create a new certificate In the pop up window enter the friendly name for your certificate. In my case I entered ‘WAC – Test Rig’ and click ok. This will create a new certificate for you. Click on the view button to see the certificate details. Do you see the Thumbprint, this is the value that will go in the config file (very important). Now click on the Copy to File button to copy the certificate, we will need to import the certificate to the windows Azure Management portal later. So, make sure you save it a safe location. Click Finish and enter details of the user you would like to create with permissions for remote desktop access, once you have entered the details on the ‘Remote desktop configuration’ screen click on Ok. From the Publish Windows Azure Wizard screen press Cancel. Cancel because we don’t want to publish the role just yet and Yes because we want to save all the changes in the config file. Now if you go to the ServiceDefinition.csdef file you will see that the RemoteAccess and RemoteForwarder roles have been imported for you. <?xml version="1.0" encoding="utf-8"?> <ServiceDefinition name="WindowsAzureProject2" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition"> <WorkerRole name="WorkerRole1" vmsize="Small"> <Imports> <Import moduleName="Diagnostics" /> <Import moduleName="Connect" /> <Import moduleName="RemoteAccess" /> <Import moduleName="RemoteForwarder" /> </Imports> </WorkerRole> </ServiceDefinition> Now go to the ServiceConfiguration.Cloud.cscfg file and you see a whole bunch for setting “Microsoft.WindowsAzure.Plugins.RemoteAccess.%%%” values added for you. <?xml version="1.0" encoding="utf-8"?> <ServiceConfiguration serviceName="WindowsAzureProject2" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="2" osVersion="*"> <Role name="WorkerRole1"> <Instances count="1" /> <ConfigurationSettings> <Setting name="Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString" value="UseDevelopmentStorage=true" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.ActivationToken" value="45f55fea-f194-4fbc-b36e-25604faac784" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.Refresh" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.WaitForConnectivity" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.Upgrade" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.EnableDomainJoin" value="true" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainFQDN" value="play.pit.com" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainControllerFQDN" value="WIN-KUDQMQFGQOL.play.pit.com" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainAccountName" value="playpit\Administrator" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainPassword" value="************************" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainOU" value="OU=CloudInstances, DC=Play, DC=Pit, DC=com" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.Administrators" value="Playpit\Administrator" /> <Setting name="Microsoft.WindowsAzure.Plugins.Connect.DomainSiteName" value="" /> <Setting name="Microsoft.WindowsAzure.Plugins.RemoteAccess.Enabled" value="true" /> <Setting name="Microsoft.WindowsAzure.Plugins.RemoteAccess.AccountUsername" value="Administrator" /> <Setting name="Microsoft.WindowsAzure.Plugins.RemoteAccess.AccountEncryptedPassword" value="MIIBnQYJKoZIhvcNAQcDoIIBjjCCAYoCAQAxggFOMIIBSgIBADAyMB4xHDAaBgNVBAMME1dpbmRvd 3MgQXp1cmUgVG9vbHMCEGa+B46voeO5T305N7TSG9QwDQYJKoZIhvcNAQEBBQAEggEABg4ol5Xol66Ip6QKLbAPWdmD4ae ADZ7aKj6fg4D+ATr0DXBllZHG5Umwf+84Sj2nsPeCyrg3ZDQuxrfhSbdnJwuChKV6ukXdGjX0hlowJu/4dfH4jTJC7sBWS AKaEFU7CxvqYEAL1Hf9VPL5fW6HZVmq1z+qmm4ecGKSTOJ20Fptb463wcXgR8CWGa+1w9xqJ7UmmfGeGeCHQ4QGW0IDSBU6ccg vzF2ug8/FY60K1vrWaCYOhKkxD3YBs8U9X/kOB0yQm2Git0d5tFlIPCBT2AC57bgsAYncXfHvPesI0qs7VZyghk8LVa9g5IqaM Cp6cQ7rmY/dLsKBMkDcdBHuCTAzBgkqhkiG9w0BBwEwFAYIKoZIhvcNAwcECDRVifSXbA43gBApNrp40L1VTVZ1iGag+3O1" /> <Setting name="Microsoft.WindowsAzure.Plugins.RemoteAccess.AccountExpiration" value="2012-11-27T23:59:59.0000000+00:00" /> <Setting name="Microsoft.WindowsAzure.Plugins.RemoteForwarder.Enabled" value="true" /> </ConfigurationSettings> <Certificates> <Certificate name="Microsoft.WindowsAzure.Plugins.RemoteAccess.PasswordEncryption" thumbprint="AA23016CF0BDFC344400B5B82706B608B92E4217" thumbprintAlgorithm="sha1" /> </Certificates> </Role> </ServiceConfiguration> Okay let’s look at them one at a time, Enabled - Yes, we would like to enable Remote Access. AccountUserName – This is the user name you entered while you were on the publish windows azure role screen, as detailed above. AccountEncrytedPassword – Try and decode that, the certificate is used to encrypt the password you specified for the user account. Remember earlier i said, either use the instructions or wait and i’ll be showing you encryption, now the user account i am using for rdp has the same password as my domain password, so i can simply copy the value of the AccountEncryptedPassword to the DomainPassword as well. AccountExpiration – This is the expiration as you specified in the wizard earlier, make sure your account does not expire today. Remote Forwarder – Check out the documentation, below is how I understand it, -- One role in an application that implements a remote desktop connection must import the RemoteForwarder module. The two modules work together to enable the remote desktop connections to role instances. -- If you have multiple roles defined in the service model, it does not matter which role you add the RemoteForwarder module to, but you must add it to only one of the role definitions. Certificate – Remember the certificate thumbprint from the wizard, the on premise machine and windows azure role machine that need to speak to each other must have the same thumbprint. More on that when we install Windows Azure connect Endpoints on the on premise machine. As i said earlier, in this blog post, I’ll be showing you the manual process so i won’t be scripting any star up tasks to install the test agent or register the test agent with the TFS Server. I’ll be showing you all this cool stuff in the next blog post, that’s because it’s important to understand the manual side of it, it becomes easier for you to troubleshoot in case something fails. Having said that, the changes we have made are sufficient to spin up the Windows Azure Worker Role aka Test Agent VM, have it connected with the play.pit.com domain and have remote access enabled on it. Before we deploy the Test Agent VM we need to set up Windows Azure Connect on the TFS Server. II. Windows Azure Connect: Setting up Connect on VM – 2 i.e. TFS & Test Controller Glad you made it so far, now to enable communication between the on premise TFS/Test Controller and Azure-ed Test Agent we need to enable communication. We have set up the Azure connect module in the Test Agent configuration, now the connect end points need to be enabled on the on premise machines, let’s have a look at how we can do this. Log on to VM – 2 running the TFS Server and Test Controller Log on to the Windows Azure Management Portal and click on Virtual Network Click on Virtual Network, if you already have a subscription you should see the below screen shot, if not, you would be asked to complete the subscription first Click on Install Local Endpoints from the top left on the panel and you get a url appended with a token id in it, remember the token i showed you earlier, in theory the token you get here should match the token you added to the Test Agent config file. Copy the url to the clip board and paste it in IE explorer (important, the installation at present only works out of IE and you need to have cookies enabled in order to complete the installation). As stated in the pop up, you can NOT download and run the software later, you need to run it as is, since it contains a token. Once the installation completes you should see the Windows Azure connect icon in the system tray. Right click the Azure Connect icon, choose Diagnostics and refer to this link for diagnostic detail terminology. NOTE – Unfortunately I could not see the Windows Azure connect icon in the system tray, a bit of binging with Google revealed that the azure connect icon is only shown when the ‘Windows Azure Connect Endpoint’ Service is started. So go to services.msc and make sure that the service is started, if not start it, unfortunately again, the service did not start for me on a manual start and i realised that one of the dependant services was disabled, you can look at the service dependencies and start them and then start windows azure connect. Bottom line, you need to start Windows Azure connect service before you can proceed. Please refer here on MSDN for more on Troubleshooting Windows Azure connect. (Follow the next step as well) Now go back to the Windows Azure Management Portal and from Groups and Roles create a new group, lets call it ‘Test Rig’. Make sure you add the VM – 2 (the TFS Server VM where you just installed the endpoint). Now if you go back to the Azure Connect icon in the system tray and click ‘Refresh Policy’ you will notice that the disconnected status of the icon should change to ready for connection. III. Importing Certificate in to Windows Azure Management Portal But before that you need to import the certificate you created in Step I in to the Windows Azure Management Portal. Log on to the Windows Azure Management Portal and click on ‘Hosted Services, Storage Accounts & CDN’ and then ‘Management Certificates’ followed by Add Certificates as shown in the screen shot below Browse to the location where you saved the certificate earlier, remember… Refer to Step I in case you forgot. Now you should be able to see the imported certificate here, make sure the thumbprint of the certificate matches the one you inserted in the config files IV. Publish Windows Azure Worker Role aka Test Agent Having completed I, II and III, you are ready to publish the Test Agent VM – 3 to the cloud. Go to Visual Studio and right click the Windows Azure project and select Publish. Verify the infomration in the wizard, from the advanced settings tab, you can also enabled capture of intellitrace or profiling information. Click Next and Click Publish! From the view menu bar select the Windows Azure Activity Log window. Now you should be able to see the deployment progress in real time. In the Windows Azure Management Portal, you should also be able to see the progress of creation of a new Worker Role. Once the deployment is complete you should be able to RDP (go to run prompt type mstsc and in the pop up the machine name) in to the Test Agent Worker Role VM from the Playpit network using the domain admin user account. In case you are unable to log in to the Test Agent using the domain admin user account it means the process of joining the Test Agent to the domain has failed! But the good news is, because you imported the connect module, you can connect to the Test Agent machine using Windows Azure Management Portal and troubleshoot the reason for failure, you will be able to log in with the user name and password you specified in the config file for the keys ‘RemoteAccess.AccountUsername, RemoteAccess.EncryptedPassword (just that enter the password unencrypted)’, fix it or manually join the machine to the domain. Once you have managed to Join the Test Agent VM to the Domain move to the next step. So, log in to the Test Agent Worker Role VM with the Playpit Domain Administrator and verify that you can log in, the machine is connected to the domain and the connect service is successfully running. If yes, give your self a pat on the back, you are 80% mission accomplished! Go to the Windows Azure Management Portal and click on Virtual Network, click on Groups and Roles and click on Test Rig, click Edit Group, the edit the Test Rig group you created earlier. In the Connect to section, click on Add to select the worker role you have just deployed. Also, check the ‘Allow connections between endpoints in the group’ with this you will enable to communication between test controller and test agents and test agents/test agents. Click Save. Now, you are ready to deploy the Test Agent software on the Worker Role Test Agent VM and configure it to work with the Test Controller. V. Configuring VM – 3: Installing Test Agent and Associating Test Agent to Controller Log in to the Worker Role Test Agent VM that you have just successfully deployed, make sure you log in with the domain administrator account. Download the All Agents software from MSDN, ‘en_visual_studio_agents_2010_x86_x64_dvd_509679.iso’, extract the iso and navigate to where you have extracted the iso. In my case, i have extracted the iso to “C:\Resources\Temp\VsAgentSetup”. Open the Test Agent folder and double click on setup.exe. Once you have installed the Test Agent you should reach the configuration window. If you face any issues installing TFS Test Agent on the VM, refer to the walkthrough on MSDN. Once you have successfully installed the Test Agent software you will need to configure the test agent. Right click the test agent configuration tool and run as a different user. i.e. an Administrator. This is really to run the configuration wizard with elevated privileges (you might have UAC block something's otherwise). In the run options, you can select ‘service’ you do not need to run the agent as interactive un less you are running coded UI tests. I have specified the domain administrator to connect to the TFS Test Controller. In real life, i would never do that, i would create a separate test user service account for this purpose. But for the blog post, we are using the most powerful user so that any policies or restrictions don’t block you. Click the Apply Settings button and you should be all green! If not, the summary usually gives helpful error messages that you can resolve and proceed. As per my experience, you may run in to either a permission or a firewall blocking communication issue. And now the moment of truth! Go to VM –2 open up Visual Studio and from the Test Menu select Manage Test Controller Mission Accomplished! You should be able to see the Test Agent that you have just configured here, VI. Creating and Running Load Tests on your brand new Azure-ed Test Rig I have various blog posts on Performance Testing with Visual Studio Ultimate, you can follow the links and videos below, Blog Posts: - Part 1 – Performance Testing using Visual Studio 2010 Ultimate - Part 2 – Performance Testing using Visual Studio 2010 Ultimate - Part 3 – Performance Testing using Visual Studio 2010 Ultimate Videos: - Test Tools Configuration & Settings in Visual Studio - Why & How to Record Web Performance Tests in Visual Studio Ultimate - Goal Driven Load Testing using Visual Studio Ultimate Now that you have created your load tests, there is one last change you need to make before you can run the tests on your Azure Test Rig, create a new Test settings file, and change the Test Execution method to ‘Remote Execution’ and select the test controller you have configured the Worker Role Test Agent against in our case VM – 2 So, go on, fire off a test run and see the results of the test being executed on the Azur-ed Test Rig. Review and What’s next? A quick recap of the benefits of running the Test Rig in the cloud and what i will be covering in the next blog post AND I would love to hear your feedback! Advantages Utilizing the power of Azure compute to run a heavy virtual user load. Benefiting from the Azure flexibility, destroy Test Agents when not in use, takes < 25 minutes to spin up a new Test Agent. Most important test Network Latency, (network latency and speed of connection are two different things – usually network latency is very hard to test), by placing the Test Agents in Microsoft Data centres around the globe, one can actually test the lag in transferring the bytes not because of a slow connection but because the page has been requested from the other side of the globe. Next Steps The process of spinning up the Test Agents in windows Azure is not 100% automated. I am working on the Worker process and power shell scripts to make the role deployment, unattended install of test agent software and registration of the test agent to the test controller automated. In the next blog post I will show you how to make the complete process unattended and automated. Remember to subscribe to http://feeds.feedburner.com/TarunArora. Hope you enjoyed this post, I would love to hear your feedback! If you have any recommendations on things that I should consider or any questions or feedback, feel free to leave a comment. See you in Part III. Share this post : CodeProject

Read the article

Using R to Analyze G1GC Log Files

- by user12620111

Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=|| Using R to Analyze G1GC Log Files Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z,]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\$ " , " " ); gsub ( "\ $" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period: par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight. plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

Read the article

Is there anyway to get the cyberphunks pidgin OTR plugin to be less verbose?

- by Matt

Is there anyway to get the cyberphunk piding OTR plugin to be less verbose? Sometimes I get hundreds of lines notifying me of the same thing... (6:40:56 PM) OTR Error: You sent encrypted data to anoynmous, who wasn't expecting it. (6:40:57 PM) Successfully refreshed the unverified conversation with [email protected]/Gaim939DD745. (6:40:57 PM) The last message to [email protected] was resent. (6:40:57 PM) Error setting up private conversation: Malformed message received (6:40:58 PM) OTR Error: You sent encrypted data to anoynmous, who wasn't expecting it. (6:40:58 PM) Successfully refreshed the unverified conversation with [email protected]/Gaim939DD745. (6:40:58 PM) The last message to [email protected] was resent. (6:40:59 PM) Error setting up private conversation: Malformed message received (6:40:59 PM) OTR Error: You sent encrypted data to anoynmous, who wasn't expecting it. (6:41:00 PM) Successfully refreshed the unverified conversation with [email protected]/Gaim939DD745. (6:41:00 PM) The last message to [email protected] was resent. (6:41:00 PM) Error setting up private conversation: Malformed message received (6:41:01 PM) OTR Error: You sent encrypted data to anoynmous, who wasn't expecting it. (6:41:01 PM) Attempting to refresh the private conversation with [email protected]/Gaim939DD745... (6:41:01 PM) Successfully refreshed the unverified conversation with [email protected]/Gaim939DD745. (6:41:01 PM) The last message to [email protected] was resent. (6:41:02 PM) OTR Error: You transmitted an unreadable encrypted message. (6:41:02 PM) Error setting up private conversation: Malformed message received (6:41:02 PM) OTR Error: You sent encrypted data to anoynmous, who wasn't expecting it. (6:41:02 PM) Successfully refreshed the unverified conversation with [email protected]/Gaim939DD745. (6:41:02 PM) The last message to [email protected] was resent. (6:41:03 PM) Error setting up private conversation: Malformed message received (6:41:03 PM) OTR Error: You transmitted an unreadable encrypted message. (6:41:03 PM) Successfully refreshed the unverified conversation with [email protected]/Gaim939DD745. (6:41:03 PM) The last message to [email protected] was resent. (6:41:03 PM) OTR Error: You sent encrypted data to anoynmous, who wasn't expecting it. (6:41:04 PM) Error setting up private conversation: Malformed message received (6:41:09 PM) Attempting to refresh the private conversation with [email protected]/Adium39B22966... (6:41:11 PM) Unverified conversation with [email protected]/Adium39B22966 started. (6:41:11 PM) The last message to [email protected] was resent. (6:41:11 PM) OTR Error: You transmitted an unreadable encrypted message. **(6:41:16 PM) Anonymous: yea** (6:41:20 PM) [email protected]/91A31EC5: yarrr (**6:41:25 PM) Anonymous: sup? (6:41:29 PM) [email protected]/91A31EC5: nothing (6:41:33 PM) [email protected]/91A31EC5: just speaking like a pirate (6:41:51 PM) Anonymous: word (7:13:47 PM) Anonymous: how late you workin? (9:18:12 PM) [email protected]/91A31EC5: not too late** (9:18:13 PM) OTR Error: You transmitted an unreadable encrypted message. (9:18:13 PM) Unverified conversation with [email protected]/Gaim939DD745 started. (9:18:13 PM) The last message to [email protected] was resent. (9:18:14 PM) Error setting up private conversation: Malformed message received (9:18:14 PM) OTR Error: You transmitted an unreadable encrypted message. (9:18:14 PM) Unverified conversation with [email protected]/Adium39B22966 started. (9:18:14 PM) The last message to [email protected] was resent. (9:18:15 PM) OTR Error: You transmitted an unreadable encrypted message. (9:18:15 PM) Unverified conversation with [email protected]/Gaim939DD745 started. (9:18:15 PM) The last message to [email protected] was resent. (9:18:15 PM) OTR Error: You transmitted an unreadable encrypted message. (9:18:16 PM) Successfully refreshed the unverified conversation with [email protected]/Gaim939DD745. (9:18:16 PM) The last message to [email protected] was resent. (9:18:16 PM) OTR Error: You transmitted an unreadable encrypted message. (9:18:17 PM) Successfully refreshed the unverified conversation with [email protected]/Gaim939DD745. (9:18:17 PM) The last message to [email protected] was resent. (9:18:17 PM) OTR Error: You transmitted an unreadable encrypted message. (9:18:18 PM) Successfully refreshed the unverified conversation with [email protected]/Gaim939DD745. (9:18:18 PM) The last message to [email protected] was resent. (9:18:18 PM) OTR Error: You transmitted an unreadable encrypted message. (9:18:19 PM) Unverified conversation with [email protected]/Adium39B22966 started. (9:18:19 PM) The last message to [email protected] was resent. (9:18:19 PM) OTR Error: You transmitted an unreadable encrypted message. (9:18:20 PM) Successfully refreshed the unverified conversation with [email protected]/Adium39B22966. (9:18:20 PM) The last message to [email protected] was resent. (9:18:20 PM) OTR Error: You transmitted an unreadable encrypted message. (9:18:20 PM) Attempting to refresh the private conversation with [email protected]/Gaim939DD745... (9:18:21 PM) Unverified conversation with [email protected]/Gaim939DD745 started. (9:18:21 PM) The last message to [email protected] was resent. (9:18:21 PM) OTR Error: You transmitted an unreadable encrypted message. (9:18:21 PM) OTR Error: You transmitted an unreadable encrypted message. (9:18:21 PM) Successfully refreshed the unverified conversation with [email protected]/Gaim939DD745. (9:18:21 PM) The last message to [email protected] was resent. (9:18:21 PM) OTR Error: You transmitted an unreadable encrypted message. (9:18:22 PM) We received an unreadable encrypted message from [email protected]. (9:18:23 PM) The last message to [email protected] was resent. (9:18:24 PM) OTR Error: You transmitted an unreadable encrypted message. (9:18:25 PM) We received an unreadable encrypted message from [email protected]. (9:18:26 PM) Unverified conversation with [email protected]/Adium39B22966 started. (9:18:26 PM) The last message to [email protected] was resent.

Read the article

Optimizing MySQL for small VPS

- by Chris M

I'm trying to optimize my MySQL config for a verrry small VPS. The VPS is also running NGINX/PHP-FPM and Magento; all with a limit of 250MB of RAM. This is an output of MySQL Tuner... -------- General Statistics -------------------------------------------------- [--] Skipped version check for MySQLTuner script [OK] Currently running supported MySQL version 5.1.41-3ubuntu12.8 [OK] Operating on 64-bit architecture -------- Storage Engine Statistics ------------------------------------------- [--] Status: -Archive -BDB -Federated +InnoDB -ISAM -NDBCluster [--] Data in MyISAM tables: 1M (Tables: 14) [--] Data in InnoDB tables: 29M (Tables: 301) [--] Data in MEMORY tables: 1M (Tables: 17) [!!] Total fragmented tables: 301 -------- Security Recommendations ------------------------------------------- [OK] All database users have passwords assigned -------- Performance Metrics ------------------------------------------------- [--] Up for: 2d 11h 14m 58s (1M q [8.038 qps], 33K conn, TX: 2B, RX: 618M) [--] Reads / Writes: 83% / 17% [--] Total buffers: 122.0M global + 8.6M per thread (100 max threads) [!!] Maximum possible memory usage: 978.2M (404% of installed RAM) [OK] Slow queries: 0% (37/1M) [OK] Highest usage of available connections: 6% (6/100) [OK] Key buffer size / total MyISAM indexes: 32.0M/282.0K [OK] Key buffer hit rate: 99.7% (358K cached / 1K reads) [OK] Query cache efficiency: 83.4% (1M cached / 1M selects) [!!] Query cache prunes per day: 48301 [OK] Sorts requiring temporary tables: 0% (0 temp sorts / 144K sorts) [OK] Temporary tables created on disk: 13% (27K on disk / 203K total) [OK] Thread cache hit rate: 99% (6 created / 33K connections) [!!] Table cache hit rate: 0% (32 open / 51K opened) [OK] Open file limit used: 1% (20/1K) [OK] Table locks acquired immediately: 99% (1M immediate / 1M locks) [!!] InnoDB data size / buffer pool: 29.2M/8.0M -------- Recommendations ----------------------------------------------------- General recommendations: Run OPTIMIZE TABLE to defragment tables for better performance Reduce your overall MySQL memory footprint for system stability Enable the slow query log to troubleshoot bad queries Increase table_cache gradually to avoid file descriptor limits Variables to adjust: *** MySQL's maximum memory usage is dangerously high *** *** Add RAM before increasing MySQL buffer variables *** query_cache_size (> 64M) table_cache (> 32) innodb_buffer_pool_size (>= 29M) and this is the config. # # The MySQL database server configuration file. # # You can copy this to one of: # - "/etc/mysql/my.cnf" to set global options, # - "~/.my.cnf" to set user-specific options. # # One can use all long options that the program supports. # Run program with --help to get a list of available options and with # --print-defaults to see which it would actually understand and use. # # For explanations see # http://dev.mysql.com/doc/mysql/en/server-system-variables.html # This will be passed to all mysql clients # It has been reported that passwords should be enclosed with ticks/quotes # escpecially if they contain "#" chars... # Remember to edit /etc/mysql/debian.cnf when changing the socket location. [client] port = 3306 socket = /var/run/mysqld/mysqld.sock # Here is entries for some specific programs # The following values assume you have at least 32M ram # This was formally known as [safe_mysqld]. Both versions are currently parsed. [mysqld_safe] socket = /var/run/mysqld/mysqld.sock nice = 0 [mysqld] # # * Basic Settings # # # * IMPORTANT # If you make changes to these settings and your system uses apparmor, you may # also need to also adjust /etc/apparmor.d/usr.sbin.mysqld. # user = mysql socket = /var/run/mysqld/mysqld.sock port = 3306 basedir = /usr datadir = /var/lib/mysql tmpdir = /tmp skip-external-locking # # Instead of skip-networking the default is now to listen only on # localhost which is more compatible and is not less secure. bind-address = 127.0.0.1 # # * Fine Tuning # key_buffer = 32M max_allowed_packet = 16M thread_stack = 192K thread_cache_size = 8 sort_buffer_size = 4M read_buffer_size = 4M myisam_sort_buffer_size = 16M # This replaces the startup script and checks MyISAM tables if needed # the first time they are touched myisam-recover = BACKUP max_connections = 100 table_cache = 32 tmp_table_size = 128M #thread_concurrency = 10 # # * Query Cache Configuration # #query_cache_limit = 1M query_cache_type = 1 query_cache_size = 64M # # * Logging and Replication # # Both location gets rotated by the cronjob. # Be aware that this log type is a performance killer. # As of 5.1 you can enable the log at runtime! #general_log_file = /var/log/mysql/mysql.log #general_log = 1 log_error = /var/log/mysql/error.log # Here you can see queries with especially long duration #log_slow_queries = /var/log/mysql/mysql-slow.log #long_query_time = 2 #log-queries-not-using-indexes # # The following can be used as easy to replay backup logs or for replication. # note: if you are setting up a replication slave, see README.Debian about # other settings you may need to change. #server-id = 1 #log_bin = /var/log/mysql/mysql-bin.log expire_logs_days = 10 max_binlog_size = 100M #binlog_do_db = include_database_name #binlog_ignore_db = include_database_name # # * InnoDB # # InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/. # Read the manual for more InnoDB related options. There are many! # # * Security Features # # Read the manual, too, if you want chroot! # chroot = /var/lib/mysql/ # # For generating SSL certificates I recommend the OpenSSL GUI "tinyca". # # ssl-ca=/etc/mysql/cacert.pem # ssl-cert=/etc/mysql/server-cert.pem # ssl-key=/etc/mysql/server-key.pem [mysqldump] quick quote-names max_allowed_packet = 16M [mysql] #no-auto-rehash # faster start of mysql but no tab completition [isamchk] key_buffer = 16M # # * IMPORTANT: Additional settings that can override those from this file! # The files must end with '.cnf', otherwise they'll be ignored. # !includedir /etc/mysql/conf.d/ The site contains 1 wordpress site,so lots of MYISAM but mostly static content as its not changing all that often (A wordpress cache plugin deals with this). And the Magento Site which consists of a lot of InnoDB tables, some MyISAM and some INMEMORY. The "read" side seems to be running pretty well with a mass of optimizations I've used on Magento, the NGINX setup and PHP-FPM + XCACHE. I'd love to have a kick in the right direction with the MySQL config so I'm not blindly altering it based on the MySQLTuner without understanding what I'm changing. Thanks

Read the article

Server Memory with Magento

- by Mohamed Elgharabawy

I have a cloud server with the following specifications: 2vCPUs 4G RAM 160GB Disk Space Network 400Mb/s System Image: Ubuntu 12.04 LTS I am only running Magento CE 1.7.0.2 on this server. Nothing else. Usually, the server has a loading time of 4-5 seconds. Recently, this has dropped to over 30 seconds and sometimes the server just goes away and I get HTTP error reports to my email stating that HTTP requests took more than 20000ms. Running top command and sorting them returns the following: top - 15:29:07 up 3:40, 1 user, load average: 28.59, 25.95, 22.91 Tasks: 112 total, 30 running, 82 sleeping, 0 stopped, 0 zombie Cpu(s): 90.2%us, 9.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.2%st PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31901 www-data 20 0 360m 71m 5840 R 7 1.8 1:39.51 apache2 32084 www-data 20 0 362m 72m 5548 R 7 1.8 1:31.56 apache2 32089 www-data 20 0 348m 59m 5660 R 7 1.5 1:41.74 apache2 32295 www-data 20 0 343m 54m 5532 R 7 1.4 2:00.78 apache2 32303 www-data 20 0 354m 65m 5260 R 7 1.6 1:38.76 apache2 32304 www-data 20 0 346m 56m 5544 R 7 1.4 1:41.26 apache2 32305 www-data 20 0 348m 59m 5640 R 7 1.5 1:50.11 apache2 32291 www-data 20 0 358m 69m 5256 R 6 1.7 1:44.26 apache2 32517 www-data 20 0 345m 56m 5532 R 6 1.4 1:45.56 apache2 30473 www-data 20 0 355m 66m 5680 R 6 1.7 2:00.05 apache2 32093 www-data 20 0 352m 63m 5848 R 6 1.6 1:53.23 apache2 32302 www-data 20 0 345m 56m 5512 R 6 1.4 1:55.87 apache2 32433 www-data 20 0 346m 57m 5500 S 6 1.4 1:31.58 apache2 32638 www-data 20 0 354m 65m 5508 R 6 1.6 1:36.59 apache2 32230 www-data 20 0 347m 57m 5524 R 6 1.4 1:33.96 apache2 32231 www-data 20 0 355m 66m 5512 R 6 1.7 1:37.47 apache2 32233 www-data 20 0 354m 64m 6032 R 6 1.6 1:59.74 apache2 32300 www-data 20 0 355m 66m 5672 R 6 1.7 1:43.76 apache2 32510 www-data 20 0 347m 58m 5512 R 6 1.5 1:42.54 apache2 32521 www-data 20 0 348m 59m 5508 R 6 1.5 1:47.99 apache2 32639 www-data 20 0 344m 55m 5512 R 6 1.4 1:34.25 apache2 32083 www-data 20 0 345m 56m 5696 R 5 1.4 1:59.42 apache2 32085 www-data 20 0 347m 58m 5692 R 5 1.5 1:42.29 apache2 32293 www-data 20 0 353m 64m 5676 R 5 1.6 1:52.73 apache2 32301 www-data 20 0 348m 59m 5564 R 5 1.5 1:49.63 apache2 32528 www-data 20 0 351m 62m 5520 R 5 1.6 1:36.11 apache2 31523 mysql 20 0 3460m 576m 8288 S 5 14.4 2:06.91 mysqld 32002 www-data 20 0 345m 55m 5512 R 5 1.4 2:01.88 apache2 32080 www-data 20 0 357m 68m 5512 S 5 1.7 1:31.30 apache2 32163 www-data 20 0 347m 58m 5512 S 5 1.5 1:58.68 apache2 32509 www-data 20 0 345m 56m 5504 R 5 1.4 1:49.54 apache2 32306 www-data 20 0 358m 68m 5504 S 4 1.7 1:53.29 apache2 32165 www-data 20 0 344m 55m 5524 S 4 1.4 1:40.71 apache2 32640 www-data 20 0 345m 56m 5528 R 4 1.4 1:36.49 apache2 31888 www-data 20 0 359m 70m 5664 R 4 1.8 1:57.07 apache2 32511 www-data 20 0 357m 67m 5512 S 3 1.7 1:47.00 apache2 32054 www-data 20 0 357m 68m 5660 S 2 1.7 1:53.10 apache2 1 root 20 0 24452 2276 1232 S 0 0.1 0:01.58 init Moreover, running free -m returns the following: total used free shared buffers cached Mem: 4003 3919 83 0 118 901 -/+ buffers/cache: 2899 1103 Swap: 0 0 0 To investigate this further, I have installed apache buddy, it recommeneded that I need to reduce the maxclient connections. Which I did. I also installed MysqlTuner and it suggests that I need to set my innodb_buffer_pool_size to = 3.0G. However, I cannot do that, since the whole memory is 4G. Here is the output from apache buddy: ### GENERAL REPORT ### Settings considered for this report: Your server's physical RAM: 4003MB Apache's MaxClients directive: 40 Apache MPM Model: prefork Largest Apache process (by memory): 73.77MB [ OK ] Your MaxClients setting is within an acceptable range. Max potential memory usage: 2950.8 MB Percentage of RAM allocated to Apache 73.72 % And this is the output of MySQLTuner: -------- Performance Metrics ------------------------------------------------- [--] Up for: 47m 22s (675K q [237.552 qps], 12K conn, TX: 1B, RX: 300M) [--] Reads / Writes: 45% / 55% [--] Total buffers: 2.1G global + 2.7M per thread (151 max threads) [OK] Maximum possible memory usage: 2.5G (64% of installed RAM) [OK] Slow queries: 0% (0/675K) [OK] Highest usage of available connections: 26% (40/151) [OK] Key buffer size / total MyISAM indexes: 36.0M/18.7M [OK] Key buffer hit rate: 100.0% (245K cached / 105 reads) [OK] Query cache efficiency: 92.5% (500K cached / 541K selects) [!!] Query cache prunes per day: 302886 [OK] Sorts requiring temporary tables: 0% (1 temp sorts / 15K sorts) [!!] Joins performed without indexes: 12135 [OK] Temporary tables created on disk: 25% (8K on disk / 32K total) [OK] Thread cache hit rate: 90% (1K created / 12K connections) [!!] Table cache hit rate: 17% (400 open / 2K opened) [OK] Open file limit used: 12% (123/1K) [OK] Table locks acquired immediately: 100% (196K immediate / 196K locks) [!!] InnoDB buffer pool / data size: 2.0G/3.5G [OK] InnoDB log waits: 0 -------- Recommendations ----------------------------------------------------- General recommendations: Run OPTIMIZE TABLE to defragment tables for better performance MySQL started within last 24 hours - recommendations may be inaccurate Enable the slow query log to troubleshoot bad queries Adjust your join queries to always utilize indexes Increase table_cache gradually to avoid file descriptor limits Read this before increasing table_cache over 64: http://bit.ly/1mi7c4C Variables to adjust: query_cache_size ( 64M) join_buffer_size ( 128.0K, or always use indexes with joins) table_cache ( 400) innodb_buffer_pool_size (= 3G) Last but not least, the server still has more than 60% of free disk space. Now, based on the above, I have few questions: Are these numbers normal? Do they make sense? Do I need to upgrade the server? If I don't need to upgrade and my configuration is not correct, how do I optimize it?

Read the article

Add Free Windows Live Apps to Your Website or Blog

- by Matthew Guay

Would you like to use Hotmail, Office Web Apps, Messenger, and more on your website domain? Here’s how you can add Windows Live to your website for free. Microsoft offers a popular suite of online communications products including Hotmail and Messenger. Although Hotmail hasn’t been as popular in recent years as Gmail, it is getting a refresh this summer that might make it an even better email solution. Additionally, the new Office Web Apps offer great compatibility with Office documents. While Skydrive offers 25Gb of free online file storage for all users, so Windows Live can make a great communications solution for your domain. Note: To signup for Windows Live for your domain, you will need to be able to add info to your WordPress.com blog or change Domain settings manually. Getting Started Open the Windows Live Custom Domains page (Link below) to get started adding Windows Live to your domain. Your free Windows Live account will let you create up to 500 accounts, so it’s great for teams and groups that want to have customized email addresses in addition to those who just want an email account for their website. Enter your domain or subdomain you want to add to Windows Live in the box, and then select whether you want to setup Hotmail with this or now. We want to add email to our domain, so select Set up Windows Live Hotmail for my domain and click Continue. You’ll need to sign in with a Windows Live ID to create the account, or choose to create a new Windows Live account associated with your domain. Sign in with your Windows Live ID…this can be a Hotmail, Live Messenger, XBOX Live, Zune ID, or Microsoft.com account. Or, enter your information to create a new Windows Live ID if you selected the second option. Now, review your settings and make sure everything looks correct. Click the I Accept button to setup your account. Your account is now fully setup, but you’ll need to add or edit DNS information on your site. The steps are slightly different depending if your site is hosted on WordPress.com, on your own server, or hosting service. We’ll show you how to do it on either one. First, though, note the information below this box. You’ll see settings for your Mail setup… Security settings… And Messenger integration. Make note of the settings, especially the circled ones, as we’ll need them in the next step. Integrate Windows Live with Your WordPress Blog If the domain you added to Windows Live is for your WordPress blog, login to your WordPress dashboard in a separate browser window or tab. Click the arrow beside Upgrades, and select Domains from the menu. Click the Edit DNS link beside the domain name you’re adding to Windows Live. In the text box on this page, enter the following, replacing Your_info with your code from the Mail Setup box in your Windows Live Dashboard. Note that this is the blurred section in our screenshots. It should be a numerical code like 1234567890.pamx1.hotmail.com. MX 10 Your_info.pamx1.hotmail.com. TXT v=spf1 include:hotmail.com ~all CNAME Your_info domains.live.com. Click Save DNS records, and your settings are saved to WordPress. Note that this will only integrate email with your WordPress account; you cannot integrate Messenger with a domain hosted on WordPress.com. Finally, return to your Windows Live Settings page and click Refresh. If your settings are correct, you’ll now be ready to use Windows Live on your WordPress.com domain. Integrate Windows Live with Your Own Server If your website is hosted on your own server or hosting account, you’ll need to take a few more steps to add Windows Live to your domain. This is fairly easy, but the steps may be different depending on your hosting company or registrar. With some hosts, you may have to contact support to have them add the MX records for you. Our site’s host uses the popular cPanel for website administration, so here’s how we added the MX Entries through cPanel. Login to your website’s cPanel, and select MX Entry under the Mail section. In the text box on this page, enter the following, replacing Your_info with your code from the Mail Setup box in your Windows Live Dashboard. Note that this is the blurred section in our screenshots. It should be a numerical code like 1234567890.pamx1.hotmail.com. MX 10 Your_info.pamx1.hotmail.com. Now, go back to your cPanel home, and select Advanced DNS Zone Editor under Domains. Here, add a TXT record with the following info: Name: yoursite.com. TTL: 3600 TXT Data: v=spf1 include:hotmail.com ~all Click Add Record and your Mail integration data is all configured. To integrate Messenger with your own domain, you’ll have to add an SRV entry to your DNS settings. cPanel doesn’t have an option for this, so we had to contact our site’s hosting company and they added the entry for us. Copy all of the information in the Messenger box and send it to your domain support, and they should be able to add this for you. Alternately, if you don’t want or need Messenger, then you can simply skip this step. Once all of your settings are in place, return to your Windows Live Settings page and click Refresh. If your settings are correct, you’ll now be ready to use Windows Live on your WordPress.com domain. Create a New Email Account On Your Domain Welcome to your new Windows Live admin page! Now you can add email accounts so you and anyone else you want can access Hotmail and the other Windows Live apps with your domain. Click Add to add an account. Enter an account name, which will be the email address of the account, e.g. [email protected]. Then enter the user’s name and a password for the account. By default this will be a temporary password, and the user will have to change it on first log-in, but if you’re setting up this account for yourself, you can uncheck the box and keep this as your standard password. Now, go to www.mail.live.com, and sign in with your new email address and password. Remember, your email address is your username previously entered followed by @yourdomain.com. To finish setting up the email account, enter your password, secret question and answer, alternate email, and location information. Click I accept to finish setting up your new email account. Enter the characters in the Captcha to confirm you’re a human, and click Continue. Your new Hotmail inbox will now load, and you’ll have a welcome email in your inbox. This works the same as normal Hotmail, except this time, your email address is with your own domain. You can now access any of the Windows Live services from the top-level menu. Here’s an Excel Spreadsheet open in the new Office Web Apps via SkyDrive on our new Windows Live account. If you setup Messenger access previously, you can now sign in to Windows Live Messenger using your new @yourdomain.com account as well. Important Links Accessing your Windows Live accounts is easy. Simply go to any Windows Live site, such as www.hotmail.com or www.skydrive.com, and sign in with your new Windows Live ID from your domain as normal. You don’t need a special address to access your account; it works just like the standard public Hotmail accounts. To administer your Windows Live for your domain, go to https://domains.live.com/ and sign in with the Windows Live ID you used to create the account. Here you can add more users, change settings, and view usage details for the Windows Live accounts on your domain. Conclusion Windows Live is easy to add to your domain, and lets you create up to 500 email address for it. With the upcoming updates to Hotmail and Office Web Apps coming this summer, this can be a nice way to make your domain even more useful. And with 500 email accounts, you can easily let your team take advantage of your unique address as well. If you’d rather use Google’s online applications with your domain, check out our article on how to add free Google apps to your website or blog. Link Signup for Windows Live for Your Domain Similar Articles Productive Geek Tips Tools to Help Post Content On Your WordPress BlogBackup Your Windows Live Writer SettingsInstall Windows Live Essentials In Windows 7Add Your Gmail To Windows Live MailMysticgeek Blog: A Look at Internet Explorer 8 Beta 1 on Windows XP TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips HippoRemote Pro 2.2 Xobni Plus for Outlook All My Movies 5.9 CloudBerry Online Backup 1.5 for Windows Home Server Backup Drivers With Driver Magician TubeSort: YouTube Playlist Organizer XPS file format & XPS Viewer Explained Microsoft Office Web Apps Guide Know if Someone Accessed Your Facebook Account Shop for Music with Windows Media Player 12

Read the article

Working with Timelines with LINQ to Twitter

- by Joe Mayo

When first working with the Twitter API, I thought that using SinceID would be an effective way to page through timelines. In practice it doesn’t work well for various reasons. To explain why, Twitter published an excellent document that is a must-read for anyone working with timelines: Twitter Documentation: Working with Timelines This post shows how to implement the recommended strategies in that document by using LINQ to Twitter. You should read the document in it’s entirety before moving on because my explanation will start at the bottom and work back up to the top in relation to the Twitter document. What follows is an explanation of SinceID, MaxID, and how they come together to help you efficiently work with Twitter timelines. The Role of SinceID Specifying SinceID says to Twitter, “Don’t return tweets earlier than this”. What you want to do is store this value after every timeline query set so that it can be reused on the next set of queries. The next section will explain what I mean by query set, but a quick explanation is that it’s a loop that gets all new tweets. The SinceID is a backstop to avoid retrieving tweets that you already have. Here’s some initialization code that includes a variable named sinceID that will be used to populate the SinceID property in subsequent queries: // last tweet processed on previous query set ulong sinceID = 210024053698867204; ulong maxID; const int Count = 10; var statusList = new List<status>(); Here, I’ve hard-coded the sinceID variable, but this is where you would initialize sinceID from whatever storage you choose (i.e. a database). The first time you ever run this code, you won’t have a value from a previous query set. Initially setting it to 0 might sound like a good idea, but what if you’re querying a timeline with lots of tweets? Because of the number of tweets and rate limits, your query set might take a very long time to run. A caveat might be that Twitter won’t return an entire timeline back to Tweet #0, but rather only go back a certain period of time, the limits of which are documented for individual Twitter timeline API resources. So, to initialize SinceID at too low of a number can result in a lot of initial tweets, yet there is a limit to how far you can go back. What you’re trying to accomplish in your application should guide you in how to initially set SinceID. I have more to say about SinceID later in this post. The other variables initialized above include the declaration for MaxID, Count, and statusList. The statusList variable is a holder for all the timeline tweets collected during this query set. You can set Count to any value you want as the largest number of tweets to retrieve, as defined by individual Twitter timeline API resources. To effectively page results, you’ll use the maxID variable to set the MaxID property in queries, which I’ll discuss next. Initializing MaxID On your first query of a query set, MaxID will be whatever the most recent tweet is that you get back. Further, you don’t know what MaxID is until after the initial query. The technique used in this post is to do an initial query and then use the results to figure out what the next MaxID will be. Here’s the code for the initial query: var userStatusResponse = (from tweet in twitterCtx.Status where tweet.Type == StatusType.User && tweet.ScreenName == "JoeMayo" && tweet.SinceID == sinceID && tweet.Count == Count select tweet) .ToList(); statusList.AddRange(userStatusResponse); // first tweet processed on current query maxID = userStatusResponse.Min( status => ulong.Parse(status.StatusID)) - 1; The query above sets both SinceID and Count properties. As explained earlier, Count is the largest number of tweets to return, but the number can be less. A couple reasons why the number of tweets that are returned could be less than Count include the fact that the user, specified by ScreenName, might not have tweeted Count times yet or might not have tweeted at least Count times within the maximum number of tweets that can be returned by the Twitter timeline API resource. Another reason could be because there aren’t Count tweets between now and the tweet ID specified by sinceID. Setting SinceID constrains the results to only those tweets that occurred after the specified Tweet ID, assigned via the sinceID variable in the query above. The statusList is an accumulator of all tweets receive during this query set. To simplify the code, I left out some logic to check whether there were no tweets returned. If the query above doesn’t return any tweets, you’ll receive an exception when trying to perform operations on an empty list. Yeah, I cheated again. Besides querying initial tweets, what’s important about this code is the final line that sets maxID. It retrieves the lowest numbered status ID in the results. Since the lowest numbered status ID is for a tweet we already have, the code decrements the result by one to keep from asking for that tweet again. Remember, SinceID is not inclusive, but MaxID is. The maxID variable is now set to the highest possible tweet ID that can be returned in the next query. The next section explains how to use MaxID to help get the remaining tweets in the query set. Retrieving Remaining Tweets Earlier in this post, I defined a term that I called a query set. Essentially, this is a group of requests to Twitter that you perform to get all new tweets. A single query might not be enough to get all new tweets, so you’ll have to start at the top of the list that Twitter returns and keep making requests until you have all new tweets. The previous section showed the first query of the query set. The code below is a loop that completes the query set: do { // now add sinceID and maxID userStatusResponse = (from tweet in twitterCtx.Status where tweet.Type == StatusType.User && tweet.ScreenName == "JoeMayo" && tweet.Count == Count && tweet.SinceID == sinceID && tweet.MaxID == maxID select tweet) .ToList(); if (userStatusResponse.Count > 0) { // first tweet processed on current query maxID = userStatusResponse.Min( status => ulong.Parse(status.StatusID)) - 1; statusList.AddRange(userStatusResponse); } } while (userStatusResponse.Count != 0 && statusList.Count < 30); Here we have another query, but this time it includes the MaxID property. The SinceID property prevents reading tweets that we’ve already read and Count specifies the largest number of tweets to return. Earlier, I mentioned how it was important to check how many tweets were returned because failing to do so will result in an exception when subsequent code runs on an empty list. The code above protects against this problem by only working with the results if Twitter actually returns tweets. Reasons why there wouldn’t be results include: if the first query got all the new tweets there wouldn’t be more to get and there might not have been any new tweets between the SinceID and MaxID settings of the most recent query. The code for loading the returned tweets into statusList and getting the maxID are the same as previously explained. The important point here is that MaxID is being reset, not SinceID. As explained in the Twitter documentation, paging occurs from the newest tweets to oldest, so setting MaxID lets us move from the most recent tweets down to the oldest as specified by SinceID. The two loop conditions cause the loop to continue as long as tweets are being read or a max number of tweets have been read. Logically, you want to stop reading when you’ve read all the tweets and that’s indicated by the fact that the most recent query did not return results. I put the check to stop after 30 tweets are reached to keep the demo from running too long – in the console the response scrolls past available buffer and I wanted you to be able to see the complete output. Yet, there’s another point to be made about constraining the number of items you return at one time. The Twitter API has rate limits and making too many queries per minute will result in an error from twitter that LINQ to Twitter raises as an exception. To use the API properly, you’ll have to ensure you don’t exceed this threshold. Looking at the statusList.Count as done above is rather primitive, but you can implement your own logic to properly manage your rate limit. Yeah, I cheated again. Summary Now you know how to use LINQ to Twitter to work with Twitter timelines. After reading this post, you have a better idea of the role of SinceID - the oldest tweet already received. You also know that MaxID is the largest tweet ID to retrieve in a query. Together, these settings allow you to page through results via one or more queries. You also understand what factors affect the number of tweets returned and considerations for potential error handling logic. The full example of the code for this post is included in the downloadable source code for LINQ to Twitter. @JoeMayo

Search Results

Search found 5290 results on 212 pages for 'refresh rate'.

Page 67/212 | < Previous Page | 63 64 65 66 67 68 69 70 71 72 73 74 | Next Page >

- by Craig Tataryn

- by Nanne

- by a1kmm

- by Matthew

- by Will

- by delenda

- by adoado0

- by jon

- by Hugh

- by Mark B

- by Dave LeJeune

- by BDY

- by Will

- by Luke

- by Junier

- by Junier

- by Rick Strahl

- by Alan Smith

- by Tarun Arora

- by user12620111

- by Matt

- by Chris M

- by Mohamed Elgharabawy

- by Matthew Guay

- by Joe Mayo

< Previous Page | 63 64 65 66 67 68 69 70 71 72 73 74 | Next Page >