Search Results

Search found 2156 results on 87 pages for 'weighted average'.

Page 66/87 | < Previous Page | 62 63 64 65 66 67 68 69 70 71 72 73  | Next Page >

  • FFmpeg creates emtpy (black) frames

    - by resamsel
    I have a set of images from a timelapse shot (172 JPG files) that I want to convert into a movie. I tried several parameters with FFmpeg, but all I get is a video with black frames (though it has the expected length). ffmpeg -f image2 -vcodec mjpeg -y -i img_%03d.jpg timelapse2.mpg The command above creates this video: http://sdm-net.org/data/timelapse2.mpg What I'm expecting is something like this (created with Time Lapse Assembler.app): https://vimeo.com/39038362 - This is my fallback option, but I'd really like to create timelapse movies from a script. I'm on OSX Lion (10.7.3) with FFmpeg version (0.10) installed via Homebrew. I also tried to find a proper version of mencoder for OSX, but this doesn't seem to be an easy task. Also, ImageMagick's convert doesn't seem to work nicely, it creates really bad output and it seems there's not much I can do about it... Edit: With libx264 and an mp4 container: ffmpeg -f image2 -y -i img_%03d.jpg -vcodec libx264 timelapse4.mp4 Output: ffmpeg version 0.10 Copyright (c) 2000-2012 the FFmpeg developers built on Mar 26 2012 13:47:02 with clang 3.0 (tags/Apple/clang-211.12) configuration: --prefix=/usr/local/Cellar/ffmpeg/0.10 --enable-shared --enable-gpl --enable-version3 --enable-nonfree --enable-hardcoded-tables --enable-libfreetype --cc=/usr/bin/clang --enable-libx264 --enable-libfaac --enable-libmp3lame --enable-librtmp --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libxvid --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libass --disable-ffplay libavutil 51. 34.101 / 51. 34.101 libavcodec 53. 60.100 / 53. 60.100 libavformat 53. 31.100 / 53. 31.100 libavdevice 53. 4.100 / 53. 4.100 libavfilter 2. 60.100 / 2. 60.100 libswscale 2. 1.100 / 2. 1.100 libswresample 0. 6.100 / 0. 6.100 libpostproc 52. 0.100 / 52. 0.100 Input #0, image2, from 'img_%03d.jpg': Duration: 00:00:06.88, start: 0.000000, bitrate: N/A Stream #0:0: Video: mjpeg, yuvj420p, 3888x2592 [SAR 72:72 DAR 3:2], 25 fps, 25 tbr, 25 tbn, 25 tbc [buffer @ 0x7f8ec9415f20] w:3888 h:2592 pixfmt:yuvj420p tb:1/1000000 sar:72/72 sws_param: [libx264 @ 0x7f8ec981d800] using SAR=1/1 [libx264 @ 0x7f8ec981d800] frame MB size (243x162) > level limit (36864) [libx264 @ 0x7f8ec981d800] MB rate (984150) > level limit (983040) [libx264 @ 0x7f8ec981d800] using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.2 AVX [libx264 @ 0x7f8ec981d800] profile High, level 5.1 [libx264 @ 0x7f8ec981d800] 264 - core 120 - H.264/MPEG-4 AVC codec - Copyleft 2003-2011 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00 Output #0, mp4, to 'timelapse4.mp4': Metadata: encoder : Lavf53.31.100 Stream #0:0: Video: h264 (![0][0][0] / 0x0021), yuvj420p, 3888x2592 [SAR 72:72 DAR 3:2], q=-1--1, 25 tbn, 25 tbc Stream mapping: Stream #0:0 -> #0:0 (mjpeg -> libx264) Press [q] to stop, [?] for help frame= 172 fps= 18 q=-1.0 Lsize= 259kB time=00:00:06.80 bitrate= 312.3kbits/s video:256kB audio:0kB global headers:0kB muxing overhead 1.089647% [libx264 @ 0x7f8ec981d800] frame I:1 Avg QP: 9.60 size:212820 [libx264 @ 0x7f8ec981d800] frame P:43 Avg QP:30.50 size: 291 [libx264 @ 0x7f8ec981d800] frame B:128 Avg QP:31.00 size: 285 [libx264 @ 0x7f8ec981d800] consecutive B-frames: 0.6% 0.0% 1.7% 97.7% [libx264 @ 0x7f8ec981d800] mb I I16..4: 22.5% 77.2% 0.3% [libx264 @ 0x7f8ec981d800] mb P I16..4: 0.0% 0.0% 0.0% P16..4: 0.0% 0.0% 0.0% 0.0% 0.0% skip:100.0% [libx264 @ 0x7f8ec981d800] mb B I16..4: 0.0% 0.0% 0.0% B16..8: 0.0% 0.0% 0.0% direct: 0.0% skip:100.0% L0: 1.2% L1:98.8% BI: 0.0% [libx264 @ 0x7f8ec981d800] 8x8 transform intra:77.2% inter:100.0% [libx264 @ 0x7f8ec981d800] coded y,uvDC,uvAC intra: 41.2% 23.4% 0.6% inter: 0.0% 0.0% 0.0% [libx264 @ 0x7f8ec981d800] i16 v,h,dc,p: 40% 25% 35% 1% [libx264 @ 0x7f8ec981d800] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 36% 32% 30% 1% 0% 0% 0% 0% 0% [libx264 @ 0x7f8ec981d800] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 51% 40% 6% 1% 1% 0% 1% 0% 1% [libx264 @ 0x7f8ec981d800] i8c dc,h,v,p: 60% 21% 19% 0% [libx264 @ 0x7f8ec981d800] Weighted P-Frames: Y:0.0% UV:0.0% [libx264 @ 0x7f8ec981d800] ref P L0: 92.3% 0.0% 0.0% 7.7% [libx264 @ 0x7f8ec981d800] ref B L0: 50.0% 0.0% 50.0% [libx264 @ 0x7f8ec981d800] ref B L1: 99.4% 0.6% [libx264 @ 0x7f8ec981d800] kb/s:304.49 Output timelapse4.mp4 (beacause of spam protection I can only post two links with my reputation): http sdm-net.org/data/timelapse4.mp4

    Read the article

  • Screen Casting using ffmpeg (too fast)

    - by rowman
    I can use ffmpeg to make screen casts: ffmpeg -f x11grab -s 1280x800 -i :0.0 -c:v libx264 -framerate 30 -r 30 -crf 18 out.mkv However the output comes out to be too fast paced. It also happens with GTK RecordMyDesktop if I enable the encode on the fly. So, the questions is how to get a normal video pace. Also in order to capture the sound with ffmpeg what option should be used? FFmpeg Output: ffmpeg -f x11grab -s 1280x800 -r 30 -i :0.0 -c:v libx264 -framerate 30 -r 30 -crf 18 out.mkv ffmpeg version N-35162-g87244c8 Copyright (c) 2000-2012 the FFmpeg developers built on Oct 7 2012 15:56:19 with gcc 4.6 (Ubuntu/Linaro 4.6.3-1ubuntu5) configuration: --enable-gpl --enable-libfaac --enable-libfdk-aac --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-librtmp --enable-libtheora --enable-libvorbis --enable-libvpx --enable-x11grab --enable-libx264 --enable-nonfree --enable-version3 libavutil 51. 73.102 / 51. 73.102 libavcodec 54. 64.100 / 54. 64.100 libavformat 54. 29.105 / 54. 29.105 libavdevice 54. 3.100 / 54. 3.100 libavfilter 3. 19.102 / 3. 19.102 libswscale 2. 1.101 / 2. 1.101 libswresample 0. 16.100 / 0. 16.100 libpostproc 52. 1.100 / 52. 1.100 [x11grab @ 0xab896a0] device: :0.0 -> display: :0.0 x: 0 y: 0 width: 1280 height: 800 [x11grab @ 0xab896a0] shared memory extension found [x11grab @ 0xab896a0] Estimating duration from bitrate, this may be inaccurate Input #0, x11grab, from ':0.0': Duration: N/A, start: 1350136942.608988, bitrate: 983040 kb/s Stream #0:0: Video: rawvideo (BGR[0] / 0x524742), bgr0, 1280x800, 983040 kb/s, 30 tbr, 1000k tbn, 30 tbc [libx264 @ 0xab87320] using cpu capabilities: MMX2 SSE2Fast SSSE3 Cache64 SlowCTZ SlowAtom [libx264 @ 0xab87320] profile High 4:4:4 Predictive, level 3.2, 4:4:4 8-bit [libx264 @ 0xab87320] 264 - core 128 r2 198a7ea - H.264/MPEG-4 AVC codec - Copyleft 2003-2012 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=4 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=18.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00 Output #0, matroska, to 'out.mkv': Metadata: encoder : Lavf54.29.105 Stream #0:0: Video: h264, yuv444p, 1280x800, q=-1--1, 1k tbn, 30 tbc Stream mapping: Stream #0:0 -> #0:0 (rawvideo -> libx264) Press [q] to stop, [?] for help frame= 10 fps=0.0 q=0.0 size= 1kB time=00:00:00.00 bitrate= 0.0kbits/sframe= 19 fps= 17 q=0.0 size= 1kB time=00:00:00.00 bitrate= 0.0kbits/sframe= 28 fps= 17 q=0.0 size= 1kB time=00:00:00.00 bitrate= 0.0kbits/sframe= 37 fps= 17 q=0.0 size= 1kB time=00:00:00.00 bitrate= 0.0kbits/sframe= 45 fps= 16 q=0.0 size= 1kB time=00:00:00.00 bitrate= 0.0kbits/sframe= 47 fps= 14 q=0.0 size= 1kB time=00:00:00.00 bitrate= 0.0kbits/sframe= 52 fps= 13 q=24.0 size= 257kB time=00:00:00.00 bitrate=2101632.0kbiframe= 55 fps= 12 q=24.0 size= 257kB time=00:00:00.10 bitrate=20808.2kbitsframe= 59 fps= 11 q=24.0 size= 289kB time=00:00:00.23 bitrate=10145.0kbitsframe= 64 fps= 11 q=24.0 size= 289kB time=00:00:00.40 bitrate=5894.7kbits/frame= 70 fps= 11 q=24.0 size= 289kB time=00:00:00.60 bitrate=3933.1kbits/frame= 72 fps= 10 q=24.0 size= 289kB time=00:00:00.66 bitrate=3549.2kbits/frame= 77 fps=9.8 q=24.0 size= 289kB time=00:00:00.83 bitrate=2837.7kbits/frame= 80 fps=9.6 q=24.0 size= 289kB time=00:00:00.93 bitrate=2533.5kbits/frame= 85 fps=9.3 q=24.0 size= 289kB time=00:00:01.10 bitrate=2146.9kbits/frame= 89 fps=9.3 q=24.0 size= 289kB time=00:00:01.23 bitrate=1917.1kbits/frame= 92 fps=9.1 q=24.0 size= 289kB time=00:00:01.33 bitrate=1773.3kbits/frame= 96 fps=9.0 q=24.0 size= 289kB time=00:00:01.46 bitrate=1612.4kbits/frame= 99 fps=8.8 q=24.0 size= 321kB time=00:00:01.56 bitrate=1676.8kbits/frame= 104 fps=8.7 q=24.0 size= 321kB time=00:00:01.73 bitrate=1515.2kbits/frame= 109 fps=5.3 q=24.0 Lsize= 1093kB time=00:00:03.56 bitrate=2511.5kbits/s video:1092kB audio:0kB subtitle:0 global headers:0kB muxing overhead 0.120198% [libx264 @ 0xab87320] frame I:3 Avg QP:18.93 size:142610 [libx264 @ 0xab87320] frame P:43 Avg QP:20.79 size: 15751 [libx264 @ 0xab87320] frame B:63 Avg QP:23.75 size: 195 [libx264 @ 0xab87320] consecutive B-frames: 21.1% 1.8% 11.0% 66.1% [libx264 @ 0xab87320] mb I I16..4: 50.0% 21.1% 28.9% [libx264 @ 0xab87320] mb P I16..4: 6.1% 0.9% 3.2% P16..4: 5.5% 1.2% 0.6% 0.0% 0.0% skip:82.5% [libx264 @ 0xab87320] mb B I16..4: 0.4% 0.1% 0.0% B16..8: 2.9% 0.1% 0.0% direct: 0.0% skip:96.5% L0:40.7% L1:57.0% BI: 2.3% [libx264 @ 0xab87320] 8x8 transform intra:14.5% inter:46.1% [libx264 @ 0xab87320] coded y,u,v intra: 33.5% 24.1% 25.4% inter: 0.9% 0.4% 0.4% [libx264 @ 0xab87320] i16 v,h,dc,p: 70% 26% 1% 3% [libx264 @ 0xab87320] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 11% 21% 30% 5% 7% 5% 7% 4% 10% [libx264 @ 0xab87320] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 32% 35% 12% 2% 4% 3% 4% 3% 5% [libx264 @ 0xab87320] Weighted P-Frames: Y:0.0% UV:0.0% [libx264 @ 0xab87320] ref P L0: 57.0% 5.6% 26.8% 10.6% [libx264 @ 0xab87320] ref B L0: 69.4% 22.6% 8.0% [libx264 @ 0xab87320] ref B L1: 93.7% 6.3% [libx264 @ 0xab87320] kb/s:2460.40

    Read the article

  • Using R to Analyze G1GC Log Files

    - by user12620111
    Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=||   Using R to Analyze G1GC Log Files   Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z\(\),]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\\( " , " " ); gsub ( "\ \)" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period:     par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight.   plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

    Read the article

  • Allocation algorithm help, using Python.

    - by Az
    Hi there, I've been working on this general allocation algorithm for students. The pseudocode for it (a Python implementation) is: for a student in a dictionary of students: for student's preference in a set of preferences (ordered from 1 to 10): let temp_project be the first preferred project check if temp_project is available if so, allocate it to them and make the project UNavailable to others Quite simply this will try to allocate projects by starting from their most preferred. The way it works, out of a set of say 100 projects, you list 10 you would want to do. So the 10th project wouldn't be the "least preferred overall" but rather the least preferred in their chosen set, which isn't so bad. Obviously if it can't allocate a project, a student just reverts to the base case which is an allocation of None, with a rank of 11. What I'm doing is calculating the allocation "quality" based on a weighted sum of the ranks. So the lower the numbers (i.e. more highly preferred projects), the better the allocation quality (i.e. more students have highly preferred projects). That's basically what I've currently got. Simple and it works. Now I'm working on this algorithm that tries to minimise the allocation weight locally (this pseudocode is a bit messy, sorry). The only reason this will probably work is because my "search space" as it is, isn't particularly large (just a very general, anecdotal observation, mind you). Since the project is only specific to my Department, we have their own limits imposed. So the number of students can't exceed 100 and the number of preferences won't exceed 10. for student in a dictionary/list/whatever of students: where i = 0 take the (i)st student, (i+1)nd student for their ranks: allocate the projects and set local_weighting to be sum(student_i.alloc_proj_rank, student_i+1.alloc_proj_rank) these are the cases: if local_weighting is 2 (i.e. both ranks are 1): then i += 1 and and continue above if local weighting is = N>2 (i.e. one or more ranks are greater than 1): let temp_local_weighting be N: pick student with lowest rank and then move him to his next rank and pick the other student and reallocate his project after this if temp_local_weighting is < N: then allocate those projects to the students move student with lowest rank to the next rank and reallocate other if temp_local_weighting < previous_temp_allocation: let these be the new allocated projects try moving for the lowest rank and reallocate other else: if this weighting => previous_weighting let these be the allocated projects i += 1 and move on for the rest of the students So, questions: This is sort of a modification of simulated annealing, but any sort of comments on this would be appreciated. How would I keep track of which student is (i) and which student is (i+1) If my overall list of students is 100, then the thing would mess up on (i+1) = 101 since there is none. How can I circumvent that? Any immediate flaws that can be spotted? Extra info: My students dictionary is designed as such: students[student_id] = Student(student_id, student_name, alloc_proj, alloc_proj_rank, preferences) where preferences is in the form of a dictionary such that preferences[rank] = {project_id}

    Read the article

  • Why is there a Null Pointer Exception in this Java Code?

    - by algorithmicCoder
    This code takes in users and movies from two separate files and computes a user score for a movie. When i run the code I get the following error: Exception in thread "main" java.lang.NullPointerException at RecommenderSystem.makeRecommendation(RecommenderSystem.java:75) at RecommenderSystem.main(RecommenderSystem.java:24) I believe the null pointer exception is due to an error in this particular class but I can't spot it....any thoughts? import java.io.*; import java.lang.Math; public class RecommenderSystem { private Movie[] m_movies; private User[] m_users; /** Parse the movies and users files, and then run queries against them. */ public static void main(String[] argv) throws FileNotFoundException, ParseError, RecommendationError { FileReader movies_fr = new FileReader("C:\\workspace\\Recommender\\src\\IMDBTop10.txt"); FileReader users_fr = new FileReader("C:\\workspace\\Recommender\\src\\IMDBTop10-users.txt"); MovieParser mp = new MovieParser(movies_fr); UserParser up = new UserParser(users_fr); Movie[] movies = mp.getMovies(); User[] users = up.getUsers(); RecommenderSystem rs = new RecommenderSystem(movies, users); System.out.println("Alice would rate \"The Shawshank Redemption\" with at least a " + rs.makeRecommendation("The Shawshank Redemption", "asmith")); System.out.println("Carol would rate \"The Dark Knight\" with at least a " + rs.makeRecommendation("The Dark Knight", "cd0")); } /** Instantiate a recommender system. * * @param movies An array of Movie that will be copied into m_movies. * @param users An array of User that will be copied into m_users. */ public RecommenderSystem(Movie[] movies, User[] users) throws RecommendationError { m_movies = movies; m_users = users; } /** Suggest what the user with "username" would rate "movieTitle". * * @param movieTitle The movie for which a recommendation is made. * @param username The user for whom the recommendation is made. */ public double makeRecommendation(String movieTitle, String username) throws RecommendationError { int userNumber; int movieNumber; int j=0; double weightAvNum =0; double weightAvDen=0; for (userNumber = 0; userNumber < m_users.length; ++userNumber) { if (m_users[userNumber].getUsername().equals(username)) { break; } } for (movieNumber = 0; movieNumber < m_movies.length; ++movieNumber) { if (m_movies[movieNumber].getTitle().equals(movieTitle)) { break; } } // Use the weighted average algorithm here (don't forget to check for // errors). while(j<m_users.length){ if(j!=userNumber){ weightAvNum = weightAvNum + (m_users[j].getRating(movieNumber)- m_users[j].getAverageRating())*(m_users[userNumber].similarityTo(m_users[j])); weightAvDen = weightAvDen + (m_users[userNumber].similarityTo(m_users[j])); } j++; } return (m_users[userNumber].getAverageRating()+ (weightAvNum/weightAvDen)); } } class RecommendationError extends Exception { /** An error for when something goes wrong in the recommendation process. * * @param s A string describing the error. */ public RecommendationError(String s) { super(s); } }

    Read the article

  • httpd high cpu usage slowing down server response

    - by max
    my client has a image sharing website with about 100.000 visitor per day it has been slowed down considerably since this morning when i checked processes i've notice high cpu usage from http .... some has suggested ddos attack ... i'm not a webmaster and i've no idea whts going on top top - 20:13:30 up 5:04, 4 users, load average: 4.56, 4.69, 4.59 Tasks: 284 total, 3 running, 281 sleeping, 0 stopped, 0 zombie Cpu(s): 12.1%us, 0.9%sy, 1.7%ni, 69.0%id, 16.4%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 16037152k total, 15875096k used, 162056k free, 360468k buffers Swap: 4194288k total, 888k used, 4193400k free, 14050008k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4151 apache 20 0 277m 84m 3784 R 50.2 0.5 0:01.98 httpd 4115 apache 20 0 210m 16m 4480 S 18.3 0.1 0:00.60 httpd 12885 root 39 19 4296 692 308 S 13.0 0.0 11:09.53 gzip 4177 apache 20 0 214m 20m 3700 R 12.3 0.1 0:00.37 httpd 2219 mysql 20 0 4257m 198m 5668 S 11.0 1.3 42:49.70 mysqld 3691 apache 20 0 206m 14m 6416 S 1.7 0.1 0:03.38 httpd 3934 apache 20 0 211m 17m 4836 S 1.0 0.1 0:03.61 httpd 4098 apache 20 0 209m 17m 3912 S 1.0 0.1 0:04.17 httpd 4116 apache 20 0 211m 17m 4476 S 1.0 0.1 0:00.43 httpd 3867 apache 20 0 217m 23m 4672 S 0.7 0.1 1:03.87 httpd 4146 apache 20 0 209m 15m 3628 S 0.7 0.1 0:00.02 httpd 4149 apache 20 0 209m 15m 3616 S 0.7 0.1 0:00.02 httpd 12884 root 39 19 22336 2356 944 D 0.7 0.0 0:19.21 tar 4054 apache 20 0 206m 12m 4576 S 0.3 0.1 0:00.32 httpd another top top - 15:46:45 up 5:08, 4 users, load average: 5.02, 4.81, 4.64 Tasks: 288 total, 6 running, 281 sleeping, 0 stopped, 1 zombie Cpu(s): 18.4%us, 0.9%sy, 2.3%ni, 56.5%id, 21.8%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 16037152k total, 15792196k used, 244956k free, 360924k buffers Swap: 4194288k total, 888k used, 4193400k free, 13983368k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4622 apache 20 0 209m 16m 3868 S 54.2 0.1 0:03.99 httpd 4514 apache 20 0 213m 20m 3924 R 50.8 0.1 0:04.93 httpd 4627 apache 20 0 221m 27m 4560 R 18.9 0.2 0:01.20 httpd 12885 root 39 19 4296 692 308 S 18.9 0.0 11:51.79 gzip 2219 mysql 20 0 4257m 199m 5668 S 18.3 1.3 43:19.04 mysqld 4512 apache 20 0 227m 33m 4736 R 5.6 0.2 0:01.93 httpd 4520 apache 20 0 213m 19m 4640 S 1.3 0.1 0:01.48 httpd 4590 apache 20 0 212m 19m 3932 S 1.3 0.1 0:00.06 httpd 4573 apache 20 0 210m 16m 3556 R 1.0 0.1 0:00.03 httpd 4562 root 20 0 15164 1388 952 R 0.7 0.0 0:00.08 top 98 root 20 0 0 0 0 S 0.3 0.0 0:04.89 kswapd0 100 root 39 19 0 0 0 S 0.3 0.0 0:02.85 khugepaged 4579 apache 20 0 209m 16m 3900 S 0.3 0.1 0:00.83 httpd 4637 apache 20 0 209m 15m 3668 S 0.3 0.1 0:00.03 httpd ps aux [root@server ~]# ps aux | grep httpd root 2236 0.0 0.0 207524 10124 ? Ss 15:09 0:03 /usr/sbin/http d -k start -DSSL apache 3087 2.7 0.1 226968 28232 ? S 20:04 0:06 /usr/sbin/http d -k start -DSSL apache 3170 2.6 0.1 221296 22292 ? R 20:05 0:05 /usr/sbin/http d -k start -DSSL apache 3171 9.0 0.1 225044 26768 ? R 20:05 0:17 /usr/sbin/http d -k start -DSSL apache 3188 1.5 0.1 223644 24724 ? S 20:05 0:03 /usr/sbin/http d -k start -DSSL apache 3197 2.3 0.1 215908 17520 ? S 20:05 0:04 /usr/sbin/http d -k start -DSSL apache 3198 1.1 0.0 211700 13000 ? S 20:05 0:02 /usr/sbin/http d -k start -DSSL apache 3272 2.4 0.1 219960 21540 ? S 20:06 0:03 /usr/sbin/http d -k start -DSSL apache 3273 2.0 0.0 211600 12804 ? S 20:06 0:03 /usr/sbin/http d -k start -DSSL apache 3279 3.7 0.1 229024 29900 ? S 20:06 0:05 /usr/sbin/http d -k start -DSSL apache 3280 1.2 0.0 0 0 ? Z 20:06 0:01 [httpd] <defun ct> apache 3285 2.9 0.1 218532 21604 ? S 20:06 0:04 /usr/sbin/http d -k start -DSSL apache 3287 30.5 0.4 265084 65948 ? R 20:06 0:43 /usr/sbin/http d -k start -DSSL apache 3297 1.9 0.1 216068 17332 ? S 20:06 0:02 /usr/sbin/http d -k start -DSSL apache 3342 2.7 0.1 216716 17828 ? S 20:06 0:03 /usr/sbin/http d -k start -DSSL apache 3356 1.6 0.1 217244 18296 ? S 20:07 0:01 /usr/sbin/http d -k start -DSSL apache 3365 6.4 0.1 226044 27428 ? S 20:07 0:06 /usr/sbin/http d -k start -DSSL apache 3396 0.0 0.1 213844 16120 ? S 20:07 0:00 /usr/sbin/http d -k start -DSSL apache 3399 5.8 0.1 215664 16772 ? S 20:07 0:05 /usr/sbin/http d -k start -DSSL apache 3422 0.7 0.1 214860 17380 ? S 20:07 0:00 /usr/sbin/http d -k start -DSSL apache 3435 3.3 0.1 216220 17460 ? S 20:07 0:02 /usr/sbin/http d -k start -DSSL apache 3463 0.1 0.0 212732 15076 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3492 0.0 0.0 207660 7552 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3493 1.4 0.1 218092 19188 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3500 1.9 0.1 224204 26100 ? R 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3501 1.7 0.1 216916 17916 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3502 0.0 0.0 207796 7732 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3505 0.0 0.0 207660 7548 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3529 0.0 0.0 207660 7524 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3531 4.0 0.1 216180 17280 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3532 0.0 0.0 207656 7464 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3543 1.4 0.1 217088 18648 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3544 0.0 0.0 207656 7548 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3545 0.0 0.0 207656 7560 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3546 0.0 0.0 207660 7540 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3547 0.0 0.0 207660 7544 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3548 2.3 0.1 216904 17888 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3550 0.0 0.0 207660 7540 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3551 0.0 0.0 207660 7536 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3552 0.2 0.0 214104 15972 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3553 6.5 0.1 216740 17712 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3554 6.3 0.1 216156 17260 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3555 0.0 0.0 207796 7716 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3556 1.8 0.0 211588 12580 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3557 0.0 0.0 207660 7544 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3565 0.0 0.0 207660 7520 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3570 0.0 0.0 207660 7516 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL apache 3571 0.0 0.0 207660 7504 ? S 20:08 0:00 /usr/sbin/http d -k start -DSSL root 3577 0.0 0.0 103316 860 pts/2 S+ 20:08 0:00 grep httpd httpd error log [Mon Jul 01 18:53:38 2013] [error] [client 2.178.12.67] request failed: error reading the headers, referer: http://akstube.com/image/show/27023/%D9%86%DB%8C%D9%88%D8%B4%D8%A7-%D8%B6%DB%8C%D8%BA%D9%85%DB%8C-%D9%88-%D8%AE%D9%88%D8%A7%D9%87%D8%B1-%D9%88-%D9%87%D9%85%D8%B3%D8%B1%D8%B4 [Mon Jul 01 18:55:33 2013] [error] [client 91.229.215.240] request failed: error reading the headers, referer: http://akstube.com/image/show/44924 [Mon Jul 01 18:57:02 2013] [error] [client 2.178.12.67] Invalid method in request [Mon Jul 01 18:57:02 2013] [error] [client 2.178.12.67] File does not exist: /var/www/html/501.shtml [Mon Jul 01 19:21:36 2013] [error] [client 127.0.0.1] client denied by server configuration: /var/www/html/server-status [Mon Jul 01 19:21:36 2013] [error] [client 127.0.0.1] File does not exist: /var/www/html/403.shtml [Mon Jul 01 19:23:57 2013] [error] [client 151.242.14.31] request failed: error reading the headers [Mon Jul 01 19:37:16 2013] [error] [client 2.190.16.65] request failed: error reading the headers [Mon Jul 01 19:56:00 2013] [error] [client 151.242.14.31] request failed: error reading the headers Not a JPEG file: starts with 0x89 0x50 also there is lots of these in the messages log Jul 1 20:15:47 server named[2426]: client 203.88.6.9#11926: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 20:15:47 server named[2426]: client 203.88.6.9#26255: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 20:15:48 server named[2426]: client 203.88.6.9#20093: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 20:15:48 server named[2426]: client 203.88.6.9#8672: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:07 server named[2426]: client 203.88.6.9#39352: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:08 server named[2426]: client 203.88.6.9#25382: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:08 server named[2426]: client 203.88.6.9#9064: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:09 server named[2426]: client 203.88.23.9#35375: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:45:09 server named[2426]: client 203.88.6.9#61932: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:09 server named[2426]: client 203.88.23.9#4423: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:45:09 server named[2426]: client 203.88.6.9#40229: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:14 server named[2426]: client 203.88.23.9#46128: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:45:14 server named[2426]: client 203.88.6.10#62128: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:14 server named[2426]: client 203.88.23.9#35240: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:45:14 server named[2426]: client 203.88.6.10#36774: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:14 server named[2426]: client 203.88.23.9#28361: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:45:14 server named[2426]: client 203.88.6.10#14970: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:14 server named[2426]: client 203.88.23.9#20216: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:14 server named[2426]: client 203.88.23.10#31794: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:45:14 server named[2426]: client 203.88.23.9#23042: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:14 server named[2426]: client 203.88.6.10#11333: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:14 server named[2426]: client 203.88.23.10#41807: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:45:14 server named[2426]: client 203.88.23.9#20092: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:14 server named[2426]: client 203.88.6.10#43526: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:15 server named[2426]: client 203.88.23.9#17173: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:15 server named[2426]: client 203.88.23.9#62412: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:15 server named[2426]: client 203.88.23.10#63961: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:45:15 server named[2426]: client 203.88.23.10#64345: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:45:15 server named[2426]: client 203.88.23.10#31030: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:45:16 server named[2426]: client 203.88.6.9#17098: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:16 server named[2426]: client 203.88.6.9#17197: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:16 server named[2426]: client 203.88.6.9#18114: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:16 server named[2426]: client 203.88.6.9#59138: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:45:17 server named[2426]: client 203.88.6.9#28715: query (cache) 'www.xxxmaza.com/A/IN' denied Jul 1 15:48:33 server named[2426]: client 203.88.23.9#26355: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:48:34 server named[2426]: client 203.88.23.9#34473: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:48:34 server named[2426]: client 203.88.23.9#62658: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:48:34 server named[2426]: client 203.88.23.9#51631: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:48:35 server named[2426]: client 203.88.23.9#54701: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:48:36 server named[2426]: client 203.88.6.10#63694: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:48:36 server named[2426]: client 203.88.6.10#18203: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:48:37 server named[2426]: client 203.88.6.10#9029: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:48:38 server named[2426]: client 203.88.6.10#58981: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:48:38 server named[2426]: client 203.88.6.10#29321: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:49:47 server named[2426]: client 119.160.127.42#42355: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:49:49 server named[2426]: client 119.160.120.42#46285: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:49:53 server named[2426]: client 119.160.120.42#30696: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:49:54 server named[2426]: client 119.160.127.42#14038: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:49:55 server named[2426]: client 119.160.120.42#33586: query (cache) 'xxxmaza.com/A/IN' denied Jul 1 15:49:56 server named[2426]: client 119.160.127.42#55114: query (cache) 'xxxmaza.com/A/IN' denied

    Read the article

  • What’s New from the Oracle Marketing Cloud at Oracle OpenWorld 2014?

    - by Richard Lefebvre
    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 Marketing—CX Central is your hub for all things Marketing related at OpenWorld in San Francisco, September 28-October 2, 2014. Learn how to personalize the modern marketing journey to improve customer loyalty. We’re hosting more than 60 breakout sessions, half of which will highlight customer success stories from marquee brands including Bizo, Comcast, Dell, Epson, John Deere, Lane Bryant, ReadyTalk and Shutterfly. Moscone West, Levels 2 and 3 To learn more about how modern marketing works, visit Moscone West, levels 2 and 3, for exciting demos of each of the Oracle Marketing Cloud solutions (BlueKai, Compendium, Eloqua, Push I/O, and Responsys). You also can check out our stations for Vertical Marketing Best Practices, the Markie Awards, and more! CX Spotlight Sessions “Accelerating Big Profits in Big Data,” Jeff Tanner, Baylor University “Using Content Marketing to Impact Every Stage of the Buyer’s Journey,” Jennifer Agustin, Bizo “Expanding Your Marketing with Proven Testing and Optimization,” Brian Border, Shutterfly and Matthew Balthazor, Epson “Modern Marketing: The New Digital Dialogue,” Cory Treffiletti, Oracle A Special Marquee Session Dell’s Hayden Mugford will speak on “The Digital Ecosystem: Driving Experience Through Contact Engagement.” She will highlight how the organization built a digital ecosystem that supports a behaviorally driven, multivehicle nurturing campaign. The Dell 1:1 Global Marketing team worked with multiple partners to innovate integrations with Oracle Eloqua, Oracle Real-Time Decisions for real-time decision logic, and a content management system (CMS) that enables 100 percent customized e-mails. The program doubled average order values for nurtured contacts versus non-nurtured and tripled open and click-through rates versus push e-mail. Other Oracle Marketing Cloud Session Highlights Thought leadership by role Exploring the benefits of moving to the Cloud Product line roadmaps and innovations in Marketing Technical deep dives for product lines within Marketing Best practices and impactful business measurements Solutions that are Integrated across CX Target Audience Session content is geared toward professionals in Marketing, Marketing Operations, Marketing Demand Generation, Social: Chief Marketing Officers, Vice Presidents, Directors and Managers. Outcomes Customers attending Marketing—CX Central @ OpenWorld will be able to: Gain insight into delivering consistent cross-channel marketing Discover how to provide the right information to the right customer at the right time and with the right channel Get answers to burning questions and advice on business challenges Hear from other Oracle customers about recommended best practices to help their organization move forward Network and share ideas to help create a strategy for connecting with customers in better ways It Wouldn’t Be an Oracle Marketing Cloud Event Without a Party! We’re hosting CX Central Fest:  a unique customer experience specifically designed for attendees of CX Central. It will include a chance to rock out at a private concert featuring Los Angeles indie electronic pop group, Capital Cities! Join us Tuesday, September 30 from 7-9 p.m. OpenWorld is a fabulous way for your customers to see all that Oracle Marketing Cloud has to offer. Pass on an invitation today. By Laura Vogel (Oracle) /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;}

    Read the article

  • Sun Fire X4270 M3 SAP Enhancement Package 4 for SAP ERP 6.0 (Unicode) Two-Tier Standard Sales and Distribution (SD) Benchmark

    - by Brian
    Oracle's Sun Fire X4270 M3 server achieved 8,320 SAP SD Benchmark users running SAP enhancement package 4 for SAP ERP 6.0 with unicode software using Oracle Database 11g and Oracle Solaris 10. The Sun Fire X4270 M3 server using Oracle Database 11g and Oracle Solaris 10 beat both IBM Flex System x240 and IBM System x3650 M4 server running DB2 9.7 and Windows Server 2008 R2 Enterprise Edition. The Sun Fire X4270 M3 server running Oracle Database 11g and Oracle Solaris 10 beat the HP ProLiant BL460c Gen8 server using SQL Server 2008 and Windows Server 2008 R2 Enterprise Edition by 6%. The Sun Fire X4270 M3 server using Oracle Database 11g and Oracle Solaris 10 beat Cisco UCS C240 M3 server running SQL Server 2008 and Windows Server 2008 R2 Datacenter Edition by 9%. The Sun Fire X4270 M3 server running Oracle Database 11g and Oracle Solaris 10 beat the Fujitsu PRIMERGY RX300 S7 server using SQL Server 2008 and Windows Server 2008 R2 Enterprise Edition by 10%. Performance Landscape SAP-SD 2-Tier Performance Table (in decreasing performance order). SAP ERP 6.0 Enhancement Pack 4 (Unicode) Results (benchmark version from January 2009 to April 2012) System OS Database Users SAPERP/ECCRelease SAPS SAPS/Proc Date Sun Fire X4270 M3 2xIntel Xeon E5-2690 @2.90GHz 128 GB Oracle Solaris 10 Oracle Database 11g 8,320 20096.0 EP4(Unicode) 45,570 22,785 10-Apr-12 IBM Flex System x240 2xIntel Xeon E5-2690 @2.90GHz 128 GB Windows Server 2008 R2 EE DB2 9.7 7,960 20096.0 EP4(Unicode) 43,520 21,760 11-Apr-12 HP ProLiant BL460c Gen8 2xIntel Xeon E5-2690 @2.90GHz 128 GB Windows Server 2008 R2 EE SQL Server 2008 7,865 20096.0 EP4(Unicode) 42,920 21,460 29-Mar-12 IBM System x3650 M4 2xIntel Xeon E5-2690 @2.90GHz 128 GB Windows Server 2008 R2 EE DB2 9.7 7,855 20096.0 EP4(Unicode) 42,880 21,440 06-Mar-12 Cisco UCS C240 M3 2xIntel Xeon E5-2690 @2.90GHz 128 GB Windows Server 2008 R2 DE SQL Server 2008 7,635 20096.0 EP4(Unicode) 41,800 20,900 06-Mar-12 Fujitsu PRIMERGY RX300 S7 2xIntel Xeon E5-2690 @2.90GHz 128 GB Windows Server 2008 R2 EE SQL Server 2008 7,570 20096.0 EP4(Unicode) 41,320 20,660 06-Mar-12 Complete benchmark results may be found at the SAP benchmark website http://www.sap.com/benchmark. Configuration and Results Summary Hardware Configuration: Sun Fire X4270 M3 2 x 2.90 GHz Intel Xeon E5-2690 processors 128 GB memory Sun StorageTek 6540 with 4 * 16 * 300GB 15Krpm 4Gb FC-AL Software Configuration: Oracle Solaris 10 Oracle Database 11g SAP enhancement package 4 for SAP ERP 6.0 (Unicode) Certified Results (published by SAP): Number of benchmark users: 8,320 Average dialog response time: 0.95 seconds Throughput: Fully processed order line: 911,330 Dialog steps/hour: 2,734,000 SAPS: 45,570 SAP Certification: 2012014 Benchmark Description The SAP Standard Application SD (Sales and Distribution) Benchmark is a two-tier ERP business test that is indicative of full business workloads of complete order processing and invoice processing, and demonstrates the ability to run both the application and database software on a single system. The SAP Standard Application SD Benchmark represents the critical tasks performed in real-world ERP business environments. SAP is one of the premier world-wide ERP application providers, and maintains a suite of benchmark tests to demonstrate the performance of competitive systems on the various SAP products. See Also SAP Benchmark Website Sun Fire X4270 M3 Server oracle.com OTN Oracle Solaris oracle.com OTN Oracle Database 11g Release 2 Enterprise Edition oracle.com OTN Disclosure Statement Two-tier SAP Sales and Distribution (SD) standard SAP SD benchmark based on SAP enhancement package 4 for SAP ERP 6.0 (Unicode) application benchmark as of 04/11/12: Sun Fire X4270 M3 (2 processors, 16 cores, 32 threads) 8,320 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, Oracle 11g, Solaris 10, Cert# 2012014. IBM Flex System x240 (2 processors, 16 cores, 32 threads) 7,960 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, DB2 9.7, Windows Server 2008 R2 EE, Cert# 2012016. IBM System x3650 M4 (2 processors, 16 cores, 32 threads) 7,855 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, DB2 9.7, Windows Server 2008 R2 EE, Cert# 2012010. Cisco UCS C240 M3 (2 processors, 16 cores, 32 threads) 7,635 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, SQL Server 2008, Windows Server 2008 R2 DE, Cert# 2012011. Fujitsu PRIMERGY RX300 S7 (2 processors, 16 cores, 32 threads) 7,570 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, SQL Server 2008, Windows Server 2008 R2 EE, Cert# 2012008. HP ProLiant DL380p Gen8 (2 processors, 16 cores, 32 threads) 7,865 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, SQL Server 2008, Windows Server 2008 R2 EE, Cert# 2012012. SAP, R/3, reg TM of SAP AG in Germany and other countries. More info www.sap.com/benchmark

    Read the article

  • The Incremental Architect&rsquo;s Napkin - #5 - Design functions for extensibility and readability

    - by Ralf Westphal
    Originally posted on: http://geekswithblogs.net/theArchitectsNapkin/archive/2014/08/24/the-incremental-architectrsquos-napkin---5---design-functions-for.aspx The functionality of programs is entered via Entry Points. So what we´re talking about when designing software is a bunch of functions handling the requests represented by and flowing in through those Entry Points. Designing software thus consists of at least three phases: Analyzing the requirements to find the Entry Points and their signatures Designing the functionality to be executed when those Entry Points get triggered Implementing the functionality according to the design aka coding I presume, you´re familiar with phase 1 in some way. And I guess you´re proficient in implementing functionality in some programming language. But in my experience developers in general are not experienced in going through an explicit phase 2. “Designing functionality? What´s that supposed to mean?” you might already have thought. Here´s my definition: To design functionality (or functional design for short) means thinking about… well, functions. You find a solution for what´s supposed to happen when an Entry Point gets triggered in terms of functions. A conceptual solution that is, because those functions only exist in your head (or on paper) during this phase. But you may have guess that, because it´s “design” not “coding”. And here is, what functional design is not: It´s not about logic. Logic is expressions (e.g. +, -, && etc.) and control statements (e.g. if, switch, for, while etc.). Also I consider calling external APIs as logic. It´s equally basic. It´s what code needs to do in order to deliver some functionality or quality. Logic is what´s doing that needs to be done by software. Transformations are either done through expressions or API-calls. And then there is alternative control flow depending on the result of some expression. Basically it´s just jumps in Assembler, sometimes to go forward (if, switch), sometimes to go backward (for, while, do). But calling your own function is not logic. It´s not necessary to produce any outcome. Functionality is not enhanced by adding functions (subroutine calls) to your code. Nor is quality increased by adding functions. No performance gain, no higher scalability etc. through functions. Functions are not relevant to functionality. Strange, isn´t it. What they are important for is security of investment. By introducing functions into our code we can become more productive (re-use) and can increase evolvability (higher unterstandability, easier to keep code consistent). That´s no small feat, however. Evolvable code can hardly be overestimated. That´s why to me functional design is so important. It´s at the core of software development. To sum this up: Functional design is on a level of abstraction above (!) logical design or algorithmic design. Functional design is only done until you get to a point where each function is so simple you are very confident you can easily code it. Functional design an logical design (which mostly is coding, but can also be done using pseudo code or flow charts) are complementary. Software needs both. If you start coding right away you end up in a tangled mess very quickly. Then you need back out through refactoring. Functional design on the other hand is bloodless without actual code. It´s just a theory with no experiments to prove it. But how to do functional design? An example of functional design Let´s assume a program to de-duplicate strings. The user enters a number of strings separated by commas, e.g. a, b, a, c, d, b, e, c, a. And the program is supposed to clear this list of all doubles, e.g. a, b, c, d, e. There is only one Entry Point to this program: the user triggers the de-duplication by starting the program with the string list on the command line C:\>deduplicate "a, b, a, c, d, b, e, c, a" a, b, c, d, e …or by clicking on a GUI button. This leads to the Entry Point function to get called. It´s the program´s main function in case of the batch version or a button click event handler in the GUI version. That´s the physical Entry Point so to speak. It´s inevitable. What then happens is a three step process: Transform the input data from the user into a request. Call the request handler. Transform the output of the request handler into a tangible result for the user. Or to phrase it a bit more generally: Accept input. Transform input into output. Present output. This does not mean any of these steps requires a lot of effort. Maybe it´s just one line of code to accomplish it. Nevertheless it´s a distinct step in doing the processing behind an Entry Point. Call it an aspect or a responsibility - and you will realize it most likely deserves a function of its own to satisfy the Single Responsibility Principle (SRP). Interestingly the above list of steps is already functional design. There is no logic, but nevertheless the solution is described - albeit on a higher level of abstraction than you might have done yourself. But it´s still on a meta-level. The application to the domain at hand is easy, though: Accept string list from command line De-duplicate Present de-duplicated strings on standard output And this concrete list of processing steps can easily be transformed into code:static void Main(string[] args) { var input = Accept_string_list(args); var output = Deduplicate(input); Present_deduplicated_string_list(output); } Instead of a big problem there are three much smaller problems now. If you think each of those is trivial to implement, then go for it. You can stop the functional design at this point. But maybe, just maybe, you´re not so sure how to go about with the de-duplication for example. Then just implement what´s easy right now, e.g.private static string Accept_string_list(string[] args) { return args[0]; } private static void Present_deduplicated_string_list( string[] output) { var line = string.Join(", ", output); Console.WriteLine(line); } Accept_string_list() contains logic in the form of an API-call. Present_deduplicated_string_list() contains logic in the form of an expression and an API-call. And then repeat the functional design for the remaining processing step. What´s left is the domain logic: de-duplicating a list of strings. How should that be done? Without any logic at our disposal during functional design you´re left with just functions. So which functions could make up the de-duplication? Here´s a suggestion: De-duplicate Parse the input string into a true list of strings. Register each string in a dictionary/map/set. That way duplicates get cast away. Transform the data structure into a list of unique strings. Processing step 2 obviously was the core of the solution. That´s where real creativity was needed. That´s the core of the domain. But now after this refinement the implementation of each step is easy again:private static string[] Parse_string_list(string input) { return input.Split(',') .Select(s => s.Trim()) .ToArray(); } private static Dictionary<string,object> Compile_unique_strings(string[] strings) { return strings.Aggregate( new Dictionary<string, object>(), (agg, s) => { agg[s] = null; return agg; }); } private static string[] Serialize_unique_strings( Dictionary<string,object> dict) { return dict.Keys.ToArray(); } With these three additional functions Main() now looks like this:static void Main(string[] args) { var input = Accept_string_list(args); var strings = Parse_string_list(input); var dict = Compile_unique_strings(strings); var output = Serialize_unique_strings(dict); Present_deduplicated_string_list(output); } I think that´s very understandable code: just read it from top to bottom and you know how the solution to the problem works. It´s a mirror image of the initial design: Accept string list from command line Parse the input string into a true list of strings. Register each string in a dictionary/map/set. That way duplicates get cast away. Transform the data structure into a list of unique strings. Present de-duplicated strings on standard output You can even re-generate the design by just looking at the code. Code and functional design thus are always in sync - if you follow some simple rules. But about that later. And as a bonus: all the functions making up the process are small - which means easy to understand, too. So much for an initial concrete example. Now it´s time for some theory. Because there is method to this madness ;-) The above has only scratched the surface. Introducing Flow Design Functional design starts with a given function, the Entry Point. Its goal is to describe the behavior of the program when the Entry Point is triggered using a process, not an algorithm. An algorithm consists of logic, a process on the other hand consists just of steps or stages. Each processing step transforms input into output or a side effect. Also it might access resources, e.g. a printer, a database, or just memory. Processing steps thus can rely on state of some sort. This is different from Functional Programming, where functions are supposed to not be stateful and not cause side effects.[1] In its simplest form a process can be written as a bullet point list of steps, e.g. Get data from user Output result to user Transform data Parse data Map result for output Such a compilation of steps - possibly on different levels of abstraction - often is the first artifact of functional design. It can be generated by a team in an initial design brainstorming. Next comes ordering the steps. What should happen first, what next etc.? Get data from user Parse data Transform data Map result for output Output result to user That´s great for a start into functional design. It´s better than starting to code right away on a given function using TDD. Please get me right: TDD is a valuable practice. But it can be unnecessarily hard if the scope of a functionn is too large. But how do you know beforehand without investing some thinking? And how to do this thinking in a systematic fashion? My recommendation: For any given function you´re supposed to implement first do a functional design. Then, once you´re confident you know the processing steps - which are pretty small - refine and code them using TDD. You´ll see that´s much, much easier - and leads to cleaner code right away. For more information on this approach I call “Informed TDD” read my book of the same title. Thinking before coding is smart. And writing down the solution as a bunch of functions possibly is the simplest thing you can do, I´d say. It´s more according to the KISS (Keep It Simple, Stupid) principle than returning constants or other trivial stuff TDD development often is started with. So far so good. A simple ordered list of processing steps will do to start with functional design. As shown in the above example such steps can easily be translated into functions. Moving from design to coding thus is simple. However, such a list does not scale. Processing is not always that simple to be captured in a list. And then the list is just text. Again. Like code. That means the design is lacking visuality. Textual representations need more parsing by your brain than visual representations. Plus they are limited in their “dimensionality”: text just has one dimension, it´s sequential. Alternatives and parallelism are hard to encode in text. In addition the functional design using numbered lists lacks data. It´s not visible what´s the input, output, and state of the processing steps. That´s why functional design should be done using a lightweight visual notation. No tool is necessary to draw such designs. Use pen and paper; a flipchart, a whiteboard, or even a napkin is sufficient. Visualizing processes The building block of the functional design notation is a functional unit. I mostly draw it like this: Something is done, it´s clear what goes in, it´s clear what comes out, and it´s clear what the processing step requires in terms of state or hardware. Whenever input flows into a functional unit it gets processed and output is produced and/or a side effect occurs. Flowing data is the driver of something happening. That´s why I call this approach to functional design Flow Design. It´s about data flow instead of control flow. Control flow like in algorithms is of no concern to functional design. Thinking about control flow simply is too low level. Once you start with control flow you easily get bogged down by tons of details. That´s what you want to avoid during design. Design is supposed to be quick, broad brush, abstract. It should give overview. But what about all the details? As Robert C. Martin rightly said: “Programming is abot detail”. Detail is a matter of code. Once you start coding the processing steps you designed you can worry about all the detail you want. Functional design does not eliminate all the nitty gritty. It just postpones tackling them. To me that´s also an example of the SRP. Function design has the responsibility to come up with a solution to a problem posed by a single function (Entry Point). And later coding has the responsibility to implement the solution down to the last detail (i.e. statement, API-call). TDD unfortunately mixes both responsibilities. It´s just coding - and thereby trying to find detailed implementations (green phase) plus getting the design right (refactoring). To me that´s one reason why TDD has failed to deliver on its promise for many developers. Using functional units as building blocks of functional design processes can be depicted very easily. Here´s the initial process for the example problem: For each processing step draw a functional unit and label it. Choose a verb or an “action phrase” as a label, not a noun. Functional design is about activities, not state or structure. Then make the output of an upstream step the input of a downstream step. Finally think about the data that should flow between the functional units. Write the data above the arrows connecting the functional units in the direction of the data flow. Enclose the data description in brackets. That way you can clearly see if all flows have already been specified. Empty brackets mean “no data is flowing”, but nevertheless a signal is sent. A name like “list” or “strings” in brackets describes the data content. Use lower case labels for that purpose. A name starting with an upper case letter like “String” or “Customer” on the other hand signifies a data type. If you like, you also can combine descriptions with data types by separating them with a colon, e.g. (list:string) or (strings:string[]). But these are just suggestions from my practice with Flow Design. You can do it differently, if you like. Just be sure to be consistent. Flows wired-up in this manner I call one-dimensional (1D). Each functional unit just has one input and/or one output. A functional unit without an output is possible. It´s like a black hole sucking up input without producing any output. Instead it produces side effects. A functional unit without an input, though, does make much sense. When should it start to work? What´s the trigger? That´s why in the above process even the first processing step has an input. If you like, view such 1D-flows as pipelines. Data is flowing through them from left to right. But as you can see, it´s not always the same data. It get´s transformed along its passage: (args) becomes a (list) which is turned into (strings). The Principle of Mutual Oblivion A very characteristic trait of flows put together from function units is: no functional units knows another one. They are all completely independent of each other. Functional units don´t know where their input is coming from (or even when it´s gonna arrive). They just specify a range of values they can process. And they promise a certain behavior upon input arriving. Also they don´t know where their output is going. They just produce it in their own time independent of other functional units. That means at least conceptually all functional units work in parallel. Functional units don´t know their “deployment context”. They now nothing about the overall flow they are place in. They are just consuming input from some upstream, and producing output for some downstream. That makes functional units very easy to test. At least as long as they don´t depend on state or resources. I call this the Principle of Mutual Oblivion (PoMO). Functional units are oblivious of others as well as an overall context/purpose. They are just parts of a whole focused on a single responsibility. How the whole is built, how a larger goal is achieved, is of no concern to the single functional units. By building software in such a manner, functional design interestingly follows nature. Nature´s building blocks for organisms also follow the PoMO. The cells forming your body do not know each other. Take a nerve cell “controlling” a muscle cell for example:[2] The nerve cell does not know anything about muscle cells, let alone the specific muscel cell it is “attached to”. Likewise the muscle cell does not know anything about nerve cells, let a lone a specific nerve cell “attached to” it. Saying “the nerve cell is controlling the muscle cell” thus only makes sense when viewing both from the outside. “Control” is a concept of the whole, not of its parts. Control is created by wiring-up parts in a certain way. Both cells are mutually oblivious. Both just follow a contract. One produces Acetylcholine (ACh) as output, the other consumes ACh as input. Where the ACh is going, where it´s coming from neither cell cares about. Million years of evolution have led to this kind of division of labor. And million years of evolution have produced organism designs (DNA) which lead to the production of these different cell types (and many others) and also to their co-location. The result: the overall behavior of an organism. How and why this happened in nature is a mystery. For our software, though, it´s clear: functional and quality requirements needs to be fulfilled. So we as developers have to become “intelligent designers” of “software cells” which we put together to form a “software organism” which responds in satisfying ways to triggers from it´s environment. My bet is: If nature gets complex organisms working by following the PoMO, who are we to not apply this recipe for success to our much simpler “machines”? So my rule is: Wherever there is functionality to be delivered, because there is a clear Entry Point into software, design the functionality like nature would do it. Build it from mutually oblivious functional units. That´s what Flow Design is about. In that way it´s even universal, I´d say. Its notation can also be applied to biology: Never mind labeling the functional units with nouns. That´s ok in Flow Design. You´ll do that occassionally for functional units on a higher level of abstraction or when their purpose is close to hardware. Getting a cockroach to roam your bedroom takes 1,000,000 nerve cells (neurons). Getting the de-duplication program to do its job just takes 5 “software cells” (functional units). Both, though, follow the same basic principle. Translating functional units into code Moving from functional design to code is no rocket science. In fact it´s straightforward. There are two simple rules: Translate an input port to a function. Translate an output port either to a return statement in that function or to a function pointer visible to that function. The simplest translation of a functional unit is a function. That´s what you saw in the above example. Functions are mutually oblivious. That why Functional Programming likes them so much. It makes them composable. Which is the reason, nature works according to the PoMO. Let´s be clear about one thing: There is no dependency injection in nature. For all of an organism´s complexity no DI container is used. Behavior is the result of smooth cooperation between mutually oblivious building blocks. Functions will often be the adequate translation for the functional units in your designs. But not always. Take for example the case, where a processing step should not always produce an output. Maybe the purpose is to filter input. Here the functional unit consumes words and produces words. But it does not pass along every word flowing in. Some words are swallowed. Think of a spell checker. It probably should not check acronyms for correctness. There are too many of them. Or words with no more than two letters. Such words are called “stop words”. In the above picture the optionality of the output is signified by the astrisk outside the brackets. It means: Any number of (word) data items can flow from the functional unit for each input data item. It might be none or one or even more. This I call a stream of data. Such behavior cannot be translated into a function where output is generated with return. Because a function always needs to return a value. So the output port is translated into a function pointer or continuation which gets passed to the subroutine when called:[3]void filter_stop_words( string word, Action<string> onNoStopWord) { if (...check if not a stop word...) onNoStopWord(word); } If you want to be nitpicky you might call such a function pointer parameter an injection. And technically you´re right. Conceptually, though, it´s not an injection. Because the subroutine is not functionally dependent on the continuation. Firstly continuations are procedures, i.e. subroutines without a return type. Remember: Flow Design is about unidirectional data flow. Secondly the name of the formal parameter is chosen in a way as to not assume anything about downstream processing steps. onNoStopWord describes a situation (or event) within the functional unit only. Translating output ports into function pointers helps keeping functional units mutually oblivious in cases where output is optional or produced asynchronically. Either pass the function pointer to the function upon call. Or make it global by putting it on the encompassing class. Then it´s called an event. In C# that´s even an explicit feature.class Filter { public void filter_stop_words( string word) { if (...check if not a stop word...) onNoStopWord(word); } public event Action<string> onNoStopWord; } When to use a continuation and when to use an event dependens on how a functional unit is used in flows and how it´s packed together with others into classes. You´ll see examples further down the Flow Design road. Another example of 1D functional design Let´s see Flow Design once more in action using the visual notation. How about the famous word wrap kata? Robert C. Martin has posted a much cited solution including an extensive reasoning behind his TDD approach. So maybe you want to compare it to Flow Design. The function signature given is:string WordWrap(string text, int maxLineLength) {...} That´s not an Entry Point since we don´t see an application with an environment and users. Nevertheless it´s a function which is supposed to provide a certain functionality. The text passed in has to be reformatted. The input is a single line of arbitrary length consisting of words separated by spaces. The output should consist of one or more lines of a maximum length specified. If a word is longer than a the maximum line length it can be split in multiple parts each fitting in a line. Flow Design Let´s start by brainstorming the process to accomplish the feat of reformatting the text. What´s needed? Words need to be assembled into lines Words need to be extracted from the input text The resulting lines need to be assembled into the output text Words too long to fit in a line need to be split Does sound about right? I guess so. And it shows a kind of priority. Long words are a special case. So maybe there is a hint for an incremental design here. First let´s tackle “average words” (words not longer than a line). Here´s the Flow Design for this increment: The the first three bullet points turned into functional units with explicit data added. As the signature requires a text is transformed into another text. See the input of the first functional unit and the output of the last functional unit. In between no text flows, but words and lines. That´s good to see because thereby the domain is clearly represented in the design. The requirements are talking about words and lines and here they are. But note the asterisk! It´s not outside the brackets but inside. That means it´s not a stream of words or lines, but lists or sequences. For each text a sequence of words is output. For each sequence of words a sequence of lines is produced. The asterisk is used to abstract from the concrete implementation. Like with streams. Whether the list of words gets implemented as an array or an IEnumerable is not important during design. It´s an implementation detail. Does any processing step require further refinement? I don´t think so. They all look pretty “atomic” to me. And if not… I can always backtrack and refine a process step using functional design later once I´ve gained more insight into a sub-problem. Implementation The implementation is straightforward as you can imagine. The processing steps can all be translated into functions. Each can be tested easily and separately. Each has a focused responsibility. And the process flow becomes just a sequence of function calls: Easy to understand. It clearly states how word wrapping works - on a high level of abstraction. And it´s easy to evolve as you´ll see. Flow Design - Increment 2 So far only texts consisting of “average words” are wrapped correctly. Words not fitting in a line will result in lines too long. Wrapping long words is a feature of the requested functionality. Whether it´s there or not makes a difference to the user. To quickly get feedback I decided to first implement a solution without this feature. But now it´s time to add it to deliver the full scope. Fortunately Flow Design automatically leads to code following the Open Closed Principle (OCP). It´s easy to extend it - instead of changing well tested code. How´s that possible? Flow Design allows for extension of functionality by inserting functional units into the flow. That way existing functional units need not be changed. The data flow arrow between functional units is a natural extension point. No need to resort to the Strategy Pattern. No need to think ahead where extions might need to be made in the future. I just “phase in” the remaining processing step: Since neither Extract words nor Reformat know of their environment neither needs to be touched due to the “detour”. The new processing step accepts the output of the existing upstream step and produces data compatible with the existing downstream step. Implementation - Increment 2 A trivial implementation checking the assumption if this works does not do anything to split long words. The input is just passed on: Note how clean WordWrap() stays. The solution is easy to understand. A developer looking at this code sometime in the future, when a new feature needs to be build in, quickly sees how long words are dealt with. Compare this to Robert C. Martin´s solution:[4] How does this solution handle long words? Long words are not even part of the domain language present in the code. At least I need considerable time to understand the approach. Admittedly the Flow Design solution with the full implementation of long word splitting is longer than Robert C. Martin´s. At least it seems. Because his solution does not cover all the “word wrap situations” the Flow Design solution handles. Some lines would need to be added to be on par, I guess. But even then… Is a difference in LOC that important as long as it´s in the same ball park? I value understandability and openness for extension higher than saving on the last line of code. Simplicity is not just less code, it´s also clarity in design. But don´t take my word for it. Try Flow Design on larger problems and compare for yourself. What´s the easier, more straightforward way to clean code? And keep in mind: You ain´t seen all yet ;-) There´s more to Flow Design than described in this chapter. In closing I hope I was able to give you a impression of functional design that makes you hungry for more. To me it´s an inevitable step in software development. Jumping from requirements to code does not scale. And it leads to dirty code all to quickly. Some thought should be invested first. Where there is a clear Entry Point visible, it´s functionality should be designed using data flows. Because with data flows abstraction is possible. For more background on why that´s necessary read my blog article here. For now let me point out to you - if you haven´t already noticed - that Flow Design is a general purpose declarative language. It´s “programming by intention” (Shalloway et al.). Just write down how you think the solution should work on a high level of abstraction. This breaks down a large problem in smaller problems. And by following the PoMO the solutions to those smaller problems are independent of each other. So they are easy to test. Or you could even think about getting them implemented in parallel by different team members. Flow Design not only increases evolvability, but also helps becoming more productive. All team members can participate in functional design. This goes beyon collective code ownership. We´re talking collective design/architecture ownership. Because with Flow Design there is a common visual language to talk about functional design - which is the foundation for all other design activities.   PS: If you like what you read, consider getting my ebook “The Incremental Architekt´s Napkin”. It´s where I compile all the articles in this series for easier reading. I like the strictness of Function Programming - but I also find it quite hard to live by. And it certainly is not what millions of programmers are used to. Also to me it seems, the real world is full of state and side effects. So why give them such a bad image? That´s why functional design takes a more pragmatic approach. State and side effects are ok for processing steps - but be sure to follow the SRP. Don´t put too much of it into a single processing step. ? Image taken from www.physioweb.org ? My code samples are written in C#. C# sports typed function pointers called delegates. Action is such a function pointer type matching functions with signature void someName(T t). Other languages provide similar ways to work with functions as first class citizens - even Java now in version 8. I trust you find a way to map this detail of my translation to your favorite programming language. I know it works for Java, C++, Ruby, JavaScript, Python, Go. And if you´re using a Functional Programming language it´s of course a no brainer. ? Taken from his blog post “The Craftsman 62, The Dark Path”. ?

    Read the article

  • SQLAuthority News – Author Visit – SQL Server 2008 R2 Launch

    - by pinaldave
    June 11, 2010 was a wonderful day because I attended the very first SQL Server 2008 R2 Launch event held by Microsoft at Mumbai. I traveled to Mumbai from my home town, Ahmedabad. The event was located at one of the best hotels in Mumbai,”The Leela”. SQL Server R2 Launch was an evening event that had a few interesting talks. SQL PASS is associated with this event as one of the partners and its goal is to increase the awareness of the Community about SQL Server. I met many interesting people and had a great networking opportunity at the event. This event was kicked off with an awesome laser show and a “Welcome” video, which was followed by a Microsoft Executive session wherein there were several interesting demo. The very first demo was about Powerpivot. I knew beforehand that there will be Powerpivot demos because it is a very popular subject; however, I was really hoping to see other interesting demos from SQL Server 2008 R2. And believe me; I was happier to see the later demos. There were demos from SQL Server Utility Control Point, as well an integration of Bing Map with Reporting Servers. I really enjoyed the interactive and informative session by Shivaram Venkatesh. He had excellent presentation skills as well as ample technical knowledge to keep the audience attentive. I really liked his presentations skills wherein he did not read the whole slide deck; rather, he picked one point and using that point he told the story of the whole slide deck. I also enjoyed my conversation with Afaq Choonawala, who is one of the “gem guys” in Microsoft. I also want to acknowledge Ashwin Kini and Mohit Panchal for their excellent support to this event. Mumbai IT Pro is a user group which you can really count on for any kind of help. After excellent demos and a vibrant start of the event, all the audience was jazzed up. There were two vendors’ sessions right after the first session. Intel had 15 minutes to present; however, Intel’s representative, who had good knowledge of the subject, had nearly 30+ slides in his presentation, so he had to rush a bit to cover the whole slide deck. Intel presentations were followed up by another vendor presentation from NetApp. I have previously heard about this tool. After I saw the demo which did not work the first time the Net App presenter demonstrated it, I started to have a doubt on this product. I personally went to clarify my doubt to the demo booth after the presentation was over, but I realize the NetApp presenter or booth owner had absolutely a POOR KNOWLEDGE of SQL Server and even of their own NetApp product. The NetApp people tried to misguide us and when we argued, they started to say different things against what they said earlier. At one point in their presentation, they claimed their application does something very fast, which did not really happen in front of all the audience. They blamed SQL Server R2 DBCC CHECKDB command for their product’s failed demonstration. I know that NetApp has many great products; however, this one was not conveyed clearly and even created a negative impression to all of us. Well, let us not judge the potential, fun, education and enigma of the launch event through a small glitch. This event was jam-packed and extremely well-received by everybody who attended it. As what I said, average demos and good presentations by MS folks were really something to cheer about. Any launch event is considered as successful if it achieves its goal to excite users with its cutting edge technology; just like this event that left a very deep impression on me. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority Author Visit, SQLAuthority News, T SQL, Technology Tagged: PASS, SQLPASS

    Read the article

  • Java Script – Content delivery networks (CDN) can bit you in the butt.

    - by Ryan Ternier
    As much as I love the new CDN’s that Google, Microsoft and a few others have publically released, there are some strong gotchas that could come up and bite you in the ass if you’re not careful. But before we jump into that, for those that are not 100% sure what a CDN is (besides Canadian).   Content Delivery Network. A way of distributing your static content across various servers in different physical locations.  Because this static content is stored on many servers around the world, whenever a user needs to access this content, they are given the closest server to their location for this data. Already you can probably see the immediate bonuses to a system like this: Lower bandwidth Even small script files downloaded thousands of times will start to take a noticeable hit on your bandwidth meter. Less connections/hits to your web server which gives better latency If you manage many servers, you don’t need to manually update each server with scripts. A user will download a script for each website they visit. If a user is redirected to many domains/sub-domains within your web site, they might download many copies of the same file. When a system sees multiple requests from the same  domain, they will ignore the download   Those are just a handful of the many bonuses a CDN will give you. And for the average website, a CDN is great choice. Check out the following CDN links for their solutions: Google AJAX Library: http://code.google.com/apis/ajaxlibs/ Microsoft Ajax library: http://www.asp.net/ajaxlibrary/cdn.ashx The Gotcha There is always a catch. Here are some issues I found with using CDN’s that hopefully can help you make your decision. HTTP / HTTPS If you are running a website behind SSL, make sure that when you reference your CDN data that you use https:// vs. http://. If you forget this users will get a very nice message telling them that their secure connection is trying to access unsecure data. For a developer this is fairly simple, but general users will get a bit anxious when seeing this. Trusted Sites Internet Explorer has this really nifty feature that allows users to specify what sites they trust, and by some defaults IE7 only allows trusted sites to be viewed.  No problem, they set your website as trusted. But what about your CDN? If a user sets your websites to trusted, but not the CDN, they will not download those static files. This has the potential to totally break your web site. Pedantic Network Admins This alone is sometimes the killer of projects. However, always be careful when you are going to use a CDN for a professional project. If a network / security admin sees that you’re referencing an outside source, or that a call from a website might hit an outside domain.. panties will be bunched, emails will be spewed out and well, no one wants that.

    Read the article

  • Grouping data in LINQ with the help of group keyword

    - by vik20000in
    While working with any kind of advanced query grouping is a very important factor. Grouping helps in executing special function like sum, max average etc to be performed on certain groups of data inside the date result set. Grouping is done with the help of the Group method. Below is an example of the basic group functionality.     int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };         var numberGroups =         from num in numbers         group num by num % 5 into numGroup         select new { Remainder = numGroup.Key, Numbers = numGroup };  In the above example we have grouped the values based on the reminder left over when divided by 5. First we are grouping the values based on the reminder when divided by 5 into the numgroup variable.  numGroup.Key gives the value of the key on which the grouping has been applied. And the numGroup itself contains all the records that are contained in that group. Below is another example to explain the same. string[] words = { "blueberry", "abacus", "banana", "apple", "cheese" };         var wordGroups =         from num in words         group num by num[0] into grp         select new { FirstLetter = grp.Key, Words = grp }; In the above example we are grouping the value with the first character of the string (num[0]). Just like the order operator the group by clause also allows us to write our own logic for the Equal comparison (That means we can group Item by ignoring case also by writing out own implementation). For this we need to pass an object that implements the IEqualityComparer<string> interface. Below is an example. public class AnagramEqualityComparer : IEqualityComparer<string> {     public bool Equals(string x, string y) {         return getCanonicalString(x) == getCanonicalString(y);     }      public int GetHashCode(string obj) {         return getCanonicalString(obj).GetHashCode();     }         private string getCanonicalString(string word) {         char[] wordChars = word.ToCharArray();         Array.Sort<char>(wordChars);         return new string(wordChars);     } }  string[] anagrams = {"from   ", " salt", " earn", "  last   ", " near "}; var orderGroups = anagrams.GroupBy(w => w.Trim(), new AnagramEqualityComparer()); Vikram  

    Read the article

  • How to make software development decisions based on facts

    - by Laila
    We love to hear stories about the many and varied ways our customers use the tools that we develop, but in our earnest search for stories and feedback, we'd rather forgotten that some of our keenest users are fellow RedGaters, in the same building. It was almost by chance that we discovered how the SQL Source Control team were using SmartAssembly. As it happens, there is a separate account (here on Simple-Talk) of how SmartAssembly was used to support the Early Access program; by providing answers to specific questions about how the SQL Source Control product was used. But what really got us all grinning was how valuable the SQL Source Control team found the reports that SmartAssembly was quickly and painlessly providing. So gather round, my friends, and I'll tell you the Tale Of The Framework Upgrade . <strange mirage effect to denote a flashback. A subtle background string of music starts playing in minor key> Kevin and his team were undecided. They weren't sure whether they could move their software product from .NET 2 to .NET 3.5 , let alone to .NET 4. You see, they were faced with having to guess what version of .NET was already installed on the average user's machine, which I'm sure you'll agree is no easy task. Upgrading their code to .NET 3.5 might put a barrier to people trying the tool, which was the last thing Kevin wanted: "what if our users have to download X, Y, and Z before being able to open the application?" he asked. That fear of users having to do half an hour of downloads (.followed by at least ten minutes of installation. followed by a five minute restart) meant that Kevin's team couldn't take advantage of WCF (Windows Communication Foundation). This made them sad, because WCF would have allowed them to write their code in a much simpler way, and in hours instead of days (as was the case with .NET 2). Oh sure, they had a gut feeling that this probably wasn't the case, 3.5 had been out for so many years, but they weren't sure. <background music switches to major key> SmartAssembly Feature Usage Reporting gave Kevin and his team exactly what they needed: hard data on their users' systems, both hardware and software. I was there, I saw it happen, and that's not the sort of thing a woman quickly forgets. I'll always remember his last words (before he went to lunch): "You get lots of free information by just checking a box in SmartAssembly" is what he said. For example, they could see how many CPU cores their customers were using, and found out that they should be making use of parallelism to take advantage of available cores. But crucially, (and this is the moral of my tale, dear reader), Kevin saw that 99% of SQL Source Control's users were on .NET 3.5 or above.   So he knew that they could make the switch and that is was safe to do so. With this reassurance, they could use WCF to not only make development easier, but to also give them a really nice way to do inter-process communication between the Source Control and the SQL Compare products. To have done that on .NET 2.0 was certainly possible <knowing chuckle>, but Microsoft have made it a lot easier with WCF. <strange mirage effect to denote end of flashback> So you see, with Feature Usage Reporting, they finally got the hard evidence they needed to safely make the switch to .NET 3.5, knowing it would not inconvenience their users. And that, my friends, is just the sort of thing we like to hear.

    Read the article

  • Observations From The Corner of a Starbucks

    - by Chris Williams
    I’ve spent the last 3 days sitting in a Starbucks for 4-8 hours at a time. As a result, I’ve observed a lot of interesting behavior and people (most of whom were uninteresting themselves.) One of the things I’ve noticed is that most people don’t sit down. They come in, get their drink and go. The ones that do sit down, stay much longer than it takes to consume their drink. The drink is just an incidental purchase. Certainly not the reason they are here. Most of the people who sit also have laptops. Probably around 75%. Only a few have kids (with them) but the ones that do, have very small kids. Toddlers or younger. Of all the “campers” only a small percentage are wearing headphone, presumably because A) external noise doesn’t bother them or B) they aren’t working on anything important. My buddy George falls into category A, but he grew up in a house full of people. Silence freaks him out far more than noise. My brother and I, on the other hand, were both only children and don’t handle noisy distractions well. He needs it quiet (like a tomb) and I need music. Go figure… I can listen to Britney Spears mixed with Apoptygma Berzerk and Anthrax and crank out 30 pages, but if your toddler is banging his spoon on the table, you’re getting a dirty look… unless I have music, then all is right with the world. Anyway, enough about me. Most of the people who come in as a group are smiling when they enter. Half as many are smiling when they leave. People who come in alone typically aren’t smiling at all. The average age, over the last three days seems to be early 30s… with a couple of senior citizens and teenagers at either end of the curve. The teenagers almost never stay. They have better stuff to do on a nice day. The senior citizens are split nearly evenly between campers and in&outs. Most of the non-solo campers have 1 person with a laptop, while the other reads the paper or a book. Some campers bring multiple laptops… but only really look at one of them. This Starbucks has a drive through. The line is almost never more than 2-3 cars long but apparently a lot of the in&out people would rather come in and stand in line behind (up to) 5 people. The music in here sucks. My musical tastes can best be described as eclectic to bad, but I can still get work done (see above.) I find the music in this particular Starbucks to be discordant and jarring. At this Starbucks, the coffee lingo is apparently something that is meant to occur between employees only. The nice lady at the counter can handle orders in plain English and translate them to Baristaspeak (Baristese?) quite efficiently. If you order in Baristaspeak however, she will look confused and repeat your order back to you in plain English to confirm you actually meant what you said. Then she will say it in Baristaspeak to the lady making your drink. Nobody in this Starbucks (other than the Baristas) makes eye-contact… at least not with me. Of course that may be indicative of a separate issue. ;)

    Read the article

  • What’s New from the Oracle Marketing Cloud at Oracle OpenWorld 2014

    - by Kathryn Perry
    A Guest Post by Laura Vogel, Director, Oracle Marketing Cloud Events (pictured left) Marketing—CX Central is your hub for all things Marketing related at OpenWorld in San Francisco, September 28-October 2, 2014. Learn how to personalize the modern marketing journey to improve customer loyalty. We’re hosting more than 60 breakout sessions, half of which will highlight customer success stories from marquee brands including Bizo, Comcast, Dell, Epson, John Deere, Lane Bryant, ReadyTalk and Shutterfly. Moscone West, Levels 2 and 3To learn more about how modern marketing works, visit Moscone West, levels 2 and 3, for exciting demos of each of the Oracle Marketing Cloud solutions (BlueKai, Compendium, Eloqua, Push I/O, and Responsys). You also can check out our stations for Vertical Marketing Best Practices, the Markie Awards, and more! CX Spotlight Sessions “Accelerating Big Profits in Big Data,” Jeff Tanner, Baylor University “Using Content Marketing to Impact Every Stage of the Buyer’s Journey,” Jennifer Agustin, Bizo “Expanding Your Marketing with Proven Testing and Optimization,” Brian Border, Shutterfly and Matthew Balthazor, Epson “Modern Marketing: The New Digital Dialogue,” Cory Treffiletti, Oracle A Special Marquee SessionDell’s Hayden Mugford will speak on "The Digital Ecosystem: Driving Experience Through Contact Engagement.” She will highlight how the organization built a digital ecosystem that supports a behaviorally driven, multivehicle nurturing campaign. The Dell 1:1 Global Marketing team worked with multiple partners to innovate integrations with Oracle Eloqua, Oracle Real-Time Decisions for real-time decision logic, and a content management system (CMS) that enables 100 percent customized e-mails. The program doubled average order values for nurtured contacts versus non-nurtured and tripled open and click-through rates versus push e-mail. It Wouldn’t Be an Oracle Marketing Cloud Event Without a Party!We’re hosting CX Central Fest: a unique customer experience specifically designed for attendees of CX Central. It will include a chance to rock out at a private concert featuring Los Angeles indie electronic pop group, Capital Cities! Join us Tuesday, September 30 from 7-9 p.m. Other Oracle Marketing Cloud Session Highlights Thought leadership by role Exploring the benefits of moving to the Cloud Product line roadmaps and innovations in Marketing Technical deep dives for product lines within Marketing Best practices and impactful business measurements Solutions that are integrated across CX Target AudienceSession content is geared toward professionals in Marketing, Marketing Operations, Marketing Demand Generation, Social: Chief Marketing Officers, Vice Presidents, Directors and Managers. OutcomesCustomers attending Marketing—CX Central @ OpenWorld will be able to: Gain insight into delivering consistent cross-channel marketing Discover how to provide the right information to the right customer at the right time and with the right channel Get answers to burning questions and advice on business challenges Hear from other Oracle customers about recommended best practices to help their organization move forward Network and share ideas to help create a strategy for connecting with customers in better ways Resources At a Glance Register Now Track Site—View Marketing Sessions 72 1024x768 Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Focus on Session Doc Downloadable Justification Email OpenWorld is a fabulous way for you to see all that Oracle Marketing Cloud has to offer. Register today.

    Read the article

  • Is there a low carbon future for the retail industry?

    - by user801960
    Recently Oracle published a report in conjunction with The Future Laboratory and a global panel of experts to highlight the issue of energy use in modern industry and the serious need to reduce carbon emissions radically by 2050.  Emissions must be cut by 80-95% below the levels in 1990 – but what can the retail industry do to keep up with this? There are three key aspects to the retail industry where carbon emissions can be cut:  manufacturing, transport and IT.  Manufacturing Naturally, manufacturing is going to be a big area where businesses across all industries will be forced to make considerable savings in carbon emissions as well as other forms of pollution.  Many retailers of all sizes will use third party factories and will have little control over specific environmental impacts from the factory, but retailers can reduce environmental impact at the factories by managing orders more efficiently – better planning for stock requirements means economies of scale both in terms of finance and the environment. The John Lewis Partnership has made detailed commitments to reducing manufacturing and packaging waste on both its own-brand products and products it sources from third party suppliers. It aims to divert 95 percent of its operational waste from landfill by 2013, which is a huge logistics challenge.  The John Lewis Partnership’s website provides a large amount of information on its responsibilities towards the environment. Transport Similarly to manufacturing, tightening up on logistical planning for stock distribution will make savings on carbon emissions from haulage.  More accurate supply and demand analysis will mean less stock re-allocation after initial distribution, and better warehouse management will mean more efficient stock distribution.  UK grocery retailer Morrisons has introduced double-decked trailers to its haulage fleet and adjusted distribution logistics accordingly to reduce the number of kilometers travelled by the fleet.  Morrisons measures route planning efficiency in terms of cases moved per kilometre and has, over the last two years, increased the number of cases per kilometre by 12.7%.  See Morrisons Corporate Responsibility report for more information. IT IT infrastructure is often initially overlooked by businesses when considering environmental efficiency.  Datacentres and web servers often need to run 24/7 to handle both consumer orders and internal logistics, and this both requires a lot of energy and puts out a lot of heat.  Many businesses are lowering environmental impact by reducing IT system fragmentation in their offices, while an increasing number of businesses are outsourcing their datacenters to cloud-based services.  Using centralised datacenters reduces the power usage at smaller offices, while using cloud based services means the datacenters can be based in a more environmentally friendly location.  For example, Facebook is opening a massive datacentre in Sweden – close to the Arctic Circle – to reduce the need for artificial cooling methods.  In addition, moving to a cloud-based solution makes IT services more easily scaleable, reducing redundant IT systems that would still use energy.  In store, the UK’s Carbon Trust reports that on average, lighting accounts for 25% of a retailer’s electricity costs, and for grocery retailers, up to 50% of their electricity bill comes from refrigeration units.  On a smaller scale, retailers can invest in greener technologies in store and in their offices.  The report concludes that widely shared objectives of energy security, reduced emissions and continued economic growth are dependent on the development of a smart grid capable of delivering energy efficiency and demand response, as well as integrating renewable and variable sources of energy. The report is available to download from http://emeapressoffice.oracle.com/imagelibrary/detail.aspx?MediaDetailsID=1766I’d be interested to hear your thoughts on the report.   

    Read the article

  • Speed up SQL Server queries with PREFETCH

    - by Akshay Deep Lamba
    Problem The SAN data volume has a throughput capacity of 400MB/sec; however my query is still running slow and it is waiting on I/O (PAGEIOLATCH_SH). Windows Performance Monitor shows data volume speed of 4MB/sec. Where is the problem and how can I find the problem? Solution This is another summary of a great article published by R. Meyyappan at www.sqlworkshops.com.  In my opinion, this is the first article that highlights and explains with working examples how PREFETCH determines the performance of a Nested Loop join.  First of all, I just want to recall that Prefetch is a mechanism with which SQL Server can fire up many I/O requests in parallel for a Nested Loop join. When SQL Server executes a Nested Loop join, it may or may not enable Prefetch accordingly to the number of rows in the outer table. If the number of rows in the outer table is greater than 25 then SQL will enable and use Prefetch to speed up query performance, but it will not if it is less than 25 rows. In this section we are going to see different scenarios where prefetch is automatically enabled or disabled. These examples only use two tables RegionalOrder and Orders.  If you want to create the sample tables and sample data, please visit this site www.sqlworkshops.com. The breakdown of the data in the RegionalOrders table is shown below and the Orders table contains about 6 million rows. In this first example, I am creating a stored procedure against two tables and then execute the stored procedure.  Before running the stored proceudre, I am going to include the actual execution plan. --Example provided by www.sqlworkshops.com --Create procedure that pulls orders based on City --Do not forget to include the actual execution plan CREATE PROC RegionalOrdersProc @City CHAR(20) AS BEGIN DECLARE @OrderID INT, @OrderDetails CHAR(200) SELECT @OrderID = o.OrderID, @OrderDetails = o.OrderDetails       FROM RegionalOrders ao INNER JOIN Orders o ON (o.OrderID = ao.OrderID)       WHERE City = @City END GO SET STATISTICS time ON GO --Example provided by www.sqlworkshops.com --Execute the procedure with parameter SmallCity1 EXEC RegionalOrdersProc 'SmallCity1' GO After running the stored procedure, if we right click on the Clustered Index Scan and click Properties we can see the Estimated Numbers of Rows is 24.    If we right click on Nested Loops and click Properties we do not see Prefetch, because it is disabled. This behavior was expected, because the number of rows containing the value ‘SmallCity1’ in the outer table is less than 25.   Now, if I run the same procedure with parameter ‘BigCity’ will Prefetch be enabled? --Example provided by www.sqlworkshops.com --Execute the procedure with parameter BigCity --We are using cached plan EXEC RegionalOrdersProc 'BigCity' GO As we can see from the below screenshot, prefetch is not enabled and the query takes around 7 seconds to execute. This is because the query used the cached plan from ‘SmallCity1’ that had prefetch disabled. Please note that even if we have 999 rows for ‘BigCity’ the Estimated Numbers of Rows is still 24.   Finally, let’s clear the procedure cache to trigger a new optimization and execute the procedure again. DBCC freeproccache GO EXEC RegionalOrdersProc 'BigCity' GO This time, our procedure runs under a second, Prefetch is enabled and the Estimated Number of Rows is 999.   The RegionalOrdersProc can be optimized by using the below example where we are using an optimizer hint. I have also shown some other hints that could be used as well. --Example provided by www.sqlworkshops.com --You can fix the issue by using any of the following --hints --Create procedure that pulls orders based on City DROP PROC RegionalOrdersProc GO CREATE PROC RegionalOrdersProc @City CHAR(20) AS BEGIN DECLARE @OrderID INT, @OrderDetails CHAR(200) SELECT @OrderID = o.OrderID, @OrderDetails = o.OrderDetails       FROM RegionalOrders ao INNER JOIN Orders o ON (o.OrderID = ao.OrderID)       WHERE City = @City       --Hinting optimizer to use SmallCity2 for estimation       OPTION (optimize FOR (@City = 'SmallCity2'))       --Hinting optimizer to estimate for the currnet parameters       --option (recompile)       --Hinting optimize not to use histogram rather       --density for estimation (average of all 3 cities)       --option (optimize for (@City UNKNOWN))       --option (optimize for UNKNOWN) END GO Conclusion, this tip was mainly aimed at illustrating how Prefetch can speed up query execution and how the different number of rows can trigger this.

    Read the article

  • BizTalk 2009 - BizTalk Benchmark Wizard: Running a Test

    - by StuartBrierley
    The BizTalk Benchmark Wizard is a ultility that can be used to gain some validation of a BizTalk installation, giving a level of guidance on whether it is performing as might be expected.  It should be used after BizTalk Server has been installed and before any solutions are deployed to the environment.  This will ensure that you are getting consistent and clean results from the BizTalk Benchmark Wizard. The BizTalk Benchmark Wizard applies load to the BizTalk Server environment under a choice of specific scenarios. During these scenarios performance counter information is collected and assessed against statistics that are appropriate to the BizTalk Server environment. For details on installing the Benchmark Wizard see my previous post. The BizTalk Benchmarking Wizard provides two simple test scenarios, one for messaging and one for Orchestrations, which can be used to test your BizTalk implementation. Messaging Loadgen generates a new XML message and sends it over NetTCP A WCF-NetTCP Receive Location receives a the xml document from Loadgen. The PassThruReceive pipeline performs no processing and the message is published by the EPM to the MessageBox. The WCF One-Way Send Port, which is the only subscriber to the message, retrieves the message from the MessageBox The PassThruTransmit pipeline provides no additional processing The message is delivered to the back end WCF service by the WCF NetTCP adapter Orchestrations Loadgen generates a new XML message and sends it over NetTCP A WCF-NetTCP Receive Location receives a the xml document from Loadgen. The XMLReceive pipeline performs no processing and the message is published by the EPM to the MessageBox. The message is delivered to a simple Orchestration which consists of a receive location and a send port The WCF One-Way Send Port, which is the only subscriber to the Orchestration message, retrieves the message from the MessageBox The PassThruTransmit pipeline provides no additional processing The message is delivered to the back end WCF service by the WCF NetTCP adapter Below is a quick outline of how to run the BizTalk Benchmark Wizard on a single server, although it should be noted that this is not ideal as this server is then both generating and processing the load.  In order to separate this load out you should run the "Indigo" service on a seperate server. To start the BizTalk Benchmark Wizard click Start > All Programs > BizTalk Benchmark Wizard > BizTalk Benchmark Wizard. On this screen click next, you will then get the following pop up window. Check the server and database names and check the "check prerequsites" check-box before pressing ok.  The wizard will then check that the appropriate test scenarios are installed. You should then choose the test scenario that wish to run (messaging or orchestration) and the architecture that most closely matches your environment. You will then be asked to confirm the host server for each of the host instances. Next you will be presented with the prepare screen.  You will need to start the indigo service before pressing the Test Indigo Service Button. If you are running the indigo service on a separate server you can enter the server name here.  To start the indigo service click Start > All Programs > BizTalk Benchmark Wizard > Start Indigo Service.   While the test is running you will be presented with two speed dial type displays - one for the received messages per second and one for the processed messages per second. The green dial shows the current rate and the red dial shows the overall average rate.  Optionally you can view the CPU usage of the various servers involved in processing the tests. For my development environment I expected low results and this is what I got.  Although looking at the online high scores table and comparing to the quad core system listed, the results are perhaps not really that bad. At some time I may look at what improvements I can make to this score, but if you are interested in that now take a look at Benchmark your BizTalk Server (Part 3).

    Read the article

  • How John Got 15x Improvement Without Really Trying

    - by rchrd
    The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here.  How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

    Read the article

  • How to Set Up Your Enterprise Social Organization

    - by Mike Stiles
    The rush for business organizations to establish, grow, and adopt social was driven out of necessity and inevitability. The result, however, was a sudden, booming social presence creating touch points with customers, partners and influencers, but without any corporate social organization or structure in place to effectively manage it. Even today, many business leaders remain uncertain as to how to corral this social media thing so that it makes sense for their enterprise. Imagine their panic when they hear one of the most beneficial approaches to corporate use of social involves giving up at least some hierarchical control and empowering employees to publicly engage customers. And beyond that, they should also be empowered, regardless of their corporate status, to engage and collaborate internally, spurring “off the grid” innovation. An HBR blog points out that traditionally, enterprise organizations function from the top down, and employees work end-to-end, structured around business processes. But the social enterprise opens up structures that up to now have not exactly been embraced by turf-protecting executives and managers. The blog asks, “What if leaders could create a future where customers, associates and suppliers are no longer seen as objects in the system but as valued sources of innovation, ideas and energy?” What if indeed? The social enterprise activates internal resources without the usual obsession with position. It is the dawn of mass collaboration. That does not, however, mean this mass collaboration has to lead to uncontrolled chaos. In an extended interview with Oracle, Altimeter Group analyst Jeremiah Owyang and Oracle SVP Reggie Bradford paint a complete picture of today’s social enterprise, including internal organizational structures Altimeter Group has seen emerge. One sign of a mature social enterprise is the establishing of a social Center of Excellence (CoE), which serves as a hub for high-level social strategy, training and education, research, measurement and accountability, and vendor selection. This CoE is led by a corporate Social Strategist, most likely from a Marketing or Corporate Communications background. Reporting to them are the Community Managers, the front lines of customer interaction and engagement; business unit liaisons that coordinate the enterprise; and social media campaign/product managers, social analysts, and developers. With content rising as the defining factor for social success, Altimeter also sees a Content Strategist position emerging. Across the enterprise, Altimeter has seen 5 organizational patterns. Watching the video will give you the pros and cons of each. Decentralized - Anyone can do anything at any time on any social channel. Centralized – One central groups controls all social communication for the company. Hub and Spoke – A centralized group, but business units can operate their own social under the hub’s guidance and execution. Most enterprises are using this model. Dandelion – Each business unit develops their own social strategy & staff, has its own ability to deploy, and its own ability to engage under the central policies of the CoE. Honeycomb – Every employee can do social, but as opposed to the decentralized model, it’s coordinated and monitored on one platform. The average enterprise has a whopping 178 social accounts, nearly ¼ of which are usually semi-idle and need to be scrapped. The last thing any C-suite needs is to cope with fragmented technologies, solutions and platforms. It’s neither scalable nor strategic. The prepared, effective social enterprise has a technology partner that can quickly and holistically integrate emerging platforms and technologies, such that whatever internal social command structure you’ve set up can continue efficiently executing strategy without skipping a beat. @mikestiles

    Read the article

  • MySQL for Excel 1.3.0 Beta has been released

    - by Javier Treviño
    The MySQL Windows Experience Team is proud to announce the release of MySQL for Excel version 1.3.0.  This is a beta release for 1.3.x. MySQL for Excel is an application plug-in enabling data analysts to very easily access and manipulate MySQL data within Microsoft Excel. It enables you to directly work with a MySQL database from within Microsoft Excel so you can easily do tasks such as: Importing MySQL data into Excel Exporting Excel data directly into MySQL to a new or existing table Editing MySQL data directly within Excel As this is a beta version the MySQL for Excel product can be downloaded only by using the product standalone installer at this link http://dev.mysql.com/downloads/windows/excel/ Your feedback on this beta version is very well appreciated, you can raise bugs on the MySQL bugs page or give us your comments on the MySQL for Excel forum. Changes in MySQL for Excel 1.3.0 (2014-06-06, Beta) This section documents all changes and bug fixes applied to MySQL for Excel since the release of 1.2.1. Several new features were added, for more information see What Is New In MySQL for Excel (http://dev.mysql.com/doc/refman/5.6/en/mysql-for-excel-what-is-new.html). Known limitations: Upgrading from versions MySQL for Excel 1.2.0 and lower is not possible due to a bug fixed in MySQL for Excel 1.2.1. In that scenario, the old version (MySQL for Excel 1.2.0 or lower) must be uninstalled first. Upgrading from version 1.2.1 works correctly. <CTRL> + <A> cannot be used to select all database objects. Either <SHIFT> + <Arrow Key> or <CTRL> + click must be used instead. PivotTables are normally placed to the right (skipping one column) of the imported data, they will not be created if there is another existing Excel object at that position. Functionality Added or Changed Imported data can now be refreshed by using the native Refresh feature. Fields in the imported data sheet are then updated against the live MySQL database using the saved connection ID. Functionality was added to import data directly into PivotTables, which can be created from any Import operation. Multiple objects (tables and views) can now be imported into Excel, when before only one object could be selected. Relational information is also utilized when importing multiple objects. All options now have descriptive tooltips. Hovering over an option/preference displays helpful information about its use. A new Export Data, Advanced Options option was added that shows all available data types in the Data Type combo box, instead of only showing a subset of the most popular data types. The option dialogs now include a Refresh to Defaults button that resets the dialog's options to their defaults values. Each option dialog is set individually. A new Add Summary Fields for Numeric Columns option was added to the Import Data dialog that automatically adds summary fields for numeric data after the last row of the imported data. The specific summary function is selectable from many options, such as "Total" and "Average." A new collation option was added for the schema and table creation wizards. The default schema collation is "Server Default", and the default table collation is "Schema Default". Collation options may be selected from a drop-down list of all available collations. Quick links: MySQL for Excel documentation: http://dev.mysql.com/doc/en/mysql-for-excel.html. MySQL on Windows blog: http://blogs.oracle.com/MySQLOnWindows. MySQL for Excel forum: http://forums.mysql.com/list.php?172. MySQL YouTube channel: http://www.youtube.com/user/MySQLChannel. Enjoy and thanks for the support! 

    Read the article

  • Extend Your Applications Your Way: Oracle OpenWorld Live Poll Results

    - by Applications User Experience
    Lydia Naylor, Oracle Applications User Experience Manager At OpenWorld 2012, I attended one of our team’s very exciting sessions: “Extend Your Applications, Your Way”. It was clear that customers were engaged by the topics presented. Not only did we see many heads enthusiastically nodding in agreement during the presentation, and witness a large crowd surround our speakers Killian Evers, Kristin Desmond and Greg Nerpouni afterwards, but we can prove it…with data! Figure 1. Killian Evers, Kristin Desmond, and Greg Nerpouni of Oracle at the OOW 2012 session. At the beginning of our OOW 2012 journey, Greg Nerpouni, Fusion HCM Principal Product Manager, told me he really wanted to get feedback from the audience on our extensibility direction. Initially, we were thinking of doing a group activity at the OOW UX labs events that we hold every year, but Greg was adamant- he wanted “real-time” feedback. So, after a little tinkering, we came up with a way to use an online survey tool, a simple QR code (Quick Response code: a matrix barcode that can include information like URLs and can be read by mobile device cameras), and the audience’s mobile devices to do just that. Figure 2. Actual QR Code for survey Prior to the session, we developed a short survey in Vovici (an online survey tool), with questions to gather feedback on certain points in the presentation, as well as demographic data from our participants. We used Vovici’s feature to generate a mobile HTML version of the survey. At the session, attendees accessed the survey by simply scanning a QR code or typing in a TinyURL (a shorthand web address that is easily accessible through mobile devices). Killian, Kristin and Greg paused at certain points during the session and asked participants to answer a few survey questions about what they just presented. Figure 3. Session survey deployed on a mobile phone The nice thing about Vovici’s survey tool is that you can see the data real-time as participants are responding to questions - so we knew during the session that not only was our direction on track but we were hitting the mark and fulfilling Greg’s request. We planned on showing the live polling results to the audience at the end of the presentation but it ran just a little over time, and we were gently nudged out of the room by the session attendants. We’ve included a quick summary below and this link to the full results for your enjoyment. Figure 4. Most important extensions to Fusion Applications So what did participants think of our direction for extensibility? A total of 94% agreed that it was an improvement upon their current process. The vast majority, 80%, concurred that the extensibility model accounts for the major roles involved: end user, business systems analyst and programmer. Attendees suggested a few supporting roles such as systems administrator, data architect and integrator. Customers and partners in the audience verified that Oracle‘s Fusion Composers allow them to make changes in the most common areas they need to: user interface, business processes, reporting and analytics. Integrations were also suggested. All top 10 things customers can do on a page rated highly in importance, with all but two getting an average rating above 4.4 on a 5 point scale. The kinds of layout changes our composers allow customers to make align well with customers’ needs. The most common were adding columns to a table (94%) and resizing regions and drag and drop content (both selected by 88% of participants). We want to thank the attendees of the session for allowing us another great opportunity to gather valuable feedback from our customers! If you didn’t have a chance to attend the session, we will provide a link to the OOW presentation when it becomes available.

    Read the article

  • How to Set Up Your Enterprise Social Organization?

    - by Richard Lefebvre
    By Mike Stiles on Dec 04, 2012 The rush for business organizations to establish, grow, and adopt social was driven out of necessity and inevitability. The result, however, was a sudden, booming social presence creating touch points with customers, partners and influencers, but without any corporate social organization or structure in place to effectively manage it. Even today, many business leaders remain uncertain as to how to corral this social media thing so that it makes sense for their enterprise. Imagine their panic when they hear one of the most beneficial approaches to corporate use of social involves giving up at least some hierarchical control and empowering employees to publicly engage customers. And beyond that, they should also be empowered, regardless of their corporate status, to engage and collaborate internally, spurring “off the grid” innovation. An HBR blog points out that traditionally, enterprise organizations function from the top down, and employees work end-to-end, structured around business processes. But the social enterprise opens up structures that up to now have not exactly been embraced by turf-protecting executives and managers. The blog asks, “What if leaders could create a future where customers, associates and suppliers are no longer seen as objects in the system but as valued sources of innovation, ideas and energy?” What if indeed? The social enterprise activates internal resources without the usual obsession with position. It is the dawn of mass collaboration. That does not, however, mean this mass collaboration has to lead to uncontrolled chaos. In an extended interview with Oracle, Altimeter Group analyst Jeremiah Owyang and Oracle SVP Reggie Bradford paint a complete picture of today’s social enterprise, including internal organizational structures Altimeter Group has seen emerge. One sign of a mature social enterprise is the establishing of a social Center of Excellence (CoE), which serves as a hub for high-level social strategy, training and education, research, measurement and accountability, and vendor selection. This CoE is led by a corporate Social Strategist, most likely from a Marketing or Corporate Communications background. Reporting to them are the Community Managers, the front lines of customer interaction and engagement; business unit liaisons that coordinate the enterprise; and social media campaign/product managers, social analysts, and developers. With content rising as the defining factor for social success, Altimeter also sees a Content Strategist position emerging. Across the enterprise, Altimeter has seen 5 organizational patterns. Watching the video will give you the pros and cons of each. Decentralized - Anyone can do anything at any time on any social channel. Centralized – One central groups controls all social communication for the company. Hub and Spoke – A centralized group, but business units can operate their own social under the hub’s guidance and execution. Most enterprises are using this model. Dandelion – Each business unit develops their own social strategy & staff, has its own ability to deploy, and its own ability to engage under the central policies of the CoE. Honeycomb – Every employee can do social, but as opposed to the decentralized model, it’s coordinated and monitored on one platform. The average enterprise has a whopping 178 social accounts, nearly ¼ of which are usually semi-idle and need to be scrapped. The last thing any C-suite needs is to cope with fragmented technologies, solutions and platforms. It’s neither scalable nor strategic. The prepared, effective social enterprise has a technology partner that can quickly and holistically integrate emerging platforms and technologies, such that whatever internal social command structure you’ve set up can continue efficiently executing strategy without skipping a beat. @mikestiles

    Read the article

  • Edit the Windows Live Writer Custom Dictionary

    - by Matthew Guay
    Windows Live Writer is a great tool for writing and publishing posts to your blog, but its spell check unfortunately doesn’t include many common tech words.  Here’s how you can easily edit your custom dictionary and add your favorite words. Customize Live Writer’s Dictionary Adding an individual word to the Windows Live Writer dictionary works as you would expect.  Right-click on a word and select Add to dictionary. And changing the default spell check settings is easy too.  In the menu, click Tools, then Options, and select the Spelling tab in this dialog.  Here you can choose your dictionary language and turn on/off real-time spell checking and other settings. But there’s no obvious way to edit your custom dictionary.  Editing the custom dictionary directly is nice if you accidently add a misspelled word to your dictionary and want to remove it, or if you want to add a lot of words to the dictionary at once. Live Writer actually stores your custom dictionary entries in a plain text file located in your appdata folder.  It is saved as User.dic in the C:\Users\user_name\AppData\Roaming\Windows Live Writer\Dictionaries folder.  The easiest way to open the custom dictionary is to enter the following in the Run box or the address bar of an Explorer window: %appdata%\Windows Live Writer\Dictionaries\User.dic   This will open the User.dic file in your default text editor.  Add any new words to the custom dictionary on separate lines, and delete any misspelled words you accidently added to the dictionary.   Microsoft Office Word also stores its custom dictionary in a plain text file.  If you already have lots of custom words in it and want to import them into Live Writer, enter the following in the Run command or Explorer’s address bar to open Word’s custom dictionary.  Then copy the words, and past them into your Live Writer custom dictionary file. %AppData%\Microsoft\UProof\Custom.dic Don’t forget to save the changes when you’re done.  Note that the changes to the dictionary may not show up in Live Writer’s spell check until you restart the program.  If it’s currently running, save any posts you’re working on, exit, and then reopen, and all of your new words should be in the dictionary. Conclusion Whether you use Live Writer daily in your job or occasionally post an update to a personal blog, adding your own custom words to the dictionary can save you a lot of time and frustration in editing.  Plus, if you’ve accidently added a misspelled word to the dictionary, this is a great way to undo your mistake and make sure your spelling is up to par! Similar Articles Productive Geek Tips Backup Your Windows Live Writer SettingsTransfer or Move Your Microsoft Office Custom DictionaryFuture Date a Post in Windows Live WriterTools to Help Post Content On Your WordPress BlogInstall Windows Live Essentials In Windows 7 TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Acronis Online Backup DVDFab 6 Revo Uninstaller Pro Registry Mechanic 9 for Windows Video Toolbox is a Superb Online Video Editor Fun with 47 charts and graphs Tomorrow is Mother’s Day Check the Average Speed of YouTube Videos You’ve Watched OutlookStatView Scans and Displays General Usage Statistics How to Add Exceptions to the Windows Firewall

    Read the article

  • FairWarning Privacy Monitoring Solutions Rely on MySQL to Secure Patient Data

    - by Rebecca Hansen
    FairWarning® solutions have audited well over 120 billion events, each of which was processed and stored in a MySQL database. FairWarning is the world's leading supplier of privacy monitoring solutions for electronic health records, relied on by over 1,200 Hospitals and 5,000 Clinics to keep their patients' data safe. In January 2014, FairWarning was awarded the highest commendation in healthcare IT as the first ever Category Leader for Patient Privacy Monitoring in the "2013 Best in KLAS: Software & Services" report[1]. FairWarning has used MySQL as their solutions’ database from their start in 2005 to worldwide expansion and market leadership. FairWarning recently migrated their solutions from MyISAM to InnoDB and updated from MySQL 5.5 to 5.6. Following are some of benefits they’ve had as a result of those changes and reasons for their continued reliance on MySQL (from FairWarning MySQL Case Study). Scalability to Handle Terabytes of Data FairWarning's customers have a lot of data: On average, FairWarning customers receive over 700,000 events to be processed daily. Over 25% of their customers receive over 30 million events per day, which equates to over 1 billion events and nearly one terabyte (TB) of new data each month. Databases range in size from a few hundred GBs to 10+ TBs for enterprise deployments (data are rolled off after 13 months). Low or Zero Admin = Few DBAs "MySQL has not required a lot of administration. After it's been tuned, configured, and optimized for size on initial setup, we have very low administrative costs. I can scale and add more customers without adding DBAs. This has had a big, positive impact on our business.” - Chris Arnold, FairWarning Vice President of Product Management and Engineering. Performance Schema  As the size of FairWarning's customers has increased, so have their tables and data volumes. MySQL 5.6’ new maintenance and management features have helped FairWarning keep up. In particular, MySQL 5.6 performance schema’s low-level metrics have provided critical insight into how the system is performing and why. Support for Mutli-CPU Threads MySQL 5.6' support for multiple concurrent CPU threads, and FairWarning's custom data loader allow multiple files to load into a single table simultaneously vs. one at a time. As a result, their data load time has been reduced by 500%. MySQL Enterprise Hot Backup Because hospitals and clinics never stop, FairWarning solutions can’t either. FairWarning changed from using mysqldump to MySQL Enterprise Hot Backup, which has reduced downtime, restore time, and storage requirements. For many of their larger customers, restore time has decreased by 80%. MySQL Enterprise Edition and Product Roadmap Provide Complete Solution "MySQL's product roadmap fully addresses our needs. We like the fact that MySQL Enterprise Edition has everything included; there's no need to purchase separate modules."  - Chris Arnold Learn More>> FairWarning MySQL Case Study Why MySQL 5.6 is an Even Better Embedded Database for Your Products presentation Updating Your Products to MySQL 5.6, Best Practices for OEMs on-demand webinar (audio and / or slides + Q&A transcript) MyISAM to InnoDB – Why and How on-demand webinar (same stuff) Top 10 Reasons to Use MySQL as an Embedded Database white paper [1] 2013 Best in KLAS: Software & Services report, January, 2014. © 2014 KLAS Enterprises, LLC. All rights reserved.

    Read the article

< Previous Page | 62 63 64 65 66 67 68 69 70 71 72 73  | Next Page >