Search Results

Search found 97855 results on 3915 pages for 'code performance'.

Page 10/3915 | < Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >

Improving performance of a particle system (OpenGL ES)

- by Jason

I'm in the process of implementing a simple particle system for a 2D mobile game (using OpenGL ES 2.0). It's working, but it's pretty slow. I start getting frame rate battering after about 400 particles, which I think is pretty low. Here's a summary of my approach: I start with point sprites (GL_POINTS) rendered in a batch just using a native float buffer (I'm in Java-land on Android, so that translates as a java.nio.FloatBuffer). On GL context init, the following are set: GLES20.glViewport(0, 0, width, height); GLES20.glClearColor(0.0f, 0.0f, 0.0f, 0.0f); GLES20.glEnable(GLES20.GL_CULL_FACE); GLES20.glDisable(GLES20.GL_DEPTH_TEST); Each draw frame sets the following: GLES20.glEnable(GLES20.GL_BLEND); GLES20.glBlendFunc(GLES20.GL_ONE, GLES20.GL_ONE_MINUS_SRC_ALPHA); And I bind a single texture: GLES20.glActiveTexture(GLES20.GL_TEXTURE0); GLES20.glBindTexture(GLES20.GL_TEXTURE_2D, textureHandle); GLES20.glUniform1i(mUniformTextureHandle, 0); Which is just a simple circle with some blur (and hence some transparency) http://cl.ly/image/0K2V2p2L1H2x Then there are a bunch of glVertexAttribPointer calls: mBuffer.position(position); mGlEs20.glVertexAttribPointer(mAttributeRGBHandle, valsPerRGB, GLES20.GL_FLOAT, false, stride, mBuffer); ...4 more of these Then I'm drawing: GLES20.glUniformMatrix4fv(mUniformProjectionMatrixHandle, 1, false, Camera.mProjectionMatrix, 0); GLES20.glDrawArrays(GLES20.GL_POINTS, 0, drawCalls); GLES20.glBindTexture(GLES20.GL_TEXTURE_2D, 0); My vertex shader does have some computation in it, but given that they're point sprites (with only 2 coordinate values) I'm not sure this is the problem: #ifdef GL_ES // Set the default precision to low. precision lowp float; #endif uniform mat4 u_ProjectionMatrix; attribute vec4 a_Position; attribute float a_PointSize; attribute vec3 a_RGB; attribute float a_Alpha; attribute float a_Burn; varying vec4 v_Color; void main() { vec3 v_FGC = a_RGB * a_Alpha; v_Color = vec4(v_FGC.x, v_FGC.y, v_FGC.z, a_Alpha * (1.0 - a_Burn)); gl_PointSize = a_PointSize; gl_Position = u_ProjectionMatrix * a_Position; } My fragment shader couldn't really be simpler: #ifdef GL_ES // Set the default precision to low. precision lowp float; #endif uniform sampler2D u_Texture; varying vec4 v_Color; void main() { gl_FragColor = texture2D(u_Texture, gl_PointCoord) * v_Color; } That's about it. I had read that transparent pixels in point sprites can cause issues, but surely not at only 400 points? I'm running on a fairly new device (12 month old Galaxy Nexus). My question is less about my approach (although I'm open to suggestion) but more about whether there are any specific OpenGL "no no's" that have leaked into my code. I'm sure there's GL master out there facepalming right now... I'd love to hear any critique.

Read the article
FreeBSD performance tuning. Sysctls, loader.conf, kernel.

- by SaveTheRbtz

I wanted to share knowledge of tuning FreeBSD via sysctls, so i'm posting them with comments. Based on Igor Sysoev (author of nginx) presentation about FreeBSD tuning up to 100,000-200,000 active connections. Sysctls are for 7.x FreeBSD. Since 7.2 amd64 some of them are tuned well by default. Prior 7.0 some of them are boot only (set via /boot/loader.conf) or does not exist at all. Highload web server sysctls: # Max. backlog size kern.ipc.somaxconn=4096 # Shared memory // 7.2+ can use shared memory > 2Gb kern.ipc.shmmax=2147483648 # Sockets kern.ipc.maxsockets=204800 # Do not use lager sockbufs on 8.0 # ( http://old.nabble.com/Significant-performance-regression-for-increased-maxsockbuf-on-8.0-RELEASE-tt26745981.html#a26745981 ) kern.ipc.maxsockbuf=262144 # Recive clusters (on amd64 7.2+ 65k is default) # For such high value vm.kmem_size must be increased to 3G #kern.ipc.nmbclusters=229376 # Jumbo pagesize(4k/8k) clusters # Used as general packet storage for jumbo frames # can be monitored via `netstat -m` #kern.ipc.nmbjumbop=192000 # Jumbo 9k/16k clusters # If you are using them #kern.ipc.nmbjumbo9=24000 #kern.ipc.nmbjumbo16=10240 # Every socket is a file, so increase them kern.maxfiles=204800 kern.maxfilesperproc=200000 kern.maxvnodes=200000 # Turn off receive autotuning #net.inet.tcp.recvbuf_auto=0 # Small receive space, only usable on http-server, on file server this # should be increased to 65535 or even more #net.inet.tcp.recvspace=8192 # Small send space is useful for http servers that serve small files # Autotuned since 7.x net.inet.tcp.sendspace=16384 # This should be enabled if you going to use big spaces (>64k) #net.inet.tcp.rfc1323=1 # Turn this off on highspeed, lossless connections (LAN 1Gbit+) #net.inet.tcp.delayed_ack=0 # This feature is useful if you are serving data over modems, Gigabit Ethernet, # or even high speed WAN links (or any other link with a high bandwidth delay product), # especially if you are also using window scaling or have configured a large send window. # You can try setting it to 0 on fileserver with 1GBit+ interfaces # Automatically disables on small RTT ( http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_subr.c?#rev1.237 ) #net.inet.tcp.inflight.enable=0 # Disable randomizing of ports to avoid false RST # Before usage check SA here www.bsdcan.org/2006/papers/ImprovingTCPIP.pdf # (it's also says that port randomization auto-disables at some conn.rates, but I didn't tested it thou) #net.inet.ip.portrange.randomized=0 # Increase portrange # For outgoing connections only. Good for seed-boxes and ftp servers. net.inet.ip.portrange.first=1024 net.inet.ip.portrange.last=65535 # Security net.inet.ip.redirect=0 net.inet.ip.sourceroute=0 net.inet.ip.accept_sourceroute=0 net.inet.icmp.maskrepl=0 net.inet.icmp.log_redirect=0 net.inet.icmp.drop_redirect=1 net.inet.tcp.drop_synfin=1 # Security net.inet.udp.blackhole=1 net.inet.tcp.blackhole=2 # Increases default TTL, sometimes useful # Default is 64 net.inet.ip.ttl=128 # Lessen max segment life to conserve resources # ACK waiting time in miliseconds (default: 30000 from RFC) net.inet.tcp.msl=5000 # Max bumber of timewait sockets net.inet.tcp.maxtcptw=40960 # Don't use tw on local connections # As of 15 Apr 2009. Igor Sysoev says that nolocaltimewait has some buggy realization. # So disable it or now till get fixed #net.inet.tcp.nolocaltimewait=1 # FIN_WAIT_2 state fast recycle net.inet.tcp.fast_finwait2_recycle=1 # Time before tcp keepalive probe is sent # default is 2 hours (7200000) #net.inet.tcp.keepidle=60000 # Should be increased until net.inet.ip.intr_queue_drops is zero net.inet.ip.intr_queue_maxlen=4096 # Interrupt handling via multiple CPU, but with context switch. # You can play with it. Default is 1; #net.isr.direct=0 # This is for routers only #net.inet.ip.forwarding=1 #net.inet.ip.fastforwarding=1 # This speed ups dummynet when channel isn't saturated net.inet.ip.dummynet.io_fast=1 # Increase dummynet(4) hash #net.inet.ip.dummynet.hash_size=2048 #net.inet.ip.dummynet.max_chain_len # Should be increased when you have A LOT of files on server # (Increase until vfs.ufs.dirhash_mem becames lower) vfs.ufs.dirhash_maxmem=67108864 # Explicit Congestion Notification (see http://en.wikipedia.org/wiki/Explicit_Congestion_Notification) net.inet.tcp.ecn.enable=1 # Flowtable - flow caching mechanism # Useful for routers #net.inet.flowtable.enable=1 #net.inet.flowtable.nmbflows=65535 # Extreme polling tuning #kern.polling.burst_max=1000 #kern.polling.each_burst=1000 #kern.polling.reg_frac=100 #kern.polling.user_frac=1 #kern.polling.idle_poll=0 # IPFW dynamic rules and timeouts tuning # Increase dyn_buckets till net.inet.ip.fw.curr_dyn_buckets is lower net.inet.ip.fw.dyn_buckets=65536 net.inet.ip.fw.dyn_max=65536 net.inet.ip.fw.dyn_ack_lifetime=120 net.inet.ip.fw.dyn_syn_lifetime=10 net.inet.ip.fw.dyn_fin_lifetime=2 net.inet.ip.fw.dyn_short_lifetime=10 # Make packets pass firewall only once when using dummynet # i.e. packets going thru pipe are passing out from firewall with accept #net.inet.ip.fw.one_pass=1 # shm_use_phys Wires all shared pages, making them unswappable # Use this to lessen Virtual Memory Manager's work when using Shared Mem. # Useful for databases #kern.ipc.shm_use_phys=1 /boot/loader.conf: # Accept filters for data, http and DNS requests # Usefull when your software uses select() instead of kevent/kqueue or when you under DDoS # DNS accf available on 8.0+ accf_data_load="YES" accf_http_load="YES" accf_dns_load="YES" # Async IO system calls aio_load="YES" # Adds NCQ support in FreeBSD # WARNING! all ad[0-9]+ devices will be renamed to ada[0-9]+ # 8.0+ only #ahci_load= #siis_load= # Increase kernel memory size to 3G. # # Use ONLY if you have KVA_PAGES in kernel configuration, and you have more than 3G RAM # Otherwise panic will happen on next reboot! # # It's required for high buffer sizes: kern.ipc.nmbjumbop, kern.ipc.nmbclusters, etc # Useful on highload stateful firewalls, proxies or ZFS fileservers # (FreeBSD 7.2+ amd64 users: Check that current value is lower!) #vm.kmem_size="3G" # Older versions of FreeBSD can't tune maxfiles on the fly #kern.maxfiles="200000" # Useful for databases # Sets maximum data size to 1G # (FreeBSD 7.2+ amd64 users: Check that current value is lower!) #kern.maxdsiz="1G" # Maximum buffer size(vfs.maxbufspace) # You can check current one via vfs.bufspace # Should be lowered/upped depending on server's load-type # Usually decreased to preserve kmem # (default is 200M) #kern.maxbcache="512M" # Sendfile buffers # For i386 only #kern.ipc.nsfbufs=10240 # syncache Hash table tuning net.inet.tcp.syncache.hashsize=1024 net.inet.tcp.syncache.bucketlimit=100 # Incresed hostcache net.inet.tcp.hostcache.hashsize="16384" net.inet.tcp.hostcache.bucketlimit="100" # TCP control-block Hash table tuning net.inet.tcp.tcbhashsize=4096 # Enable superpages, for 7.2+ only # Also read http://lists.freebsd.org/pipermail/freebsd-hackers/2009-November/030094.html vm.pmap.pg_ps_enabled=1 # Usefull if you are using Intel-Gigabit NIC #hw.em.rxd=4096 #hw.em.txd=4096 #hw.em.rx_process_limit="-1" # Also if you have ALOT interrupts on NIC - play with following parameters # NOTE: You should set them for every NIC #dev.em.0.rx_int_delay: 250 #dev.em.0.tx_int_delay: 250 #dev.em.0.rx_abs_int_delay: 250 #dev.em.0.tx_abs_int_delay: 250 # There is also multithreaded version of em drivers can be found here: # http://people.yandex-team.ru/~wawa/ # # for additional em monitoring and statistics use # `sysctl dev.em.0.stats=1 ; dmesg` # #Same tunings for igb #hw.igb.rxd=4096 #hw.igb.txd=4096 #hw.igb.rx_process_limit=100 # Some useful netisr tunables. See sysctl net.isr #net.isr.defaultqlimit=4096 #net.isr.maxqlimit: 10240 # Bind netisr threads to CPUs #net.isr.bindthreads=1 # Nicer boot logo =) loader_logo="beastie" And finally here is my additions to GENERIC kernel # Just some of them, see also # cat /sys/{i386,amd64,}/conf/NOTES # This one useful only on i386 #options KVA_PAGES=512 # You can play with HZ in environments with high interrupt rate (default is 1000) # 100 is for my notebook to prolong it's battery life #options HZ=100 # Polling is goot on network loads with high packet rates and low-end NICs # NB! Do not enable it if you want more than one netisr thread #options DEVICE_POLLING # Eliminate datacopy on socket read-write # To take advantage with zero copy sockets you should have an MTU of 8K(amd64) # (4k for i386). This req. is only for receiving data. # Read more in man zero_copy_sockets #options ZERO_COPY_SOCKETS # Support TCP sign. Used for IPSec options TCP_SIGNATURE options IPSEC # This ones can be loaded as modules. They described in loader.conf section #options ACCEPT_FILTER_DATA #options ACCEPT_FILTER_HTTP # Adding ipfw, also can be loaded as modules options IPFIREWALL options IPFIREWALL_VERBOSE options IPFIREWALL_VERBOSE_LIMIT=10 options IPFIREWALL_DEFAULT_TO_ACCEPT options IPFIREWALL_FORWARD # Adding kernel NAT options IPFIREWALL_NAT options LIBALIAS # Traffic shaping options DUMMYNET # Divert, i.e. for userspace NAT options IPDIVERT # This is for OpenBSD's pf firewall device pf device pflog # pf's QoS - ALTQ options ALTQ options ALTQ_CBQ # Class Bases Queuing (CBQ) options ALTQ_RED # Random Early Detection (RED) options ALTQ_RIO # RED In/Out options ALTQ_HFSC # Hierarchical Packet Scheduler (HFSC) options ALTQ_PRIQ # Priority Queuing (PRIQ) options ALTQ_NOPCC # Required for SMP build # Pretty console # Manual can be found here http://forums.freebsd.org/showthread.php?t=6134 #options VESA #options SC_PIXEL_MODE # Disable reboot on Ctrl Alt Del #options SC_DISABLE_REBOOT # Change normal|kernel messages color options SC_NORM_ATTR=(FG_GREEN|BG_BLACK) options SC_KERNEL_CONS_ATTR=(FG_YELLOW|BG_BLACK) # More scroll space options SC_HISTORY_SIZE=8192 # Adding hardware crypto device device crypto device cryptodev # Useful network interfaces device vlan device tap #Virtual Ethernet driver device gre #IP over IP tunneling device if_bridge #Bridge interface device pfsync #synchronization interface for PF device carp #Common Address Redundancy Protocol device enc #IPsec interface device lagg #Link aggregation interface device stf #IPv4-IPv6 port # Also for my notebook, but may be used with Opteron #device amdtemp # Support for ECMP. More than one route for destination # Works even with default route so one can use it as LB for two ISP # For now code is unstable and panics (panic: rtfree 2) on route deletions. #options RADIX_MPATH # Multicast routing #options MROUTING #options PIM # DTrace options KDTRACE_HOOKS # all architectures - enable general DTrace hooks options DDB_CTF # all architectures - kernel ELF linker loads CTF data #options KDTRACE_FRAME # amd64-only # Adaptive spining in lockmgr (8.x+) # See http://www.mail-archive.com/[email protected]/msg10782.html options ADAPTIVE_LOCKMGRS # UTF-8 in console (9.x+) #options TEKEN_UTF8 #options TEKEN_XTERM # NCQ support # WARNING! all ad[0-9]+ devices will be renamed to ada[0-9]+ #options ATA_CAM # FreeBSD 9+ # Deadlock resolver thread # For additional information see http://www.mail-archive.com/[email protected]/msg18124.html #options DEADLKRES PS. Also most of FreeBSD's limits can be monitored by # vmstat -z and # limits PPS. variety of network counters can be monitored via # netstat -s In FreeBSD-9 netstat's -Q option appeared, try following command to display netisr stats # netstat -Q PPPS. also see # man 7 tuning PPPPS. I wanted to thank FreeBSD community, especially author of nginx - Igor Sysoev, nginx-ru@ and FreeBSD-performance@ mailing lists for providing useful information about FreeBSD tuning. So here is the question: What tunings are you using on yours FreeBSD servers? You can also post your /etc/sysctl.conf, /boot/loader.conf, kernel options, etc with description of its' meaning (do not copy-paste from sysctl -d). Don't forget to specify server type (web, smb, gateway, etc) Let's share experience!

Read the article
How can a code editor effectively hint at code nesting level - without using indentation?

- by pgfearo

I've written an XML text editor that provides 2 view options for the same XML text, one indented (virtually), the other left-justified. The motivation for the left-justified view is to help users 'see' the whitespace characters they're using for indentation of plain-text or XPath code without interference from indentation that is an automated side-effect of the XML context. I want to provide visual clues (in the non-editable part of the editor) for the left-justified mode that will help the user, but without getting too elaborate. I tried just using connecting lines, but that seemed too busy. The best I've come up with so far is shown in a mocked up screenshot of the editor below, but I'm seeking better/simpler alternatives (that don't require too much code). [Edit] Taking the heatmap idea (from: @jimp) I get this and 3 alternatives - labelled a, b and c: The following section describes the accepted answer as a proposal, bringing together ideas from a number of other answers and comments. As this question is now community wiki, please feel free to update this. NestView The name for this idea which provides a visual method to improve the readability of nested code without using indentation. Contour Lines The name for the differently shaded lines within the NestView The image above shows the NestView used to help visualise an XML snippet. Though XML is used for this illustration, any other code syntax that uses nesting could have been used for this illustration. An Overview: The contour lines are shaded (as in a heatmap) to convey nesting level The contour lines are angled to show when a nesting level is being either opened or closed. A contour line links the start of a nesting level to the corresponding end. The combined width of contour lines give a visual impression of nesting level, in addition to the heatmap. The width of the NestView may be manually resizable, but should not change as the code changes. Contour lines can either be compressed or truncated to keep acheive this. Blank lines are sometimes used code to break up text into more digestable chunks. Such lines could trigger special behaviour in the NestView. For example the heatmap could be reset or a background color contour line used, or both. One or more contour lines associated with the currently selected code can be highlighted. The contour line associated with the selected code level would be emphasized the most, but other contour lines could also 'light up' in addition to help highlight the containing nested group Different behaviors (such as code folding or code selection) can be associated with clicking/double-clicking on a Contour Line. Different parts of a contour line (leading, middle or trailing edge) may have different dynamic behaviors associated. Tooltips can be shown on a mouse hover event over a contour line The NestView is updated continously as the code is edited. Where nesting is not well-balanced assumptions can be made where the nesting level should end, but the associated temporary contour lines must be highlighted in some way as a warning. Drag and drop behaviors of Contour Lines can be supported. Behaviour may vary according to the part of the contour line being dragged. Features commonly found in the left margin such as line numbering and colour highlighting for errors and change state could overlay the NestView. Additional Functionality The proposal addresses a range of additional issues - many are outside the scope of the original question, but a useful side-effect. Visually linking the start and end of a nested region The contour lines connect the start and end of each nested level Highlighting the context of the currently selected line As code is selected, the associated nest-level in the NestView can be highlighted Differentiating between code regions at the same nesting level In the case of XML different hues could be used for different namespaces. Programming languages (such as c#) support named regions that could be used in a similar way. Dividing areas within a nesting area into different visual blocks Extra lines are often inserted into code to aid readability. Such empty lines could be used to reset the saturation level of the NestView's contour lines. Multi-Column Code View Code without indentation makes the use of a multi-column view more effective because word-wrap or horizontal scrolling is less likely to be required. In this view, once code has reach the bottom of one column, it flows into the next one: Usage beyond merely providing a visual aid As proposed in the overview, the NestView could provide a range of editing and selection features which would be broadly in line with what is expected from a TreeView control. The key difference is that a typical TreeView node has 2 parts: an expander and the node icon. A NestView contour line can have as many as 3 parts: an opener (sloping), a connector (vertical) and a close (sloping). On Indentation The NestView presented alongside non-indented code complements, but is unlikely to replace, the conventional indented code view. It's likely that any solutions adopting a NestView, will provide a method to switch seamlessly between indented and non-indented code views without affecting any of the code text itself - including whitespace characters. One technique for the indented view would be 'Virtual Formatting' - where a dynamic left-margin is used in lieu of tab or space characters. The same nesting-level data used to dynamically render the NestView could also used for the more conventional-looking indented view. Printing Indentation will be important for the readability of printed code. Here, the absence of tab/space characters and a dynamic left-margin means that the text can wrap at the right-margin and still maintain the integrity of the indented view. Line numbers can be used as visual markers that indicate where code is word-wrapped and also the exact position of indentation: Screen Real-Estate: Flat Vs Indented Addressing the question of whether the NestView uses up valuable screen real-estate: Contour lines work well with a width the same as the code editor's character width. A NestView width of 12 character widths can therefore accommodate 12 levels of nesting before contour lines are truncated/compressed. If an indented view uses 3 character-widths for each nesting level then space is saved until nesting reaches 4 levels of nesting, after this nesting level the flat view has a space-saving advantage that increases with each nesting level. Note: A minimum indentation of 4 character widths is often recommended for code, however XML often manages with less. Also, Virtual Formatting permits less indentation to be used because there's no risk of alignment issues A comparison of the 2 views is shown below: Based on the above, its probably fair to conclude that view style choice will be based on factors other than screen real-estate. The one exception is where screen space is at a premium, for example on a Netbook/Tablet or when multiple code windows are open. In these cases, the resizable NestView would seem to be a clear winner. Use Cases Examples of real-world examples where NestView may be a useful option: Where screen real-estate is at a premium a. On devices such as tablets, notepads and smartphones b. When showing code on websites c. When multiple code windows need to be visible on the desktop simultaneously Where consistent whitespace indentation of text within code is a priority For reviewing deeply nested code. For example where sub-languages (e.g. Linq in C# or XPath in XSLT) might cause high levels of nesting. Accessibility Resizing and color options must be provided to aid those with visual impairments, and also to suit environmental conditions and personal preferences: Compatability of edited code with other systems A solution incorporating a NestView option should ideally be capable of stripping leading tab and space characters (identified as only having a formatting role) from imported code. Then, once stripped, the code could be rendered neatly in both the left-justified and indented views without change. For many users relying on systems such as merging and diff tools that are not whitespace-aware this will be a major concern (if not a complete show-stopper). Other Works: Visualisation of Overlapping Markup Published research by Wendell Piez, dated from 2004, addresses the issue of the visualisation of overlapping markup, specifically LMNL. This includes SVG graphics with significant similarities to the NestView proposal, as such, they are acknowledged here. The visual differences are clear in the images (below), the key functional distinction is that NestView is intended only for well-nested XML or code, whereas Wendell Piez's graphics are designed to represent overlapped nesting. The graphics above were reproduced - with kind permission - from http://www.piez.org Sources: Towards Hermenutic Markup Half-steps toward LMNL

Read the article
Using the StopWatch class to calculate the execution time of a block of code

- by vik20000in

Many of the times while doing the performance tuning of some, class, webpage, component, control etc. we first measure the current time taken in the execution of that code. This helps in understanding the location in code which is actually causing the performance issue and also help in measuring the amount of improvement by making the changes. This measurement is very important as it helps us understand the problem in code, Helps us to write better code next time (as we have already learnt what kind of improvement can be made with different code) . Normally developers create 2 objects of the DateTime class. The exact time is collected before and after the code where the performance needs to be measured. Next the difference between the two objects is used to know about the time spent in the code that is measured. Below is an example of the sample code. DateTime dt1, dt2; dt1 = DateTime.Now; for (int i = 0; i < 1000000; i++) { string str = "string"; } dt2 = DateTime.Now; TimeSpan ts = dt2.Subtract(dt1); Console.WriteLine("Time Spent : " + ts.TotalMilliseconds.ToString()); The above code works great. But the dot net framework also provides for another way to capture the time spent on the code without doing much effort (creating 2 datetime object, timespan object etc..). We can use the inbuilt StopWatch class to get the exact time spent. Below is an example of the same work with the help of the StopWatch class. Stopwatch sw = Stopwatch.StartNew(); for (int i = 0; i < 1000000; i++) { string str = "string"; } sw.Stop(); Console.WriteLine("Time Spent : " +sw.Elapsed.TotalMilliseconds.ToString()); [Note the StopWatch class resides in the System.Diagnostics namespace] If you use the StopWatch class the time taken for measuring the performance is much better, with very little effort. Vikram

Read the article
Why VM snapshots are affecting performance?

- by Samselvaprabu

I read in one of the VMware KB article says that snapshots will directly proportional to VM performance. But my team keep asking me how snapshots can affect performance. I would like to give them solid reason behind the statement that snapshots are performance killers. Can any one explain a little bit theory behind why actually snapshots are affecting the performance? Is it just because Disk I/O rate of hard disk would be slow?

Read the article
How should code reviews be Carried Out?

- by Graviton

My previous question has to do with how to advance code reviews among the developers. Here I am interested in how a code review session should be carried out, so that both the reviewer and reviewed are feeling comfortable with it. I have done some code reviews before and the experience has been very unpleasant. My previous manager would come to us --on an ad hoc basis-- and tell us to explain our code to him. Since he wasn't very familiar with the code base, whenever he would ask me to explain my code, I'd find myself spending a huge amount of time explaining the most basic structure of my code. As a result, each review would last much too long, and the process would leave both of us exhausted. Once I was done explaining my work, he would continue by raising issues with it. Most of the issues he raised were cosmetic in nature ( e.g, don't use region for this code block, change the variable name from xxx to yyy even though the later makes even less sense, and so on). After trying this process for few rounds, we found the review session didn't derive much benefits for either of us, and we stopped. How would you go about making each code review a natural, enjoyable, thought stimulating, bug-fixing and mutual-learning experience? Also, how frequently you do your code reviews - as soon as the code is checked in? Do you allocate a fixed time every week to do this? What are the guidelines that you follow during your code reviews?

Read the article
Low graphics performance with Intel HD graphics

- by neil

hey, my laptop should be capable of running some games fine but doesn't. Examples are egoboo and tome. http://www.ebuyer.com/product/237739 this is my laptop. I tried the gears test and i only get 60 FPS, on IRC they said thats a big issue and should try the forums. I am using Ubuntu 11.04 and was told I should have the newest drivers. neil@neil-K52F:~$ /usr/lib/nux/unity_support_test --print OpenGL vendor string: Tungsten Graphics, Inc OpenGL renderer string: Mesa DRI Intel(R) Ironlake Mobile GEM 20100330 DEVELOPMENT OpenGL version string: 2.1 Mesa 7.10.2 Not software rendered: yes Not blacklisted: yes GLX fbconfig: yes GLX texture from pixmap: yes GL npot or rect textures: yes GL vertex program: yes GL fragment program: yes GL vertex buffer object: yes GL framebuffer object: yes GL version is 1.4+: yes Unity supported: yes

Read the article
SQLAuthority News – SQL Server Technical Article – The Data Loading Performance Guide

- by pinaldave

The white paper describes load strategies for achieving high-speed data modifications of a Microsoft SQL Server database. “Bulk Load Methods” and “Other Minimally Logged and Metadata Operations” provide an overview of two key and interrelated concepts for high-speed data loading: bulk loading and metadata operations. After this background knowledge, white paper describe how these methods can be [...]

Read the article
SQLAuthority News – Hyderabad Techies February Fever Feb 11, 2010 – Indexing for Performance

- by pinaldave

I recently presented in Hyderabad User Group on the subject of The Other Side of SQL Server Index: Advanced Solutions to Ancient Problem , you can read more about this event here SQLAuthority News – MUGH – Microsoft User Group Hyderabad – Feb 2, 2010 Session Review. I really had great time talking about Index [...]

Read the article
How to find and fix performance problems in ORM powered applications

- by FransBouma

Once in a while we get requests about how to fix performance problems with our framework. As it comes down to following the same steps and looking into the same things every single time, I decided to write a blogpost about it instead, so more people can learn from this and solve performance problems in their O/R mapper powered applications. In some parts it's focused on LLBLGen Pro but it's also usable for other O/R mapping frameworks, as the vast majority of performance problems in O/R mapper powered applications are not specific for a certain O/R mapper framework. Too often, the developer looks at the wrong part of the application, trying to fix what isn't a problem in that part, and getting frustrated that 'things are so slow with <insert your favorite framework X here>'. I'm in the O/R mapper business for a long time now (almost 10 years, full time) and as it's a small world, we O/R mapper developers know almost all tricks to pull off by now: we all know what to do to make task ABC faster and what compromises (because there are almost always compromises) to deal with if we decide to make ABC faster that way. Some O/R mapper frameworks are faster in X, others in Y, but you can be sure the difference is mainly a result of a compromise some developers are willing to deal with and others aren't. That's why the O/R mapper frameworks on the market today are different in many ways, even though they all fetch and save entities from and to a database. I'm not suggesting there's no room for improvement in today's O/R mapper frameworks, there always is, but it's not a matter of 'the slowness of the application is caused by the O/R mapper' anymore. Perhaps query generation can be optimized a bit here, row materialization can be optimized a bit there, but it's mainly coming down to milliseconds. Still worth it if you're a framework developer, but it's not much compared to the time spend inside databases and in user code: if a complete fetch takes 40ms or 50ms (from call to entity object collection), it won't make a difference for your application as that 10ms difference won't be noticed. That's why it's very important to find the real locations of the problems so developers can fix them properly and don't get frustrated because their quest to get a fast, performing application failed. Performance tuning basics and rules Finding and fixing performance problems in any application is a strict procedure with four prescribed steps: isolate, analyze, interpret and fix, in that order. It's key that you don't skip a step nor make assumptions: these steps help you find the reason of a problem which seems to be there, and how to fix it or leave it as-is. Skipping a step, or when you assume things will be bad/slow without doing analysis will lead to the path of premature optimization and won't actually solve your problems, only create new ones. The most important rule of finding and fixing performance problems in software is that you have to understand what 'performance problem' actually means. Most developers will say "when a piece of software / code is slow, you have a performance problem". But is that actually the case? If I write a Linq query which will aggregate, group and sort 5 million rows from several tables to produce a resultset of 10 rows, it might take more than a couple of milliseconds before that resultset is ready to be consumed by other logic. If I solely look at the Linq query, the code consuming the resultset of the 10 rows and then look at the time it takes to complete the whole procedure, it will appear to me to be slow: all that time taken to produce and consume 10 rows? But if you look closer, if you analyze and interpret the situation, you'll see it does a tremendous amount of work, and in that light it might even be extremely fast. With every performance problem you encounter, always do realize that what you're trying to solve is perhaps not a technical problem at all, but a perception problem. The second most important rule you have to understand is based on the old saying "Penny wise, Pound Foolish": the part which takes e.g. 5% of the total time T for a given task isn't worth optimizing if you have another part which takes a much larger part of the total time T for that same given task. Optimizing parts which are relatively insignificant for the total time taken is not going to bring you better results overall, even if you totally optimize that part away. This is the core reason why analysis of the complete set of application parts which participate in a given task is key to being successful in solving performance problems: No analysis -> no problem -> no solution. One warning up front: hunting for performance will always include making compromises. Fast software can be made maintainable, but if you want to squeeze as much performance out of your software, you will inevitably be faced with the dilemma of compromising one or more from the group {readability, maintainability, features} for the extra performance you think you'll gain. It's then up to you to decide whether it's worth it. In almost all cases it's not. The reason for this is simple: the vast majority of performance problems can be solved by implementing the proper algorithms, the ones with proven Big O-characteristics so you know the performance you'll get plus you know the algorithm will work. The time taken by the algorithm implementing code is inevitable: you already implemented the best algorithm. You might find some optimizations on the technical level but in general these are minor. Let's look at the four steps to see how they guide us through the quest to find and fix performance problems. Isolate The first thing you need to do is to isolate the areas in your application which are assumed to be slow. For example, if your application is a web application and a given page is taking several seconds or even minutes to load, it's a good candidate to check out. It's important to start with the isolate step because it allows you to focus on a single code path per area with a clear begin and end and ignore the rest. The rest of the steps are taken per identified problematic area. Keep in mind that isolation focuses on tasks in an application, not code snippets. A task is something that's started in your application by either another task or the user, or another program, and has a beginning and an end. You can see a task as a piece of functionality offered by your application. Analyze Once you've determined the problem areas, you have to perform analysis on the code paths of each area, to see where the performance problems occur and which areas are not the problem. This is a multi-layered effort: an application which uses an O/R mapper typically consists of multiple parts: there's likely some kind of interface (web, webservice, windows etc.), a part which controls the interface and business logic, the O/R mapper part and the RDBMS, all connected with either a network or inter-process connections provided by the OS or other means. Each of these parts, including the connectivity plumbing, eat up a part of the total time it takes to complete a task, e.g. load a webpage with all orders of a given customer X. To understand which parts participate in the task / area we're investigating and how much they contribute to the total time taken to complete the task, analysis of each participating task is essential. Start with the code you wrote which starts the task, analyze the code and track the path it follows through your application. What does the code do along the way, verify whether it's correct or not. Analyze whether you have implemented the right algorithms in your code for this particular area. Remember we're looking at one area at a time, which means we're ignoring all other code paths, just the code path of the current problematic area, from begin to end and back. Don't dig in and start optimizing at the code level just yet. We're just analyzing. If your analysis reveals big architectural stupidity, it's perhaps a good idea to rethink the architecture at this point. For the rest, we're analyzing which means we collect data about what could be wrong, for each participating part of the complete application. Reviewing the code you wrote is a good tool to get deeper understanding of what is going on for a given task but ultimately it lacks precision and overview what really happens: humans aren't good code interpreters, computers are. We therefore need to utilize tools to get deeper understanding about which parts contribute how much time to the total task, triggered by which other parts and for example how many times are they called. There are two different kind of tools which are necessary: .NET profilers and O/R mapper / RDBMS profilers. .NET profiling .NET profilers (e.g. dotTrace by JetBrains or Ants by Red Gate software) show exactly which pieces of code are called, how many times they're called, and the time it took to run that piece of code, at the method level and sometimes even at the line level. The .NET profilers are essential tools for understanding whether the time taken to complete a given task / area in your application is consumed by .NET code, where exactly in your code, the path to that code, how many times that code was called by other code and thus reveals where hotspots are located: the areas where a solution can be found. Importantly, they also reveal which areas can be left alone: remember our penny wise pound foolish saying: if a profiler reveals that a group of methods are fast, or don't contribute much to the total time taken for a given task, ignore them. Even if the code in them is perhaps complex and looks like a candidate for optimization: you can work all day on that, it won't matter. As we're focusing on a single area of the application, it's best to start profiling right before you actually activate the task/area. Most .NET profilers support this by starting the application without starting the profiling procedure just yet. You navigate to the particular part which is slow, start profiling in the profiler, in your application you perform the actions which are considered slow, and afterwards you get a snapshot in the profiler. The snapshot contains the data collected by the profiler during the slow action, so most data is produced by code in the area to investigate. This is important, because it allows you to stay focused on a single area. O/R mapper and RDBMS profiling .NET profilers give you a good insight in the .NET side of things, but not in the RDBMS side of the application. As this article is about O/R mapper powered applications, we're also looking at databases, and the software making it possible to consume the database in your application: the O/R mapper. To understand which parts of the O/R mapper and database participate how much to the total time taken for task T, we need different tools. There are two kind of tools focusing on O/R mappers and database performance profiling: O/R mapper profilers and RDBMS profilers. For O/R mapper profilers, you can look at LLBLGen Prof by hibernating rhinos or the Linq to Sql/LLBLGen Pro profiler by Huagati. Hibernating rhinos also have profilers for other O/R mappers like NHibernate (NHProf) and Entity Framework (EFProf) and work the same as LLBLGen Prof. For RDBMS profilers, you have to look whether the RDBMS vendor has a profiler. For example for SQL Server, the profiler is shipped with SQL Server, for Oracle it's build into the RDBMS, however there are also 3rd party tools. Which tool you're using isn't really important, what's important is that you get insight in which queries are executed during the task / area we're currently focused on and how long they took. Here, the O/R mapper profilers have an advantage as they collect the time it took to execute the query from the application's perspective so they also collect the time it took to transport data across the network. This is important because a query which returns a massive resultset or a resultset with large blob/clob/ntext/image fields takes more time to get transported across the network than a small resultset and a database profiler doesn't take this into account most of the time. Another tool to use in this case, which is more low level and not all O/R mappers support it (though LLBLGen Pro and NHibernate as well do) is tracing: most O/R mappers offer some form of tracing or logging system which you can use to collect the SQL generated and executed and often also other activity behind the scenes. While tracing can produce a tremendous amount of data in some cases, it also gives insight in what's going on. Interpret After we've completed the analysis step it's time to look at the data we've collected. We've done code reviews to see whether we've done anything stupid and which parts actually take place and if the proper algorithms have been implemented. We've done .NET profiling to see which parts are choke points and how much time they contribute to the total time taken to complete the task we're investigating. We've performed O/R mapper profiling and RDBMS profiling to see which queries were executed during the task, how many queries were generated and executed and how long they took to complete, including network transportation. All this data reveals two things: which parts are big contributors to the total time taken and which parts are irrelevant. Both aspects are very important. The parts which are irrelevant (i.e. don't contribute significantly to the total time taken) can be ignored from now on, we won't look at them. The parts which contribute a lot to the total time taken are important to look at. We now have to first look at the .NET profiler results, to see whether the time taken is consumed in our own code, in .NET framework code, in the O/R mapper itself or somewhere else. For example if most of the time is consumed by DbCommand.ExecuteReader, the time it took to complete the task is depending on the time the data is fetched from the database. If there was just 1 query executed, according to tracing or O/R mapper profilers / RDBMS profilers, check whether that query is optimal, uses indexes or has to deal with a lot of data. Interpret means that you follow the path from begin to end through the data collected and determine where, along the path, the most time is contributed. It also means that you have to check whether this was expected or is totally unexpected. My previous example of the 10 row resultset of a query which groups millions of rows will likely reveal that a long time is spend inside the database and almost no time is spend in the .NET code, meaning the RDBMS part contributes the most to the total time taken, the rest is compared to that time, irrelevant. Considering the vastness of the source data set, it's expected this will take some time. However, does it need tweaking? Perhaps all possible tweaks are already in place. In the interpret step you then have to decide that further action in this area is necessary or not, based on what the analysis results show: if the analysis results were unexpected and in the area where the most time is contributed to the total time taken is room for improvement, action should be taken. If not, you can only accept the situation and move on. In all cases, document your decision together with the analysis you've done. If you decide that the perceived performance problem is actually expected due to the nature of the task performed, it's essential that in the future when someone else looks at the application and starts asking questions you can answer them properly and new analysis is only necessary if situations changed. Fix After interpreting the analysis results you've concluded that some areas need adjustment. This is the fix step: you're actively correcting the performance problem with proper action targeted at the real cause. In many cases related to O/R mapper powered applications it means you'll use different features of the O/R mapper to achieve the same goal, or apply optimizations at the RDBMS level. It could also mean you apply caching inside your application (compromise memory consumption over performance) to avoid unnecessary re-querying data and re-consuming the results. After applying a change, it's key you re-do the analysis and interpretation steps: compare the results and expectations with what you had before, to see whether your actions had any effect or whether it moved the problem to a different part of the application. Don't fall into the trap to do partly analysis: do the full analysis again: .NET profiling and O/R mapper / RDBMS profiling. It might very well be that the changes you've made make one part faster but another part significantly slower, in such a way that the overall problem hasn't changed at all. Performance tuning is dealing with compromises and making choices: to use one feature over the other, to accept a higher memory footprint, to go away from the strict-OO path and execute queries directly onto the RDBMS, these are choices and compromises which will cross your path if you want to fix performance problems with respect to O/R mappers or data-access and databases in general. In most cases it's not a big issue: alternatives are often good choices too and the compromises aren't that hard to deal with. What is important is that you document why you made a choice, a compromise: which analysis data, which interpretation led you to the choice made. This is key for good maintainability in the years to come. Most common performance problems with O/R mappers Below is an incomplete list of common performance problems related to data-access / O/R mappers / RDBMS code. It will help you with fixing the hotspots you found in the interpretation step. SELECT N+1: (Lazy-loading specific). Lazy loading triggered performance bottlenecks. Consider a list of Orders bound to a grid. You have a Field mapped onto a related field in Order, Customer.CompanyName. Showing this column in the grid will make the grid fetch (indirectly) for each row the Customer row. This means you'll get for the single list not 1 query (for the orders) but 1+(the number of orders shown) queries. To solve this: use eager loading using a prefetch path to fetch the customers with the orders. SELECT N+1 is easy to spot with an O/R mapper profiler or RDBMS profiler: if you see a lot of identical queries executed at once, you have this problem. Prefetch paths using many path nodes or sorting, or limiting. Eager loading problem. Prefetch paths can help with performance, but as 1 query is fetched per node, it can be the number of data fetched in a child node is bigger than you think. Also consider that data in every node is merged on the client within the parent. This is fast, but it also can take some time if you fetch massive amounts of entities. If you keep fetches small, you can use tuning parameters like the ParameterizedPrefetchPathThreshold setting to get more optimal queries. Deep inheritance hierarchies of type Target Per Entity/Type. If you use inheritance of type Target per Entity / Type (each type in the inheritance hierarchy is mapped onto its own table/view), fetches will join subtype- and supertype tables in many cases, which can lead to a lot of performance problems if the hierarchy has many types. With this problem, keep inheritance to a minimum if possible, or switch to a hierarchy of type Target Per Hierarchy, which means all entities in the inheritance hierarchy are mapped onto the same table/view. Of course this has its own set of drawbacks, but it's a compromise you might want to take. Fetching massive amounts of data by fetching large lists of entities. LLBLGen Pro supports paging (and limiting the # of rows returned), which is often key to process through large sets of data. Use paging on the RDBMS if possible (so a query is executed which returns only the rows in the page requested). When using paging in a web application, be sure that you switch server-side paging on on the datasourcecontrol used. In this case, paging on the grid alone is not enough: this can lead to fetching a lot of data which is then loaded into the grid and paged there. Keep note that analyzing queries for paging could lead to the false assumption that paging doesn't occur, e.g. when the query contains a field of type ntext/image/clob/blob and DISTINCT can't be applied while it should have (e.g. due to a join): the datareader will do DISTINCT filtering on the client. this is a little slower but it does perform paging functionality on the data-reader so it won't fetch all rows even if the query suggests it does. Fetch massive amounts of data because blob/clob/ntext/image fields aren't excluded. LLBLGen Pro supports field exclusion for queries. You can exclude fields (also in prefetch paths) per query to avoid fetching all fields of an entity, e.g. when you don't need them for the logic consuming the resultset. Excluding fields can greatly reduce the amount of time spend on data-transport across the network. Use this optimization if you see that there's a big difference between query execution time on the RDBMS and the time reported by the .NET profiler for the ExecuteReader method call. Doing client-side aggregates/scalar calculations by consuming a lot of data. If possible, try to formulate a scalar query or group by query using the projection system or GetScalar functionality of LLBLGen Pro to do data consumption on the RDBMS server. It's far more efficient to process data on the RDBMS server than to first load it all in memory, then traverse the data in-memory to calculate a value. Using .ToList() constructs inside linq queries. It might be you use .ToList() somewhere in a Linq query which makes the query be run partially in-memory. Example: var q = from c in metaData.Customers.ToList() where c.Country=="Norway" select c; This will actually fetch all customers in-memory and do an in-memory filtering, as the linq query is defined on an IEnumerable<T>, and not on the IQueryable<T>. Linq is nice, but it can often be a bit unclear where some parts of a Linq query might run. Fetching all entities to delete into memory first. To delete a set of entities it's rather inefficient to first fetch them all into memory and then delete them one by one. It's more efficient to execute a DELETE FROM ... WHERE query on the database directly to delete the entities in one go. LLBLGen Pro supports this feature, and so do some other O/R mappers. It's not always possible to do this operation in the context of an O/R mapper however: if an O/R mapper relies on a cache, these kind of operations are likely not supported because they make it impossible to track whether an entity is actually removed from the DB and thus can be removed from the cache. Fetching all entities to update with an expression into memory first. Similar to the previous point: it is more efficient to update a set of entities directly with a single UPDATE query using an expression instead of fetching the entities into memory first and then updating the entities in a loop, and afterwards saving them. It might however be a compromise you don't want to take as it is working around the idea of having an object graph in memory which is manipulated and instead makes the code fully aware there's a RDBMS somewhere. Conclusion Performance tuning is almost always about compromises and making choices. It's also about knowing where to look and how the systems in play behave and should behave. The four steps I provided should help you stay focused on the real problem and lead you towards the solution. Knowing how to optimally use the systems participating in your own code (.NET framework, O/R mapper, RDBMS, network/services) is key for success as well as knowing what's going on inside the application you built. I hope you'll find this guide useful in tracking down performance problems and dealing with them in a useful way.

Read the article
Bad performance with ATI Radeon X1300?

- by stighy

Hi, i'm having problem with Ubuntu 10.04 and my Ati radeon X1300. In particular i can't enable effect (compiz) because they are SLOW, and, for example, the same game (hedgewars) on the same pc run very slowly on Linux, nor in Windows. With my old Ubuntu (9.04) i didn't have the same problem. Does anyone help me to "configure" the right driver for my video card ? I've tested with proprietary (fglrx) and open (xorg..-ati-radeon)... Either give me some problem :(! Thank you!

Read the article
Load and Web Performance Testing using Visual Studio Ultimate 2010-Part 2

- by Tarun Arora

Welcome back, in part 1 of Load and Web Performance Testing using Visual Studio 2010 I talked about why Performance Testing the application is important, the test tools available in Visual Studio Ultimate 2010 and various test rig topologies. In this blog post I’ll get into the details of web performance & load tests as well as why it’s important to follow a goal based pattern while performance testing your application. Tools => Options => Test Tools Have you visited the treasures of Visual Studio Menu bar tools => Options => Test Tools lately? The options to enable disable prompts on creating, editing, deleting or running manual/automated tests can be controller from here. The default test project language and default test types created on a new test project creation could be selected/unselected from here. Ever wondered how you can change the default limit of 25 test results, this can again be changed from here. If you record a lot of Web Tests and wish for the web test recorder to start with “that” URL populated, well this again can be specified from here. If you haven’t so far, I would urge you to spend 2 minutes in the test tools options. Test Menu => Ready Steady Test Action! The Test tools are under the Test Menu in Visual Studio, apart from being able to create a new Test and Test List you can also load an existing vsmdi file. You can also manage your test controllers from here. A solution can have one or more test setting files, but there can only be one active test settings file at any time. Again, this selection can be done from here. You can open the various test windows from under the windows option from the test menu. If you open the Test view window you will see that you have the option to group the tests by work items, project, test type, etc. You can set these properties by right clicking a test in the test list and choosing properties from the context menu. So, what is a vsmdi file? vsmdi stands for Visual Studio Test Metadata File. Placed under the Solution Items this file keeps track of the list of unit tests in your solution. If you open the vsmdi file as an xml file you will see a series of Test Links nested with in the list Test List tags along with the Run Configuration tag. When in visual studio you run tests, the IDE looks at the vsmdi file to see what tests need to be run. You also have the option of using the vsmdi file in your team builds to specify which tests need to run as part of the build. Refer here for a walkthrough from a fellow blogger on how to use the vsmdi file in the team builds. Web Performance Test – The Truth! In Visual Studio 2010 “Web Tests” have been renamed to “Web Performance Tests”. Apart from renaming this test type there have been several improvements to this test type in visual studio 2010. I am very active on the MSDN Visual Studio And Load Testing forum and a frequent question from many users is “Do Web Tests support Pages that run JavaScript?” I will start with a little bit of background before answering this question. Web Performance Tests operate at the HTTP Layer, but why? To enable you to generate high loads with a relatively low amount of hardware, Web performance tests are driven at the protocol layer rather than instantiating a browser.The most common source of confusion is that users do not realize Web Performance Tests work at the HTTP layer. The tool adds to that misconception. After all, you record in IE, and when running a Web test you can select which browser to use, and then the result viewer shows the results in a browser window. So that means the tests run through the browser, right? NO! The Web test engine works at the HTTP layer, and does not instantiate a browser. What does that mean? In the diagram below, you can see there are no browsers running when the engine is sending and receiving requests. Does that mean I can’t test pages that use Java script? The best example for java script generating HTTP traffic is AJAX calls. The most common example of browser plugins are Silverlight or Flash. The Web test recorder will record HTTP traffic from AJAX calls and from most (but not all) browser plugins. This means you will still be able to web performance test pages that use java script or plugin and play back the results but the playback engine will not show the java script or plug in results in the ‘browser control’. If you want to test the page behaviour as a result of the java script or plug in consider using Coded UI Tests. This page looks like it failed, when in fact it succeeded! Looking closely at the response, and subsequent requests, it is clear the operation succeeded. As stated above, the reason why the browser control is pasting this message is because java script has been disabled in this control. So, to reiterate, the web performance test recorder: - Sends and receives data at the HTTP layer. - Does NOT run a browser. - Does NOT run java script. - Does NOT host ActiveX controls or plugins. There is a great series of blog posts from Ed Glas, i would highly recommend his blog to any one performing Load/Performance testing through Visual Studio. Demo – Web Performance Test [Demo] - Visual Studio Ultimate 2010: Test Settings and Configuration [Demo]–Visual Studio Ultimate 2010: Web Performance Test In this short video I try and answer the following questions, Why is performance Testing important? How does Visual Studio Help you performance Test your applications? How do i record a web performance test? How do make a web performance test data driven, transaction driven, loop driven, convert to code, add validations? Best practices for recording Web Performance Tests. I have a web performance test, what next? Creating the Web Performance Test was the first step towards load testing your application. Now that we have the base test we can test the page behaviour when N-users access the page. Have you ever had the head of business call you and mention that the marketing team has done a fantastic job and are expecting increased traffic on the web site, can the website survive the weekend with that additional load? This is the perfect opportunity to capacity test your application to see how your website holds up under various levels of load, you can work the results backwards to see how much hardware you may need to scale up your application to survive the weekend. Apart from that it is always a good idea to have some benchmarks around how the application performs under light loads for short duration, under heavy load for long duration and soak test the application run a constant load for a very week or two to record the effects of constant load for really long durations, this is a great way of identifying how your application handles the default IIS application pool reset which by default is configured to once every 25 hours. These bench marks will act as the perfect yard stick to measure performance gains when you start making improvements. BUT there are some best practices! => Goal Based Load Testing Approach Since the subject is vast and there are a lot of things to measure and analyse, … it is very easy to get distracted from the real goal! You can optimize your application once you know where the pain points are. There is no point performing a load test of 5000 users if your intranet application will only have a 100 simultaneous users, it is important to keep focussed on the real goals of the project. So the idea is to have a user story around your load testing scenarios and test realistically. So it is recommended that you follow the below outline, It is an Iterative process, refine your objectives, identify the key scenarios, what is the expected workload, key metrics you want to report, record the web performance tests, simulate load and analyse results. Is your application already deployed in Production? This is great! You can analyse the IIS Logs to understand the user behaviour… But what are IIS LOGS? The IIS logs allow you to record events for each application and Web site on the Web server. You can create separate logs for each of your applications and Web sites. Logging information in IIS goes beyond the scope of the event logging or performance monitoring features provided by Windows. The IIS logs can include information, such as who has visited your site, what the visitor viewed, and when the information was last viewed. You can use the IIS logs to identify any attempts to gain unauthorized access to your Web server. How to configure IIS LOGS? For those Ninjas who already have IIS Logs configured (by the way its on by default) and need a way to analyse the IIS Logs, can use the Windows IIS Utility – Log Parser. Log Parser is a very powerful tool that provides a generic SQL-like language on top of many types of data like IIS Logs, Event Viewer entries, XML files, CSV files, File System and others; and it allows you to export the result of the queries to many output formats such as CSV, XML, SQL Server, Charts and others; and it works well with IIS 5, 6, 7 and 7.5. Frequently used Log Parser queries. Demo – Load Test [Demo]–Visual Studio Ultimate 2010: Load Testing In this short video I try and answer the following questions, - Types of Performance Testing? - Perform Goal driven Load Testing, analyse Test Run Result and Generate a report? Recap A quick recap of what we have covered so far, Thank you for taking the time out and reading this blog post, in part III of this blog series I’ll be getting into the details of Test Result Analysis, Test Result Drill through, Test Report Generation, Test Run Comparison, and the Asp.net Profiler. If you enjoyed the post, remember to subscribe to http://feeds.feedburner.com/TarunArora. Questions/Feedback/Suggestions, etc please leave a comment. See you on in Part III Share this post : CodeProject

Read the article
Improving the performance of a db import process

- by mmr

I have a program in Microsoft Access that processes text and also inserts data in MySQL database. This operation takes 30 mins or less to finished. I translated it into VB.NET and it takes 2 hours to finish. The program goes like this: A text file contains individual swipe from a corresponding person, it contains their id, time and date of swipe in the machine, and an indicator if it is a time-in or a time-out. I process this text, segregate the information and insert the time-in and time-out per row. I also check if there are double occurrences in the database. After checking, I simply merge the time-in and time-out of the corresponding person into one row only. This process takes 2 hours to finished in VB.NET considering I have a table to compare which contains 600,000+ rows. Now, I read in the internet that python is best in text processing, i already have a test but i doubt in database operation. What do you think is the best programming language for this kind of problem? How can I speed up the process? My first idea was using python instead of VB.NET, but since people here telling me here on SO that this most probably won't help I am searching for different solutions.

Read the article
Enhancing performance in Entity Framework applications by precompiling LINQ to Entities queries

- by nikolaosk

This is going to be the tenth post of a series of posts regarding ASP.Net and the Entity Framework and how we can use Entity Framework to access our datastore. You can find the first one here , the second one here , the third one here , the fourth one here , the fifth one here ,the sixth one here ,the seventh one here ,the eighth one here and the ninth one here . I have a post regarding ASP.Net and EntityDataSource . You can read it here .I have 3 more posts on Profiling Entity Framework applications...(read more)

Read the article
high performance with xen, vmware or virtualbox

- by Marchosius

I was wondering which is the best method to go about if I want to play win based games. I do not want to go with the dual boot method as this will cost me time to restart, login and run a os to do my work or pass the time, and some of my apps rely on win and my graphics to run. for example Daz3d, Photoshop, Flash etc. Now I read about HVM(hardware virtual machines) and then I know about the 3D virtualisation of VMware and VirtualBox. How ever the 2 later virtualise the 3D not using the full power of the GPU. So this option wont perform perfect for latest games like D3. I was wondering if anyone have experience in HVM(like xen if i am not mistaken) and tried something similar to access the full power of the GPU and successfully run newer games and other products relying on the GPU? Will be the first time setting up a HVM, no experience in this so don't know what to expect. This will help a lot as I do not want to revert back to win or as mentioned do dual boot.

Read the article
Linked servers and performance impact: Direction matters!

- by Linchi Shea

When you have some data on a SQL Server instance (say SQL01) and you want to move the data to another SQL Server instance (say SQL02) through openquery(), you can either push the data from SQL01, or pull the data from SQL02. To push the data, you can run a SQL script like the following on SQL01, which is the source server: -- The push script -- Run this on SQL01 use testDB go insert openquery(SQL02, 'select * from testDB.dbo.target_table') select * from source_table; To pull the data, you can run...(read more)

Read the article
IOS OpenGl transparency performance issue

- by user346443

I have built a game in Unity that uses OpenGL ES 1.1 for IOS. I have a nice constant frame rate of 30 until i place a semi transparent texture over the top on my entire scene. I expect the drop in frames is due to the blending overhead with sorting the frame buffer. On 4s and 3gs the frames stay at 30 but on the iPhone 4 the frame rate drops to 15-20. Probably due to the extra pixels in the retina compared to the 3gs and smaller cpu/gpu compared to the 4s. I would like to know if there is anything i can do to try and increase the frame rate when a transparent texture is rendered on top of the entire scene. Please not the the transparent texture overlay is a core part of the game and i can't disable anything else in the scene to speed things up. If its guaranteed to make a difference I guess I can switch to OpenGl ES 2.0 and write the shaders but i would prefer not to as i need to target older devices. I should add that the depth buffer is disabled and I'm blending using SrcAlpha One. Any advice would be highly appreciated. Cheers

Read the article
Slow Firefox Javascript Canvas Performance?

- by jujumbura

As a followup from a previous post, I have been trying to track down some slowdown I am having when drawing a scene using Javascript and the canvas element. I decided to narrow down my focus to a REALLY barebones animation that only clears the canvas and draws a single image, once per-frame. This of course runs silky smooth in Chrome, but it still stutters in Firefox. I added a simple FPS calculator, and indeed it appears that my page is typically getting an FPS in the 50's when running Firefox. This doesn't seem right to me, I must be doing something wrong here. Can anybody see anything I might be doing that is causing this drop in FPS? <!DOCTYPE HTML> <html> <head> </head> <body bgcolor=silver> <canvas id="myCanvas" width="600" height="400"></canvas> <img id="myHexagon" src="Images/Hexagon.png" style="display: none;"> <script> window.requestAnimFrame = (function(callback) { return window.requestAnimationFrame || window.webkitRequestAnimationFrame || window.mozRequestAnimationFrame || window.oRequestAnimationFrame || window.msRequestAnimationFrame || function(callback) { window.setTimeout(callback, 1000 / 60); }; })(); var animX = 0; var frameCounter = 0; var fps = 0; var time = new Date(); function animate() { var canvas = document.getElementById("myCanvas"); var context = canvas.getContext("2d"); context.clearRect(0, 0, canvas.width, canvas.height); animX += 1; if (animX == canvas.width) { animX = 0; } var image = document.getElementById("myHexagon"); context.drawImage(image, animX, 128); context.lineWidth=1; context.fillStyle="#000000"; context.lineStyle="#ffffff"; context.font="18px sans-serif"; context.fillText("fps: " + fps, 20, 20); ++frameCounter; var currentTime = new Date(); var elapsedTimeMS = currentTime - time; if (elapsedTimeMS >= 1000) { fps = frameCounter; frameCounter = 0; time = currentTime; } // request new frame requestAnimFrame(function() { animate(); }); } window.onload = function() { animate(); }; </script> </body> </html>

Read the article
Improving SpriteBatch performance for tiles

- by Richard Rast

I realize this is a variation on what has got to be a common question, but after reading several (good answers) I'm no closer to a solution here. So here's my situation: I'm making a 2D game which has (among some other things) a tiled world, and so, drawing this world implies drawing a jillion tiles each frame (depending on resolution: it's roughly a 64x32 tile with some transparency). Now I want the user to be able to maximize the game (or fullscreen mode, actually, as its a bit more efficient) and instead of scaling textures (bleagh) this will just allow lots and lots of tiles to be shown at once. Which is great! But it turns out this makes upward of 2000 tiles on the screen each time, and this is framerate-limiting (I've commented out enough other parts of the game to make sure this is the bottleneck). It gets worse if I use multiple source rectangles on the same texture (I use a tilesheet; I believe changing textures entirely makes things worse), or if you tint the tiles, or whatever. So, the general question is this: What are some general methods for improving the drawing of thousands of repetitive sprites? Answers pertaining to XNA's SpriteBatch would be helpful but I'm equally happy with general theory. Also, any tricks pertaining to this situation in particular (drawing a tiled world efficiently) are also welcome. I really do want to draw all of them, though, and I need the SpriteMode.BackToFront to be active, because

Read the article
Should I reorder partitions regarding performance

- by Marcel

I have in principal 3 big partitions on my Netbook. One Windows, one for shared files, one for Ubuntu. I recently find out (using hdparm) the the hardisk seems to have much better perfomance on the first 2/3 (~ 60MB/s) than on the last 1/3 (~ 40MB/s). I am thinking to delete the second partition and create new partitions for "swap" and / directly after Windows. Does this effort make sense? I also wanna upgrade to 10.4/10.10 but keep the option to go back to the old system, so maybe I install ubuntu completely in a/this new partition?

Read the article
Ubuntu with KDE and kernel 3.6.2 - performance issues

- by Pelda

I have recently swiched to Ubuntu and installed KDe on it. I like software included in Ubuntu but interface from Kubuntu. My problem is, that after installation of kernel 3.6.2 (deafult ubuntu 12.04 is 3.2 I think) - whole KDE interface is laggy and I have to render using Xrended because Opel AL doesnt work. So please tell me - I didint find it anywhere - Does KDE has some problems with new kernel? should i downgrade back to 3.x.x? Thank you for answers and your time. Pelda

Read the article
Fresh install 12.04 Performance & Stability: SONY VPCCW2S1E

- by George Katsanos

just yesterday I installed 12.04. Is it normal that installing a package (Chrome for example) takes 5 to 10 minutes (compared to Windows 7 for example which takes 2 or 3)? (with end result: installation failed) Otherwise is it normal that while the system is doing some installation,extracting packs and so on, other applications often become unresponsive? Sidenote: I don't care about fancy desktop effects, I installed ubuntu to go on and experiment with web servers, memcache, Varnish and git/svn. So basically I plan to do lots of console-only operations. My surprise was also the difference of stability compared to an old FreeBSD installation I had on a dinosaur P3 550Mhz :) (I am on a SONY VPCCW2S1E) (I guess it might have to do something with my problems?)

Read the article
Google bots are severely affecting site performance

- by Lynn

I have an aggregate site on a linux server that pulls in feeds from a universe of about 2,000 blogs. It's in Wordpress 3.4.2 and I have a cron job that is staggered to run five times an hour on another server to pull in the stories and then publish them to the front page of this site. This is so I didn't put too much pressure all on one server. However, the Google bots, which visit a few times every hour bring the server to its knees in the morning and evenings when there is an increase in traffic on the site. The bots have something like 30,000 links to follow at this point. How do I throttle the bots to simply grab the new stories off the front page and stop there? EDIT- Details of my server configuration: The way we have this set up is the server that handles all the publishing is an unmanaged instance via AWS. It mounts the NFS server and connects to the RDS to update content, etc. You get to this publishing instance via a plugin that detects the wp-admin link and then redirects you into there. The front end app server also mounts the NFS and requests data from the RDS. It is the only one that has the WP Super Cache on it.... The OS is Ubuntu on the App server and the NFS runs CentOs. The front end is Nginx and the publishing server is Apache.

Read the article
Optimizing hash lookup & memory performance in Go

- by Moishe

As an exercise, I'm implementing HashLife in Go. In brief, HashLife works by memoizing nodes in a quadtree so that once a given node's value in the future has been calculated, it can just be looked up instead of being re-calculated. So eg. if you have a node at the 8x8 level, you remember it by its four children (each at the 2x2 level). So next time you see an 8x8 node, when you calculate the next generation, you first check if you've already seen a node with those same four children. This is extended up through all levels of the quadtree, which gives you some pretty amazing optimizations if eg. you're 10 levels above the leaves. Unsurprisingly, it looks like the perfmance crux of this is the lookup of nodes by child-node values. Currently I have a hashmap of {&upper_left_node,&upper_right_node,&lower_left_node,&lower_right_node} -> node So my lookup function is this: func FindNode(ul, ur, ll, lr *Node) *Node { var node *Node var ok bool nc := NodeChildren{ul, ur, ll, lr} node, ok = NodeMap[nc] if ok { return node } node = &Node{ul, ur, ll, lr, 0, ul.Level + 1, nil} NodeMap[nc] = node return node } What I'm trying to figure out is if the "nc := NodeChildren..." line causes a memory allocation each time the function is called. If it does, can I/should I move the declaration to the global scope and just modify the values each time this function is called? Or is there a more efficient way to do this? Any advice/feedback would be welcome. (even coding style nits; this is literally the first thing I've written in Go so I'd love any feedback)

Read the article
Performance cost of running Ubuntu from external hard drive

- by dandan78

A friend just complained to me about Ubuntu being slow. Although I've noticed a certain lack of snappiness with Linux vs Windows in the past, I really can't say I've had much to grumble about with the recent distributions of Ubuntu. That said, his objections seem much worse than the ones I used to have and I know that his current setup is significantly more powerful than my laptop. And then it turned out he is running Ubuntu off an external HDD hooked up via USB2.0. The HD enclosure is USB3.0 but apparently he can't manage to get it to boot on USB3.0 so he switched to one of the USB2.0 ports or whatever and that works, albeit not very well. Now I would expect USB to add some overhead to communication between the computer and the HDD; SATA is after all designed to get the maximum out of a hard drive, whereas USB is, well, universal. What are your expreriences with booting off external HDDs? Edit: Does anybody know just how much of a slowdown can be expected?

Read the article

Search Results

Search found 97855 results on 3915 pages for 'code performance'.

Page 10/3915 | < Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >

- by Jason

- by SaveTheRbtz

- by pgfearo

- by vik20000in

- by Samselvaprabu

- by Graviton

- by neil

- by pinaldave

- by pinaldave

- by FransBouma

- by stighy

- by Tarun Arora

- by mmr

- by nikolaosk

- by Marchosius

- by Linchi Shea

- by user346443

- by jujumbura

- by Richard Rast

- by Marcel

- by Pelda

- by George Katsanos

- by Lynn

- by Moishe

- by dandan78

< Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >