Search Results

Search found 13151 results on 527 pages for 'performance counters'.

Page 43/527 | < Previous Page | 39 40 41 42 43 44 45 46 47 48 49 50  | Next Page >

  • What criteria would I use SQL Stream Insight vs TPL Dataflow [closed]

    - by makerofthings7
    There is an add-in to the Task Parallel Library (TPL) called TPL Dataflow that allows a variety of data processing scenarios. It seems that there are some parallels to the SQL Stream Insight product, however since SQL's Stream Insight has some interesting licensing around it, and it has a better performance depending on what license I get... I found myself asking myself should I use TPL Dataflow and not have any licensing issues, and possibly better performance. Can anyone tell me if performance is a valid criteria for comparing SQL Stream Insight vs TPL Dataflow? What other criteria should I be looking at when comparing the two?

    Read the article

  • Can a call to WaitHandle.SignalAndWait be ignored for performance profiling purposes?

    - by Dan Tao
    I just downloaded the trial version of ANTS Performance Profiler from Red Gate and am investigating some of my team's code. Immediately I notice that there's a particular section of code that ANTS is reporting as eating up to 99% CPU time. I am completely unfamiliar with ANTS or performance profiling in general (that is, aside from self-profiling using what I'm sure are extremely crude and frowned-upon methods such as double timeToComplete = (endTime - startTime).TotalSeconds), so I'm still fiddling around with the application and figuring out how it's used. But I did call the developer responsible for the code in question and his immediate reaction was "Yeah, that doesn't surprise me that it says that; but that code calls SignalAndWait [which I could see for myself, thanks to ANTS], which doesn't use any CPU, it just sits there waiting for something to do." He advised me to simply ignore that code and look for anything ELSE I could find. My question: is it true that SignalAndWait requires NO CPU overhead (and if so, how is this possible?), and is it reasonable that a performance profiler would view it as taking up 99% CPU time? I find this particularly curious because, if it's at 99%, that would suggest that our application is often idle, wouldn't it? And yet its performance has become rather sluggish lately. Like I said, I really am just a beginner when it comes to this tool, and I don't know anything about the WaitHandle class. So ANY information to help me to understand what's going on here would be appreciated.

    Read the article

  • What is the performance impact of CSS's universal selector?

    - by Bungle
    I'm trying to find some simple client-side performance tweaks in a page that receives millions of monthly pageviews. One concern that I have is the use of the CSS universal selector (*). As an example, consider a very simple HTML document like the following: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> <title>Example</title> <style type="text/css"> * { margin: 0; padding: 0; } </head> <body> <h1>This is a heading</h1> <p>This is a paragraph of text.</p> </body> </html> The universal selector will apply the above declaration to the body, h1 and p elements, since those are the only ones in the document. In general, would I see better performance from a rule such as: body, h1, p { margin: 0; padding: 0; } Or would this have exactly the same net effect? Essentially, what I'm asking is if these rules are effectively equivalent in this case, or if the universal selector has to perform more unnecessary work that I may not be aware of. I realize that the performance impact in this example may be very small, but I'm hoping to learn something that may lead to more significant performance improvements in real-world situations. Thanks for any help!

    Read the article

  • Is there a IDE/compiler PC benchmark I can use to compare my PCs performance?

    - by RickL
    I'm looking for a benchmark (and results on other PCs) which would give me an idea of the development performance gain I could get by upgrading my PC, also the benchmark could be used to justify the upgrade to my boss. I use Visual Studio 2008 for my development, so I'd like to get an idea of by what factor the build times would be improved, and also it would be good if the benchmark could incorporate IDE performance (i.e. when editing, using intellisense, opening code files etc) into its result. I currently have an AMD 3800x2, with 2GB RAM on Vista 32. For example, I'd like to know what kind of performance gain I'd see in Visual Studio 2008 with a Q6600, 4GB RAM on Vista 64. And also with other processors, and other RAM sizes... also see whether hard disk performance is a big factor. EDIT: I mentioned Vista 64 because I'm aware that Vista 32 can only use 3GB RAM maximum. So I'd presume that wanting to use more RAM would require Vista 64, but perhaps it could still be slower overall there is a large overhead in using the 32 bit VS 2008 on 64 bit OS.

    Read the article

  • Performance of stored proc when updating columns selectively based on parameters?

    - by kprobst
    I'm trying to figure out if this is relatively well-performing T-SQL (this is SQL Server 2008). I need to create a stored procedure that updates a table. The proc accepts as many parameters as there are columns in the table, and with the exception of the PK column, they all default to NULL. The body of the procedure looks like this: CREATE PROCEDURE proc_repo_update @object_id bigint ,@object_name varchar(50) = NULL ,@object_type char(2) = NULL ,@object_weight int = NULL ,@owner_id int = NULL -- ...etc AS BEGIN update object_repo set object_name = ISNULL(@object_name, object_name) ,object_type = ISNULL(@object_type, object_type) ,object_weight = ISNULL(@object_weight, object_weight) ,owner_id = ISNULL(@owner_id, owner_id) -- ...etc where object_id = @object_id return @@ROWCOUNT END So basically: Update a column only if its corresponding parameter was provided, and leave the rest alone. This works well enough, but as the ISNULL call will return the value of the column if the received parameter was null, will SQL Server optimize this somehow? This might be a performance bottleneck on the application where the table might be updated heavily (insertion will be uncommon so the performance there is not a problem). So I'm trying to figure out what's the best way to do this. Is there a way to condition the column expressions with something like CASE WHEN or something? The table will be indexed up the wazoo as well for read performance. Is this the best approach? My alternative at this point is to create the UPDATE expression in code (e.g. inline SQL) and execute it against the server. This would solve my doubts about performance, but I'd rather leave this in a stored proc if possible.

    Read the article

  • Recent improvements in Console Performance

    - by loren.konkus
    Recently, the WebLogic Server development and support organizations have worked with a number of customers to quantify and improve the performance of the Administration Console in large, distributed configurations where there is significant latency in the communications between the administration server and managed servers. These improvements fall into two categories: Constraining the amount of time that the Console stalls waiting for communication Reducing and streamlining the amount of data required for an update A few releases ago, we added support for a configurable domain-wide mbean "Invocation Timeout" value on the Console's configuration: general, advanced section for a domain. The default value for this setting is 0, which means wait indefinitely and was chosen for compatibility with the behavior of previous releases. This configuration setting applies to all mbean communications between the admin server and managed servers, and is the first line of defense against being blocked by a stalled or completely overloaded managed server. Each site should choose an appropriate timeout value for their environment and network latency. In the next release of WebLogic Server, we've added an additional console preference, "Management Operation Timeout", to the Console's shared preference page. This setting further constrains how long certain console pages will wait for slowly responding servers before returning partial results. While not all Console pages support this yet, key pages such as the Servers Configuration and Control table pages and the Deployments Control pages have been updated to support this. For example, if a user requests a Servers Table page and a Management Operation Timeout occurs, the table is displayed with both local configuration and remote runtime information from the responding managed servers and only local configuration information for servers that did not yet respond. This means that a troublesome managed server does not impede your ability to manage your domain using the Console. To support these changes, these Console pages have been re-written to use the Work Management feature of WebLogic Server to interact with each server or deployment concurrently, which further improves the responsiveness of these pages. The basic algorithm for these pages is: For each configuration mbean (ie, Servers) populate rows with configuration attributes from the fast, local mbean server Find a WorkManager For each server, Create a Work instance to obtain runtime mbean attributes for the server Schedule Work instance in the WorkManager Call WorkManager.waitForAll to wait WorkItems to finish, constrained by Management Operation Timeout For each WorkItem, if the runtime information obtained was not complete, add a message indicating which server has incomplete data Display collected data in table In addition to these changes to constrain how long the console waits for communication, a number of other changes have been made to reduce the amount and scope of managed server interactions for key pages. For example, in previous releases the Deployments Control table looked at the status of a deployment on every managed server, even those servers that the deployment was not currently targeted on. (This was done to handle an edge case where a deployment's target configuration was changed while it remained running on previously targeted servers.) We decided supporting that edge case did not warrant the performance impact for all, and instead only look at the status of a deployment on the servers it is targeted to. Comprehensive status continues to be available if a user clicks on the 'status' field for a deployment. Finally, changes have been made to the System Status portlet to reduce its impact on Console page display times. Obtaining health information for this display requires several mbean interactions with managed servers. In previous releases, this mbean interaction occurred with every display, and any delay or impediment in these interactions was reflected in the display time for every page. To reduce this impact, we've made several changes in this portlet: Using Work Management to obtain health concurrently Applying the operation timeout configuration to constrain how long we will wait Caching health information to reduce the cost during rapid navigation from page to page and only obtaining new health information if the previous information is over 30 seconds old. Eliminating heath collection if this portlet is minimized. Together, these Console changes have resulted in significant performance improvements for the customers with large configurations and high latency that we have worked with during their development, and some lesser performance improvements for those with small configurations and very fast networks. These changes will be included in the 11g Rel 1 patch set 2 (10.3.3.0) release of WebLogic Server.

    Read the article

  • How do I find the cause for a huge difference in performance between two identical Ubuntu servers?

    - by the.duckman
    I am running two Dell R410 servers in the same rack of a data center. Both have the same hardware configuration, run Ubuntu 10.4, have the same packages installed and run the same Java web servers. No other load. One of them is 20-30% faster than the other, very consistently. I used dstat to figure out, if there are more context switches, IO, swapping or anything, but I see no reason for the difference. With the same workload, (no swapping, virtually no IO), the cpu usage and load is higher on one server. So the difference appears to be mainly CPU bound, but while a simple cpu benchmark using sysbench (with all other load turned off) did yield a difference, it was only 6%. So maybe it is not only CPU but also memory performance. I tried to figure out if the BIOS settings differ in some parameter, did a dump using dmidecode, but that yielded no difference. I compared /proc/cpuinfo, no difference. I compared the output of cpufreq-info, no difference. I am lost. What can I do, to figure out, what is going on?

    Read the article

  • mysql - moving to a lower performance server, how small can I go?

    - by pedalpete
    I've been running a site for a few years now which really isn't growing in traffic, and I want to save some money on hosting, but keep it going for the loyal users of the site and api. The database has one a nearly 4 million row table, and on a 4gb dual xeon 5320 server. When I check server stats on this server with ps -aux, i get returns of mysql running at about 11% capacity, so no serious load. The main query against mysql runs in about 0.45 seconds. I popped over to linode.com to see what kind of performance I could get out of one of their tiny boxes, and their 360mb ram XEN vps returns the same query in 20 seconds. Clearly not good enough. I've looked at the mysql variables, and they are both very similar (I've included the show variables output below, if anybody is interested). Is there a good way to decide on what size server is needed based on what I'm coming from? Is it RAM that is likely making the difference with the large table size? Is there a way for me to figure out how much ram would be ideal?? Here's the output of the show variables (though I'm not sure it is important). +---------------------------------+------------------------------------------------------------+ | Variable_name | Value | +---------------------------------+------------------------------------------------------------+ | auto_increment_increment | 1 | | auto_increment_offset | 1 | | automatic_sp_privileges | ON | | back_log | 50 | | basedir | /usr/ | | bdb_cache_size | 8384512 | | bdb_home | /var/lib/mysql/ | | bdb_log_buffer_size | 262144 | | bdb_logdir | | | bdb_max_lock | 10000 | | bdb_shared_data | OFF | | bdb_tmpdir | /tmp/ | | binlog_cache_size | 32768 | | bulk_insert_buffer_size | 8388608 | | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | | collation_connection | latin1_swedish_ci | | collation_database | latin1_swedish_ci | | collation_server | latin1_swedish_ci | | completion_type | 0 | | concurrent_insert | 1 | | connect_timeout | 10 | | datadir | /var/lib/mysql/ | | date_format | %Y-%m-%d | | datetime_format | %Y-%m-%d %H:%i:%s | | default_week_format | 0 | | delay_key_write | ON | | delayed_insert_limit | 100 | | delayed_insert_timeout | 300 | | delayed_queue_size | 1000 | | div_precision_increment | 4 | | keep_files_on_create | OFF | | engine_condition_pushdown | OFF | | expire_logs_days | 0 | | flush | OFF | | flush_time | 0 | | ft_boolean_syntax | + - For some reason, that table formats properly in the preview, but apparently not when viewing the question. Hopefully it isn't needed anyway.

    Read the article

  • Windows 7 host with Ubuntu Guest and a performance hit, memory locks?

    - by Cyrylski
    I have a brand new Lenovo T510 with Core i5 and 4GB of RAM with Windows 7 on it. I Installed Ubuntu 10.10 in a Virtualbox. For some reason system gets really slow on this setup which makes me really angry. There's a video card shared with full 3D support enabled and 1GB of RAM allocated for the Ubuntu machine. It may sound stupid, but WHY is the whole memory consumed in an instant when I run Virtualbox? I struggled for like 10 minutes restraining myself from a brutal reset, and now everything runs smooth but memory "in use" in Resource Monitor is 3GB flat with only Chrome running. I'm new to Windows 7, but I'm really disappointed with performance at this point... I used to work in a different environment with much slower hardware and there was no such problem (WinXP over Ubuntu, 1GB out of 2GB allocated for WinXP guest on intel GMA). This is, until I clogged RAM totally there. But I was capable of running Chrome, Firefox and Apache server on a 1GB RAM in Ubuntu there and Photoshop CS4 on Windows XP and it worked. In this case I can't go beyond setting up Ubuntu properly. I bet I'm doing something wrong.

    Read the article

  • Save object states in .data or attr - Performance vs CSS?

    - by Neysor
    In response to my answer yesterday about rotating an Image, Jamund told me to use .data() instead of .attr() First I thought that he is right, but then I thought about a bigger context... Is it always better to use .data() instead of .attr()? I looked in some other posts like what-is-better-data-or-attr or jquery-data-vs-attrdata The answers were not satisfactory for me... So I moved on and edited the example by adding CSS. I thought it might be useful to make a different Style on each image if it rotates. My style was the following: .rp[data-rotate="0"] { border:10px solid #FF0000; } .rp[data-rotate="90"] { border:10px solid #00FF00; } .rp[data-rotate="180"] { border:10px solid #0000FF; } .rp[data-rotate="270"] { border:10px solid #00FF00; } Because design and coding are often separated, it could be a nice feature to handle this in CSS instead of adding this functionality into JavaScript. Also in my case the data-rotate is like a special state which the image currently has. So in my opinion it make sense to represent it within the DOM. I also thought this could be a case where it is much better to save with .attr() then with .data(). Never mentioned before in one of the posts I read. But then i thought about performance. Which function is faster? I built my own test following: <!DOCTYPE HTML> <html> <head> <title>test</title> <script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"></script> <script type="text/javascript"> function runfirst(dobj,dname){ console.log("runfirst "+dname); console.time(dname+"-attr"); for(i=0;i<10000;i++){ dobj.attr("data-test","a"+i); } console.timeEnd(dname+"-attr"); console.time(dname+"-data"); for(i=0;i<10000;i++){ dobj.data("data-test","a"+i); } console.timeEnd(dname+"-data"); } function runlast(dobj,dname){ console.log("runlast "+dname); console.time(dname+"-data"); for(i=0;i<10000;i++){ dobj.data("data-test","a"+i); } console.timeEnd(dname+"-data"); console.time(dname+"-attr"); for(i=0;i<10000;i++){ dobj.attr("data-test","a"+i); } console.timeEnd(dname+"-attr"); } $().ready(function() { runfirst($("#rp4"),"#rp4"); runfirst($("#rp3"),"#rp3"); runlast($("#rp2"),"#rp2"); runlast($("#rp1"),"#rp1"); }); </script> </head> <body> <div id="rp1">Testdiv 1</div> <div id="rp2" data-test="1">Testdiv 2</div> <div id="rp3">Testdiv 3</div> <div id="rp4" data-test="1">Testdiv 4</div> </body> </html> It should also show if there is a difference with a predefined data-test or not. One result was this: runfirst #rp4 #rp4-attr: 515ms #rp4-data: 268ms runfirst #rp3 #rp3-attr: 505ms #rp3-data: 264ms runlast #rp2 #rp2-data: 260ms #rp2-attr: 521ms runlast #rp1 #rp1-data: 284ms #rp1-attr: 525ms So the .attr() function did always need more time than the .data() function. This is an argument for .data() I thought. Because performance is always an argument! Then I wanted to post my results here with some questions, and in the act of writing I compared with the questions Stack Overflow showed me (similar titles) And true enough, there was one interesting post about performance I read it and run their example. And now I am confused! This test showed that .data() is slower then .attr() !?!! Why is that so? First I thought it is because of a different jQuery library so I edited it and saved the new one. But the result wasn't changing... So now my questions to you: Why are there some differences in the performance in these two examples? Would you prefer to use data- HTML5 attributes instead of data, if it represents a state? Although it wouldn't be needed at the time of coding? Why - Why not? Now depending on the performance: Would performance be an argument for you using .attr() instead of data, if it shows that .attr() is better? Although data is meant to be used for .data()? UPDATE 1: I did see that without overhead .data() is much faster. Misinterpreted the data :) But I'm more interested in my second question. :) Would you prefer to use data- HTML5 attributes instead of data, if it represents a state? Although it wouldn't be needed at the time of coding? Why - Why not? Are there some other reasons you can think of, to use .attr() and not .data()? e.g. interoperability? because .data() is jquery style and HTML Attributes can be read by all... UPDATE 2: As we see from T.J Crowder's speed test in his answer attr is much faster then data! which is again confusing me :) But please! Performance is an argument, but not the highest! So give answers to my other questions please too!

    Read the article

  • Trending Buffer Pool Performance Using DMV sys.dm_os_performance_counters

    I'd seen you posted a tip on capturing SQL based PerfMon counters using sys.dm_os_performance_counters. What queries can I run against those stored results that would allow me to examine memory usage on my SQL instance? Join SQL Backup’s 35,000+ customers to compress and strengthen your backups "SQL Backup will be a REAL boost to any DBA lucky enough to use it." Jonathan Allen. Download a free trial now.

    Read the article

  • Troubleshooting High-CPU Utilization for SQL Server

    - by Susantha Bathige
    The objective of this FAQ is to outline the basic steps in troubleshooting high CPU utilization on  a server hosting a SQL Server instance. The first and the most common step if you suspect high CPU utilization (or are alerted for it) is to login to the physical server and check the Windows Task Manager. The Performance tab will show the high utilization as shown below: Next, we need to determine which process is responsible for the high CPU consumption. The Processes tab of the Task Manager will show this information: Note that to see all processes you should select Show processes from all user. In this case, SQL Server (sqlserver.exe) is consuming 99% of the CPU (a normal benchmark for max CPU utilization is about 50-60%). Next we examine the scheduler data. Scheduler is a component of SQLOS which evenly distributes load amongst CPUs. The query below returns the important columns for CPU troubleshooting. Note – if your server is under severe stress and you are unable to login to SSMS, you can use another machine’s SSMS to login to the server through DAC – Dedicated Administrator Connection (see http://msdn.microsoft.com/en-us/library/ms189595.aspx for details on using DAC) SELECT scheduler_id ,cpu_id ,status ,runnable_tasks_count ,active_workers_count ,load_factor ,yield_count FROM sys.dm_os_schedulers WHERE scheduler_id See below for the BOL definitions for the above columns. scheduler_id – ID of the scheduler. All schedulers that are used to run regular queries have ID numbers less than 1048576. Those schedulers that have IDs greater than or equal to 1048576 are used internally by SQL Server, such as the dedicated administrator connection scheduler. cpu_id – ID of the CPU with which this scheduler is associated. status – Indicates the status of the scheduler. runnable_tasks_count – Number of workers, with tasks assigned to them that are waiting to be scheduled on the runnable queue. active_workers_count – Number of workers that are active. An active worker is never preemptive, must have an associated task, and is either running, runnable, or suspended. current_tasks_count - Number of current tasks that are associated with this scheduler. load_factor – Internal value that indicates the perceived load on this scheduler. yield_count – Internal value that is used to indicate progress on this scheduler.                                                                 Now to interpret the above data. There are four schedulers and each assigned to a different CPU. All the CPUs are ready to accept user queries as they all are ONLINE. There are 294 active tasks in the output as per the current_tasks_count column. This count indicates how many activities currently associated with the schedulers. When a  task is complete, this number is decremented. The 294 is quite a high figure and indicates all four schedulers are extremely busy. When a task is enqueued, the load_factor  value is incremented. This value is used to determine whether a new task should be put on this scheduler or another scheduler. The new task will be allocated to less loaded scheduler by SQLOS. The very high value of this column indicates all the schedulers have a high load. There are 268 runnable tasks which mean all these tasks are assigned a worker and waiting to be scheduled on the runnable queue.   The next step is  to identify which queries are demanding a lot of CPU time. The below query is useful for this purpose (note, in its current form,  it only shows the top 10 records). SELECT TOP 10 st.text  ,st.dbid  ,st.objectid  ,qs.total_worker_time  ,qs.last_worker_time  ,qp.query_plan FROM sys.dm_exec_query_stats qs CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp ORDER BY qs.total_worker_time DESC This query as total_worker_time as the measure of CPU load and is in descending order of the  total_worker_time to show the most expensive queries and their plans at the top:      Note the BOL definitions for the important columns: total_worker_time - Total amount of CPU time, in microseconds, that was consumed by executions of this plan since it was compiled. last_worker_time - CPU time, in microseconds, that was consumed the last time the plan was executed.   I re-ran the same query again after few seconds and was returned the below output. After few seconds the SP dbo.TestProc1 is shown in fourth place and once again the last_worker_time is the highest. This means the procedure TestProc1 consumes a CPU time continuously each time it executes.      In this case, the primary cause for high CPU utilization was a stored procedure. You can view the execution plan by clicking on query_plan column to investigate why this is causing a high CPU load. I have used SQL Server 2008 (SP1) to test all the queries used in this article.

    Read the article

  • Speeding up procedural texture generation

    - by FalconNL
    Recently I've begun working on a game that takes place in a procedurally generated solar system. After a bit of a learning curve (having neither worked with Scala, OpenGL 2 ES or Libgdx before), I have a basic tech demo going where you spin around a single procedurally textured planet: The problem I'm running into is the performance of the texture generation. A quick overview of what I'm doing: a planet is a cube that has been deformed to a sphere. To each side, a n x n (e.g. 256 x 256) texture is applied, which are bundled in one 8n x n texture that is sent to the fragment shader. The last two spaces are not used, they're only there to make sure the width is a power of 2. The texture is currently generated on the CPU, using the updated 2012 version of the simplex noise algorithm linked to in the paper 'Simplex noise demystified'. The scene I'm using to test the algorithm contains two spheres: the planet and the background. Both use a greyscale texture consisting of six octaves of 3D simplex noise, so for example if we choose 128x128 as the texture size there are 128 x 128 x 6 x 2 x 6 = about 1.2 million calls to the noise function. The closest you will get to the planet is about what's shown in the screenshot and since the game's target resolution is 1280x720 that means I'd prefer to use 512x512 textures. Combine that with the fact the actual textures will of course be more complicated than basic noise (There will be a day and night texture, blended in the fragment shader based on sunlight, and a specular mask. I need noise for continents, terrain color variation, clouds, city lights, etc.) and we're looking at something like 512 x 512 x 6 x 3 x 15 = 70 million noise calls for the planet alone. In the final game, there will be activities when traveling between planets, so a wait of 5 or 10 seconds, possibly 20, would be acceptable since I can calculate the texture in the background while traveling, though obviously the faster the better. Getting back to our test scene, performance on my PC isn't too terrible, though still too slow considering the final result is going to be about 60 times worse: 128x128 : 0.1s 256x256 : 0.4s 512x512 : 1.7s This is after I moved all performance-critical code to Java, since trying to do so in Scala was a lot worse. Running this on my phone (a Samsung Galaxy S3), however, produces a more problematic result: 128x128 : 2s 256x256 : 7s 512x512 : 29s Already far too long, and that's not even factoring in the fact that it'll be minutes instead of seconds in the final version. Clearly something needs to be done. Personally, I see a few potential avenues, though I'm not particularly keen on any of them yet: Don't precalculate the textures, but let the fragment shader calculate everything. Probably not feasible, because at one point I had the background as a fullscreen quad with a pixel shader and I got about 1 fps on my phone. Use the GPU to render the texture once, store it and use the stored texture from then on. Upside: might be faster than doing it on the CPU since the GPU is supposed to be faster at floating point calculations. Downside: effects that cannot (easily) be expressed as functions of simplex noise (e.g. gas planet vortices, moon craters, etc.) are a lot more difficult to code in GLSL than in Scala/Java. Calculate a large amount of noise textures and ship them with the application. I'd like to avoid this if at all possible. Lower the resolution. Buys me a 4x performance gain, which isn't really enough plus I lose a lot of quality. Find a faster noise algorithm. If anyone has one I'm all ears, but simplex is already supposed to be faster than perlin. Adopt a pixel art style, allowing for lower resolution textures and fewer noise octaves. While I originally envisioned the game in this style, I've come to prefer the realistic approach. I'm doing something wrong and the performance should already be one or two orders of magnitude better. If this is the case, please let me know. If anyone has any suggestions, tips, workarounds, or other comments regarding this problem I'd love to hear them.

    Read the article

  • Why do I see a large performance hit with DRBD?

    - by BHS
    I see a much larger performance hit with DRBD than their user manual says I should get. I'm using DRBD 8.3.7 (Fedora 13 RPMs). I've setup a DRBD test and measured throughput of disk and network without DRBD: dd if=/dev/zero of=/data.tmp bs=512M count=1 oflag=direct 536870912 bytes (537 MB) copied, 4.62985 s, 116 MB/s / is a logical volume on the disk I'm testing with, mounted without DRBD iperf: [ 4] 0.0-10.0 sec 1.10 GBytes 941 Mbits/sec According to Throughput overhead expectations, the bottleneck would be whichever is slower, the network or the disk and DRBD should have an overhead of 3%. In my case network and I/O seem to be pretty evenly matched. It sounds like I should be able to get around 100 MB/s. So, with the raw drbd device, I get dd if=/dev/zero of=/dev/drbd2 bs=512M count=1 oflag=direct 536870912 bytes (537 MB) copied, 6.61362 s, 81.2 MB/s which is slower than I would expect. Then, once I format the device with ext4, I get dd if=/dev/zero of=/mnt/data.tmp bs=512M count=1 oflag=direct 536870912 bytes (537 MB) copied, 9.60918 s, 55.9 MB/s This doesn't seem right. There must be some other factor playing into this that I'm not aware of. global_common.conf global { usage-count yes; } common { protocol C; } syncer { al-extents 1801; rate 33M; } data_mirror.res resource data_mirror { device /dev/drbd1; disk /dev/sdb1; meta-disk internal; on cluster1 { address 192.168.33.10:7789; } on cluster2 { address 192.168.33.12:7789; } } For the hardware I have two identical machines: 6 GB RAM Quad core AMD Phenom 3.2Ghz Motherboard SATA controller 7200 RPM 64MB cache 1TB WD drive The network is 1Gb connected via a switch. I know that a direct connection is recommended, but could it make this much of a difference? Edited I just tried monitoring the bandwidth used to try to see what's happening. I used ibmonitor and measured average bandwidth while I ran the dd test 10 times. I got: avg ~450Mbits writing to ext4 avg ~800Mbits writing to raw device It looks like with ext4, drbd is using about half the bandwidth it uses with the raw device so there's a bottleneck that is not the network.

    Read the article

  • WebLogic Server Performance and Tuning: Part I - Tuning JVM

    - by Gokhan Gungor
    Each WebLogic Server instance runs in its own dedicated Java Virtual Machine (JVM) which is their runtime environment. Every Admin Server in any domain executes within a JVM. The same also applies for Managed Servers. WebLogic Server can be used for a wide variety of applications and services which uses the same runtime environment and resources. Oracle WebLogic ships with 2 different JVM, HotSpot and JRocket but you can choose which JVM you want to use. JVM is designed to optimize itself however it also provides some startup options to make small changes. There are default values for its memory and garbage collection. In real world, you will not want to stick with the default values provided by the JVM rather want to customize these values based on your applications which can produce large gains in performance by making small changes with the JVM parameters. We can tell the garbage collector how to delete garbage and we can also tell JVM how much space to allocate for each generation (of java Objects) or for heap. Remember during the garbage collection no other process is executed within the JVM or runtime, which is called STOP THE WORLD which can affect the overall throughput. Each JVM has its own memory segment called Heap Memory which is the storage for java Objects. These objects can be grouped based on their age like young generation (recently created objects) or old generation (surviving objects that have lived to some extent), etc. A java object is considered garbage when it can no longer be reached from anywhere in the running program. Each generation has its own memory segment within the heap. When this segment gets full, garbage collector deletes all the objects that are marked as garbage to create space. When the old generation space gets full, the JVM performs a major collection to remove the unused objects and reclaim their space. A major garbage collect takes a significant amount of time and can affect system performance. When we create a managed server either on the same machine or on remote machine it gets its initial startup parameters from $DOMAIN_HOME/bin/setDomainEnv.sh/cmd file. By default two parameters are set:     Xms: The initial heapsize     Xmx: The max heapsize Try to set equal initial and max heapsize. The startup time can be a little longer but for long running applications it will provide a better performance. When we set -Xms512m -Xmx1024m, the physical heap size will be 512m. This means that there are pages of memory (in the state of the 512m) that the JVM does not explicitly control. It will be controlled by OS which could be reserve for the other tasks. In this case, it is an advantage if the JVM claims the entire memory at once and try not to spend time to extend when more memory is needed. Also you can use -XX:MaxPermSize (Maximum size of the permanent generation) option for Sun JVM. You should adjust the size accordingly if your application dynamically load and unload a lot of classes in order to optimize the performance. You can set the JVM options/heap size from the following places:     Through the Admin console, in the Server start tab     In the startManagedWeblogic script for the managed servers     $DOMAIN_HOME/bin/startManagedWebLogic.sh/cmd     JAVA_OPTIONS="-Xms1024m -Xmx1024m" ${JAVA_OPTIONS}     In the setDomainEnv script for the managed servers and admin server (domain wide)     USER_MEM_ARGS="-Xms1024m -Xmx1024m" When there is free memory available in the heap but it is too fragmented and not contiguously located to store the object or when there is actually insufficient memory we can get java.lang.OutOfMemoryError. We should create Thread Dump and analyze if that is possible in case of such error. The second option we can use to produce higher throughput is to garbage collection. We can roughly divide GC algorithms into 2 categories: parallel and concurrent. Parallel GC stops the execution of all the application and performs the full GC, this generally provides better throughput but also high latency using all the CPU resources during GC. Concurrent GC on the other hand, produces low latency but also low throughput since it performs GC while application executes. The JRockit JVM provides some useful command-line parameters that to control of its GC scheme like -XgcPrio command-line parameter which takes the following options; XgcPrio:pausetime (To minimize latency, parallel GC) XgcPrio:throughput (To minimize throughput, concurrent GC ) XgcPrio:deterministic (To guarantee maximum pause time, for real time systems) Sun JVM has similar parameters (like  -XX:UseParallelGC or -XX:+UseConcMarkSweepGC) to control its GC scheme. We can add -verbosegc -XX:+PrintGCDetails to monitor indications of a problem with garbage collection. Try configuring JVM’s of all managed servers to execute in -server mode to ensure that it is optimized for a server-side production environment.

    Read the article

  • SQL SERVER – Introduction to SQL Server 2014 In-Memory OLTP

    - by Pinal Dave
    In SQL Server 2014 Microsoft has introduced a new database engine component called In-Memory OLTP aka project “Hekaton” which is fully integrated into the SQL Server Database Engine. It is optimized for OLTP workloads accessing memory resident data. In-memory OLTP helps us create memory optimized tables which in turn offer significant performance improvement for our typical OLTP workload. The main objective of memory optimized table is to ensure that highly transactional tables could live in memory and remain in memory forever without even losing out a single record. The most significant part is that it still supports majority of our Transact-SQL statement. Transact-SQL stored procedures can be compiled to machine code for further performance improvements on memory-optimized tables. This engine is designed to ensure higher concurrency and minimal blocking. In-Memory OLTP alleviates the issue of locking, using a new type of multi-version optimistic concurrency control. It also substantially reduces waiting for log writes by generating far less log data and needing fewer log writes. Points to remember Memory-optimized tables refer to tables using the new data structures and key words added as part of In-Memory OLTP. Disk-based tables refer to your normal tables which we used to create in SQL Server since its inception. These tables use a fixed size 8 KB pages that need to be read from and written to disk as a unit. Natively compiled stored procedures refer to an object Type which is new and is supported by in-memory OLTP engine which convert it into machine code, which can further improve the data access performance for memory –optimized tables. Natively compiled stored procedures can only reference memory-optimized tables, they can’t be used to reference any disk –based table. Interpreted Transact-SQL stored procedures, which is what SQL Server has always used. Cross-container transactions refer to transactions that reference both memory-optimized tables and disk-based tables. Interop refers to interpreted Transact-SQL that references memory-optimized tables. Using In-Memory OLTP In-Memory OLTP engine has been available as part of SQL Server 2014 since June 2013 CTPs. Installation of In-Memory OLTP is part of the SQL Server setup application. The In-Memory OLTP components can only be installed with a 64-bit edition of SQL Server 2014 hence they are not available with 32-bit editions. Creating Databases Any database that will store memory-optimized tables must have a MEMORY_OPTIMIZED_DATA filegroup. This filegroup is specifically designed to store the checkpoint files needed by SQL Server to recover the memory-optimized tables, and although the syntax for creating the filegroup is almost the same as for creating a regular filestream filegroup, it must also specify the option CONTAINS MEMORY_OPTIMIZED_DATA. Here is an example of a CREATE DATABASE statement for a database that can support memory-optimized tables: CREATE DATABASE InMemoryDB ON PRIMARY(NAME = [InMemoryDB_data], FILENAME = 'D:\data\InMemoryDB_data.mdf', size=500MB), FILEGROUP [SampleDB_mod_fg] CONTAINS MEMORY_OPTIMIZED_DATA (NAME = [InMemoryDB_mod_dir], FILENAME = 'S:\data\InMemoryDB_mod_dir'), (NAME = [InMemoryDB_mod_dir], FILENAME = 'R:\data\InMemoryDB_mod_dir') LOG ON (name = [SampleDB_log], Filename='L:\log\InMemoryDB_log.ldf', size=500MB) COLLATE Latin1_General_100_BIN2; Above example code creates files on three different drives (D:  S: and R:) for the data files and in memory storage so if you would like to run this code kindly change the drive and folder locations as per your convenience. Also notice that binary collation was specified as Windows (non-SQL). BIN2 collation is the only collation support at this point for any indexes on memory optimized tables. It is also possible to add a MEMORY_OPTIMIZED_DATA file group to an existing database, use the below command to achieve the same. ALTER DATABASE AdventureWorks2012 ADD FILEGROUP hekaton_mod CONTAINS MEMORY_OPTIMIZED_DATA; GO ALTER DATABASE AdventureWorks2012 ADD FILE (NAME='hekaton_mod', FILENAME='S:\data\hekaton_mod') TO FILEGROUP hekaton_mod; GO Creating Tables There is no major syntactical difference between creating a disk based table or a memory –optimized table but yes there are a few restrictions and a few new essential extensions. Essentially any memory-optimized table should use the MEMORY_OPTIMIZED = ON clause as shown in the Create Table query example. DURABILITY clause (SCHEMA_AND_DATA or SCHEMA_ONLY) Memory-optimized table should always be defined with a DURABILITY value which can be either SCHEMA_AND_DATA or  SCHEMA_ONLY the former being the default. A memory-optimized table defined with DURABILITY=SCHEMA_ONLY will not persist the data to disk which means the data durability is compromised whereas DURABILITY= SCHEMA_AND_DATA ensures that data is also persisted along with the schema. Indexing Memory Optimized Table A memory-optimized table must always have an index for all tables created with DURABILITY= SCHEMA_AND_DATA and this can be achieved by declaring a PRIMARY KEY Constraint at the time of creating a table. The following example shows a PRIMARY KEY index created as a HASH index, for which a bucket count must also be specified. CREATE TABLE Mem_Table ( [Name] VARCHAR(32) NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 100000), [City] VARCHAR(32) NULL, [State_Province] VARCHAR(32) NULL, [LastModified] DATETIME NOT NULL, ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA); Now as you can see in the above query example we have used the clause MEMORY_OPTIMIZED = ON to make sure that it is considered as a memory optimized table and not just a normal table and also used the DURABILITY Clause= SCHEMA_AND_DATA which means it will persist data along with metadata and also you can notice this table has a PRIMARY KEY mentioned upfront which is also a mandatory clause for memory-optimized tables. We will talk more about HASH Indexes and BUCKET_COUNT in later articles on this topic which will be focusing more on Row and Index storage on Memory-Optimized tables. So stay tuned for that as well. Now as we covered the basics of Memory Optimized tables and understood the key things to remember while using memory optimized tables, let’s explore more using examples to understand the Performance gains using memory-optimized tables. I will be using the database which i created earlier in this article i.e. InMemoryDB in the below Demo Exercise. USE InMemoryDB GO -- Creating a disk based table CREATE TABLE dbo.Disktable ( Id INT IDENTITY, Name CHAR(40) ) GO CREATE NONCLUSTERED INDEX IX_ID ON dbo.Disktable (Id) GO -- Creating a memory optimized table with similar structure and DURABILITY = SCHEMA_AND_DATA CREATE TABLE dbo.Memorytable_durable ( Id INT NOT NULL PRIMARY KEY NONCLUSTERED Hash WITH (bucket_count =1000000), Name CHAR(40) ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA) GO -- Creating an another memory optimized table with similar structure but DURABILITY = SCHEMA_Only CREATE TABLE dbo.Memorytable_nondurable ( Id INT NOT NULL PRIMARY KEY NONCLUSTERED Hash WITH (bucket_count =1000000), Name CHAR(40) ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_only) GO -- Now insert 100000 records in dbo.Disktable and observe the Time Taken DECLARE @i_t bigint SET @i_t =1 WHILE @i_t<= 100000 BEGIN INSERT INTO dbo.Disktable(Name) VALUES('sachin' + CONVERT(VARCHAR,@i_t)) SET @i_t+=1 END -- Do the same inserts for Memory table dbo.Memorytable_durable and observe the Time Taken DECLARE @i_t bigint SET @i_t =1 WHILE @i_t<= 100000 BEGIN INSERT INTO dbo.Memorytable_durable VALUES(@i_t, 'sachin' + CONVERT(VARCHAR,@i_t)) SET @i_t+=1 END -- Now finally do the same inserts for Memory table dbo.Memorytable_nondurable and observe the Time Taken DECLARE @i_t bigint SET @i_t =1 WHILE @i_t<= 100000 BEGIN INSERT INTO dbo.Memorytable_nondurable VALUES(@i_t, 'sachin' + CONVERT(VARCHAR,@i_t)) SET @i_t+=1 END The above 3 Inserts took 1.20 minutes, 54 secs, and 2 secs respectively to insert 100000 records on my machine with 8 Gb RAM. This proves the point that memory-optimized tables can definitely help businesses achieve better performance for their highly transactional business table and memory- optimized tables with Durability SCHEMA_ONLY is even faster as it does not bother persisting its data to disk which makes it supremely fast. Koenig Solutions is one of the few organizations which offer IT training on SQL Server 2014 and all its updates. Now, I leave the decision on using memory_Optimized tables on you, I hope you like this article and it helped you understand  the fundamentals of IN-Memory OLTP . Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Koenig

    Read the article

  • Understanding LINQ to SQL (11) Performance

    - by Dixin
    [LINQ via C# series] LINQ to SQL has a lot of great features like strong typing query compilation deferred execution declarative paradigm etc., which are very productive. Of course, these cannot be free, and one price is the performance. O/R mapping overhead Because LINQ to SQL is based on O/R mapping, one obvious overhead is, data changing usually requires data retrieving:private static void UpdateProductUnitPrice(int id, decimal unitPrice) { using (NorthwindDataContext database = new NorthwindDataContext()) { Product product = database.Products.Single(item => item.ProductID == id); // SELECT... product.UnitPrice = unitPrice; // UPDATE... database.SubmitChanges(); } } Before updating an entity, that entity has to be retrieved by an extra SELECT query. This is slower than direct data update via ADO.NET:private static void UpdateProductUnitPrice(int id, decimal unitPrice) { using (SqlConnection connection = new SqlConnection( "Data Source=localhost;Initial Catalog=Northwind;Integrated Security=True")) using (SqlCommand command = new SqlCommand( @"UPDATE [dbo].[Products] SET [UnitPrice] = @UnitPrice WHERE [ProductID] = @ProductID", connection)) { command.Parameters.Add("@ProductID", SqlDbType.Int).Value = id; command.Parameters.Add("@UnitPrice", SqlDbType.Money).Value = unitPrice; connection.Open(); command.Transaction = connection.BeginTransaction(); command.ExecuteNonQuery(); // UPDATE... command.Transaction.Commit(); } } The above imperative code specifies the “how to do” details with better performance. For the same reason, some articles from Internet insist that, when updating data via LINQ to SQL, the above declarative code should be replaced by:private static void UpdateProductUnitPrice(int id, decimal unitPrice) { using (NorthwindDataContext database = new NorthwindDataContext()) { database.ExecuteCommand( "UPDATE [dbo].[Products] SET [UnitPrice] = {0} WHERE [ProductID] = {1}", id, unitPrice); } } Or just create a stored procedure:CREATE PROCEDURE [dbo].[UpdateProductUnitPrice] ( @ProductID INT, @UnitPrice MONEY ) AS BEGIN BEGIN TRANSACTION UPDATE [dbo].[Products] SET [UnitPrice] = @UnitPrice WHERE [ProductID] = @ProductID COMMIT TRANSACTION END and map it as a method of NorthwindDataContext (explained in this post):private static void UpdateProductUnitPrice(int id, decimal unitPrice) { using (NorthwindDataContext database = new NorthwindDataContext()) { database.UpdateProductUnitPrice(id, unitPrice); } } As a normal trade off for O/R mapping, a decision has to be made between performance overhead and programming productivity according to the case. In a developer’s perspective, if O/R mapping is chosen, I consistently choose the declarative LINQ code, unless this kind of overhead is unacceptable. Data retrieving overhead After talking about the O/R mapping specific issue. Now look into the LINQ to SQL specific issues, for example, performance in the data retrieving process. The previous post has explained that the SQL translating and executing is complex. Actually, the LINQ to SQL pipeline is similar to the compiler pipeline. It consists of about 15 steps to translate an C# expression tree to SQL statement, which can be categorized as: Convert: Invoke SqlProvider.BuildQuery() to convert the tree of Expression nodes into a tree of SqlNode nodes; Bind: Used visitor pattern to figure out the meanings of names according to the mapping info, like a property for a column, etc.; Flatten: Figure out the hierarchy of the query; Rewrite: for SQL Server 2000, if needed Reduce: Remove the unnecessary information from the tree. Parameterize Format: Generate the SQL statement string; Parameterize: Figure out the parameters, for example, a reference to a local variable should be a parameter in SQL; Materialize: Executes the reader and convert the result back into typed objects. So for each data retrieving, even for data retrieving which looks simple: private static Product[] RetrieveProducts(int productId) { using (NorthwindDataContext database = new NorthwindDataContext()) { return database.Products.Where(product => product.ProductID == productId) .ToArray(); } } LINQ to SQL goes through above steps to translate and execute the query. Fortunately, there is a built-in way to cache the translated query. Compiled query When such a LINQ to SQL query is executed repeatedly, The CompiledQuery can be used to translate query for one time, and execute for multiple times:internal static class CompiledQueries { private static readonly Func<NorthwindDataContext, int, Product[]> _retrieveProducts = CompiledQuery.Compile((NorthwindDataContext database, int productId) => database.Products.Where(product => product.ProductID == productId).ToArray()); internal static Product[] RetrieveProducts( this NorthwindDataContext database, int productId) { return _retrieveProducts(database, productId); } } The new version of RetrieveProducts() gets better performance, because only when _retrieveProducts is first time invoked, it internally invokes SqlProvider.Compile() to translate the query expression. And it also uses lock to make sure translating once in multi-threading scenarios. Static SQL / stored procedures without translating Another way to avoid the translating overhead is to use static SQL or stored procedures, just as the above examples. Because this is a functional programming series, this article not dive into. For the details, Scott Guthrie already has some excellent articles: LINQ to SQL (Part 6: Retrieving Data Using Stored Procedures) LINQ to SQL (Part 7: Updating our Database using Stored Procedures) LINQ to SQL (Part 8: Executing Custom SQL Expressions) Data changing overhead By looking into the data updating process, it also needs a lot of work: Begins transaction Processes the changes (ChangeProcessor) Walks through the objects to identify the changes Determines the order of the changes Executes the changings LINQ queries may be needed to execute the changings, like the first example in this article, an object needs to be retrieved before changed, then the above whole process of data retrieving will be went through If there is user customization, it will be executed, for example, a table’s INSERT / UPDATE / DELETE can be customized in the O/R designer It is important to keep these overhead in mind. Bulk deleting / updating Another thing to be aware is the bulk deleting:private static void DeleteProducts(int categoryId) { using (NorthwindDataContext database = new NorthwindDataContext()) { database.Products.DeleteAllOnSubmit( database.Products.Where(product => product.CategoryID == categoryId)); database.SubmitChanges(); } } The expected SQL should be like:BEGIN TRANSACTION exec sp_executesql N'DELETE FROM [dbo].[Products] AS [t0] WHERE [t0].[CategoryID] = @p0',N'@p0 int',@p0=9 COMMIT TRANSACTION Hoverer, as fore mentioned, the actual SQL is to retrieving the entities, and then delete them one by one:-- Retrieves the entities to be deleted: exec sp_executesql N'SELECT [t0].[ProductID], [t0].[ProductName], [t0].[SupplierID], [t0].[CategoryID], [t0].[QuantityPerUnit], [t0].[UnitPrice], [t0].[UnitsInStock], [t0].[UnitsOnOrder], [t0].[ReorderLevel], [t0].[Discontinued] FROM [dbo].[Products] AS [t0] WHERE [t0].[CategoryID] = @p0',N'@p0 int',@p0=9 -- Deletes the retrieved entities one by one: BEGIN TRANSACTION exec sp_executesql N'DELETE FROM [dbo].[Products] WHERE ([ProductID] = @p0) AND ([ProductName] = @p1) AND ([SupplierID] IS NULL) AND ([CategoryID] = @p2) AND ([QuantityPerUnit] IS NULL) AND ([UnitPrice] = @p3) AND ([UnitsInStock] = @p4) AND ([UnitsOnOrder] = @p5) AND ([ReorderLevel] = @p6) AND (NOT ([Discontinued] = 1))',N'@p0 int,@p1 nvarchar(4000),@p2 int,@p3 money,@p4 smallint,@p5 smallint,@p6 smallint',@p0=78,@p1=N'Optimus Prime',@p2=9,@p3=$0.0000,@p4=0,@p5=0,@p6=0 exec sp_executesql N'DELETE FROM [dbo].[Products] WHERE ([ProductID] = @p0) AND ([ProductName] = @p1) AND ([SupplierID] IS NULL) AND ([CategoryID] = @p2) AND ([QuantityPerUnit] IS NULL) AND ([UnitPrice] = @p3) AND ([UnitsInStock] = @p4) AND ([UnitsOnOrder] = @p5) AND ([ReorderLevel] = @p6) AND (NOT ([Discontinued] = 1))',N'@p0 int,@p1 nvarchar(4000),@p2 int,@p3 money,@p4 smallint,@p5 smallint,@p6 smallint',@p0=79,@p1=N'Bumble Bee',@p2=9,@p3=$0.0000,@p4=0,@p5=0,@p6=0 -- ... COMMIT TRANSACTION And the same to the bulk updating. This is really not effective and need to be aware. Here is already some solutions from the Internet, like this one. The idea is wrap the above SELECT statement into a INNER JOIN:exec sp_executesql N'DELETE [dbo].[Products] FROM [dbo].[Products] AS [j0] INNER JOIN ( SELECT [t0].[ProductID], [t0].[ProductName], [t0].[SupplierID], [t0].[CategoryID], [t0].[QuantityPerUnit], [t0].[UnitPrice], [t0].[UnitsInStock], [t0].[UnitsOnOrder], [t0].[ReorderLevel], [t0].[Discontinued] FROM [dbo].[Products] AS [t0] WHERE [t0].[CategoryID] = @p0) AS [j1] ON ([j0].[ProductID] = [j1].[[Products])', -- The Primary Key N'@p0 int',@p0=9 Query plan overhead The last thing is about the SQL Server query plan. Before .NET 4.0, LINQ to SQL has an issue (not sure if it is a bug). LINQ to SQL internally uses ADO.NET, but it does not set the SqlParameter.Size for a variable-length argument, like argument of NVARCHAR type, etc. So for two queries with the same SQL but different argument length:using (NorthwindDataContext database = new NorthwindDataContext()) { database.Products.Where(product => product.ProductName == "A") .Select(product => product.ProductID).ToArray(); // The same SQL and argument type, different argument length. database.Products.Where(product => product.ProductName == "AA") .Select(product => product.ProductID).ToArray(); } Pay attention to the argument length in the translated SQL:exec sp_executesql N'SELECT [t0].[ProductID] FROM [dbo].[Products] AS [t0] WHERE [t0].[ProductName] = @p0',N'@p0 nvarchar(1)',@p0=N'A' exec sp_executesql N'SELECT [t0].[ProductID] FROM [dbo].[Products] AS [t0] WHERE [t0].[ProductName] = @p0',N'@p0 nvarchar(2)',@p0=N'AA' Here is the overhead: The first query’s query plan cache is not reused by the second one:SELECT sys.syscacheobjects.cacheobjtype, sys.dm_exec_cached_plans.usecounts, sys.syscacheobjects.[sql] FROM sys.syscacheobjects INNER JOIN sys.dm_exec_cached_plans ON sys.syscacheobjects.bucketid = sys.dm_exec_cached_plans.bucketid; They actually use different query plans. Again, pay attention to the argument length in the [sql] column (@p0 nvarchar(2) / @p0 nvarchar(1)). Fortunately, in .NET 4.0 this is fixed:internal static class SqlTypeSystem { private abstract class ProviderBase : TypeSystemProvider { protected int? GetLargestDeclarableSize(SqlType declaredType) { SqlDbType sqlDbType = declaredType.SqlDbType; if (sqlDbType <= SqlDbType.Image) { switch (sqlDbType) { case SqlDbType.Binary: case SqlDbType.Image: return 8000; } return null; } if (sqlDbType == SqlDbType.NVarChar) { return 4000; // Max length for NVARCHAR. } if (sqlDbType != SqlDbType.VarChar) { return null; } return 8000; } } } In this above example, the translated SQL becomes:exec sp_executesql N'SELECT [t0].[ProductID] FROM [dbo].[Products] AS [t0] WHERE [t0].[ProductName] = @p0',N'@p0 nvarchar(4000)',@p0=N'A' exec sp_executesql N'SELECT [t0].[ProductID] FROM [dbo].[Products] AS [t0] WHERE [t0].[ProductName] = @p0',N'@p0 nvarchar(4000)',@p0=N'AA' So that they reuses the same query plan cache: Now the [usecounts] column is 2.

    Read the article

  • Live CD / Live USB much faster than full install

    - by user29347
    I've observed it on both laptops I own! HP Compaq nx6125 and Ubuntu 11.04 x64 - somewhat solved Lenovo Thinkpad T500 and Ubuntu 11.10 x64 - help needed! I'm still struggling with the Thinkpad to get performance level similar to that of 10 y.o. laptops... All in all a really serious issue with multiple versions of Ubuntu that renders computers with perfectly compatible hardware unusable, as far as out of the box experience is concerned. Troubleshooting resultant issues seems to be a hard case even for users with some experience with installing graphics drivers. EDIT: I can't really post additional details. Two different ubuntu versions, two laptops, two different set of graph. drivers (OS vs ATI prop.) - all with the same symptoms. Also I can't stress enough how massive the performance degradation is compared to a healthy system. For that reason I ask for input from people who may know roughly what are we dealing with here. I can post more details if we were to focus on my current Thinkpad T500. In that case my current system details: Lenovo Thinkpad T500 Ubuntu 11.10 x64 ATI Mobility Radeon HD 3650 (also see the "What I have already tried" section about Intel graphics tested) ATI Catalyst 11.10 drivers OCZ Agility 3 SSD but! same with the default driver for ATI the card same with the prop. driver for the ATI card from Jockey (Additional drivers applet) What I have already tried: 0. Switching to Intel integrated card (Intel GMA 4500M HD) with the default driver - same effects = may indicate not driver related problem but a problem with something of global influence like e.g. nomodeset or other I don't even know about. (What you can read above) ATI Catalyst 11.10 and radeon.modeset=0 boot parameter + disabled Wait for VBlank. Unity 2D Ubuntu 10.04 LTS tested (ubuntu-10.04.3-desktop-i386.iso): Both live USB and installed version blazing fast! (on the default drivers - without even installing the proprietary fglrx drivers). re2 a) seems to give me the only significant results (still poor) - perfect Unity elements performance with the same crawling stuttering/lagging when dragging windows around. re2 b) this happens often http://i17.photobucket.com/albums/b68/Bucic/ubuntuforumsorg/Screenshotat2011-10-28083140.png re2 c) Sometimes I am able to witness a normal performance when dragging a window around but only for a second or two. When I try to shake it longer it starts to lag and it will keep lagging like that with an increased probability of what you see in the sshot in point re2 b). re2 d) I can't establish the radeon.modeset=0 influence though. Once it seems to work be smooth with it, the other time - without it. Really can't tell.

    Read the article

  • ATG Live Webcast March 29: Diagnosing E-Business Suite JVM and Forms Performance Issues (Performance Series Part 4 of 4)

    - by BillSawyer
    The next webcast in our popular EBS series on performance management is going to be a showstopper.  Dave Suri, Project Lead, Applications Performance and Gustavo Jimenez, Senior Development Manager will discuss some of the steps involved in triaging and diagnosing E-Business Suite systems related to JVM and Forms components. Please join us for our next ATG Live Webcast on Mar. 29, 2012: Triage and Diagnostics for E-Business Suite JVM and Forms The topics covered in this webcast will be: Overall Menu/Sections Architecture Patches/Certified browsers/jdk versions JVM Tuning JVM Tools (jstat,eclipse mat, ibm tda) Forms Tools (strace/FRD) Java Concurrent Program options location Case studies Case Studies JVM Thread dump case for Oracle Advanced Product Catalog Forms FRD trace relating to Saving an SR Java Concurrent Program for BT Date:               Thursday, March 29, 2012Time:              8:00 AM - 9:00 AM Pacific Standard TimePresenters:  Dave Suri, Project Lead, Applications Performance                        Gustavo Jimenez, Senior Development ManagerWebcast Registration Link (Preregistration is optional but encouraged)To hear the audio feed:   Domestic Participant Dial-In Number:            877-697-8128    International Participant Dial-In Number:      706-634-9568    Additional International Dial-In Numbers Link:    Dial-In Passcode:                                              99342To see the presentation:    The Direct Access Web Conference details are:    Website URL: https://ouweb.webex.com    Meeting Number:  597073984 If you miss the webcast, or you have missed any webcast, don't worry -- we'll post links to the recording as soon as it's available from Oracle University.  You can monitor this blog for pointers to the replay. And, you can find our archive of our past webcasts and training here.If you have any questions or comments, feel free to email Bill Sawyer (Senior Manager, Applications Technology Curriculum) at BilldotSawyer-AT-Oracle-DOT-com. 

    Read the article

  • Best Practices of Performance Management Plan (PMP)

    - by Robert Story
    Upcoming WebcastTitle: Best Practices of Performance Management Plan (PMP)Date: April 22, 2010Time: 11 AM EST / 8 AM PST / 8.30 PM IST  Product Family: EBS HRMS SummaryThis webcast will cover the best practices of Performance Management Plan(PMP) in very common scenarios. The best practices will address major issues around plan dates, new hire, manager transfer and related events. The session will also cover HRMS Patching Strategy, Key References and various customer communication channels.A short, live demonstration (only if applicable) and question and answer period will be included.Click here to register for this session....... ....... ....... ....... ....... ....... .......The above webcast is a service of the E-Business Suite Communities in My Oracle Support.For more information on other webcasts, please reference the Oracle Advisor Webcast Schedule.Click here to visit the E-Business Communities in My Oracle Support Note that all links require access to My Oracle Support.

    Read the article

< Previous Page | 39 40 41 42 43 44 45 46 47 48 49 50  | Next Page >