Search Results

Search found 13403 results on 537 pages for 'epm performance tuning'.

Page 113/537 | < Previous Page | 109 110 111 112 113 114 115 116 117 118 119 120  | Next Page >

  • Network throughput issue (ARP-related)

    - by Joel Coel
    The small college where I work is having some very strange network issues. I'm looking for any advice or ideas here. We were fine over the summer, but the trouble began few days after students returned to campus in force for the fall term. Symptoms The main symptom is that internet access will work, but it's very slow... often to the point of timeouts. As an example, a typical result from Speedtest.net will return .4Mbps download, but allow 3 to 8 Mbps upload speed. Lesser symptoms may include severely limited performance transferring data to and from our file server, or even in some cases the inability to log in to the computer (cannot reach the domain controller). The issue crosses multiple vlans, and has effected devices on nearly every vlan we operate. The issue does not impact all machines on the network. An unaffected machine will typically see at least 11Mbps download from speedtest.net, and perhaps much more depending on larger campus traffic patterns at the time. There is one variation on the larger issue. We have one vlan where users were unable to log into nearly all of the machines at all. IT staff would log in using a local administrator account (or in some cases cached credentials), and from there a release/renew or pinging the gateway would allow the machine to work... for a while. Complicating this issue is that this vlan covers our computer labs, which use software called Deep Freeze to completely reset the hard drives after a reboot. It could just the same issue manifesting differently because of stale data on machines that have not permanently altered low-level info for weeks. We were able to solve this, however, by creating a new vlan and moving the labs over to the new vlan wholesale. Instigations Eventually we noticed that the effected machines all had recent dhcp leases. We can predict when a machine will become "slow" by watching when a dhcp lease comes up for renewal. We played with setting the lease time very short for a test vlan, but all that did was remove our ability to predict when the machine would become slow. Machines with static IPs have pretty much always worked normally. Manually releasing/renewing an address will never cause a machine to become slow. In fact, in some cases this process has fixed a machine in that state. Most of the time, though, it doesn't help. We also noticed that mobile machines like laptops are likely to become slow when they cross to new vlans. Wireless on campus is divided up into "zones", where each zone maps to a small set of buildings. Moving to a new building can place you in a zone, thereby causing you to get a new address. A machine resuming from sleep mode is also very likely to be slow. Mitigations Sometimes, but not always, clearing the arp cache on an effected machine will allow it to work normally again. As already mentioned, releasing/renewing a local machine's IP address can fix that machine, but it's not guaranteed. Pinging the default gateway can also sometimes help with a slow machine. What seems to help most to mitigate the issue is clearing the arp cache on our core layer-3 switch. This switch is used for our dhcp system as the default gateway on all vlans, and it handles inter-vlan routing. The model is a 3Com 4900SX. To try to mitigate the issue, we have the cache timeout set on the switch all the way down to the lowest possible time, but it hasn't helped. I also put together a script that runs every few minutes to automatically connect to the switch and reset the cache. Unfortunately, this does not always work, and can even cause some machines to end up in the slow state for a short time (though these seem to correct themselves after a few minutes). We currently have a scheduled job that runs every 10 minutes to force the core switch to clear it's ARP cache, but this is far from perfect or desirable. Reproduction We now have a test machine that we can force into the slow state at will. It is connected to a switch with ports set up for each of our vlans. We make the machine slow by connecting to different vlans, and after a new connection or two it will be slow. It's also worth noting in this section that this has happened before at the start of prior terms, but in the past the problem has gone away on it's own after a few days. It solved itself before we had a chance to do much diagnostic work... hence why we've allowed it to drag so long into the term this time 'round; the expectation was this would be a short-lived situation. Other Factors It's worth mentioning that we have had about half a dozen switches just outright fail over the last year. These are mainly 2003/2004-era 3Coms (mostly 4200's) that were all put in at about the same time. They should still be covered under warranty, buy HP has made getting service somewhat difficult. Mostly in power supplies that have failed, but in a couple cases we have used a power supply from a switch with a failed mainboard to bring a switch with a failed power supply back to life. We do have UPS devices on all but three of four switches now, but that was not the case when I started two and a half years ago. Severe budget constraints (we were on the Dept. of Ed's financially challenged institutions list a couple years back) have forced me to look to the likes of Netgear and TrendNet for replacements, but so far these low-end models seem to be holding their own. It's also worth mentioning that the big change on our network this summer was migrating from a single cross-campus wireless SSID to the zoned approach mentioned earlier. I don't think this is the source of the issue, as like I've said: we've seen this before. However, it's possible this is exacerbating the issue, and may be much of the reason it's been so hard to isolate. Diagnosis At first it seemed clear to us, given the timing and persistent nature of the problem, that the source of the issue was an infected (or malicious) student machine doing ARP cache poisoning. However, repeated attempts to isolate the source have failed. Those attempts include numerous wireshark packet traces, and even taking entire buildings offline for brief periods. We have not been able even to find a smoking gun bad ARP entry. My current best guess is an overloaded or failing core switch, but I'm not sure on how to test for this, and the cost of replacing it blindly is steep. Again, any ideas appreciated.

    Read the article

  • VS2012 Launch Event &ndash; Combating Bugs And Poor Performance In Production

    - by Tarun Arora
    I presented a session “A techies guide to combating bugs & poor performance in production” at the Microsoft IT Visual Studio Launch event.  The key message was to demonstrate what common production issues (non-reproducible bugs and poor performance) techie’s run into and how the tooling in Visual Studio can help you efficiently tackle these issues. Remember, a Techie without efficient tools is only half the good!                                                       A techies guide to combating bugs & poor performance in production from Avanade Enjoy!

    Read the article

  • Advantage Database Server: slow stored procedure performance.

    - by ie
    I have a question about a performance of stored procedures in the ADS. I created a simple database with the following structure: CREATE TABLE MainTable ( Id INTEGER PRIMARY KEY, Name VARCHAR(50), Value INTEGER ); CREATE UNIQUE INDEX MainTableName_UIX ON MainTable ( Name ); CREATE TABLE SubTable ( Id INTEGER PRIMARY KEY, MainId INTEGER, Name VARCHAR(50), Value INTEGER ); CREATE INDEX SubTableMainId_UIX ON SubTable ( MainId ); CREATE UNIQUE INDEX SubTableName_UIX ON SubTable ( Name ); CREATE PROCEDURE CreateItems ( MainName VARCHAR ( 20 ), SubName VARCHAR ( 20 ), MainValue INTEGER, SubValue INTEGER, MainId INTEGER OUTPUT, SubId INTEGER OUTPUT ) BEGIN DECLARE @MainName VARCHAR ( 20 ); DECLARE @SubName VARCHAR ( 20 ); DECLARE @MainValue INTEGER; DECLARE @SubValue INTEGER; DECLARE @MainId INTEGER; DECLARE @SubId INTEGER; @MainName = (SELECT MainName FROM __input); @SubName = (SELECT SubName FROM __input); @MainValue = (SELECT MainValue FROM __input); @SubValue = (SELECT SubValue FROM __input); @MainId = (SELECT MAX(Id)+1 FROM MainTable); @SubId = (SELECT MAX(Id)+1 FROM SubTable ); INSERT INTO MainTable (Id, Name, Value) VALUES (@MainId, @MainName, @MainValue); INSERT INTO SubTable (Id, Name, MainId, Value) VALUES (@SubId, @SubName, @MainId, @SubValue); INSERT INTO __output SELECT @MainId, @SubId FROM system.iota; END; CREATE PROCEDURE UpdateItems ( MainName VARCHAR ( 20 ), MainValue INTEGER, SubValue INTEGER ) BEGIN DECLARE @MainName VARCHAR ( 20 ); DECLARE @MainValue INTEGER; DECLARE @SubValue INTEGER; DECLARE @MainId INTEGER; @MainName = (SELECT MainName FROM __input); @MainValue = (SELECT MainValue FROM __input); @SubValue = (SELECT SubValue FROM __input); @MainId = (SELECT TOP 1 Id FROM MainTable WHERE Name = @MainName); UPDATE MainTable SET Value = @MainValue WHERE Id = @MainId; UPDATE SubTable SET Value = @SubValue WHERE MainId = @MainId; END; CREATE PROCEDURE SelectItems ( MainName VARCHAR ( 20 ), CalculatedValue INTEGER OUTPUT ) BEGIN DECLARE @MainName VARCHAR ( 20 ); @MainName = (SELECT MainName FROM __input); INSERT INTO __output SELECT m.Value * s.Value FROM MainTable m INNER JOIN SubTable s ON m.Id = s.MainId WHERE m.Name = @MainName; END; CREATE PROCEDURE DeleteItems ( MainName VARCHAR ( 20 ) ) BEGIN DECLARE @MainName VARCHAR ( 20 ); DECLARE @MainId INTEGER; @MainName = (SELECT MainName FROM __input); @MainId = (SELECT TOP 1 Id FROM MainTable WHERE Name = @MainName); DELETE FROM SubTable WHERE MainId = @MainId; DELETE FROM MainTable WHERE Id = @MainId; END; Actually, the problem I had - even so light stored procedures work very-very slow (about 50-150 ms) relatively to plain queries (0-5ms). To test the performance, I created a simple test (in F# using ADS ADO.NET provider): open System; open System.Data; open System.Diagnostics; open Advantage.Data.Provider; let mainName = "main name #"; let subName = "sub name #"; // INSERT let cmdTextScriptInsert = " DECLARE @MainId INTEGER; DECLARE @SubId INTEGER; @MainId = (SELECT MAX(Id)+1 FROM MainTable); @SubId = (SELECT MAX(Id)+1 FROM SubTable ); INSERT INTO MainTable (Id, Name, Value) VALUES (@MainId, :MainName, :MainValue); INSERT INTO SubTable (Id, Name, MainId, Value) VALUES (@SubId, :SubName, @MainId, :SubValue); SELECT @MainId, @SubId FROM system.iota;"; let cmdTextProcedureInsert = "CreateItems"; // UPDATE let cmdTextScriptUpdate = " DECLARE @MainId INTEGER; @MainId = (SELECT TOP 1 Id FROM MainTable WHERE Name = :MainName); UPDATE MainTable SET Value = :MainValue WHERE Id = @MainId; UPDATE SubTable SET Value = :SubValue WHERE MainId = @MainId;"; let cmdTextProcedureUpdate = "UpdateItems"; // SELECT let cmdTextScriptSelect = " SELECT m.Value * s.Value FROM MainTable m INNER JOIN SubTable s ON m.Id = s.MainId WHERE m.Name = :MainName;"; let cmdTextProcedureSelect = "SelectItems"; // DELETE let cmdTextScriptDelete = " DECLARE @MainId INTEGER; @MainId = (SELECT TOP 1 Id FROM MainTable WHERE Name = :MainName); DELETE FROM SubTable WHERE MainId = @MainId; DELETE FROM MainTable WHERE Id = @MainId;"; let cmdTextProcedureDelete = "DeleteItems"; let cnnStr = @"data source=D:\DB\test.add; ServerType=local; user id=adssys; password=***;"; let cnn = new AdsConnection(cnnStr); try cnn.Open(); let cmd = cnn.CreateCommand(); let parametrize ix prms = cmd.Parameters.Clear(); let addParam = function | "MainName" -> cmd.Parameters.Add(":MainName" , mainName + ix.ToString()) |> ignore; | "SubName" -> cmd.Parameters.Add(":SubName" , subName + ix.ToString() ) |> ignore; | "MainValue" -> cmd.Parameters.Add(":MainValue", ix * 3 ) |> ignore; | "SubValue" -> cmd.Parameters.Add(":SubValue" , ix * 7 ) |> ignore; | _ -> () prms |> List.iter addParam; let runTest testData = let (cmdType, cmdName, cmdText, cmdParams) = testData; let toPrefix cmdType cmdName = let prefix = match cmdType with | CommandType.StoredProcedure -> "Procedure-" | CommandType.Text -> "Script -" | _ -> "Unknown -" in prefix + cmdName; let stopWatch = new Stopwatch(); let runStep ix prms = parametrize ix prms; stopWatch.Start(); cmd.ExecuteNonQuery() |> ignore; stopWatch.Stop(); cmd.CommandText <- cmdText; cmd.CommandType <- cmdType; let startId = 1500; let count = 10; for id in startId .. startId+count do runStep id cmdParams; let elapsed = stopWatch.Elapsed; Console.WriteLine("Test '{0}' - total: {1}; per call: {2}ms", toPrefix cmdType cmdName, elapsed, Convert.ToInt32(elapsed.TotalMilliseconds)/count); let lst = [ (CommandType.Text, "Insert", cmdTextScriptInsert, ["MainName"; "SubName"; "MainValue"; "SubValue"]); (CommandType.Text, "Update", cmdTextScriptUpdate, ["MainName"; "MainValue"; "SubValue"]); (CommandType.Text, "Select", cmdTextScriptSelect, ["MainName"]); (CommandType.Text, "Delete", cmdTextScriptDelete, ["MainName"]) (CommandType.StoredProcedure, "Insert", cmdTextProcedureInsert, ["MainName"; "SubName"; "MainValue"; "SubValue"]); (CommandType.StoredProcedure, "Update", cmdTextProcedureUpdate, ["MainName"; "MainValue"; "SubValue"]); (CommandType.StoredProcedure, "Select", cmdTextProcedureSelect, ["MainName"]); (CommandType.StoredProcedure, "Delete", cmdTextProcedureDelete, ["MainName"])]; lst |> List.iter runTest; finally cnn.Close(); And I'm getting the following results: Test 'Script -Insert' - total: 00:00:00.0292841; per call: 2ms Test 'Script -Update' - total: 00:00:00.0056296; per call: 0ms Test 'Script -Select' - total: 00:00:00.0051738; per call: 0ms Test 'Script -Delete' - total: 00:00:00.0059258; per call: 0ms Test 'Procedure-Insert' - total: 00:00:01.2567146; per call: 125ms Test 'Procedure-Update' - total: 00:00:00.7442440; per call: 74ms Test 'Procedure-Select' - total: 00:00:00.5120446; per call: 51ms Test 'Procedure-Delete' - total: 00:00:01.0619165; per call: 106ms The situation with the remote server is much better, but still a great gap between plaqin queries and stored procedures: Test 'Script -Insert' - total: 00:00:00.0709299; per call: 7ms Test 'Script -Update' - total: 00:00:00.0161777; per call: 1ms Test 'Script -Select' - total: 00:00:00.0258113; per call: 2ms Test 'Script -Delete' - total: 00:00:00.0166242; per call: 1ms Test 'Procedure-Insert' - total: 00:00:00.5116138; per call: 51ms Test 'Procedure-Update' - total: 00:00:00.3802251; per call: 38ms Test 'Procedure-Select' - total: 00:00:00.1241245; per call: 12ms Test 'Procedure-Delete' - total: 00:00:00.4336334; per call: 43ms Is it any chance to improve the SP performance? Please advice. ADO.NET driver version - 9.10.2.9 Server version - 9.10.0.9 (ANSI - GERMAN, OEM - GERMAN) Thanks!

    Read the article

  • C++ custom exceptions: run time performance and passing exceptions from C++ to C

    - by skyeagle
    I am writing a custom C++ exception class (so I can pass exceptions occuring in C++ to another language via a C API). My initial plan of attack was to proceed as follows: //C++ myClass { public: myClass(); ~myClass(); void foo() // throws myException int foo(const int i, const bool b) // throws myException } * myClassPtr; // C API #ifdef __cplusplus extern "C" { #endif myClassPtr MyClass_New(); void MyClass_Destroy(myClassPtr p); void MyClass_Foo(myClassPtr p); int MyClass_FooBar(myClassPtr p, int i, bool b); #ifdef __cplusplus }; #endif I need a way to be able to pass exceptions thrown in the C++ code to the C side. The information I want to pass to the C side is the following: (a). What (b). Where (c). Simple Stack Trace (just the sequence of error messages in order they occured, no debugging info etc) I want to modify my C API, so that the API functions take a pointer to a struct ExceptionInfo, which will contain any exception info (if an exception occured) before consuming the results of the invocation. This raises two questions: Question 1 1. Implementation of each of the C++ methods exposed in the C API needs to be enclosed in a try/catch statement. The performance implications for this seem quite serious (according to this article): "It is a mistake (with high runtime cost) to use C++ exception handling for events that occur frequently, or for events that are handled near the point of detection." At the same time, I remember reading somewhere in my C++ days, that all though exception handling is expensive, it only becmes expensive when an exception actually occurs. So, which is correct?. what to do?. Is there an alternative way that I can trap errors safely and pass the resulting error info to the C API?. Or is this a minor consideration (the article after all, is quite old, and hardware have improved a bit since then). Question 2 I wuld like to modify the exception class given in that article, so that it contains a simple stack trace, and I need some help doing that. Again, in order to make the exception class 'lightweight', I think its a good idea not to include any STL classes, like string or vector (good idea/bad idea?). Which potentially leaves me with a fixed length C string (char*) which will be stack allocated. So I can maybe just keep appending messages (delimted by a unique separator [up to maximum length of buffer])... Its been a while since I did any serious C++ coding, and I will be grateful for the help. BTW, this is what I have come up with so far (I am intentionally, not deriving from std::exception because of the performance reasons mentioned in the article, and I am instead, throwing an integral exception (based on an exception enumeration): class fast_exception { public: fast_exception(int what, char const* file=0, int line=0) : what_(what), line_(line), file_(file) {/*empty*/} int what() const { return what_; } int line() const { return line_; } char const* file() const { return file_; } private: int what_; int line_; char const[MAX_BUFFER_SIZE] file_; }

    Read the article

  • Can I spread out a long running stored proc accross multiple CPU's?

    - by Russ
    [Also on SuperUser - http://superuser.com/questions/116600/can-i-spead-out-a-long-running-stored-proc-accross-multiple-cpus] I have a stored procedure in SQL server the gets, and decrypts a block of data. ( Credit cards in this case. ) Most of the time, the performance is tolerable, but there are a couple customers where the process is painfully slow, taking literally 1 minute to complete. ( Well, 59377ms to return from SQL Server to be exact, but it can vary by a few hundred ms based on load ) When I watch the process, I see that SQL is only using a single proc to perform the whole process, and typically only proc 0. Is there a way I can change my stored proc so that SQL can multi-thread the process? Is it even feasible to cheat and to break the calls in half, ( top 50%, bottom 50% ), and spread the load, as a gross hack? ( just spit-balling here ) My stored proc: USE [Commerce] GO /****** Object: StoredProcedure [dbo].[GetAllCreditCardsByCustomerId] Script Date: 03/05/2010 11:50:14 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO ALTER PROCEDURE [dbo].[GetAllCreditCardsByCustomerId] @companyId UNIQUEIDENTIFIER, @DecryptionKey NVARCHAR (MAX) AS SET NoCount ON DECLARE @cardId uniqueidentifier DECLARE @tmpdecryptedCardData VarChar(MAX); DECLARE @decryptedCardData VarChar(MAX); DECLARE @tmpTable as Table ( CardId uniqueidentifier, DecryptedCard NVarChar(Max) ) DECLARE creditCards CURSOR FAST_FORWARD READ_ONLY FOR Select cardId from CreditCards where companyId = @companyId and Active=1 order by addedBy desc --2 OPEN creditCards --3 FETCH creditCards INTO @cardId -- prime the cursor WHILE @@Fetch_Status = 0 BEGIN --OPEN creditCards DECLARE creditCardData CURSOR FAST_FORWARD READ_ONLY FOR select convert(nvarchar(max), DecryptByCert(Cert_Id('Oh-Nay-Nay'), EncryptedCard, @DecryptionKey)) FROM CreditCardData where cardid = @cardId order by valueOrder OPEN creditCardData FETCH creditCardData INTO @tmpdecryptedCardData -- prime the cursor WHILE @@Fetch_Status = 0 BEGIN print 'CreditCardData' print @tmpdecryptedCardData set @decryptedCardData = ISNULL(@decryptedCardData, '') + @tmpdecryptedCardData print '@decryptedCardData' print @decryptedCardData; FETCH NEXT FROM creditCardData INTO @tmpdecryptedCardData -- fetch next END CLOSE creditCardData DEALLOCATE creditCardData insert into @tmpTable (CardId, DecryptedCard) values ( @cardId, @decryptedCardData ) set @decryptedCardData = '' FETCH NEXT FROM creditCards INTO @cardId -- fetch next END select CardId, DecryptedCard FROM @tmpTable CLOSE creditCards DEALLOCATE creditCards

    Read the article

  • Can I spread out a long running stored proc accross multiple CPU's?

    - by Russ
    [Also on SuperUser - http://superuser.com/questions/116600/can-i-spead-out-a-long-running-stored-proc-accross-multiple-cpus] I have a stored procedure in SQL server the gets, and decrypts a block of data. ( Credit cards in this case. ) Most of the time, the performance is tolerable, but there are a couple customers where the process is painfully slow, taking literally 1 minute to complete. ( Well, 59377ms to return from SQL Server to be exact, but it can vary by a few hundred ms based on load ) When I watch the process, I see that SQL is only using a single proc to perform the whole process, and typically only proc 0. Is there a way I can change my stored proc so that SQL can multi-thread the process? Is it even feasible to cheat and to break the calls in half, ( top 50%, bottom 50% ), and spread the load, as a gross hack? ( just spit-balling here ) My stored proc: USE [Commerce] GO /****** Object: StoredProcedure [dbo].[GetAllCreditCardsByCustomerId] Script Date: 03/05/2010 11:50:14 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO ALTER PROCEDURE [dbo].[GetAllCreditCardsByCustomerId] @companyId UNIQUEIDENTIFIER, @DecryptionKey NVARCHAR (MAX) AS SET NoCount ON DECLARE @cardId uniqueidentifier DECLARE @tmpdecryptedCardData VarChar(MAX); DECLARE @decryptedCardData VarChar(MAX); DECLARE @tmpTable as Table ( CardId uniqueidentifier, DecryptedCard NVarChar(Max) ) DECLARE creditCards CURSOR FAST_FORWARD READ_ONLY FOR Select cardId from CreditCards where companyId = @companyId and Active=1 order by addedBy desc --2 OPEN creditCards --3 FETCH creditCards INTO @cardId -- prime the cursor WHILE @@Fetch_Status = 0 BEGIN --OPEN creditCards DECLARE creditCardData CURSOR FAST_FORWARD READ_ONLY FOR select convert(nvarchar(max), DecryptByCert(Cert_Id('Oh-Nay-Nay'), EncryptedCard, @DecryptionKey)) FROM CreditCardData where cardid = @cardId order by valueOrder OPEN creditCardData FETCH creditCardData INTO @tmpdecryptedCardData -- prime the cursor WHILE @@Fetch_Status = 0 BEGIN print 'CreditCardData' print @tmpdecryptedCardData set @decryptedCardData = ISNULL(@decryptedCardData, '') + @tmpdecryptedCardData print '@decryptedCardData' print @decryptedCardData; FETCH NEXT FROM creditCardData INTO @tmpdecryptedCardData -- fetch next END CLOSE creditCardData DEALLOCATE creditCardData insert into @tmpTable (CardId, DecryptedCard) values ( @cardId, @decryptedCardData ) set @decryptedCardData = '' FETCH NEXT FROM creditCards INTO @cardId -- fetch next END select CardId, DecryptedCard FROM @tmpTable CLOSE creditCards DEALLOCATE creditCards

    Read the article

  • push_back of STL list got bad performance?

    - by Leon Zhang
    I wrote a simple program to test STL list performance against a simple C list-like data structure. It shows bad performance at "push_back()" line. Any comments on it? $ ./test2 Build the type list : time consumed -> 0.311465 Iterate over all items: time consumed -> 0.00898 Build the simple C List: time consumed -> 0.020275 Iterate over all items: time consumed -> 0.008755 The source code is: #include <stdexcept> #include "high_resolution_timer.hpp" #include <list> #include <algorithm> #include <iostream> #define TESTNUM 1000000 /* The test struct */ struct MyType { int num; }; /* * C++ STL::list Test */ typedef struct MyType* mytype_t; void myfunction(mytype_t t) { } int test_stl_list() { std::list<mytype_t> mylist; util::high_resolution_timer t; /* * Build the type list */ t.restart(); for(int i = 0; i < TESTNUM; i++) { mytype_t aItem = (mytype_t) malloc(sizeof(struct MyType)); if(aItem == NULL) { printf("Error: while malloc\n"); return -1; } aItem->num = i; mylist.push_back(aItem); } std::cout << " Build the type list : time consumed -> " << t.elapsed() << std::endl; /* * Iterate over all item */ t.restart(); std::for_each(mylist.begin(), mylist.end(), myfunction); std::cout << " Iterate over all items: time consumed -> " << t.elapsed() << std::endl; return 0; } /* * a simple C list */ struct MyCList; struct MyCList{ struct MyType m; struct MyCList* p_next; }; int test_simple_c_list() { struct MyCList* p_list_head = NULL; util::high_resolution_timer t; /* * Build it */ t.restart(); struct MyCList* p_new_item = NULL; for(int i = 0; i < TESTNUM; i++) { p_new_item = (struct MyCList*) malloc(sizeof(struct MyCList)); if(p_new_item == NULL) { printf("ERROR : while malloc\n"); return -1; } p_new_item->m.num = i; p_new_item->p_next = p_list_head; p_list_head = p_new_item; } std::cout << " Build the simple C List: time consumed -> " << t.elapsed() << std::endl; /* * Iterate all items */ t.restart(); p_new_item = p_list_head; while(p_new_item->p_next != NULL) { p_new_item = p_new_item->p_next; } std::cout << " Iterate over all items: time consumed -> " << t.elapsed() << std::endl; return 0; } int main(int argc, char** argv) { if(test_stl_list() != 0) { printf("ERROR: error at testcase1\n"); return -1; } if(test_simple_c_list() != 0) { printf("ERROR: error at testcase2\n"); return -1; } return 0; }

    Read the article

  • Linq2SQL vs NHibernate performance (have I gone mad?)

    - by HeavyWave
    I have written the following tests to compare performance of Linq2SQL and NHibernate and I find results to be somewhat strange. Mappings are straight forward and identical for both. Both are running against a live DB. Although I'm not deleting Campaigns in case of Linq, but that shouldn't affect performance by more than 10 ms. Linq: [Test] public void Test1000ReadsWritesToAgentStateLinqPrecompiled() { Stopwatch sw = new Stopwatch(); Stopwatch swIn = new Stopwatch(); sw.Start(); for (int i = 0; i < 1000; i++) { swIn.Reset(); swIn.Start(); ReadWriteAndDeleteAgentStateWithLinqPrecompiled(); swIn.Stop(); Console.WriteLine("Run ReadWriteAndDeleteAgentState: " + swIn.ElapsedMilliseconds + " ms"); } sw.Stop(); Console.WriteLine("Total Time: " + sw.ElapsedMilliseconds + " ms"); Console.WriteLine("Average time to execute queries: " + sw.ElapsedMilliseconds / 1000 + " ms"); } private static readonly Func<AgentDesktop3DataContext, int, EntityModel.CampaignDetail> GetCampaignById = CompiledQuery.Compile<AgentDesktop3DataContext, int, EntityModel.CampaignDetail>( (ctx, sessionId) => (from cd in ctx.CampaignDetails join a in ctx.AgentCampaigns on cd.CampaignDetailId equals a.CampaignDetailId where a.AgentStateId == sessionId select cd).FirstOrDefault()); private void ReadWriteAndDeleteAgentStateWithLinqPrecompiled() { int id = 0; using (var ctx = new AgentDesktop3DataContext()) { EntityModel.AgentState agentState = new EntityModel.AgentState(); var campaign = new EntityModel.CampaignDetail { CampaignName = "Test" }; var campaignDisposition = new EntityModel.CampaignDisposition { Code = "123" }; campaignDisposition.Description = "abc"; campaign.CampaignDispositions.Add(campaignDisposition); agentState.CallState = 3; campaign.AgentCampaigns.Add(new AgentCampaign { AgentState = agentState }); ctx.CampaignDetails.InsertOnSubmit(campaign); ctx.AgentStates.InsertOnSubmit(agentState); ctx.SubmitChanges(); id = agentState.AgentStateId; } using (var ctx = new AgentDesktop3DataContext()) { var dbAgentState = ctx.GetAgentStateById(id); Assert.IsNotNull(dbAgentState); Assert.AreEqual(dbAgentState.CallState, 3); var campaignDetails = GetCampaignById(ctx, id); Assert.AreEqual(campaignDetails.CampaignDispositions[0].Description, "abc"); } using (var ctx = new AgentDesktop3DataContext()) { ctx.DeleteSessionById(id); } } NHibernate (the loop is the same): private void ReadWriteAndDeleteAgentState() { var id = WriteAgentState().Id; StartNewTransaction(); var dbAgentState = agentStateRepository.Get(id); Assert.IsNotNull(dbAgentState); Assert.AreEqual(dbAgentState.CallState, 3); Assert.AreEqual(dbAgentState.Campaigns[0].Dispositions[0].Description, "abc"); var campaignId = dbAgentState.Campaigns[0].Id; agentStateRepository.Delete(dbAgentState); NHibernateSession.Current.Transaction.Commit(); Cleanup(campaignId); NHibernateSession.Current.BeginTransaction(); } Results: NHibernate: Total Time: 9469 ms Average time to execute 13 queries: 9 ms Linq: Total Time: 127200 ms Average time to execute 13 queries: 127 ms Linq lost by 13.5 times! Event with precompiled queries (both read queries are precompiled). This can't be right, although I expected NHibernate to be faster, this is just too big of a difference, considering mappings are identical and NHibernate actually executes more queries against the DB.

    Read the article

  • ASP.NET retrieve Average CPU Usage

    - by Sam
    Last night I did a load test on a site. I found that one of my shared caches is a bottleneck. I'm using a ReaderWriterLockSlim to control the updates of the data. Unfortunately at one point there are ~200 requests trying to update the data at approximately the same time. This also coincided with CPU usage spikes. The data being updated is in the ASP.NET Cache. What I'd like to do is if the CPU usage is around 75%, I'd like to just skip the cache and hit the database on another machine. My problem is that I don't know how expensive it is to create a new performance counter to check the cpu usage. Also, if I would probably like the average cpu usage over the last 2 or 3 seconds. However, I can't sit there and calculate the cpu time as that would take longer than it's taking to update the cache currently. Is there an easy way to get the average CPU usage? Are there any drawbacks to this? I'm also considering totaling the wait count for the lock and then at a certain threshold switch over to the database. The concern I had with this approach would be that changing hardware might allow more locks with less of a strain on the system. And also finding the right balance for the threshold would be cumbersome and it doesn't take into account any other load on the machine. But it's a simple approach, and simple is 99% of the time better.

    Read the article

  • How to decide between using PLINQ and LINQ at runtime?

    - by Hamish Grubijan
    Or decide between a parallel and a sequential operation in general. It is hard to know without testing whether parallel or sequential implementation is best due to overhead. Obviously it will take some time to train "the decider" which method to use. I would say that this method cannot be perfect, so it is probabilistic in nature. The x,y,z do influence "the decider". I think a very naive implementation would be to give both 1/2 chance at the beginning and then start favoring them according to past performance. This disregards x,y,z, however. I suspect that this question would be better answered by academics than practitioners. Anyhow, please share your heuristic, your experience if any, your tips on this. Sample code: public interface IComputer { decimal Compute(decimal x, decimal y, decimal z); } public class SequentialComputer : IComputer { public decimal Compute( ... // sequential implementation } public class ParallelComputer : IComputer { public decimal Compute( ... // parallel implementation } public class HybridComputer : IComputer { private SequentialComputer sc; private ParallelComputer pc; private TheDecider td; // Helps to decide between the two. public HybridComputer() { sc = new SequentialComputer(); pc = new ParallelComputer(); td = TheDecider(); } public decimal Compute(decimal x, decimal y, decimal z) { decimal result; decimal time; if (td.PickOneOfTwo() == 0) { // Time this and save result into time. result = sc.Compute(...); } else { // Time this and save result into time. result = pc.Compute(); } td.Train(time); return result; } }

    Read the article

  • UIButton performance in UITableViewCell vs UIView

    - by marcel salathe
    I'd like to add a UIButton to a custom UITableViewCell (programmatically). This is easy to do, but I'm finding that the "performance" of the button in the cell is slow - that is, when I touch the button, there is quite a bit of delay until the button visually goes into the highlighted state. The same type of button on a regular UIView is very responsive in comparison. In order to isolate the problem, I've created two views - one is a simple UIView, the other is a UITableView with only one UITableViewCell. I've added buttons to both views (the UIView and the UITableViewCell), and the performance difference is quite striking. I've searched the web and read the Apple docs but haven't really found the cause of the problem. My guess is that it somehow has to do with the responder chain, but I can't quite put my finger on it. I must be doing something wrong, and I'd appreciate any help. Thanks. Demo code: ViewController.h #import <UIKit/UIKit.h> @interface ViewController : UIViewController <UITableViewDelegate, UITableViewDataSource> @property UITableView* myTableView; @property UIView* myView; ViewController.m #import "ViewController.h" #import "CustomCell.h" @implementation ViewController @synthesize myTableView, myView; - (id)initWithNibName:(NSString *)nibNameOrNil bundle:(NSBundle *)nibBundleOrNil { self = [super initWithNibName:nibNameOrNil bundle:nibBundleOrNil]; if (self) { [self initMyView]; [self initMyTableView]; } return self; } - (void) initMyView { UIView* newView = [[UIView alloc] initWithFrame:CGRectMake(0,0,[[UIScreen mainScreen] bounds].size.width,100)]; self.myView = newView; // button on regularView UIButton* myButton = [UIButton buttonWithType:UIButtonTypeRoundedRect]; [myButton addTarget:self action:@selector(pressedMyButton) forControlEvents:UIControlEventTouchUpInside]; [myButton setTitle:@"I'm fast" forState:UIControlStateNormal]; [myButton setFrame:CGRectMake(20.0, 10.0, 160.0, 30.0)]; [[self myView] addSubview:myButton]; } - (void) initMyTableView { UITableView *newTableView = [[UITableView alloc] initWithFrame:CGRectMake(0,100,[[UIScreen mainScreen] bounds].size.width,[[UIScreen mainScreen] bounds].size.height-100) style:UITableViewStyleGrouped]; self.myTableView = newTableView; self.myTableView.delegate = self; self.myTableView.dataSource = self; } -(void) pressedMyButton { NSLog(@"pressedMyButton"); } - (void)viewDidLoad { [super viewDidLoad]; [[self view] addSubview:self.myView]; [[self view] addSubview:self.myTableView]; } - (NSInteger)numberOfSectionsInTableView:(UITableView *)tableView { return 1; } - (NSInteger)tableView:(UITableView *)tableView numberOfRowsInSection:(NSInteger)section { return 1; } - (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath { CustomCell *customCell = [tableView dequeueReusableCellWithIdentifier:@"CustomCell"]; if (customCell == nil) { customCell = [[CustomCell alloc] initWithStyle:UITableViewCellStyleSubtitle reuseIdentifier:@"CustomCell"]; } return customCell; } @end CustomCell.h #import <UIKit/UIKit.h> @interface CustomCell : UITableViewCell @property (retain, nonatomic) UIButton* cellButton; @end CustomCell.m #import "CustomCell.h" @implementation CustomCell @synthesize cellButton; - (id)initWithStyle:(UITableViewCellStyle)style reuseIdentifier:(NSString *)reuseIdentifier { self = [super initWithStyle:style reuseIdentifier:reuseIdentifier]; if (self) { // button within cell cellButton = [UIButton buttonWithType:UIButtonTypeRoundedRect]; [cellButton addTarget:self action:@selector(pressedCellButton) forControlEvents:UIControlEventTouchUpInside]; [cellButton setTitle:@"I'm sluggish" forState:UIControlStateNormal]; [cellButton setFrame:CGRectMake(20.0, 10.0, 160.0, 30.0)]; [self addSubview:cellButton]; } return self; } - (void) pressedCellButton { NSLog(@"pressedCellButton"); } @end

    Read the article

  • What is the fastest cyclic synchronization in Java (ExecutorService vs. CyclicBarrier vs. X)?

    - by Alex Dunlop
    Which Java synchronization construct is likely to provide the best performance for a concurrent, iterative processing scenario with a fixed number of threads like the one outlined below? After experimenting on my own for a while (using ExecutorService and CyclicBarrier) and being somewhat surprised by the results, I would be grateful for some expert advice and maybe some new ideas. Existing questions here do not seem to focus primarily on performance, hence this new one. Thanks in advance! The core of the app is a simple iterative data processing algorithm, parallelized to the spread the computational load across 8 cores on a Mac Pro, running OS X 10.6 and Java 1.6.0_07. The data to be processed is split into 8 blocks and each block is fed to a Runnable to be executed by one of a fixed number of threads. Parallelizing the algorithm was fairly straightforward, and it functionally works as desired, but its performance is not yet what I think it could be. The app seems to spend a lot of time in system calls synchronizing, so after some profiling I wonder whether I selected the most appropriate synchronization mechanism(s). A key requirement of the algorithm is that it needs to proceed in stages, so the threads need to sync up at the end of each stage. The main thread prepares the work (very low overhead), passes it to the threads, lets them work on it, then proceeds when all threads are done, rearranges the work (again very low overhead) and repeats the cycle. The machine is dedicated to this task, Garbage Collection is minimized by using per-thread pools of pre-allocated items, and the number of threads can be fixed (no incoming requests or the like, just one thread per CPU core). V1 - ExecutorService My first implementation used an ExecutorService with 8 worker threads. The program creates 8 tasks holding the work and then lets them work on it, roughly like this: // create one thread per CPU executorService = Executors.newFixedThreadPool( 8 ); ... // now process data in cycles while( ...) { // package data into 8 work items ... // create one Callable task per work item ... // submit the Callables to the worker threads executorService.invokeAll( taskList ); } This works well functionally (it does what it should), and for very large work items indeed all 8 CPUs become highly loaded, as much as the processing algorithm would be expected to allow (some work items will finish faster than others, then idle). However, as the work items become smaller (and this is not really under the program's control), the user CPU load shrinks dramatically: blocksize | system | user | cycles/sec 256k 1.8% 85% 1.30 64k 2.5% 77% 5.6 16k 4% 64% 22.5 4096 8% 56% 86 1024 13% 38% 227 256 17% 19% 420 64 19% 17% 948 16 19% 13% 1626 Legend: - block size = size of the work item (= computational steps) - system = system load, as shown in OS X Activity Monitor (red bar) - user = user load, as shown in OS X Activity Monitor (green bar) - cycles/sec = iterations through the main while loop, more is better The primary area of concern here is the high percentage of time spent in the system, which appears to be driven by thread synchronization calls. As expected, for smaller work items, ExecutorService.invokeAll() will require relatively more effort to sync up the threads versus the amount of work being performed in each thread. But since ExecutorService is more generic than it would need to be for this use case (it can queue tasks for threads if there are more tasks than cores), I though maybe there would be a leaner synchronization construct. V2 - CyclicBarrier The next implementation used a CyclicBarrier to sync up the threads before receiving work and after completing it, roughly as follows: main() { // create the barrier barrier = new CyclicBarrier( 8 + 1 ); // create Runable for thread, tell it about the barrier Runnable task = new WorkerThreadRunnable( barrier ); // start the threads for( int i = 0; i < 8; i++ ) { // create one thread per core new Thread( task ).start(); } while( ... ) { // tell threads about the work ... // N threads + this will call await(), then system proceeds barrier.await(); // ... now worker threads work on the work... // wait for worker threads to finish barrier.await(); } } class WorkerThreadRunnable implements Runnable { CyclicBarrier barrier; WorkerThreadRunnable( CyclicBarrier barrier ) { this.barrier = barrier; } public void run() { while( true ) { // wait for work barrier.await(); // do the work ... // wait for everyone else to finish barrier.await(); } } } Again, this works well functionally (it does what it should), and for very large work items indeed all 8 CPUs become highly loaded, as before. However, as the work items become smaller, the load still shrinks dramatically: blocksize | system | user | cycles/sec 256k 1.9% 85% 1.30 64k 2.7% 78% 6.1 16k 5.5% 52% 25 4096 9% 29% 64 1024 11% 15% 117 256 12% 8% 169 64 12% 6.5% 285 16 12% 6% 377 For large work items, synchronization is negligible and the performance is identical to V1. But unexpectedly, the results of the (highly specialized) CyclicBarrier seem MUCH WORSE than those for the (generic) ExecutorService: throughput (cycles/sec) is only about 1/4th of V1. A preliminary conclusion would be that even though this seems to be the advertised ideal use case for CyclicBarrier, it performs much worse than the generic ExecutorService. V3 - Wait/Notify + CyclicBarrier It seemed worth a try to replace the first cyclic barrier await() with a simple wait/notify mechanism: main() { // create the barrier // create Runable for thread, tell it about the barrier // start the threads while( ... ) { // tell threads about the work // for each: workerThreadRunnable.setWorkItem( ... ); // ... now worker threads work on the work... // wait for worker threads to finish barrier.await(); } } class WorkerThreadRunnable implements Runnable { CyclicBarrier barrier; @NotNull volatile private Callable<Integer> workItem; WorkerThreadRunnable( CyclicBarrier barrier ) { this.barrier = barrier; this.workItem = NO_WORK; } final protected void setWorkItem( @NotNull final Callable<Integer> callable ) { synchronized( this ) { workItem = callable; notify(); } } public void run() { while( true ) { // wait for work while( true ) { synchronized( this ) { if( workItem != NO_WORK ) break; try { wait(); } catch( InterruptedException e ) { e.printStackTrace(); } } } // do the work ... // wait for everyone else to finish barrier.await(); } } } Again, this works well functionally (it does what it should). blocksize | system | user | cycles/sec 256k 1.9% 85% 1.30 64k 2.4% 80% 6.3 16k 4.6% 60% 30.1 4096 8.6% 41% 98.5 1024 12% 23% 202 256 14% 11.6% 299 64 14% 10.0% 518 16 14.8% 8.7% 679 The throughput for small work items is still much worse than that of the ExecutorService, but about 2x that of the CyclicBarrier. Eliminating one CyclicBarrier eliminates half of the gap. V4 - Busy wait instead of wait/notify Since this app is the primary one running on the system and the cores idle anyway if they're not busy with a work item, why not try a busy wait for work items in each thread, even if that spins the CPU needlessly. The worker thread code changes as follows: class WorkerThreadRunnable implements Runnable { // as before final protected void setWorkItem( @NotNull final Callable<Integer> callable ) { workItem = callable; } public void run() { while( true ) { // busy-wait for work while( true ) { if( workItem != NO_WORK ) break; } // do the work ... // wait for everyone else to finish barrier.await(); } } } Also works well functionally (it does what it should). blocksize | system | user | cycles/sec 256k 1.9% 85% 1.30 64k 2.2% 81% 6.3 16k 4.2% 62% 33 4096 7.5% 40% 107 1024 10.4% 23% 210 256 12.0% 12.0% 310 64 11.9% 10.2% 550 16 12.2% 8.6% 741 For small work items, this increases throughput by a further 10% over the CyclicBarrier + wait/notify variant, which is not insignificant. But it is still much lower-throughput than V1 with the ExecutorService. V5 - ? So what is the best synchronization mechanism for such a (presumably not uncommon) problem? I am weary of writing my own sync mechanism to completely replace ExecutorService (assuming that it is too generic and there has to be something that can still be taken out to make it more efficient). It is not my area of expertise and I'm concerned that I'd spend a lot of time debugging it (since I'm not even sure my wait/notify and busy wait variants are correct) for uncertain gain. Any advice would be greatly appreciated.

    Read the article

  • Null-free "maps": Is a callback solution slower than tryGet()?

    - by David Moles
    In comments to "How to implement List, Set, and Map in null free design?", Steven Sudit and I got into a discussion about using a callback, with handlers for "found" and "not found" situations, vs. a tryGet() method, taking an out parameter and returning a boolean indicating whether the out parameter had been populated. Steven maintained that the callback approach was more complex and almost certain to be slower; I maintained that the complexity was no greater and the performance at worst the same. But code speaks louder than words, so I thought I'd implement both and see what I got. The original question was fairly theoretical with regard to language ("And for argument sake, let's say this language don't even have null") -- I've used Java here because that's what I've got handy. Java doesn't have out parameters, but it doesn't have first-class functions either, so style-wise, it should suck equally for both approaches. (Digression: As far as complexity goes: I like the callback design because it inherently forces the user of the API to handle both cases, whereas the tryGet() design requires callers to perform their own boilerplate conditional check, which they could forget or get wrong. But having now implemented both, I can see why the tryGet() design looks simpler, at least in the short term.) First, the callback example: class CallbackMap<K, V> { private final Map<K, V> backingMap; public CallbackMap(Map<K, V> backingMap) { this.backingMap = backingMap; } void lookup(K key, Callback<K, V> handler) { V val = backingMap.get(key); if (val == null) { handler.handleMissing(key); } else { handler.handleFound(key, val); } } } interface Callback<K, V> { void handleFound(K key, V value); void handleMissing(K key); } class CallbackExample { private final Map<String, String> map; private final List<String> found; private final List<String> missing; private Callback<String, String> handler; public CallbackExample(Map<String, String> map) { this.map = map; found = new ArrayList<String>(map.size()); missing = new ArrayList<String>(map.size()); handler = new Callback<String, String>() { public void handleFound(String key, String value) { found.add(key + ": " + value); } public void handleMissing(String key) { missing.add(key); } }; } void test() { CallbackMap<String, String> cbMap = new CallbackMap<String, String>(map); for (int i = 0, count = map.size(); i < count; i++) { String key = "key" + i; cbMap.lookup(key, handler); } System.out.println(found.size() + " found"); System.out.println(missing.size() + " missing"); } } Now, the tryGet() example -- as best I understand the pattern (and I might well be wrong): class TryGetMap<K, V> { private final Map<K, V> backingMap; public TryGetMap(Map<K, V> backingMap) { this.backingMap = backingMap; } boolean tryGet(K key, OutParameter<V> valueParam) { V val = backingMap.get(key); if (val == null) { return false; } valueParam.value = val; return true; } } class OutParameter<V> { V value; } class TryGetExample { private final Map<String, String> map; private final List<String> found; private final List<String> missing; public TryGetExample(Map<String, String> map) { this.map = map; found = new ArrayList<String>(map.size()); missing = new ArrayList<String>(map.size()); } void test() { TryGetMap<String, String> tgMap = new TryGetMap<String, String>(map); for (int i = 0, count = map.size(); i < count; i++) { String key = "key" + i; OutParameter<String> out = new OutParameter<String>(); if (tgMap.tryGet(key, out)) { found.add(key + ": " + out.value); } else { missing.add(key); } } System.out.println(found.size() + " found"); System.out.println(missing.size() + " missing"); } } And finally, the performance test code: public static void main(String[] args) { int size = 200000; Map<String, String> map = new HashMap<String, String>(); for (int i = 0; i < size; i++) { String val = (i % 5 == 0) ? null : "value" + i; map.put("key" + i, val); } long totalCallback = 0; long totalTryGet = 0; int iterations = 20; for (int i = 0; i < iterations; i++) { { TryGetExample tryGet = new TryGetExample(map); long tryGetStart = System.currentTimeMillis(); tryGet.test(); totalTryGet += (System.currentTimeMillis() - tryGetStart); } System.gc(); { CallbackExample callback = new CallbackExample(map); long callbackStart = System.currentTimeMillis(); callback.test(); totalCallback += (System.currentTimeMillis() - callbackStart); } System.gc(); } System.out.println("Avg. callback: " + (totalCallback / iterations)); System.out.println("Avg. tryGet(): " + (totalTryGet / iterations)); } On my first attempt, I got 50% worse performance for callback than for tryGet(), which really surprised me. But, on a hunch, I added some garbage collection, and the performance penalty vanished. This fits with my instinct, which is that we're basically talking about taking the same number of method calls, conditional checks, etc. and rearranging them. But then, I wrote the code, so I might well have written a suboptimal or subconsicously penalized tryGet() implementation. Thoughts?

    Read the article

  • Preloading Winforms using a Stack and Hidden Form

    - by msarchet
    I am currently working on a project where we have a couple very control heavy user controls that are being used inside a MDI Controller. This is a Line of Business app and it is very data driven. The problem that we were facing was the aforementioned controls would load very very slowly, we dipped our toes into the waters of multi-threading for the control loading but that was not a solution for a plethora of reasons. Our solution to increasing the performance of the controls ended up being to 'pre-load' the forms onto a hidden window, create a stack of the existing forms, and pop off of the stack as the user requested a form. Now the current issue that I'm seeing that will arise as we push this 'fix' out to our testers, and the ultimately our users is this: Currently the 'hidden' window that contains the preloaded forms is visible in task manager, and can be shut down thus causing all of the controls to be lost. Then you have to create them on the fly losing the performance increase. Secondly, when the user uses up the stack we lose the performance increase (current solution to this is discussed below). For the first problem, is there a way to hide this window from task manager, perhaps by creating a parent form that encapsulates both the main form for the program and the hidden form? Our current solution to the second problem is to have an inactivity timer that when it fires checks the stacks for the forms, and loads a new form onto the stack if it isn't full. However this still has the potential of causing a hang in the UI while it creates the forms. A possible solutions for this would be to put 'used' forms back onto the stack, but I feel like there may be a better way. EDIT: For control design clarification From the comments I have realized there is a lack of clarity on what exactly the control is doing. Here is a detailed explanation of one of the controls. I have defined for this control loading time as the time it takes from when a user performs an action that would open a control, until the time a control is accessible to be edited. The control is for entering Prescriptions for a patient in the system, it has about 5 tabbed groups with a total of about 180 controls. The user selects to open a new Prescription control from inside the main program, this control is loaded into the MDI Child area of the Main Form (which is a DevExpress Ribbon Control). From the time the user clicks New (or loads an existing record) until the control is visible. The list of actions that happens in the program is this: The stack is checked for the existence of a control. If the control exists it is popped off of the stack. The control is rendered on screen. This is what takes 2 seconds The control then is populated with a blank object, or with existing data. The control is ready to use. The average percentage of loading time, across about 10 different machines, with different hardware the control rendering takes about 85 - 95 percent of the control loading time. Without using the stack the control takes about 2 seconds to load, with the stack it takes about .8 seconds, this second time is acceptable. I have looked at Henry's link and I had previously already implemented the applicable suggestions. Again I re-iterate my question as What is the best method to move controls to and from the stack with as little UI interruption as possible?

    Read the article

  • Random Complete System Unresponsiveness Running Mathematical Functions

    - by Computer Guru
    I have a program that loads a file (anywhere from 10MB to 5GB) a chunk at a time (ReadFile), and for each chunk performs a set of mathematical operations (basically calculates the hash). After calculating the hash, it stores info about the chunk in an STL map (basically <chunkID, hash>) and then writes the chunk itself to another file (WriteFile). That's all it does. This program will cause certain PCs to choke and die. The mouse begins to stutter, the task manager takes 2 min to show, ctrl+alt+del is unresponsive, running programs are slow.... the works. I've done literally everything I can think of to optimize the program, and have triple-checked all objects. What I've done: Tried different (less intensive) hashing algorithms. Switched all allocations to nedmalloc instead of the default new operator Switched from stl::map to unordered_set, found the performance to still be abysmal, so I switched again to Google's dense_hash_map. Converted all objects to store pointers to objects instead of the objects themselves. Caching all Read and Write operations. Instead of reading a 16k chunk of the file and performing the math on it, I read 4MB into a buffer and read 16k chunks from there instead. Same for all write operations - they are coalesced into 4MB blocks before being written to disk. Run extensive profiling with Visual Studio 2010, AMD Code Analyst, and perfmon. Set the thread priority to THREAD_MODE_BACKGROUND_BEGIN Set the thread priority to THREAD_PRIORITY_IDLE Added a Sleep(100) call after every loop. Even after all this, the application still results in a system-wide hang on certain machines under certain circumstances. Perfmon and Process Explorer show minimal CPU usage (with the sleep), no constant reads/writes from disk, few hard pagefaults (and only ~30k pagefaults in the lifetime of the application on a 5GB input file), little virtual memory (never more than 150MB), no leaked handles, no memory leaks. The machines I've tested it on run Windows XP - Windows 7, x86 and x64 versions included. None have less than 2GB RAM, though the problem is always exacerbated under lower memory conditions. I'm at a loss as to what to do next. I don't know what's causing it - I'm torn between CPU or Memory as the culprit. CPU because without the sleep and under different thread priorities the system performances changes noticeably. Memory because there's a huge difference in how often the issue occurs when using unordered_set vs Google's dense_hash_map. What's really weird? Obviously, the NT kernel design is supposed to prevent this sort of behavior from ever occurring (a user-mode application driving the system to this sort of extreme poor performance!?)..... but when I compile the code and run it on OS X or Linux (it's fairly standard C++ throughout) it performs excellently even on poor machines with little RAM and weaker CPUs. What am I supposed to do next? How do I know what the hell it is that Windows is doing behind the scenes that's killing system performance, when all the indicators are that the application itself isn't doing anything extreme? Any advice would be most welcome.

    Read the article

  • Why don't I just build the whole web app in Javascript and Javascript HTML Templates?

    - by viatropos
    I'm getting to the point on an app where I need to start caching things, and it got me thinking... In some parts of the app, I render table rows (jqGrid, slickgrid, etc.) or fancy div rows (like in the New Twitter) by grabbing pure JSON and running it through something like Mustache, jquery.tmpl, etc. In other parts of the app, I just render the info in pure HTML (server-side HAML templates), and if there's searching/paginating, I just go to a new URL and load a new HTML page. Now the problem is in caching and maintainability. On one hand I'm thinking, if everything was built using Javascript HTML Templates, then my app would serve just an HTML layout/shell, and a bunch of JSON. If you look at the Facebook and Twitter HTML source, that's basically what they're doing (95% json/javascript, 5% html). This would make it so my app only needed to cache JSON (pages, actions, and/or records). Which means you'd hit the cache no matter if you were some remote api developer accessing a JSON api, or the strait web app. That is, I don't need 2 caches, one for the JSON, one for the HTML. That seems like it'd cut my cache store down in half, and streamline things a little bit. On the other hand, I'm thinking, from what I've seen/experienced, generating static HTML server-side, and caching that, seems to be much better performance wise cross-browser; you get the graphics instantly and don't have to wait that split-second for javascript to render it. StackOverflow seems to do everything in plain HTML, and you can tell... everything appears at once. Notice how though on twitter.com, the page is blank for .5-1 seconds, and the page chunks in: the javascript has to render the json. The downside with this is that, for anything dynamic (like endless scrolling, or grids), I'd have to create javascript templates anyway... so now I have server-side HAML templates, client-side javascript templates, and a lot more to cache. My question is, is there any consensus on how to approach this? What are the benefits and drawbacks from your experience of mixing the two versus going 100% with one over the other? Update: Some reasons that factor into why I haven't yet made the decision to go with 100% javascript templating are: Performance. Haven't formally tested, but from what I've seen, raw html renders faster and more fluidly than javascript-generated html cross-browser. Plus, I'm not sure how mobile devices handle dynamic html performance-wise. Testing. I have a lot of integration tests that work well with static HTML, so switching to javascript-only would require 1) more focused pure-javascript testing (jasmine), and 2) integrating javascript into capybara integration tests. This is just a matter of time and work, but it's probably significant. Maintenance. Getting rid of HAML. I love HAML, it's so easy to write, it prints pretty HTML... It makes code clean, it makes maintenance easy. Going with javascript, there's nothing as concise. SEO. I know google handles the ajax /#!/path, but haven't grasped how this will affect other search engines and how older browsers handle it. Seems like it'd require a significant setup.

    Read the article

  • Improve WPF Rendering Performance (WrapPanel in ItemsControl)

    - by Wonko the Sane
    Hello All, I have an ItemsSource that appears to have poor performance when adding even a fairly small ObservableCollection to it. The ItemsPanel is a WrapPanel, and the ItemTemplate is essentially a Border containing another Border painted with an ImageBrush. The ItemsControl is wrapped inside a ScrollViewer. After some investigation using WpfPerf, it would appear that most of the "what the heck is it doing?" time is spent on WrapPanel.Measure after creating the collection that is being bound. As I've mentioned, it's a fairly small collection - generally less than 100 items. If nothing else, I'd like to be able to put a "Please Wait" on the screen (during the collection creation portion as well), but I am not sure how to know when the rendering is complete. Any thoughts would be greatly appreciated! Thanks, wTs

    Read the article

  • Is Objective C fast enough for DSP/audio programming

    - by morgancodes
    I've been making some progress with audio programming for iPhone. Now I'm doing some performance tuning, trying to see if I can squeeze more out of this little machine. Running Shark, I see that a significant part of my cpu power (16%) is getting eaten up by objc_msgSend. I understand I can speed this up somewhat by storing pointers to functions (IMP) rather than calling them using [object message] notation. But if I'm going to go through all this trouble, I wonder if I might just be better off using C++. Any thoughts on this?

    Read the article

  • SQL - .NET - SqlParameters - AddWithValue - Are there any negative performance implications when Par

    - by hamlin11
    http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlparametercollection.addwithvalue.aspx I'm used to adding sql parameters to a sqlCommand using the add() function. This allows me to specify the type of the sqlParameter, but it requires another line to set the value. It's nice to use the AddWithValue function, but it skips the "specify the parameter type" step. I'm guessing this causes the parameters to be sent over as strings contained within single quotes (''), but I'm not sure. Is this the case, and does this cause significantly slower performance of the stored procedures? Note: I understand that it is nice to validate user data on the .NET side of things by specifying the data type for params -- I'm only concerned about reflection-type overhead of AddWithValue either on the .NET or SQL side.

    Read the article

  • C# WMI, Performance Counters, & SNMP Oh My!

    - by Keith
    I have a C# windows service which listens to a MSMQ and sends each message out as an email. Since there's no UI, I'd like to offer an ability to monitor this service to see things such as # messages in queue, # emails sent (by message type perhaps), # of errors, etc. What is the best/recommended way to accomplish this? Is it WMI or performance counters? Is this data viewed using PerfMon or WMI CIM Studio? Does any approach allow one to monitor the service real-time as well as providing historical analysis? I can dig into the details myself but would appreciate some broad guidance to help demystify this subject.

    Read the article

  • Need advice on speeding up tableViewCell photo loading performance

    - by ambertch
    I have around 20 tableview cells that each contain a number (2-5) thumbnail sized pictures (they are VERY small Facebook profile pictures, ex. http://profile.ak.fbcdn.net/hprofile-ak-sf2p/hs254.snc3/23133_201668_2989_q.jpg). Each picture is an UIImageView added to the cell's contentview. Scrolling performance is poor, and measuring the draw time I've found the UIImage rendering is the bottleneck. I've researched/thought of some solutions but as I am new to iphone development I am not sure which strategy to pursue: preload all the images and retrieve them from disk instead of URL when drawing cells (I'm not sure if cell drawing will still be slow, so I want to hold off on the time investment here) Have the cells display a placeholder image from disk, while the picture is asynchronously loaded (this seems to be the best solution, but I'm currently not sure exactly how to do best do this) There's the fast drawing recommendation from Tweetie, but I don't know that will have much affect if it turns out my overhead is in network loading (http://blog.atebits.com/2008/12/fast-scrolling-in-tweetie-with-uitableview/) Thoughts/implementation advice? Thanks!

    Read the article

  • PhysX for massive performance via GPU ?

    - by devdude
    I recently compared some of the physics engine out there for simulation and game development. Some are free, some are opensource, some are commercial (1 is even very commercial $$$$). Havok, Ode, Newton (aka oxNewton), Bullet, PhysX and "raw" build-in physics in some 3D engines. At some stage I came to conclusion or question: Why should I use anything but NVidia PhysX if I can make use of its amazing performance (if I need it) due to GPU processing ? With future NVidia cards I can expect further improvement independent of the regular CPU generation steps. The SDK is free and it is available for Linux as well. Of course it is a bit of vendor lock-in and it is not opensource. Whats your view or experience ? If you would start right now with development, would you agree with the above ? cheers

    Read the article

  • Improving code and UI Performance

    - by Kobojunkie
    I am dealing with a situation that I need some help with here. I need to improve performance on functionality that records and updates UI with user selection info. What my code current does is 'This is called to update the Database each time the user makes a new selection on the UI Private Sub OnFilterChanged(String newReviewValueToAdd) AddRecentViewToDB(newReviewValueToAdd) UpdateRecentViewsUI() PageReviewGrid.Rebind()'Call Grid Rebind End Sub 'This is the code that handles updating the UI with the Updated selection Private Sub UpdateRecentViewsUI() Dim rlNode As RadTreeNode = radTree.FindNodeByValue("myreviewnode") Dim Obj As Setting Dim treenode As RadTreeNode For i As Integer = 0 To Count - 1 Obj = Setting.Review.Item(i) treenode = New RadTreeNode(datetime.now.ToString,i.ToString()) treenode.ToolTip = obj.GetFilter radNode1.Nodes.Add(treenode) Next End Sub Private Sub UpdateRecentViewsUI() Dim pnlNav As RadPanelItem = rpbMyLoans.FindItemByValue("rpiMLNavTree") Dim radTree As RadTreeView = CType(pnlNav.FindControl("rtMyLoansNav"), RadTreeView) Dim rlNode As RadTreeNode = radTree.FindNodeByValue("MLRS") rlNode.Nodes.Clear() Dim objRS As SharedCode.WATSUserSettings.MyLoansView Dim objRTN As RadTreeNode For intItem As Integer = 0 To GetUserSettings.MyLoansRecentViews.Count - 1 objRS = GetUserSettings.MyLoansRecentViews.Item(intItem) objRTN = New RadTreeNode(objRS.LastUpdate.ToString, intItem.ToString) objRTN.ToolTip = objRS.getFilterString rlNode.Nodes.Add(objRTN) Next End Sub

    Read the article

  • AJAX vs AHAH Is there a performance advantage?

    - by LanguaFlash
    My concern is performance, is there a reason to to send the client XML instead of valid HTML? Like most things, I am sure it is application dependent. My specific situation is where there is substantial content being inserted into the web page that has been pulled from a database. What are the advantages of either approach? Is the size of the content even a concern? Or, in the case of using XML, will the time for the Javascript to process the XML into HTML counterbalance the extra time that would have been required to send HTML to start with? Thanks, Jeff

    Read the article

  • High-Performance In-Browser Networking

    - by Jon Purdy
    (Similar in spirit to but different in practice from this question.) Is there any cross-browser-compatible, in-browser technology that allows a high-performance perstistent network connection between a server application and a client written in, say, Javascript? Think XmlHttpRequest on caffeine. I am working on a visualisation system that's restricted to at most a few users at once, and the server is pretty robust, so it can handle as much as it needs to. I would like to allow the client to have access to video streamed from the server at a minimum of about 20 frames per second, regardless of what their graphics hardware capabilities are. Simply put: is this doable without resorting to Flash or Java?

    Read the article

< Previous Page | 109 110 111 112 113 114 115 116 117 118 119 120  | Next Page >