high performance - Page 27

What's the performance penalty of weak_ptr?

- by Kornel Kisielewicz

I'm currently designing a object structure for a game, and the most natural organization in my case became a tree. Being a great fan of smart pointers I use shared_ptr's exclusively. However, in this case, the children in the tree will need access to it's parent (example -- beings on map need to be able to access map data -- ergo the data of their parents. The direction of owning is of course that a map owns it's beings, so holds shared pointers to them. To access the map data from within a being we however need a pointer to the parent -- the smart pointer way is to use a reference, ergo a weak_ptr. However, I once read that locking a weak_ptr is a expensive operation -- maybe that's not true anymore -- but considering that the weak_ptr will be locked very often, I'm concerned that this design is doomed with poor performance. Hence the question: What is the performance penalty of locking a weak_ptr? How significant is it?

Read the article

Performance of Managed C++ Vs UnManaged/native C++

- by bsobaid

I am writing a very high performance application that handles and processes hundreds of events every millisecond. Is Unmanaged C++ faster than managed c++? and why? Managed C++ deals with CLR instead of OS and CLR takes care of memory management, which simplifies the code and is probably also more efficient than code written by "a programmer" in unmanaged C++? or there is some other reason? When using managed, how can one then avoid dynamic memory allocation, which causes a performance hit, if it is all transparent to the programmer and handled by CLR? So coming back to my question, Is managed C++ more efficient in terms of speed than unmanaged C++ and why?

Read the article

OpenGL Performance Questions

- by Daniel

This subject, as with any optimisation problem, gets hit on a lot, but I just couldn't find what I (think) I want. A lot of tutorials, and even SO questions have similar tips; generally covering: Use GL face culling (the OpenGL function, not the scene logic) Only send 1 matrix to the GPU (projectionModelView combination), therefore decreasing the MVP calculations from per vertex to once per model (as it should be). Use interleaved Vertices Minimize as many GL calls as possible, batch where appropriate And possibly a few/many others. I am (for curiosity reasons) rendering 28 million triangles in my application using several vertex buffers. I have tried all the above techniques (to the best of my knowledge), and received almost no performance change. Whilst I am receiving around 40FPS in my implementation, which is by no means problematic, I am still curious as to where these optimisation 'tips' actually come into use? My CPU is idling around 20-50% during rendering, therefore I assume I am GPU bound for increasing performance. Note: I am looking into gDEBugger at the moment Cross posted at Game Development

Read the article

which toString() method can be used performance wise??

- by Mrityunjay

hi, I am working on one project for performance enhancement. I had one doubt, while we are during a process, we tend to trace the current state of the DTO and entity used. So, for this we have included toString() method in all POJOs for the same. I have now implemented toString() in three different ways which are following :- public String toString() { return "POJO :" + this.class.getName() + " RollNo :" + this.rollNo + " Name :" + this.name; } public String toString() { StringBuffer buff = new StringBuffer("POJO :").append(this.class.getName()).append(" RollNo :").append(this.rollNo).append(" Name :").append(this.name); return buff.toString(); } public String toString() { StringBuilder builder = new StringBuilder("POJO :").append(this.class.getName()).append(" RollNo :").append(this.rollNo).append(" Name :").append(this.name); return builder .toString(); } can anyone please help me to find out which one is best and should be used for enhancing performance.

Read the article

C# debug vs release performance

- by sagie

Hi. I've encountered in the following paragraph: “Debug vs Release setting in the IDE when you compile your code in Visual Studio makes almost no difference to performance… the generated code is almost the same. The C# compiler doesn’t really do any optimisation. The C# compiler just spits out IL… and at the runtime it’s the JITer that does all the optimisation. The JITer does have a Debug/Release mode and that makes a huge difference to performance. But that doesn’t key off whether you run the Debug or Release configuration of your project, that keys off whether a debugger is attached.” The source is here and the podcast is here. Can someone direct me to a microsoft an article that can actualy prove this?

Read the article

What are some good code optimization methods?

- by esac

I would like to understand good code optimization methods and methodology. How do I keep from doing premature optimization if I am thinking about performance already. How do I find the bottlenecks in my code? How do I make sure that over time my program does not become any slower? What are some common performance errors to avoid (e.g.; I know it is bad in some languages to return while inside the catch portion of a try{} catch{} block

Read the article

java statistics collection for performance evaluation

- by user384706

What is the most efficient way to collect and report performance statistic analysis from an application? If I have an application that uses a series of network apis, and I want to report statistics at runtime, e.g. Method doA() was called 3 times and consumed on avg 500ms Method doB() was called 5 times and consumed on avg 1200ms etc Then, I thought of using a well defined data structure (of collection) that each thread updates per remote call, and this can be used for the report. But I think that it will make the performance worse, for the time spend for statistics collection. Am I correct? How would I procceed if I used a background thread for this, and the other threads that did the remote calls were unaware of this collection gathering? Thanks

Read the article

Performance Testing using xUnit framework

- by Pavlo Neyman

Hi all, I wonder if it is possible to run performance testing based on xUnit?

Read the article

Android -- Object Creation/Memory Allocation vs. Performance

- by borg17of20

Hello all, This is probably an easy one. I have about 20 TextViews/ImageViews in my current project that I access like this: ((TextView)multiLayout.findViewById(R.id.GameBoard_Multi_Answer1_Text)).setText(""); //or ((ImageView)multiLayout.findViewById(R.id.GameBoard_Multi_Answer1_Right)).setVisibility(View.INVISIBLE); My question is this, am I better off, from a performance standpoint, just assigning these object variables? Further, am I losing some performance to the constant "search" process that goes on as a part of the findViewById(...) method? (i.e. Does findsViewById(...) use some sort of hashtable/hashmap for look-ups or does it implement an iterative search over the view hierarchy?) At present, my program never uses more than 2.5MB of RAM, so will assigning 20 or so more object variables drastically affect this? I don't think so, but I figured I'd ask. Thanks.

Read the article

javascript object access performance

- by youdontmeanmuch

In Javascript, when your getting a property of an object, is there a performance penalty to getting the whole object vs only getting a property of that object? Also Keep in mind I'm not talking about DOM access these are pure simple Javascript objects. For example: Is there some kind of performance difference between the following code: Assumed to be faster but not sure: var length = some.object[key].length; if(length === condition){ // Do something that doesnt need anything inside of some.object[key] } else{ var object = some.object[key]; // Do something that requires stuff inside of some.object[key] } I think this would be slower but not sure if it matters. var object = some.object[key]; if(object.length === condition){ // Do something that doesnt need anything inside of some.object[key] } else{ // Do something that requires stuff inside of some.object[key] }

Read the article

server performance: multiple external connections and performance

- by websiteguru

I am creating a php script that requires the server to make several cURL requests per run. I'll be running this script through cron every 3 minutes. Im looking to maximize the amount of cURL requests I can make in a 24 hr period. What I am wondering is if it would be better from a performance standpoint to get a dedicated server, or several small shared hosting accounts. With the problem being number of external connections and not system resources I'm wondering which is the best approach.

Read the article

Best way to handle MySQL date for performance with thousands of users

- by bitLost

I am currently part of a team designing a site that will potentially have thousands of users who will be doing a number of date related searches. During the design phase we have been trying to determine which makes more sense for performance optimization. Should we store the datetime field as a mysql datetime. Or should be break it up into a number of fields (year, month, day, hour, minute, ...) The question is with a large data set and a potentially large set of users, would we gain performance wise breaking the datetime into multiple fields and saving on relying on mysql date functions? Or is mysql already optimized for this?

Read the article

Performance considerations of a large hard-coded array in the .cs file

- by terence

I'm writing some code where performance is important. In one part of it, I have to compare a large set of pre-computed data against dynamic values. Currently, I'm storing that pre-computed data in a giant array in the .cs file: Data[] data = { /* my data set */ }; The data set is about 90kb, or roughly 13k elements. I was wondering if there's any downside to doing this, as opposed to loading it in from an external file? I'm not entirely sure how C# works internally, so I just wanted to be aware of any performance issues I might encounter with this method.

Read the article

PHP include(): File size & performance

- by Tom

An inexperienced PHP question: I've got a PHP script file that I need to include on different pages lots of times in lots of places. I have the option of either breaking the included file down into several smaller files and include these on a as-needed basis... OR ... I could just keep it all together in a single PHP file. I'm wondering if there's any performance impact of using a larger vs. smaller file for include() in this context? For example, is there any performance difference between a 200KB file and a 20KB file? Thank you.

Read the article

PHP code's performance test

- by jasmine

What is the best way for test my php code's performance?

Read the article

What's the format of Real World Performance Day?

- by william.hardie

A question that has cropped a lot of late is "what's the format of Real World Performance Day?" Not an unreasonable question you might think. Sure enough, a quick check of the Independent Oracle User Group's website tells us a bit about the Real World Performance Day event, but no formal agenda? This was one of the questions I posed to Tom Kyte (one of the main presenters) in our recent podcast. Tom tells us that this isn't your traditional event where one speaker follows another with loads of slides. In fact, the Real World Performance Day features Tom and fellow Oracle performance experts - Andrew Holdsworth and Graham Wood - continuously on stage throughout the day. All three will be discussing database performance challenges and solutions from development, architectural design and management perspectives. There's going to be multi-terabyte demos on show, less of the traditional slides, and more interactive debate and discussion going on. Tune-in and hear what else Tom has to say about this fairly unique event!

Read the article

Analysing Group & Individual Member Performance -RUP

- by user23871

I am writing a report which requires the analysis of performance of each individual team member. This is for a software development project developed using the Unified Process (UP). I was just wondering if there are any existing group & individual appraisal metrics used so I don't have to reinvent the wheel... EDIT This is by no means correct but something like: Individual Contribution (IC) = time spent (individual) / time spent (total) = Performance = ? (should use individual contribution (IC) combined with something to gain a measure of overall performance).... Maybe I am talking complete hash and I know generally its really difficult to analyse performance with numbers but any mathematicians out there that can lend a hand or know a somewhat more accurate method of analysing performance than arbitrary marking (e.g. 8 out 10)

Read the article

Strange performance behaviour for 64 bit modulo operation

- by codymanix

The last three of these method calls take approx. double the time than the first four. The only difference is that their arguments doesn't fit in integer anymore. But should this matter? The parameter is declared to be long, so it should use long for calculation anyway. Does the modulo operation use another algorithm for numbersmaxint? I am using amd athlon64 3200+, winxp sp3 and vs2008. Stopwatch sw = new Stopwatch(); TestLong(sw, int.MaxValue - 3l); TestLong(sw, int.MaxValue - 2l); TestLong(sw, int.MaxValue - 1l); TestLong(sw, int.MaxValue); TestLong(sw, int.MaxValue + 1l); TestLong(sw, int.MaxValue + 2l); TestLong(sw, int.MaxValue + 3l); Console.ReadLine(); static void TestLong(Stopwatch sw, long num) { long n = 0; sw.Reset(); sw.Start(); for (long i = 3; i < 20000000; i++) { n += num % i; } sw.Stop(); Console.WriteLine(sw.Elapsed); } EDIT: I now tried the same with C and the issue does not occur here, all modulo operations take the same time, in release and in debug mode with and without optimizations turned on: #include "stdafx.h" #include "time.h" #include "limits.h" static void TestLong(long long num) { long long n = 0; clock_t t = clock(); for (long long i = 3; i < 20000000LL*100; i++) { n += num % i; } printf("%d - %lld\n", clock()-t, n); } int main() { printf("%i %i %i %i\n\n", sizeof (int), sizeof(long), sizeof(long long), sizeof(void*)); TestLong(3); TestLong(10); TestLong(131); TestLong(INT_MAX - 1L); TestLong(UINT_MAX +1LL); TestLong(INT_MAX + 1LL); TestLong(LLONG_MAX-1LL); getchar(); return 0; } EDIT2: Thanks for the great suggestions. I found that both .net and c (in debug as well as in release mode) does't not use atomically cpu instructions to calculate the remainder but they call a function that does. In the c program I could get the name of it which is "_allrem". It also displayed full source comments for this file so I found the information that this algorithm special cases the 32bit divisors instead of dividends which was the case in the .net application. I also found out that the performance of the c program really is only affected by the value of the divisor but not the dividend. Another test showed that the performance of the remainder function in the .net program depends on both the dividend and divisor. BTW: Even simple additions of long long values are calculated by a consecutive add and adc instructions. So even if my processor calls itself 64bit, it really isn't :( EDIT3: I now ran the c app on a windows 7 x64 edition, compiled with visual studio 2010. The funny thing is, the performance behavior stays the same, although now (I checked the assembly source) true 64 bit instructions are used.

Read the article

C# performance varying due to memory

- by user1107474

Hope this is a valid post here, its a combination of C# issues and hardware. I am benchmarking our server because we have found problems with the performance of our quant library (written in C#). I have simulated the same performance issues with some simple C# code- performing very heavy memory-usage. The code below is in a function which is spawned from a threadpool, up to a maximum of 32 threads (because our server has 4x CPUs x 8 cores each). This is all on .Net 3.5 The problem is that we are getting wildly differing performance. I run the below function 1000 times. The average time taken for the code to run could be, say, 3.5s, but the fastest will only be 1.2s and the slowest will be 7s- for the exact same function! I have graphed the memory usage against the timings and there doesnt appear to be any correlation with the GC kicking in. One thing I did notice is that when running in a single thread the timings are identical and there is no wild deviation. I have also tested CPU-bound algorithms and the timings are identical too. This has made us wonder if the memory bus just cannot cope. I was wondering could this be another .net or C# problem, or is it something related to our hardware? Would this be the same experience if I had used C++, or Java?? We are using 4x Intel x7550 with 32GB ram. Is there any way around this problem in general? Stopwatch watch = new Stopwatch(); watch.Start(); List<byte> list1 = new List<byte>(); List<byte> list2 = new List<byte>(); List<byte> list3 = new List<byte>(); int Size1 = 10000000; int Size2 = 2 * Size1; int Size3 = Size1; for (int i = 0; i < Size1; i++) { list1.Add(57); } for (int i = 0; i < Size2; i = i + 2) { list2.Add(56); } for (int i = 0; i < Size3; i++) { byte temp = list1.ElementAt(i); byte temp2 = list2.ElementAt(i); list3.Add(temp); list2[i] = temp; list1[i] = temp2; } watch.Stop(); (the code is just meant to stress out the memory) I would include the threadpool code, but we used a non-standard threadpool library. EDIT: I have reduced "size1" to 100000, which basically doesn't use much memory and I still get a lot of jitter. This suggests it's not the amount of memory being transferred, but the frequency of memory grabs?

Read the article

AS 400 Performance from .Net iSeries Provider

- by Nathan

Hey all, First off, I am not an AS 400 guy - at all. So please forgive me for asking any noobish questions here. Basically, I am working on a .Net application that needs to access the AS400 for some real-time data. Although I have the system working, I am getting very different performance results between queries. Typically, when I make the 1st request against a SPROC on the AS400, I am seeing ~ 14 seconds to get the full data set. After that initial call, any subsequent calls usually only take ~ 1 second to return. This performance improvement remains for ~ 20 mins or so before it takes 14 seconds again. The interesting part with this is that, if the stored procedure is executed directly on the iSeries Navigator, it always returns within milliseconds (no change in response time). I wonder if it is a caching / execution plan issue but I can only apply my SQL SERVER logic to the AS400, which is not always a match. Any suggestions on what I can do to recieve a more consistant response time or simply insight as to why the AS400 is acting in this manner when I was using the iSeries Data Provider for .Net? Is there a better access method that I should use? Just in case, here's the code I am using to connect to the AS400 Dim Conn As New IBM.Data.DB2.iSeries.iDB2Connection(ConnectionString) Dim Cmd As New IBM.Data.DB2.iSeries.iDB2Command("SPROC_NAME_HERE", Conn) Cmd.CommandType = CommandType.StoredProcedure Using Conn Conn.Open() Dim Reader = Cmd.ExecuteReader() Using Reader While Reader.Read() 'Do Something End While Reader.Close() End Using Conn.Close() End Using

Read the article

MySQL performance - 100Mb ethernet vs 1Gb ethernet

- by Rob Penridge

Hi All I've just started a new job and noticed that the analysts computers are connected to the network at 100Mbps. The ODBC queries we run against the MySQL server can easily return 500MB+ and it seems at times when the servers are under high load the DBAs kill low priority jobs as they are taking too long to run. My question is this... How much of this server time is spent executing the request, and how much time is spent returning the data to the client? Could the query speeds be improved by upgrading the network connections to 1Gbps? (Updated for the why): The database in question was built to accomodate reporting needs and contains massive amounts of data. We usually work with subsets of this data at a granular level in external applications such as SAS or Excel, hence the reason for the large amounts of data being transmitted. The queries are not poorly structured - they are very simple and the appropriate joins/indexes etc are being used. I've removed 'query' from the Title of the post as I realised this question is more to do with general MySQL performance rather than query related performance. I was kind of hoping that someone with a Gigabit connection may be able to actually quantify some results for me here by running a query that returns a decent amount of data, then they could limit their connection speed to 100Mb and rerun the same query. Hopefully this could be done in an environment where loads are reasonably stable so as not to skew the results. If ethernet speed can improve the situation I wanted some quantifiable results to help argue my case for upgrading the network connections. Thanks Rob

Read the article

iPhone openGLES performance tuning

- by genesys

Hey there! I'm trying now for quite a while to optimize the framerate of my game without really making progress. I'm running on the newest iPhone SDK and have a iPhone 3G 3.1.2 device. I invoke arround 150 drawcalls, rendering about 1900 Triangles in total (all objects are textured using two texturelayers and multitexturing. most textures come from the same textureAtlasTexture stored in pvrtc 2bpp compressed texture). This renders on my phone at arround 30 fps, which appears to me to be way too low for only 1900 triangles. I tried many things to optimize the performance, including batching together the objects, transforming the vertices on the CPU and rendering them in a single drawcall. this yelds 8 drawcalls (as oposed to 150 drawcalls), but performance is about the same (fps drop to arround 26fps) I'm using 32byte vertices stored in an interleaved array (12bytes position, 12bytes normals, 8bytes uv). I'm rendering triangleLists and the vertices are ordered in TriStrip order. I did some profiling but I don't really know how to interprete it. instruments-sampling using Instruments and Sampling yelds this result: http://neo.cycovery.com/instruments_sampling.gif telling me that a lot of time is spent in "mach_msg_trap". I googled for it and it seems this function is called in order to wait for some other things. But wait for what?? instruments-openGL instruments with the openGL module yelds this result: http://neo.cycovery.com/intstruments_openglES_debug.gif but here i have really no idea what those numbers are telling me shark profiling: profiling with shark didn't tell me much either: http://neo.cycovery.com/shark_profile_release.gif the largest number is 10%, spent by DrawTriangles - and the whole rest is spent in very small percentage functions Can anyone tell me what else I could do in order to figure out the bottleneck and could help me to interprete those profiling information? Thanks a lot!

Read the article

Java performance problem with LinkedBlockingQueue

- by lofthouses

Hello, this is my first post on stackoverflow...i hope someone can help me i have a big performance regression with Java 6 LinkedBlockingQueue. In the first thread i generate some objects which i push in to the queue In the second thread i pull these objects out. The performance regression occurs when the take() method of the LinkedBlockingQueue is called frequently. I monitored the whole program and the take() method claimed the most time overall. And the throughput goes from ~58Mb/s to 0.9Mb/s... the queue pop and take methods ar called with a static method from this class public class C_myMessageQueue { private static final LinkedBlockingQueue<C_myMessageObject> x_queue = new LinkedBlockingQueue<C_myMessageObject>( 50000 ); /** * @param message * @throws InterruptedException * @throws NullPointerException */ public static void addMyMessage( C_myMessageObject message ) throws InterruptedException, NullPointerException { x_queue.put( message ); } /** * @return Die erste message der MesseageQueue * @throws InterruptedException */ public static C_myMessageObject getMyMessage() throws InterruptedException { return x_queue.take(); } } how can i tune the take() method to accomplish at least 25Mb/s, or is there a other class i can use which will block when the "queue" is full or empty. kind regards Bart P.S.: sorry for my bad english, i'm from germany ;)

Read the article

Reporting System architecture for better performance

- by pauloya

Hi, We have a product that runs Sql Server Express 2005 and uses mainly ASP.NET. The database has around 200 tables, with a few (4 or 5) that can grow from 300 to 5000 rows per day and keep a history of 5 years, so they can grow to have 10 million rows. We have built a reporting platform, that allows customers to build reports based on templates, fields and filters. We face performance problems almost since the beginning, we try to keep reports display under 10 seconds but some of them go up to 25 seconds (specially on those customers with long history). We keep checking indexes and trying to improve the queries but we get the feeling that there's only so much we can do. Off course the fact that the queries are generated dynamically doesn't help with the optimization. We also added a few tables that keep redundant data, but then we have the added problem of maintaining this data up to date, and also Sql Express has a limit on the size of databases. We are now facing a point where we have to decide if we want to give up real time reports, or maybe cut the history to be able to have better performance. I would like to ask what is the recommended approach for this kind of systems. Also, should we start looking for third party tools/platforms? I know OLAP can be an option but can we make it work on Sql Server Express, or at least with a license that is cheap enough to distribute to thousands of deployments? Thanks

Read the article

Poor LLVM JIT performance

- by Paul J. Lucas

I have a legacy C++ application that constructs a tree of C++ objects. I want to use LLVM to call class constructors to create said tree. The generated LLVM code is fairly straight-forward and looks repeated sequences of: ; ... %11 = getelementptr [11 x i8*]* %Value_array1, i64 0, i64 1 %12 = call i8* @T_string_M_new_A_2Pv(i8* %heap, i8* getelementptr inbounds ([10 x i8]* @0, i64 0, i64 0)) %13 = call i8* @T_QueryLoc_M_new_A_2Pv4i(i8* %heap, i8* %12, i32 1, i32 1, i32 4, i32 5) %14 = call i8* @T_GlobalEnvironment_M_getItemFactory_A_Pv(i8* %heap) %15 = call i8* @T_xs_integer_M_new_A_Pvl(i8* %heap, i64 2) %16 = call i8* @T_ItemFactory_M_createInteger_A_3Pv(i8* %heap, i8* %14, i8* %15) %17 = call i8* @T_SingletonIterator_M_new_A_4Pv(i8* %heap, i8* %2, i8* %13, i8* %16) store i8* %17, i8** %11, align 8 ; ... Where each T_ function is a C "thunk" that calls some C++ constructor, e.g.: void* T_string_M_new_A_2Pv( void *v_value ) { string *const value = static_cast<string*>( v_value ); return new string( value ); } The thunks are necessary, of course, because LLVM knows nothing about C++. The T_ functions are added to the ExecutionEngine in use via ExecutionEngine::addGlobalMapping(). When this code is JIT'd, the performance of the JIT'ing itself is very poor. I've generated a call-graph using kcachegrind. I don't understand all the numbers (and this PDF seems not to include commas where it should), but if you look at the left fork, the bottom two ovals, Schedule... is called 16K times and setHeightToAtLeas... is called 37K times. On the right fork, RAGreed... is called 35K times. Those are far too many calls to anything for what's mostly a simple sequence of call LLVM instructions. Something seems horribly wrong. Any ideas on how to improve the performance of the JIT'ing?

Search Results

Search found 19055 results on 763 pages for 'high performance'.

Page 27/763 | < Previous Page | 23 24 25 26 27 28 29 30 31 32 33 34 | Next Page >

- by Kornel Kisielewicz

- by bsobaid

- by Daniel

- by Mrityunjay

- by sagie

- by esac

- by user384706

- by Pavlo Neyman

- by borg17of20

- by youdontmeanmuch

- by websiteguru

- by bitLost

- by terence

- by Tom

- by jasmine

- by william.hardie

- by user23871

- by codymanix

- by user1107474

- by Nathan

- by Rob Penridge

- by genesys

- by lofthouses

- by pauloya

- by Paul J. Lucas

< Previous Page | 23 24 25 26 27 28 29 30 31 32 33 34 | Next Page >