Search Results

Search found 23545 results on 942 pages for 'parallel task library'.

Page 123/942 | < Previous Page | 119 120 121 122 123 124 125 126 127 128 129 130 | Next Page >

add uchar values in ushort array with sse2 or sse3

- by pompolus

i have an unsigned short dst[16][16] matrix and a larger unsigned char src[m][n] matrix. Now i have to access in the src matrix and add a 16x16 submatrix to dst, using sse2 or ss3. In a my older implementation, I was sure that my summed values ??were never greater than 256, so i could do this: for (int row = 0; row < 16; ++row) { __m128i subMat = _mm_lddqu_si128(reinterpret_cast<const __m128i*>(src)); dst[row] = _mm_add_epi8(dst[row], subMat); src += W; // Step to next row i need to add } where W is an offset to reach the desired rows. This code works, but now my values in src are larger and summed could be greater than 256, so i need to store them as ushort. i've tried this: for (int row = 0; row < 16; ++row) { __m128i subMat = _mm_lddqu_si128(reinterpret_cast<const __m128i*>(src)); dst[row] = _mm_add_epi16(dst[row], subMat); src += W; // Step to next row i need to add } but it doesn't work. I'm not so good with sse, so any help will be appreciated.

Read the article
Android: How to make launcher always open the main activity instead of child activity? (or otherwise

- by yuku

I have activities A and B. The A is the one with LAUNCHER intent-filter (i.e. the activity that is started when we click the app icon on home screen). A launches B using startActivity(new Intent(A.this, B.class)). When the user has the B activity open, and then put my application into the background, and later my application's process is killed, when the user starts my application again, B is opened instead of A. This caused a force close in my app, because A is the activity that initializes the resources my app needs, and when B tried to access the uninitialized resources, B crashes. Do you have any suggestions what should I do in this situation?

Read the article
How can i connect two or more machines via tcp cable to form a network grid?

- by Gath

How can i connect two or more machines to form a network grid and how can i distribute work load to the two machines? What operating systems do i need to run on the machines, and what application should i use to manage the load balancing? NB: I read somewhere that google uses cheap machines to perform this fete, how do they connect two network cards( 'Teaming' ) and distribute load across the machines? Good practical examples would serve me good, with actual code samples. Pointers to some good site i might read this stuff will be highly appreciated.

Read the article
Executing functions parallelly in PHP

- by binaryLV

Hi! Can PHP call a function and don't wait for it to return? So something like this: function callback($pause, $arg) { sleep($pause); echo $arg, "\n"; } header('Content-Type: text/plain'); fast_call_user_func_array('callback', array(3, 'three')); fast_call_user_func_array('callback', array(2, 'two')); fast_call_user_func_array('callback', array(1, 'one')); would output one (after 1 second) two (after 2 seconds) three (after 3 seconds) rather than three (after 3 seconds) two (after 3 + 2 = 5 seconds) one (after 3 + 2 + 1 = 6 seconds) Main script is intended to be run as a permanent process (TCP server). callback() function would receive data from client, execute external PHP script and then do something based on other arguments that are passed to callback(). The problem is that main script must not wait for external PHP script to finish. Result of external script is important, so exec('php -f file.php &') is not an option.

Read the article
Why are Asynchronous processes not called Synchronous?

- by Balk

So I'm a little confused by this terminology. Everyone refers to "Asynchronous" computing as running different processes on seperate threads, which gives the illusion that these processes are running at the same time. This is not the definition of the word asynchronous. a·syn·chro·nous –adjective 1. not occurring at the same time. 2. (of a computer or other electrical machine) having each operation started only after the preceding operation is completed. What am I not understanding here?

Read the article
How to copy a folder recursively with out overwriting the previous one

what i need is i have linked my project with the cruise control, so when ever a build happens i want to copy the bin folder to a seperate destination folder with version number. That is when the project build happens for the second time i dont want to replace the bin folder of the first build i want to save this with another version number. How can i do that. Right now i got the thing how to copy the folder but it was overwriting the previous one. i dont want that to happen please help me how to implement the concept of versioning.

Read the article
Setting the cores to use in Parallelism

- by Ben

Hi, I have a feeling the answer to this is no, but using .Net 4.0's Parallelism, can you set the amount of cores on which to run i.e. if your running a Quad Core, can you set your Application to only use 2 of them? Thanks

Read the article
MPI Odd/Even Compare-Split Deadlock

- by erebel55

I'm trying to write an MPI version of a program that runs an odd/even compare-split operation on n randomly generated elements. Process 0 should generated the elements and send nlocal of them to the other processes, (keeping the first nlocal for itself). From here, process 0 should print out it's results after running the CompareSplit algorithm. Then, receive the results from the other processes run of the algorithm. Finally, print out the results that it has just received. I have a large chunk of this already done, but I'm getting a deadlock that I can't seem to fix. I would greatly appreciate any hints that people could give me. Here is my code http://pastie.org/3742474 Right now I'm pretty sure that the deadlock is coming from the Send/Recv at lines 134 and 151. I've tried changing the Send to use "tag" instead of myrank for the tag parameter..but when I did that I just keep getting a "MPI_ERR_TAG: invalid tag" for some reason. Obviously I would also run the algorithm within the processors 0 but I took that part out for now, until I figure out what is going wrong. Any help is appreciated.

Read the article
Can a Perl subroutine return data but keep processing?

- by Perl QuestionAsker

Is there any way to have a subroutine send data back while still processing? For instance (this example used simply to illustrate) - a subroutine reads a file. While it is reading through the file, if some condition is met, then "return" that line and keep processing. I know there are those that will answer - why would you want to do that? and why don't you just ...?, but I really would like to know if this is possible. Thank you so much in advance.

Read the article
Home based business would like customers to schedule via website the time, day and date they want to take a class.

- by Alessandro Machi

I'm using google blogger. I want to ad thumbnail images of different classes I will be offering in my home film/video/sound/lighting studio. The idea is the prospective student visits my website, sees a class they want to take, clicks the thumbnail so first read a descriptive article about the class, at which point they can schedule the class for the time, day, and date of their choosing between the hours of 5am to 9pm, 365 days a year. As soon as the student has inputed the time, day and date of the class they want, they would go to a check out page to purchase the class time. The student would then be sent an email confirmation along with the exact location, the class name, and the time and date they selected. I was thinking of using Dwolla for the check out page because Dwolla offers either no fee or 25 cents per payment transaction, but I'm not sure I can hook up to them easily enough. My blog site is not finished by a longshot. I still have to actually input all of the class thumbnail images along with descriptions, but if you need to see what the page looks like the web address is http://www.myalexlogic.com Google blogger allows for third party code to be added within movable gadgets.

Read the article
How to detect invalid user input in a Batch File?

- by user2975367

I want to use a batch file to ask for a password to continue, i have very simple code that works. @echo off :Begin cls echo. echo Enter Password set /p pass= if %pass%==Password goto Start :Start cls echo What would you like me to do? (Date/Chrome/Lock/Shutdown/Close) set /p task= if %task%==Date goto Task=Date if %task%==Chrome goto Task=Chrome if %task%==Lock goto Task=Lock if %task%==Shutdown goto Task=Shutdown if %task%==Close goto Task=Close I need to detect when the user entered an invalid password, i have spent an hour researching but i found nothing. I'm not advanced in any way so try and keep it very simple like the code above. Please help me.

Read the article
How to execute unknown functions from dynamic load libraries?

- by activenightly

It's easy to load functions from dynamic libraries when you know this function in design time. just do something like this: int (*fn)(int); l0 = dlopen("./libfoo.so", RTLD_LAZY); if (!l0) { fprintf(stderr, "l0 %s\n", dlerror()); return 1; } fn = (int (*)(int))dlsym(l0, "foo"); if ((error = dlerror()) != NULL) { fprintf(stderr, "fn:%s\n", error); return 1; } x=(*fn)(y); ... How to execute library function when it's unknown in design time? In runtime you have a function name and array of arguments pointers and array of arguments sizes: char* fn_name="foo"; int foo_argc; void* foo_argv[]; int foo_argv_size[]; In scripting language it's a piece a cake task, but how to implement this nicely in c++?

Read the article
How to add a child node to an xml file given an xpath locating the parent node?

- by Nam Gi VU

I have an xml file and need to add nodes to it using MSBuild. Please help.

Read the article
Simple C++ container class that is thread-safe for writing

- by conradlee

I am writing a multi-threaded program using OpenMP in C++. At one point my program forks into many threads, each of which need to add "jobs" to some container that keeps track of all added jobs. Each job can just be a pointer to some object. Basically, I just need the add pointers to some container from several threads at the same time. Is there a simple solution that performs well? After some googling, I found that STL containers are not thread-safe. Some stackoverflow threads address this question, but none that forms a consensus on a simple solution.

Read the article
Can I use MPI_Probe to probe messsages sent by any collective operation?

- by takwing

In my code I have a server process repeatedly probing for incoming messages, which come in two types. One type of the two will be sent once by each process to give hint to the server process about its termination. I was wondering if it is valid to use MPI_Broadcast to broadcast these termination messages and use MPI_Probe to probe their arrivals. I tried using this combination but it failed. This failure might have been caused by some other things. So I would like anyone who knows about this to confirm. Cheers.

Read the article
Take advantage of multiple cores executing SQL statements

- by willvv

I have a small application that reads XML files and inserts the information on a SQL DB. There are ~ 300 000 files to import, each one with ~ 1000 records. I started the application on 20% of the files and it has been running for 18 hours now, I hope I can improve this time for the rest of the files. I'm not using a multi-thread approach, but since the computer I'm running the process on has 4 cores I was thinking on doing it to get some improvement on the performance (although I guess the main problem is the I/O and not only the processing). I was thinking on using the BeginExecutingNonQuery() method on the SqlCommand object I create for each insertion, but I don't know if I should limit the max amount of simultaneous threads (nor I know how to do it). What's your advice to get the best CPU utilization? Thanks

Read the article
how to do thread in matlab?

- by hai

how to do thread in matlab? i want to run one function on two variables simulatniosly how to do it?

Read the article
How to specify the equivalent of ppn in SGE queing system ?

- by debugger

Hello All, Is there a way to specify the ppn ( or equivalent ) in SGE ? i don't want to use all cpus in one node so i will be able to have more memory per core. ( In PBS you would do -l nodes=16:ppn=2 for exemple) Thanks.

Read the article
[C++] Needed: A simple C++ container (stack, linked list) that is thread-safe for writing

- by conradlee

I am writing a multi-threaded program using OpenMP in C++. At one point my program forks into many threads, each of which need to add "jobs" to some container that keeps track of all added jobs. Each job can just be a pointer to some object. Basically, I just need the add pointers to some container from several threads at the same time. Is there a simple solution that performs well? After some googling, I found that STL containers are not thread-safe. Some stackoverflow threads address this question, but none form a consensus on a simple solution.

Read the article
Not really a quaestion...but i need help

- by Dan F.

I have to make a process in Oracle/PLSQL.....i have to verify that the interval of time between start_date and end_date from a new row that i create must not intersect other start_dates and end_dates from other rows. Now I need to check each row for that condition and if it doesn't correspond the repetitive instruction should stop and after that to display a message such as "The interval of time given is not correct". I don't know how to make repetitive instructions in Oracle/PLSQL and I would appreciate if you would help me.

Read the article
Parallelism in .NET – Part 5, Partitioning of Work

- by Reed

When parallelizing any routine, we start by decomposing the problem. Once the problem is understood, we need to break our work into separate tasks, so each task can be run on a different processing element. This process is called partitioning. Partitioning our tasks is a challenging feat. There are opposing forces at work here: too many partitions adds overhead, too few partitions leaves processors idle. Trying to work the perfect balance between the two extremes is the goal for which we should aim. Luckily, the Task Parallel Library automatically handles much of this process. However, there are situations where the default partitioning may not be appropriate, and knowledge of our routines may allow us to guide the framework to making better decisions. First off, I’d like to say that this is a more advanced topic. It is perfectly acceptable to use the parallel constructs in the framework without considering the partitioning taking place. The default behavior in the Task Parallel Library is very well-behaved, even for unusual work loads, and should rarely be adjusted. I have found few situations where the default partitioning behavior in the TPL is not as good or better than my own hand-written partitioning routines, and recommend using the defaults unless there is a strong, measured, and profiled reason to avoid using them. However, understanding partitioning, and how the TPL partitions your data, helps in understanding the proper usage of the TPL. I indirectly mentioned partitioning while discussing aggregation. Typically, our systems will have a limited number of Processing Elements (PE), which is the terminology used for hardware capable of processing a stream of instructions. For example, in a standard Intel i7 system, there are four processor cores, each of which has two potential hardware threads due to Hyperthreading. This gives us a total of 8 PEs – theoretically, we can have up to eight operations occurring concurrently within our system. In order to fully exploit this power, we need to partition our work into Tasks. A task is a simple set of instructions that can be run on a PE. Ideally, we want to have at least one task per PE in the system, since fewer tasks means that some of our processing power will be sitting idle. A naive implementation would be to just take our data, and partition it with one element in our collection being treated as one task. When we loop through our collection in parallel, using this approach, we’d just process one item at a time, then reuse that thread to process the next, etc. There’s a flaw in this approach, however. It will tend to be slower than necessary, often slower than processing the data serially. The problem is that there is overhead associated with each task. When we take a simple foreach loop body and implement it using the TPL, we add overhead. First, we change the body from a simple statement to a delegate, which must be invoked. In order to invoke the delegate on a separate thread, the delegate gets added to the ThreadPool’s current work queue, and the ThreadPool must pull this off the queue, assign it to a free thread, then execute it. If our collection had one million elements, the overhead of trying to spawn one million tasks would destroy our performance. The answer, here, is to partition our collection into groups, and have each group of elements treated as a single task. By adding a partitioning step, we can break our total work into small enough tasks to keep our processors busy, but large enough tasks to avoid overburdening the ThreadPool. There are two clear, opposing goals here: Always try to keep each processor working, but also try to keep the individual partitions as large as possible. When using Parallel.For, the partitioning is always handled automatically. At first, partitioning here seems simple. A naive implementation would merely split the total element count up by the number of PEs in the system, and assign a chunk of data to each processor. Many hand-written partitioning schemes work in this exactly manner. This perfectly balanced, static partitioning scheme works very well if the amount of work is constant for each element. However, this is rarely the case. Often, the length of time required to process an element grows as we progress through the collection, especially if we’re doing numerical computations. In this case, the first PEs will finish early, and sit idle waiting on the last chunks to finish. Sometimes, work can decrease as we progress, since previous computations may be used to speed up later computations. In this situation, the first chunks will be working far longer than the last chunks. In order to balance the workload, many implementations create many small chunks, and reuse threads. This adds overhead, but does provide better load balancing, which in turn improves performance. The Task Parallel Library handles this more elaborately. Chunks are determined at runtime, and start small. They grow slowly over time, getting larger and larger. This tends to lead to a near optimum load balancing, even in odd cases such as increasing or decreasing workloads. Parallel.ForEach is a bit more complicated, however. When working with a generic IEnumerable<T>, the number of items required for processing is not known in advance, and must be discovered at runtime. In addition, since we don’t have direct access to each element, the scheduler must enumerate the collection to process it. Since IEnumerable<T> is not thread safe, it must lock on elements as it enumerates, create temporary collections for each chunk to process, and schedule this out. By default, it uses a partitioning method similar to the one described above. We can see this directly by looking at the Visual Partitioning sample shipped by the Task Parallel Library team, and available as part of the Samples for Parallel Programming. When we run the sample, with four cores and the default, Load Balancing partitioning scheme, we see this: The colored bands represent each processing core. You can see that, when we started (at the top), we begin with very small bands of color. As the routine progresses through the Parallel.ForEach, the chunks get larger and larger (seen by larger and larger stripes). Most of the time, this is fantastic behavior, and most likely will out perform any custom written partitioning. However, if your routine is not scaling well, it may be due to a failure in the default partitioning to handle your specific case. With prior knowledge about your work, it may be possible to partition data more meaningfully than the default Partitioner. There is the option to use an overload of Parallel.ForEach which takes a Partitioner<T> instance. The Partitioner<T> class is an abstract class which allows for both static and dynamic partitioning. By overriding Partitioner<T>.SupportsDynamicPartitions, you can specify whether a dynamic approach is available. If not, your custom Partitioner<T> subclass would override GetPartitions(int), which returns a list of IEnumerator<T> instances. These are then used by the Parallel class to split work up amongst processors. When dynamic partitioning is available, GetDynamicPartitions() is used, which returns an IEnumerable<T> for each partition. If you do decide to implement your own Partitioner<T>, keep in mind the goals and tradeoffs of different partitioning strategies, and design appropriately. The Samples for Parallel Programming project includes a ChunkPartitioner class in the ParallelExtensionsExtras project. This provides example code for implementing your own, custom allocation strategies, including a static allocator of a given chunk size. Although implementing your own Partitioner<T> is possible, as I mentioned above, this is rarely required or useful in practice. The default behavior of the TPL is very good, often better than any hand written partitioning strategy.

Read the article
How to find and fix performance problems in ORM powered applications

- by FransBouma

Once in a while we get requests about how to fix performance problems with our framework. As it comes down to following the same steps and looking into the same things every single time, I decided to write a blogpost about it instead, so more people can learn from this and solve performance problems in their O/R mapper powered applications. In some parts it's focused on LLBLGen Pro but it's also usable for other O/R mapping frameworks, as the vast majority of performance problems in O/R mapper powered applications are not specific for a certain O/R mapper framework. Too often, the developer looks at the wrong part of the application, trying to fix what isn't a problem in that part, and getting frustrated that 'things are so slow with <insert your favorite framework X here>'. I'm in the O/R mapper business for a long time now (almost 10 years, full time) and as it's a small world, we O/R mapper developers know almost all tricks to pull off by now: we all know what to do to make task ABC faster and what compromises (because there are almost always compromises) to deal with if we decide to make ABC faster that way. Some O/R mapper frameworks are faster in X, others in Y, but you can be sure the difference is mainly a result of a compromise some developers are willing to deal with and others aren't. That's why the O/R mapper frameworks on the market today are different in many ways, even though they all fetch and save entities from and to a database. I'm not suggesting there's no room for improvement in today's O/R mapper frameworks, there always is, but it's not a matter of 'the slowness of the application is caused by the O/R mapper' anymore. Perhaps query generation can be optimized a bit here, row materialization can be optimized a bit there, but it's mainly coming down to milliseconds. Still worth it if you're a framework developer, but it's not much compared to the time spend inside databases and in user code: if a complete fetch takes 40ms or 50ms (from call to entity object collection), it won't make a difference for your application as that 10ms difference won't be noticed. That's why it's very important to find the real locations of the problems so developers can fix them properly and don't get frustrated because their quest to get a fast, performing application failed. Performance tuning basics and rules Finding and fixing performance problems in any application is a strict procedure with four prescribed steps: isolate, analyze, interpret and fix, in that order. It's key that you don't skip a step nor make assumptions: these steps help you find the reason of a problem which seems to be there, and how to fix it or leave it as-is. Skipping a step, or when you assume things will be bad/slow without doing analysis will lead to the path of premature optimization and won't actually solve your problems, only create new ones. The most important rule of finding and fixing performance problems in software is that you have to understand what 'performance problem' actually means. Most developers will say "when a piece of software / code is slow, you have a performance problem". But is that actually the case? If I write a Linq query which will aggregate, group and sort 5 million rows from several tables to produce a resultset of 10 rows, it might take more than a couple of milliseconds before that resultset is ready to be consumed by other logic. If I solely look at the Linq query, the code consuming the resultset of the 10 rows and then look at the time it takes to complete the whole procedure, it will appear to me to be slow: all that time taken to produce and consume 10 rows? But if you look closer, if you analyze and interpret the situation, you'll see it does a tremendous amount of work, and in that light it might even be extremely fast. With every performance problem you encounter, always do realize that what you're trying to solve is perhaps not a technical problem at all, but a perception problem. The second most important rule you have to understand is based on the old saying "Penny wise, Pound Foolish": the part which takes e.g. 5% of the total time T for a given task isn't worth optimizing if you have another part which takes a much larger part of the total time T for that same given task. Optimizing parts which are relatively insignificant for the total time taken is not going to bring you better results overall, even if you totally optimize that part away. This is the core reason why analysis of the complete set of application parts which participate in a given task is key to being successful in solving performance problems: No analysis -> no problem -> no solution. One warning up front: hunting for performance will always include making compromises. Fast software can be made maintainable, but if you want to squeeze as much performance out of your software, you will inevitably be faced with the dilemma of compromising one or more from the group {readability, maintainability, features} for the extra performance you think you'll gain. It's then up to you to decide whether it's worth it. In almost all cases it's not. The reason for this is simple: the vast majority of performance problems can be solved by implementing the proper algorithms, the ones with proven Big O-characteristics so you know the performance you'll get plus you know the algorithm will work. The time taken by the algorithm implementing code is inevitable: you already implemented the best algorithm. You might find some optimizations on the technical level but in general these are minor. Let's look at the four steps to see how they guide us through the quest to find and fix performance problems. Isolate The first thing you need to do is to isolate the areas in your application which are assumed to be slow. For example, if your application is a web application and a given page is taking several seconds or even minutes to load, it's a good candidate to check out. It's important to start with the isolate step because it allows you to focus on a single code path per area with a clear begin and end and ignore the rest. The rest of the steps are taken per identified problematic area. Keep in mind that isolation focuses on tasks in an application, not code snippets. A task is something that's started in your application by either another task or the user, or another program, and has a beginning and an end. You can see a task as a piece of functionality offered by your application. Analyze Once you've determined the problem areas, you have to perform analysis on the code paths of each area, to see where the performance problems occur and which areas are not the problem. This is a multi-layered effort: an application which uses an O/R mapper typically consists of multiple parts: there's likely some kind of interface (web, webservice, windows etc.), a part which controls the interface and business logic, the O/R mapper part and the RDBMS, all connected with either a network or inter-process connections provided by the OS or other means. Each of these parts, including the connectivity plumbing, eat up a part of the total time it takes to complete a task, e.g. load a webpage with all orders of a given customer X. To understand which parts participate in the task / area we're investigating and how much they contribute to the total time taken to complete the task, analysis of each participating task is essential. Start with the code you wrote which starts the task, analyze the code and track the path it follows through your application. What does the code do along the way, verify whether it's correct or not. Analyze whether you have implemented the right algorithms in your code for this particular area. Remember we're looking at one area at a time, which means we're ignoring all other code paths, just the code path of the current problematic area, from begin to end and back. Don't dig in and start optimizing at the code level just yet. We're just analyzing. If your analysis reveals big architectural stupidity, it's perhaps a good idea to rethink the architecture at this point. For the rest, we're analyzing which means we collect data about what could be wrong, for each participating part of the complete application. Reviewing the code you wrote is a good tool to get deeper understanding of what is going on for a given task but ultimately it lacks precision and overview what really happens: humans aren't good code interpreters, computers are. We therefore need to utilize tools to get deeper understanding about which parts contribute how much time to the total task, triggered by which other parts and for example how many times are they called. There are two different kind of tools which are necessary: .NET profilers and O/R mapper / RDBMS profilers. .NET profiling .NET profilers (e.g. dotTrace by JetBrains or Ants by Red Gate software) show exactly which pieces of code are called, how many times they're called, and the time it took to run that piece of code, at the method level and sometimes even at the line level. The .NET profilers are essential tools for understanding whether the time taken to complete a given task / area in your application is consumed by .NET code, where exactly in your code, the path to that code, how many times that code was called by other code and thus reveals where hotspots are located: the areas where a solution can be found. Importantly, they also reveal which areas can be left alone: remember our penny wise pound foolish saying: if a profiler reveals that a group of methods are fast, or don't contribute much to the total time taken for a given task, ignore them. Even if the code in them is perhaps complex and looks like a candidate for optimization: you can work all day on that, it won't matter. As we're focusing on a single area of the application, it's best to start profiling right before you actually activate the task/area. Most .NET profilers support this by starting the application without starting the profiling procedure just yet. You navigate to the particular part which is slow, start profiling in the profiler, in your application you perform the actions which are considered slow, and afterwards you get a snapshot in the profiler. The snapshot contains the data collected by the profiler during the slow action, so most data is produced by code in the area to investigate. This is important, because it allows you to stay focused on a single area. O/R mapper and RDBMS profiling .NET profilers give you a good insight in the .NET side of things, but not in the RDBMS side of the application. As this article is about O/R mapper powered applications, we're also looking at databases, and the software making it possible to consume the database in your application: the O/R mapper. To understand which parts of the O/R mapper and database participate how much to the total time taken for task T, we need different tools. There are two kind of tools focusing on O/R mappers and database performance profiling: O/R mapper profilers and RDBMS profilers. For O/R mapper profilers, you can look at LLBLGen Prof by hibernating rhinos or the Linq to Sql/LLBLGen Pro profiler by Huagati. Hibernating rhinos also have profilers for other O/R mappers like NHibernate (NHProf) and Entity Framework (EFProf) and work the same as LLBLGen Prof. For RDBMS profilers, you have to look whether the RDBMS vendor has a profiler. For example for SQL Server, the profiler is shipped with SQL Server, for Oracle it's build into the RDBMS, however there are also 3rd party tools. Which tool you're using isn't really important, what's important is that you get insight in which queries are executed during the task / area we're currently focused on and how long they took. Here, the O/R mapper profilers have an advantage as they collect the time it took to execute the query from the application's perspective so they also collect the time it took to transport data across the network. This is important because a query which returns a massive resultset or a resultset with large blob/clob/ntext/image fields takes more time to get transported across the network than a small resultset and a database profiler doesn't take this into account most of the time. Another tool to use in this case, which is more low level and not all O/R mappers support it (though LLBLGen Pro and NHibernate as well do) is tracing: most O/R mappers offer some form of tracing or logging system which you can use to collect the SQL generated and executed and often also other activity behind the scenes. While tracing can produce a tremendous amount of data in some cases, it also gives insight in what's going on. Interpret After we've completed the analysis step it's time to look at the data we've collected. We've done code reviews to see whether we've done anything stupid and which parts actually take place and if the proper algorithms have been implemented. We've done .NET profiling to see which parts are choke points and how much time they contribute to the total time taken to complete the task we're investigating. We've performed O/R mapper profiling and RDBMS profiling to see which queries were executed during the task, how many queries were generated and executed and how long they took to complete, including network transportation. All this data reveals two things: which parts are big contributors to the total time taken and which parts are irrelevant. Both aspects are very important. The parts which are irrelevant (i.e. don't contribute significantly to the total time taken) can be ignored from now on, we won't look at them. The parts which contribute a lot to the total time taken are important to look at. We now have to first look at the .NET profiler results, to see whether the time taken is consumed in our own code, in .NET framework code, in the O/R mapper itself or somewhere else. For example if most of the time is consumed by DbCommand.ExecuteReader, the time it took to complete the task is depending on the time the data is fetched from the database. If there was just 1 query executed, according to tracing or O/R mapper profilers / RDBMS profilers, check whether that query is optimal, uses indexes or has to deal with a lot of data. Interpret means that you follow the path from begin to end through the data collected and determine where, along the path, the most time is contributed. It also means that you have to check whether this was expected or is totally unexpected. My previous example of the 10 row resultset of a query which groups millions of rows will likely reveal that a long time is spend inside the database and almost no time is spend in the .NET code, meaning the RDBMS part contributes the most to the total time taken, the rest is compared to that time, irrelevant. Considering the vastness of the source data set, it's expected this will take some time. However, does it need tweaking? Perhaps all possible tweaks are already in place. In the interpret step you then have to decide that further action in this area is necessary or not, based on what the analysis results show: if the analysis results were unexpected and in the area where the most time is contributed to the total time taken is room for improvement, action should be taken. If not, you can only accept the situation and move on. In all cases, document your decision together with the analysis you've done. If you decide that the perceived performance problem is actually expected due to the nature of the task performed, it's essential that in the future when someone else looks at the application and starts asking questions you can answer them properly and new analysis is only necessary if situations changed. Fix After interpreting the analysis results you've concluded that some areas need adjustment. This is the fix step: you're actively correcting the performance problem with proper action targeted at the real cause. In many cases related to O/R mapper powered applications it means you'll use different features of the O/R mapper to achieve the same goal, or apply optimizations at the RDBMS level. It could also mean you apply caching inside your application (compromise memory consumption over performance) to avoid unnecessary re-querying data and re-consuming the results. After applying a change, it's key you re-do the analysis and interpretation steps: compare the results and expectations with what you had before, to see whether your actions had any effect or whether it moved the problem to a different part of the application. Don't fall into the trap to do partly analysis: do the full analysis again: .NET profiling and O/R mapper / RDBMS profiling. It might very well be that the changes you've made make one part faster but another part significantly slower, in such a way that the overall problem hasn't changed at all. Performance tuning is dealing with compromises and making choices: to use one feature over the other, to accept a higher memory footprint, to go away from the strict-OO path and execute queries directly onto the RDBMS, these are choices and compromises which will cross your path if you want to fix performance problems with respect to O/R mappers or data-access and databases in general. In most cases it's not a big issue: alternatives are often good choices too and the compromises aren't that hard to deal with. What is important is that you document why you made a choice, a compromise: which analysis data, which interpretation led you to the choice made. This is key for good maintainability in the years to come. Most common performance problems with O/R mappers Below is an incomplete list of common performance problems related to data-access / O/R mappers / RDBMS code. It will help you with fixing the hotspots you found in the interpretation step. SELECT N+1: (Lazy-loading specific). Lazy loading triggered performance bottlenecks. Consider a list of Orders bound to a grid. You have a Field mapped onto a related field in Order, Customer.CompanyName. Showing this column in the grid will make the grid fetch (indirectly) for each row the Customer row. This means you'll get for the single list not 1 query (for the orders) but 1+(the number of orders shown) queries. To solve this: use eager loading using a prefetch path to fetch the customers with the orders. SELECT N+1 is easy to spot with an O/R mapper profiler or RDBMS profiler: if you see a lot of identical queries executed at once, you have this problem. Prefetch paths using many path nodes or sorting, or limiting. Eager loading problem. Prefetch paths can help with performance, but as 1 query is fetched per node, it can be the number of data fetched in a child node is bigger than you think. Also consider that data in every node is merged on the client within the parent. This is fast, but it also can take some time if you fetch massive amounts of entities. If you keep fetches small, you can use tuning parameters like the ParameterizedPrefetchPathThreshold setting to get more optimal queries. Deep inheritance hierarchies of type Target Per Entity/Type. If you use inheritance of type Target per Entity / Type (each type in the inheritance hierarchy is mapped onto its own table/view), fetches will join subtype- and supertype tables in many cases, which can lead to a lot of performance problems if the hierarchy has many types. With this problem, keep inheritance to a minimum if possible, or switch to a hierarchy of type Target Per Hierarchy, which means all entities in the inheritance hierarchy are mapped onto the same table/view. Of course this has its own set of drawbacks, but it's a compromise you might want to take. Fetching massive amounts of data by fetching large lists of entities. LLBLGen Pro supports paging (and limiting the # of rows returned), which is often key to process through large sets of data. Use paging on the RDBMS if possible (so a query is executed which returns only the rows in the page requested). When using paging in a web application, be sure that you switch server-side paging on on the datasourcecontrol used. In this case, paging on the grid alone is not enough: this can lead to fetching a lot of data which is then loaded into the grid and paged there. Keep note that analyzing queries for paging could lead to the false assumption that paging doesn't occur, e.g. when the query contains a field of type ntext/image/clob/blob and DISTINCT can't be applied while it should have (e.g. due to a join): the datareader will do DISTINCT filtering on the client. this is a little slower but it does perform paging functionality on the data-reader so it won't fetch all rows even if the query suggests it does. Fetch massive amounts of data because blob/clob/ntext/image fields aren't excluded. LLBLGen Pro supports field exclusion for queries. You can exclude fields (also in prefetch paths) per query to avoid fetching all fields of an entity, e.g. when you don't need them for the logic consuming the resultset. Excluding fields can greatly reduce the amount of time spend on data-transport across the network. Use this optimization if you see that there's a big difference between query execution time on the RDBMS and the time reported by the .NET profiler for the ExecuteReader method call. Doing client-side aggregates/scalar calculations by consuming a lot of data. If possible, try to formulate a scalar query or group by query using the projection system or GetScalar functionality of LLBLGen Pro to do data consumption on the RDBMS server. It's far more efficient to process data on the RDBMS server than to first load it all in memory, then traverse the data in-memory to calculate a value. Using .ToList() constructs inside linq queries. It might be you use .ToList() somewhere in a Linq query which makes the query be run partially in-memory. Example: var q = from c in metaData.Customers.ToList() where c.Country=="Norway" select c; This will actually fetch all customers in-memory and do an in-memory filtering, as the linq query is defined on an IEnumerable<T>, and not on the IQueryable<T>. Linq is nice, but it can often be a bit unclear where some parts of a Linq query might run. Fetching all entities to delete into memory first. To delete a set of entities it's rather inefficient to first fetch them all into memory and then delete them one by one. It's more efficient to execute a DELETE FROM ... WHERE query on the database directly to delete the entities in one go. LLBLGen Pro supports this feature, and so do some other O/R mappers. It's not always possible to do this operation in the context of an O/R mapper however: if an O/R mapper relies on a cache, these kind of operations are likely not supported because they make it impossible to track whether an entity is actually removed from the DB and thus can be removed from the cache. Fetching all entities to update with an expression into memory first. Similar to the previous point: it is more efficient to update a set of entities directly with a single UPDATE query using an expression instead of fetching the entities into memory first and then updating the entities in a loop, and afterwards saving them. It might however be a compromise you don't want to take as it is working around the idea of having an object graph in memory which is manipulated and instead makes the code fully aware there's a RDBMS somewhere. Conclusion Performance tuning is almost always about compromises and making choices. It's also about knowing where to look and how the systems in play behave and should behave. The four steps I provided should help you stay focused on the real problem and lead you towards the solution. Knowing how to optimally use the systems participating in your own code (.NET framework, O/R mapper, RDBMS, network/services) is key for success as well as knowing what's going on inside the application you built. I hope you'll find this guide useful in tracking down performance problems and dealing with them in a useful way.

Read the article
Error Installing COM+ (Error Code: 80131501)

- by Regina Foo

I've written a class library that reads from an xml file and return the result as a string. But when I want to install it as a COM+ component, an error occurred (Error Code: 80131501). I checked the event log and the details of the error is: Installation of 'C:\Users\User\Documents\Visual Studio 2005\Projects\InteropSOA\InteropSOA\bin\Debug\InteropSOA.dll' into '{28E82165-AD74-4E16-90C9-0C5CE7DA97AA}' failed with an exception: System.EnterpriseServices.RegistrationException: FATAL: Could not find component 'InteropSOA.ConfigReader' we just installed. at System.EnterpriseServices.RegistrationDriver.InstallAssembly(RegistrationConfig regConfig, Object obSync) at System.EnterpriseServices.RegistrationHelper.InstallAssemblyFromConfig(RegistrationConfig& regConfig) at System.EnterpriseServices.RegistrationHelper.InstallAssembly(String assembly, String& application, String partition, String& tlb, InstallationFlags installFlags) at System.EnterpriseServices.Internal.ComManagedImportUtil.InstallAssembly(String asmpath, String parname, String appname) Below are the steps I've done while developing the class library: Added "System.EnterpriseServices" to Reference. Imported the reference to the class. Declared the class as "ServicedComponent". Set project properties ("Make assembly COM-visible" checked, "Register for COM Interop" checked, Signed the assembly with a strong key file name.) Here are my codes: using System; using System.Collections.Generic; using System.Text; using System.Xml; using System.Xml.XPath; using System.EnterpriseServices; namespace InteropSOA { public class ConfigReader : ServicedComponent { // xml file name private string strFileName; // type of request private string strRequest = ""; // response string private string strResponse = ""; // declarations for xPath private XPathDocument doc; private XPathNavigator nav; private XPathExpression expr; private XPathNodeIterator iterator; private XmlTextReader reader; private XmlDocument xmlDoc; public ConfigReader(string strFile, string request) { this.strFileName = strFile; this.strRequest = request; } public ConfigReader() { //default contructor } // reader for console program public void ReadXML() { doc = new XPathDocument(strFileName); nav = doc.CreateNavigator(); // compile xPath expression expr = nav.Compile("/Msg/" + strRequest + "/*"); iterator = nav.Select(expr); // interate on the node set try { while (iterator.MoveNext()) { XPathNavigator nav2 = iterator.Current.Clone(); strResponse += nav2.Value + "|"; } } catch (Exception ex) { Console.WriteLine(ex.Message); } strResponse = strResponse.Substring(0, strResponse.Length-1); Console.WriteLine("Response string = " + strResponse); } public void WriteXML(string strRequest, string strElement, string strValue) { reader = new XmlTextReader(strFileName); xmlDoc = new XmlDocument(); xmlDoc.Load(reader); reader.Close(); XmlNode node; XmlElement root = xmlDoc.DocumentElement; node = root.SelectSingleNode("/Msg/" + strRequest + "/" + strElement); node.InnerText = strValue; xmlDoc.Save(strFileName); } // reader for ASP.NET public string ReadXMLElement() { doc = new XPathDocument(strFileName); nav = doc.CreateNavigator(); // compile xPath expression expr = nav.Compile("/Msg/" + strRequest + "/*"); iterator = nav.Select(expr); // interate on the node set try { while (iterator.MoveNext()) { XPathNavigator nav2 = iterator.Current.Clone(); strResponse += nav2.Value + "|"; } } catch (Exception ex) { Console.WriteLine(ex.Message); } strResponse = strResponse.Substring(0, strResponse.Length - 1); return strResponse; } } }

Read the article
iPhone Mobile Safari, How many max parallel http connections?

- by user316994

I would like to use parallel AJAX HTTP requests with iPhone Mobile Safari (OS4). What is the max number of parallel connections?

Read the article
Can I use a static library compiled with gcc 3.4.2 with gcc 4.2.2

- by shergill

I have a static library that is compiled with gcc 3.4.2. I am building a shared library that relies on this static lib. I will be building this shared library (.so) with gcc 4.2.2. I was wondering what are the potential pitfalls of using the 3.4.2 static library in a gcc 4.2.2 shared library?

Read the article

Search Results

Search found 23545 results on 942 pages for 'parallel task library'.

Page 123/942 | < Previous Page | 119 120 121 122 123 124 125 126 127 128 129 130 | Next Page >

- by pompolus

- by yuku

- by Gath

- by binaryLV

- by Balk

- by Ben

- by erebel55

- by Perl QuestionAsker

- by Alessandro Machi

- by user2975367

- by activenightly

- by Nam Gi VU

- by conradlee

- by takwing

- by willvv

- by hai

- by debugger

- by conradlee

- by Dan F.

- by Reed

- by FransBouma

- by Regina Foo

- by user316994

- by shergill

< Previous Page | 119 120 121 122 123 124 125 126 127 128 129 130 | Next Page >