Search Results

Search found 562 results on 23 pages for 'profiling'.

Page 19/23 | < Previous Page | 15 16 17 18 19 20 21 22 23  | Next Page >

  • Unexpected performance curve from CPython merge sort

    - by vkazanov
    I have implemented a naive merge sorting algorithm in Python. Algorithm and test code is below: import time import random import matplotlib.pyplot as plt import math from collections import deque def sort(unsorted): if len(unsorted) <= 1: return unsorted to_merge = deque(deque([elem]) for elem in unsorted) while len(to_merge) > 1: left = to_merge.popleft() right = to_merge.popleft() to_merge.append(merge(left, right)) return to_merge.pop() def merge(left, right): result = deque() while left or right: if left and right: elem = left.popleft() if left[0] > right[0] else right.popleft() elif not left and right: elem = right.popleft() elif not right and left: elem = left.popleft() result.append(elem) return result LOOP_COUNT = 100 START_N = 1 END_N = 1000 def test(fun, test_data): start = time.clock() for _ in xrange(LOOP_COUNT): fun(test_data) return time.clock() - start def run_test(): timings, elem_nums = [], [] test_data = random.sample(xrange(100000), END_N) for i in xrange(START_N, END_N): loop_test_data = test_data[:i] elapsed = test(sort, loop_test_data) timings.append(elapsed) elem_nums.append(len(loop_test_data)) print "%f s --- %d elems" % (elapsed, len(loop_test_data)) plt.plot(elem_nums, timings) plt.show() run_test() As much as I can see everything is OK and I should get a nice N*logN curve as a result. But the picture differs a bit: Things I've tried to investigate the issue: PyPy. The curve is ok. Disabled the GC using the gc module. Wrong guess. Debug output showed that it doesn't even run until the end of the test. Memory profiling using meliae - nothing special or suspicious. ` I had another implementation (a recursive one using the same merge function), it acts the similar way. The more full test cycles I create - the more "jumps" there are in the curve. So how can this behaviour be explained and - hopefully - fixed? UPD: changed lists to collections.deque UPD2: added the full test code UPD3: I use Python 2.7.1 on a Ubuntu 11.04 OS, using a quad-core 2Hz notebook. I tried to turn of most of all other processes: the number of spikes went down but at least one of them was still there.

    Read the article

  • Why is setting HTML5's CanvasPixelArray values is ridiculously slow and how can I do it faster?

    - by Nixuz
    I am trying to do some dynamic visual effects using the HTML 5 canvas' pixel manipulation, but I am running into a problem where setting pixels in the CanvasPixelArray is ridiculously slow. For example if I have code like: imageData = ctx.getImageData(0, 0, 500, 500); for (var i = 0; i < imageData.length; i += 4){ imageData.data[index] = buffer[i]; imageData.data[index + 1] = buffer[i]; imageData.data[index + 2] = buffer[i]; } ctx.putImageData(imageData, 0, 0); Profiling with Chrome reveals, it runs 44% slower than the following code where CanvasPixelArray is not used. tempArray = new Array(500 * 500 * 4); imageData = ctx.getImageData(0, 0, 500, 500); for (var i = 0; i < imageData.length; i += 4){ tempArray[index] = buffer[i]; tempArray[index + 1] = buffer[i]; tempArray[index + 2] = buffer[i]; } ctx.putImageData(imageData, 0, 0); My guess is that the reason for this slowdown is due to the conversion between the Javascript doubles and the internal unsigned 8bit integers, used by the CanvasPixelArray. Is this guess correct? Is there anyway to reduce the time spent setting values in the CanvasPixelArray?

    Read the article

  • How can I apply a PSSM efficiently?

    - by flies
    I am fitting for position specific scoring matrices (PSSM aka Position Specific Weight Matrices). The fit I'm using is like simulated annealing, where I the perturb the PSSM, compare the prediction to experiment and accept the change if it improves agreement. This means I apply the PSSM millions of times per fit; performance is critical. In my particular problem, I'm applying a PSSM for an object of length L (~8 bp) at every position of a DNA sequence of length M (~30 bp) (so there are M-L+1 valid positions). I need an efficient algorithm to apply a PSSM. Can anyone help improve performance? My best idea is to convert the DNA into some kind of a matrix so that applying the PSSM is matrix multiplication. There are efficient linear algebra libraries out there (e.g. BLAS), but I'm not sure how best to turn an M-length DNA sequence into a matrix M x 4 matrix and then apply the PSSM at each position. The solution needs to work for higher order/dinucleotide terms in the PSSM - presumably this means representing the sequence-matrix for mono-nucleotides and separately for dinucleotides. My current solution iterates over each position m, then over each letter in word from m to m+L-1, adding the corresponding term in the matrix. I'm storing the matrix as a multi-dimensional STL vector, and profiling has revealed that a lot of the computation time is just accessing the elements of the PSSM (with similar performance bottlenecks accessing the DNA sequence). If someone has an idea besides matrix multiplication, I'm all ears.

    Read the article

  • ConcurrentLinkedQueue$Node remains in heap after remove()

    - by action8
    I have a multithreaded app writing and reading a ConcurrentLinkedQueue, which is conceptually used to back entries in a list/table. I originally used a ConcurrentHashMap for this, which worked well. A new requirement required tracking the order entries came in, so they could be removed in oldest first order, depending on some conditions. ConcurrentLinkedQueue appeared to be a good choice, and functionally it works well. A configurable amount of entries are held in memory, and when a new entry is offered when the limit is reached, the queue is searched in oldest-first order for one that can be removed. Certain entries are not to be removed by the system and wait for client interaction. What appears to be happening is I have an entry at the front of the queue that occurred, say 100K entries ago. The queue appears to have the limited number of configured entries (size() == 100), but when profiling, I found that there were ~100K ConcurrentLinkedQueue$Node objects in memory. This appears to be by design, just glancing at the source for ConcurrentLinkedQueue, a remove merely removes the reference to the object being stored but leaves the linked list in place for iteration. Finally my question: Is there a "better" lazy way to handle a collection of this nature? I love the speed of the ConcurrentLinkedQueue, I just cant afford the unbounded leak that appears to be possible in this case. If not, it seems like I'd have to create a second structure to track order and may have the same issues, plus a synchronization concern.

    Read the article

  • Performance of Java matrix math libraries?

    - by dfrankow
    We are computing something whose runtime is bound by matrix operations. (Some details below if interested.) This experience prompted the following question: Do folk have experience with the performance of Java libraries for matrix math (e.g., multiply, inverse, etc.)? For example: JAMA: http://math.nist.gov/javanumerics/jama/ COLT: http://acs.lbl.gov/~hoschek/colt/ Apache commons math: http://commons.apache.org/math/ I searched and found nothing. Details of our speed comparison: We are using Intel FORTRAN (ifort (IFORT) 10.1 20070913). We have reimplemented it in Java (1.6) using Apache commons math 1.2 matrix ops, and it agrees to all of its digits of accuracy. (We have reasons for wanting it in Java.) (Java doubles, Fortran real*8). Fortran: 6 minutes, Java 33 minutes, same machine. jvisualm profiling shows much time spent in RealMatrixImpl.{getEntry,isValidCoordinate} (which appear to be gone in unreleased Apache commons math 2.0, but 2.0 is no faster). Fortran is using Atlas BLAS routines (dpotrf, etc.). Obviously this could depend on our code in each language, but we believe most of the time is in equivalent matrix operations. In several other computations that do not involve libraries, Java has not been much slower, and sometimes much faster.

    Read the article

  • Demangling typeclass functions in GHC profiler output

    - by Paul Kuliniewicz
    When profiling a Haskell program written in GHC, the names of typeclass functions are mangled in the .prof file to distinguish one instance's implementations of them from another. How can I demangle these names to find out which type's instance it is? For example, suppose I have the following program, where types Fast and Slow both implement Show: import Data.List (foldl') sum' = foldl' (+) 0 data Fast = Fast instance Show Fast where show _ = show $ sum' [1 .. 10] data Slow = Slow instance Show Slow where show _ = show $ sum' [1 .. 100000000] main = putStrLn (show Fast ++ show Slow) I compile with -prof -auto-all -caf-all and run with +RTS -p. In the .prof file that gets generated, I see that the top cost centers are: COST CENTRE MODULE %time %alloc show_an9 Main 71.0 83.3 sum' Main 29.0 16.7 And in the tree, I likewise see (omitting irrelevant lines): individual inherited COST CENTRE MODULE no. entries %time %alloc %time %alloc main Main 232 1 0.0 0.0 100.0 100.0 show_an9 Main 235 1 71.0 83.3 100.0 100.0 sum' Main 236 0 29.0 16.7 29.0 16.7 show_anx Main 233 1 0.0 0.0 0.0 0.0 How do I figure out that show_an9 is Slow's implementation of show and not Fast's?

    Read the article

  • Appropriate SQL Server Permissions for Developers

    - by BJ Safdie
    After a couple of Google searches and a quick look at questions here, I cannot seem to find what I thought would be a cookbook answer for SQL Server permissions. As I often see in small shops, most developers here were using an admin account for SQL Server while developing. I want to set up roles and permissions that I can assign to developers so that we can get our jobs done, but also do so with the minimum permissions required. Can anyone offer advice on what SQL Server permissions to assign? Components: SQL Server 2008 SQL Server Reporting Services (SSRS) 2008 SQL Server Integration Services (SSIS) 2008 Platforms: Production Staging/QA Development/Integration We are running "Mixed Mode" security because of some legacy apps and networks, but are moving to Windows Auth. I am not sure if that really affects the role set up. I plan to set up access for Developers to Prod and Staging/QA DBs as Read-Only. However, I still want developers to retain the ability to run Profiling. We need Deployment accounts with higher privilege levels. We are currently trying to figure out exactly what privileges we need for SSIS package deployments. Within the Development Server, Developers need broad privileges. However, I am not sure that just making them all admins is really the best choice. It's hard to believe that no one has published a decent example script that sets up these kinds of roles with a good set of appropriate permissions for developers and deployers. We can probably figure this all out by locking things down and then adding permissions as we discover the need, but that will be way too big a PITA for everyone. Can anyone point me to, or provide, a good exemplar for permissions for these kinds of roles on these kinds of platforms?

    Read the article

  • C++ interpreter conceptual problem

    - by Jan Wilkins
    I've built an interpreter in C++ for a language created by me. One main problem in the design was that I had two different types in the language: number and string. So I have to pass around a struct like: class myInterpreterValue { myInterpreterType type; int intValue; string strValue; } Objects of this class are passed around million times a second during e.g.: a countdown loop in my language. Profiling pointed out: 85% of the performance is eaten by the allocation function of the string template. This is pretty clear to me: My interpreter has bad design and doesn't use pointers enough. Yet, I don't have an option: I can't use pointers in most cases as I just have to make copies. How to do something against this? Is a class like this a better idea? vector<string> strTable; vector<int> intTable; class myInterpreterValue { myInterpreterType type; int locationInTable; } So the class only knows what type it represents and the position in the table This however again has disadvantages: I'd have to add temporary values to the string/int vector table and then remove them again, this would eat a lot of performance again. Help, how do interpreters of languages like Python or Ruby do that? They somehow need a struct that represents a value in the language like something that can either be int or string.

    Read the article

  • Parallel version of loop not faster than serial version

    - by Il-Bhima
    I'm writing a program in C++ to perform a simulation of particular system. For each timestep, the biggest part of the execution is taking up by a single loop. Fortunately this is embarassingly parallel, so I decided to use Boost Threads to parallelize it (I'm running on a 2 core machine). I would expect at speedup close to 2 times the serial version, since there is no locking. However I am finding that there is no speedup at all. I implemented the parallel version of the loop as follows: Wake up the two threads (they are blocked on a barrier). Each thread then performs the following: Atomically fetch and increment a global counter. Retrieve the particle with that index. Perform the computation on that particle, storing the result in a separate array Wait on a job finished barrier The main thread waits on the job finished barrier. I used this approach since it should provide good load balancing (since each computation may take differing amounts of time). I am really curious as to what could possibly cause this slowdown. I always read that atomic variables are fast, but now I'm starting to wonder whether they have their performance costs. If anybody has some ideas what to look for or any hints I would really appreciate it. I've been bashing my head on it for a week, and profiling has not revealed much.

    Read the article

  • Pros and cons of each JEE server for developing within Eclipse

    - by Thorbjørn Ravn Andersen
    Eclipse JEE has a lot of server adapters allowing development against many different application servers like JBoss, Glassfish and WebSphere. Frequently you can benefit from using another server for developing new features than for production, simply because it may be able to deploy changes much faster and when the functionality is in place, you can iron out bugs for the production platform. Unfortunately finding that server is a time consuming process, where others experience is invaluable. If you have experience with any server with an Eclipse Server Adapter, please add your findings and your recommendation. I believe that the following is of interest: Does saving a file trigger an update in the server, giving save edit+reload browser functionality? How fast is a deployment? (Saved a JSP? Java class? Static file?) Can the actual server be downloaded by the Server Adapter Wizard allowing for easy installation? Are there known bugs and issues with suitable work-arounds? Is debugging fully supported? Is profiling? Would you recommend this server? Note: Eclipse can also work with Tomcat but that is a web container, which cannot deploy EAR files.

    Read the article

  • Why would this Lua optimization hack help?

    - by Ian Boyd
    i'm looking over a document that describes various techniques to improve performance of Lua script code, and i'm shocked that such tricks would be required. (Although i'm quoting Lua, i've seen similar hacks in Javascript). Why would this optimization be required: For instance, the code for i = 1, 1000000 do local x = math.sin(i) end runs 30% slower than this one: local sin = math.sin for i = 1, 1000000 do local x = sin(i) end They're re-declaring sin function locally. Why would this be helpful? It's the job of the compiler to do that anyway. Why is the programmer having to do the compiler's job? i've seen similar things in Javascript; and so obviously there must be a very good reason why the interpreting compiler isn't doing its job. What is it? i see it repeatedly in the Lua environment i'm fiddling in; people redeclaring variables as local: local strfind = strfind local strlen = strlen local gsub = gsub local pairs = pairs local ipairs = ipairs local type = type local tinsert = tinsert local tremove = tremove local unpack = unpack local max = max local min = min local floor = floor local ceil = ceil local loadstring = loadstring local tostring = tostring local setmetatable = setmetatable local getmetatable = getmetatable local format = format local sin = math.sin What is going on here that people have to do the work of the compiler? Is the compiler confused by how to find format? Why is this an issue that a programmer has to deal with? Why would this not have been taken care of in 1993? i also seem to have hit a logical paradox: Optimizatin should not be done without profiling Lua has no ability to be profiled Lua should not be optimized

    Read the article

  • Interesting LinqToSql behaviour

    - by Ben Robinson
    We have a database table that stores the location of some wave files plus related meta data. There is a foreign key (employeeid) on the table that links to an employee table. However not all wav files relate to an employee, for these records employeeid is null. We are using LinqToSQl to access the database, the query to pull out all non employee related wav file records is as follows: var results = from Wavs in db.WaveFiles where Wavs.employeeid == null; Except this returns no records, despite the fact that there are records where employeeid is null. On profiling sql server i discovered the reason no records are returned is because LinqToSQl is turning it into SQL that looks very much like: SELECT Field1, Field2 //etc FROM WaveFiles WHERE 1=0 Obviously this returns no rows. However if I go into the DBML designer and remove the association and save. All of a sudden the exact same LINQ query turns into SELECT Field1, Field2 //etc FROM WaveFiles WHERE EmployeeID IS NULL I.e. if there is an association then LinqToSql assumes that all records have a value for the foreign key (even though it is nullable and the property appears as a nullable int on the WaveFile entity) and as such deliverately constructs a where clause that will return no records. Does anyone know if there is a way to keep the association in LinqToSQL but stop this behaviour. A workaround i can think of quickly is to have a calculated field called IsSystemFile and set it to 1 if employeeid is null and 0 otherwise. However this seems like a bit of a hack to work around strange behaviour of LinqToSQl and i would rather do something in the DBML file or define something on the foreign key constraint that will prevent this behaviour.

    Read the article

  • Non standard interaction among two tables to avoid very large merge

    - by riko
    Suppose I have two tables A and B. Table A has a multi-level index (a, b) and one column (ts). b determines univocally ts. A = pd.DataFrame( [('a', 'x', 4), ('a', 'y', 6), ('a', 'z', 5), ('b', 'x', 4), ('b', 'z', 5), ('c', 'y', 6)], columns=['a', 'b', 'ts']).set_index(['a', 'b']) AA = A.reset_index() Table B is another one-column (ts) table with non-unique index (a). The ts's are sorted "inside" each group, i.e., B.ix[x] is sorted for each x. Moreover, there is always a value in B.ix[x] that is greater than or equal to the values in A. B = pd.DataFrame( dict(a=list('aaaaabbcccccc'), ts=[1, 2, 4, 5, 7, 7, 8, 1, 2, 4, 5, 8, 9])).set_index('a') The semantics in this is that B contains observations of occurrences of an event of type indicated by the index. I would like to find from B the timestamp of the first occurrence of each event type after the timestamp specified in A for each value of b. In other words, I would like to get a table with the same shape of A, that instead of ts contains the "minimum value occurring after ts" as specified by table B. So, my goal would be: C: ('a', 'x') 4 ('a', 'y') 7 ('a', 'z') 5 ('b', 'x') 7 ('b', 'z') 7 ('c', 'y') 8 I have some working code, but is terribly slow. C = AA.apply(lambda row: ( row[0], row[1], B.ix[row[0]].irow(np.searchsorted(B.ts[row[0]], row[2]))), axis=1).set_index(['a', 'b']) Profiling shows the culprit is obviously B.ix[row[0]].irow(np.searchsorted(B.ts[row[0]], row[2]))). However, standard solutions using merge/join would take too much RAM in the long run. Consider that now I have 1000 a's, assume constant the average number of b's per a (probably 100-200), and consider that the number of observations per a is probably in the order of 300. In production I will have 1000 more a's. 1,000,000 x 200 x 300 = 60,000,000,000 rows may be a bit too much to keep in RAM, especially considering that the data I need is perfectly described by a C like the one I discussed above. How would I improve the performance?

    Read the article

  • Auditing front end performance on web application

    - by user1018494
    I am currently trying to performance tune the UI of a company web application. The application is only ever going to be accessed by staff, so the speed of the connection between the server and client will always be considerably more than if it was on the internet. I have been using performance auditing tools such as Y Slow! and Google Chrome's profiling tool to try and highlight areas that are worth targeting for investigation. However, these tools are written with the internet in mind. For example, the current suggestions from a Google Chrome audit of the application suggests is as follows: Network Utilization Combine external CSS (Red warning) Combine external JavaScript (Red warning) Enable gzip compression (Red warning) Leverage browser caching (Red warning) Leverage proxy caching (Amber warning) Minimise cookie size (Amber warning) Parallelize downloads across hostnames (Amber warning) Serve static content from a cookieless domain (Amber warning) Web Page Performance Remove unused CSS rules (Amber warning) Use normal CSS property names instead of vendor-prefixed ones (Amber warning) Are any of these bits of advice totally redundant given the connection speed and usage pattern? The users will be using the application frequently throughout the day, so it doesn't matter if the initial hit is large (when they first visit the page and build their cache) so long as a minimal amount of work is done on future page views. For example, is it worth the effort of combining all of our CSS and JavaScript files? It may speed up the initial page view, but how much of a difference will it really make on subsequent page views throughout the working day? I've tried searching for this but all I keep coming up with is the standard internet facing performance advice. Any advice on what to focus my performance tweaking efforts on in this scenario, or other auditing tool recommendations, would be much appreciated.

    Read the article

  • "variable tracking" is eating my compile time!

    - by wowus
    I have an auto-generated file which looks something like this... static void do_SomeFunc1(void* parameter) { // Do stuff. } // Continues on for another 4000 functions... void dispatch(int id, void* parameter) { switch(id) { case ::SomeClass1::id: return do_SomeFunc1(parameter); case ::SomeClass2::id: return do_SomeFunc2(parameter); // This continues for the next 4000 cases... } } When I build it like this, the build time is enormous. If I inline all the functions automagically into their respective cases using my script, the build time is cut in half. GCC 4.5.0 says ~50% of the build time is being taken up by "variable tracking" when I use -ftime-report. What does this mean and how can I speed compilation while still maintaining the superior cache locality of pulling out the functions from the switch? EDIT: Interestingly enough, the build time has exploded only on debug builds, as per the following profiling information of the whole project (which isn't just the file in question, but still a good metric; the file in question takes the most time to build): Debug: 8 minutes 50 seconds Release: 4 minutes, 25 seconds

    Read the article

  • how does MySQL implement the "group by"?

    - by user188916
    I read from the MySQL Reference Manual and find that when it can take use of index,it just do index scan,other it will create tmp tables and do things like filesort. And I also read from other article that the "Group By" result will sort by group by columns by default,if "order by null" clause added,it won't don filesort. The difference can be found from the "explain ..." clause. so my problem is:what is the difference between "group by" clause that with "order by null" and which doesn't have? I try to use profiling to see what mysql do on the background,and only see result like: result for group clause without order by null: |preparing | 0.000016 | | Creating tmp table | 0.000048 | | executing | 0.000009 | | Copying to tmp table | 0.000109 | **| Sorting result | 0.000023 |** | Sending data | 0.000027 | result for clause with "order by null": preparing | 0.000016 | | Creating tmp table | 0.000052 | | executing | 0.000009 | | Copying to tmp table | 0.000114 | | Sending data | 0.000028 | So I guess what MySQL do when the "order by null" added,it does not use filesort algorithm,maybe when it creates the tmp table,it uses index as well,and then use the index to do group by operation,when completed,it just read result from the table rows and does not sort the result. But my original opinion is that MySQL can use quicksort to sort the items and then do group by,so the result will be sorted as well. Any opinion appreciated,thanks.

    Read the article

  • What is the best way to find a processed memory allocations in terms of C# objects

    - by Shantaram
    I have written various C# console based applications, some of them long running some not, which can over time have a large memory foot print. When looking at the windows perofrmance monitor via the task manager, the same question keeps cropping up in my mind; how do I get a break down of the number objects by type that are contributing to this footprint; and which of those are f-reachable and those which aren't and hence can be collected. On numerous occasions I've performed a code inspection to ensure that I am not unnecessarily holding on objects longer than required and disposing of objects with the using construct. I have also recently looked at employing the CG.Collect method when I have released a large number of objects (for example held in a collection which has just been cleared). However, I am not so sure that this made that much difference, so I threw that code away. I am guessing that there are tools in sysinternals suite than can help to resolve these memory type quiestions but I am not sure which and how to use them. The alternative would be to pay for a third party profiling tool such as JetBrains dotTrace; but I need to make sure that I've explored the free options first before going cap in hand to my manager.

    Read the article

  • Generated images fail to load in browser

    - by notJim
    I've got a page on a webapp that has about 13 images that are generated by my application, which is written in the Kohana PHP framework. The images are actually graphs. They are cached so they are only generated once, but the first time the user visits the page, and the images all have to be generated, about half of the images don't load in the browser. Once the page has been requested once and images are cached, they all load successfully. Doing some ad-hoc testing, if I load an individual image in the browser, it takes from 450-700 ms to load with an empty cache (I checked this using Google Chrome's resource tracking feature). For reference, it takes around 90-150 ms to load a cached image. Even if the image cache is empty, I have the data and some of the application's startup tasks cached, so that after the first request, none of that data needs to be fetched. My questions are: Why are the images failing to load? It seems like the browser just decides not to download the image after a certain point, rather than waiting for them all to finish loading. What can I do to get them to load the first time, with an empty cache? Obviously one option is to decrease the load times, and I could figure out how to do that by profiling the app, but are there other options? As I mentioned, the app is in the Kohana PHP framework, and it's running on Apache. As an aside, I've solved this problem for now by fetching the page as soon as the data is available (it comes from a batch process), so that the images are always cached by the time the user sees them. That feels like a kludgey solution to me, though, and I'm curious about what's actually going on.

    Read the article

  • Determining if an unordered vector<T> has all unique elements

    - by Hooked
    Profiling my cpu-bound code has suggested I that spend a long time checking to see if a container contains completely unique elements. Assuming that I have some large container of unsorted elements (with < and = defined), I have two ideas on how this might be done: The first using a set: template <class T> bool is_unique(vector<T> X) { set<T> Y(X.begin(), X.end()); return X.size() == Y.size(); } The second looping over the elements: template <class T> bool is_unique2(vector<T> X) { typename vector<T>::iterator i,j; for(i=X.begin();i!=X.end();++i) { for(j=i+1;j!=X.end();++j) { if(*i == *j) return 0; } } return 1; } I've tested them the best I can, and from what I can gather from reading the documentation about STL, the answer is (as usual), it depends. I think that in the first case, if all the elements are unique it is very quick, but if there is a large degeneracy the operation seems to take O(N^2) time. For the nested iterator approach the opposite seems to be true, it is lighting fast if X[0]==X[1] but takes (understandably) O(N^2) time if all the elements are unique. Is there a better way to do this, perhaps a STL algorithm built for this very purpose? If not, are there any suggestions eek out a bit more efficiency?

    Read the article

  • CRM 2011 - Set/Retrieve work hours programmatically

    - by Philip Rich
    I am attempting to retrieve a resources work hours to perform some logic I require. I understand that the CRM scheduling engine is a little clunky around such things, but I assumed that I would be able to find out how the working hours were stored in the DB eventually... So a resource has associated calendars and those calendars have associated calendar rules and inner calendars etc. It is possible to look at the start/end and frequency of aforementioned calendar rules and query their codes to work out whether a resource is 'working' during a given period. However, I have not been able to find the actual working hours, the 9-5 shall we say in any field in the DB. I even tried some SQL profiling while I was creating a new schedule for a resource via the UI, but the results don't show any work hours passing to SQL. For those with the patience the intercepted SQL statement is below:- EXEC Sp_executesql N'update [CalendarRuleBase] set [ModifiedBy]=@ModifiedBy0, [EffectiveIntervalEnd]=@EffectiveIntervalEnd0, [Description]=@Description0, [ModifiedOn]=@ModifiedOn0, [GroupDesignator]=@GroupDesignator0, [IsSelected]=@IsSelected0, [InnerCalendarId]=@InnerCalendarId0, [TimeZoneCode]=@TimeZoneCode0, [CalendarId]=@CalendarId0, [IsVaried]=@IsVaried0, [Rank]=@Rank0, [ModifiedOnBehalfBy]=NULL, [Duration]=@Duration0, [StartTime]=@StartTime0, [Pattern]=@Pattern0 where ([CalendarRuleId] = @CalendarRuleId0)', N'@ModifiedBy0 uniqueidentifier,@EffectiveIntervalEnd0 datetime,@Description0 ntext,@ModifiedOn0 datetime,@GroupDesignator0 ntext,@IsSelected0 bit,@InnerCalendarId0 uniqueidentifier,@TimeZoneCode0 int,@CalendarId0 uniqueidentifier,@IsVaried0 bit,@Rank0 int,@Duration0 int,@StartTime0 datetime,@Pattern0 ntext,@CalendarRuleId0 uniqueidentifier', @ModifiedBy0='EB04662A-5B38-E111-9889-00155D79A113', @EffectiveIntervalEnd0='2012-01-13 00:00:00', @Description0=N'Weekly Single Rule', @ModifiedOn0='2012-03-12 16:02:08', @GroupDesignator0=N'FC5769FC-4DE9-445d-8F4E-6E9869E60857', @IsSelected0=1, @InnerCalendarId0='3C806E79-7A49-4E8D-B97E-5ED26700EB14', @TimeZoneCode0=85, @CalendarId0='E48B1ABF-329F-425F-85DA-3FFCBB77F885', @IsVaried0=0, @Rank0=2, @Duration0=1440, @StartTime0='2000-01-01 00:00:00', @Pattern0=N'FREQ=WEEKLY;INTERVAL=1;BYDAY=SU,MO,TU,WE,TH,FR,SA', @CalendarRuleId0='0A00DFCF-7D0A-4EE3-91B3-DADFCC33781D' The key parts in the statement are the setting of the pattern:- @Pattern0=N'FREQ=WEEKLY;INTERVAL=1;BYDAY=SU,MO,TU,WE,TH,FR,SA' However, as mentioned, no indication of the work hours set. Am I thinking about this incorrectly or is CRM doing something interesting around these work hours? Any thoughts greatly appreciated, thanks.

    Read the article

  • Browser timing out attempting to load images

    - by notJim
    I've got a page on a webapp that has about 13 images that are generated by my application, which is written in the Kohana PHP framework. The images are actually graphs. They are cached so they are only generated once, but the first time the user visits the page, and the images all have to be generated, about half of the images don't load in the browser. Once the page has been requested once and images are cached, they all load successfully. Doing some ad-hoc testing, if I load an individual image in the browser, it takes from 450-700 ms to load with an empty cache (I checked this using Google Chrome's resource tracking feature). For reference, it takes around 90-150 ms to load a cached image. Even if the image cache is empty, I have the data and some of the application's startup tasks cached, so that after the first request, none of that data needs to be fetched. My questions are: Why are the images failing to load? It seems like the browser just decides not to download the image after a certain point, rather than waiting for them all to finish loading. What can I do to get them to load the first time, with an empty cache? Obviously one option is to decrease the load times, and I could figure out how to do that by profiling the app, but are there other options? As I mentioned, the app is in the Kohana PHP framework, and it's running on Apache. As an aside, I've solved this problem for now by fetching the page as soon as the data is available (it comes from a batch process), so that the images are always cached by the time the user sees them. That feels like a kludgey solution to me, though, and I'm curious about what's actually going on.

    Read the article

  • Determing if an unordered vector<T> has all unique elements

    - by Hooked
    Profiling my cpu-bound code has suggested I that spend a long time checking to see if a container contains completely unique elements. Assuming that I have some large container of unsorted elements (with < and = defined), I have two ideas on how this might be done: The first using a set: template <class T> bool is_unique(vector<T> X) { set<T> Y(X.begin(), X.end()); return X.size() == Y.size(); } The second looping over the elements: template <class T> bool is_unique2(vector<T> X) { typename vector<T>::iterator i,j; for(i=X.begin();i!=X.end();++i) { for(j=i+1;j!=X.end();++j) { if(*i == *j) return 0; } } return 1; } I've tested them the best I can, and from what I can gather from reading the documentation about STL, the answer is (as usual), it depends. I think that in the first case, if all the elements are unique it is very quick, but if there is a large degeneracy the operation seems to take O(N^2) time. For the nested iterator approach the opposite seems to be true, it is lighting fast if X[0]==X[1] but takes (understandably) O(N^2) time if all the elements are unique. Is there a better way to do this, perhaps a STL algorithm built for this very purpose? If not, are there any suggestions eek out a bit more efficiency?

    Read the article

  • Do I need to using locking against integers in c++ threads

    - by Shane MacLaughlin
    The title says it all really. If I am accessing a single integer type (e.g. long, int, bool, etc...) in multiple threads, do I need to use a synchronisation mechanism such as a mutex to lock them. My understanding is that as atomic types, I don't need to lock access to a single thread, but I see a lot of code out there that does use locking. Profiling such code shows that there is a significant performance hit for using locks, so I'd rather not. So if the item I'm accessing corresponds to a bus width integer (e.g. 4 bytes on a 32 bit processor) do I need to lock access to it when it is being used across multiple threads? Put another way, if thread A is writing to integer variable X at the same time as thread B is reading from the same variable, is it possible that thread B could end up a few bytes of the previous value mixed in with a few bytes of the value being written? Is this architecture dependent, e.g. ok for 4 byte integers on 32 bit systems but unsafe on 8 byte integers on 64 bit systems? Edit: Just saw this related post which helps a fair bit.

    Read the article

  • R optimization: How can I avoid a for loop in this situation?

    - by chrisamiller
    I'm trying to do a simple genomic track intersection in R, and running into major performance problems, probably related to my use of for loops. In this situation, I have pre-defined windows at intervals of 100bp and I'm trying to calculate how much of each window is covered by the annotations in mylist. Graphically, it looks something like this: 0 100 200 300 400 500 600 windows: |-----|-----|-----|-----|-----|-----| mylist: |-| |-----------| So I wrote some code to do just that, but it's fairly slow and has become a bottleneck in my code: ##window for each 100-bp segment windows <- numeric(6) ##second track mylist = vector("list") mylist[[1]] = c(1,20) mylist[[2]] = c(120,320) ##do the intersection for(i in 1:length(mylist)){ st <- floor(mylist[[i]][1]/100)+1 sp <- floor(mylist[[i]][2]/100)+1 for(j in st:sp){ b <- max((j-1)*100, mylist[[i]][1]) e <- min(j*100, mylist[[i]][2]) windows[j] <- windows[j] + e - b + 1 } } print(windows) [1] 20 81 101 21 0 0 Naturally, this is being used on data sets that are much larger than the example I provide here. Through some profiling, I can see that the bottleneck is in the for loops, but my clumsy attempt to vectorize it using *apply functions resulted in code that runs an order of magnitude more slowly. I suppose I could write something in C, but I'd like to avoid that if possible. Can anyone suggest another approach that will speed this calculation up?

    Read the article

  • Writing a report

    - by wvd
    Hello all, Since some time I've been investigating more time into profiling things better, really think about how to do a thing and why. Now I'm going to start a new project, where I will be writing a report about. The report will be about anything what I wrote in the project, why, and I'll be investigating some things and do particular research about them. I've seen some reports, such as game programming in Haskell using FRP. However, after reading several reports they all seem to be build different. I have a few questions about writing a report: 1] What are the things I really should include, and what are the things I really shouldn't include? 2] Is it useful to include graphs about different methods/approaches to a several problem, where you only included one into your project, to show WHY you didn't include the other methods. Or should I just explain the method/approach used into the project. 3] Should I only be writing the report after I've completed the project, or should I also write pages about what I expect, how I'm going to build the software? Thanks, William van Doorn

    Read the article

< Previous Page | 15 16 17 18 19 20 21 22 23  | Next Page >