Search Results

Search found 6387 results on 256 pages for 'cpu allocation'.

Page 234/256 | < Previous Page | 230 231 232 233 234 235 236 237 238 239 240 241  | Next Page >

  • Multithreaded linked list traversal

    - by Rob Bryce
    Given a (doubly) linked list of objects (C++), I have an operation that I would like multithread, to perform on each object. The cost of the operation is not uniform for each object. The linked list is the preferred storage for this set of objects for a variety of reasons. The 1st element in each object is the pointer to the next object; the 2nd element is the previous object in the list. I have solved the problem by building an array of nodes, and applying OpenMP. This gave decent performance. I then switched to my own threading routines (based off Windows primitives) and by using InterlockedIncrement() (acting on the index into the array), I can achieve higher overall CPU utilization and faster through-put. Essentially, the threads work by "leap-frog'ing" along the elements. My next approach to optimization is to try to eliminate creating/reusing the array of elements in my linked list. However, I'd like to continue with this "leap-frog" approach and somehow use some nonexistent routine that could be called "InterlockedCompareDereference" - to atomically compare against NULL (end of list) and conditionally dereference & store, returning the dereferenced value. I don't think InterlockedCompareExchangePointer() will work since I cannot atomically dereference the pointer and call this Interlocked() method. I've done some reading and others are suggesting critical sections or spin-locks. Critical sections seem heavy-weight here. I'm tempted to try spin-locks but I thought I'd first pose the question here and ask what other people are doing. I'm not convinced that the InterlockedCompareExchangePointer() method itself could be used like a spin-lock. Then one also has to consider acquire/release/fence semantics... Ideas? Thanks!

    Read the article

  • Determining if an unordered vector<T> has all unique elements

    - by Hooked
    Profiling my cpu-bound code has suggested I that spend a long time checking to see if a container contains completely unique elements. Assuming that I have some large container of unsorted elements (with < and = defined), I have two ideas on how this might be done: The first using a set: template <class T> bool is_unique(vector<T> X) { set<T> Y(X.begin(), X.end()); return X.size() == Y.size(); } The second looping over the elements: template <class T> bool is_unique2(vector<T> X) { typename vector<T>::iterator i,j; for(i=X.begin();i!=X.end();++i) { for(j=i+1;j!=X.end();++j) { if(*i == *j) return 0; } } return 1; } I've tested them the best I can, and from what I can gather from reading the documentation about STL, the answer is (as usual), it depends. I think that in the first case, if all the elements are unique it is very quick, but if there is a large degeneracy the operation seems to take O(N^2) time. For the nested iterator approach the opposite seems to be true, it is lighting fast if X[0]==X[1] but takes (understandably) O(N^2) time if all the elements are unique. Is there a better way to do this, perhaps a STL algorithm built for this very purpose? If not, are there any suggestions eek out a bit more efficiency?

    Read the article

  • An Erroneous SQL Query makes browser hang until script timeout exceeded

    - by Jimbo
    I have an admin page in a Classic ASP web application that allows the admin user to run queries against the database (SQL Server 2000) Whats really strange is that if the query you send has an error in it (an invalid table join, a column you've forgotten to group by etc) the BROWSER hangs (CPU usage goes to maximum) until the SERVER script timeout is exceeded and then spits out a timeout exceeded error (server and browser are on different machines, so not sure how this happens!) I have tried this in IE 8 and FF 3 with the same result. If you run that same query (with errors) directly from SQL Enterprise Manager, it returns the real error immediately. Is this a security feature? Does anyone know how to turn it off? It even happens when the connection to the database is using 'sa' credentials so I dont think its a security setting :( Dim oRS Set oRS = Server.CreateObject("ADODB.Recordset") oRS.ActiveConnection = sConnectionString // run the query - this is for the admin only so doesnt check for sql safe commands etc. oRS.Open Request.Form("txtSQL") If Not oRS.EOF Then // list the field names from the recordset For i = 0 to oRS.Fields.Count - 1 Response.Write oRS.Fields(i).name & "&nbsp;" Next // show the data for each record in the recordset While Not oRS.EOF For i = 0 to oRS.Fields.Count - 1 Response.Write oRS.Fields(i).value & "&nbsp;" Next Response.Write "<br />" oRS.Movenext() Wend End If

    Read the article

  • What is the most common way to use a middleware in node with express and connect

    - by Bernhard
    Thinking about the correct way, how to make use of middlewares in a node.js web project using express and connect which is growing up at the moment. Of course there are middlewares right now wich has to pass or extend requests globally but in a lot of cases there are special jobs like prepare incoming data and in this case the middleware would only work for a set of http-methods and routes. I've a component based architecture and each component brings it's own middleware layer which can implement those for requests this component can handle. On app startup any required component is loaded and prepared. Is it a good idea to bind the middleware code execution to URLs to keep cpu load lower or is it better to use middlewares only for global purposes? Here's some dummy how an url related middleware look like. app.use(function(req, res, next) { // Check if requested route is a part of the current component // or if the middleware should be passed on any request if (APP.controller.groups.Component.isExpectedRoute(req) || APP.controller.groups.Component.getConfig().MIDDLEWARE_PASS_ALL === true) { // Execute the midleware code here console.log('This is a route which should be afected by middleware'); ... next(); }else{ next(); } });

    Read the article

  • Can knowing C actually hurt the code you write in higher level languages?

    - by Jurily
    The question seems settled, beaten to death even. Smart people have said smart things on the subject. To be a really good programmer, you need to know C. Or do you? I was enlightened twice this week. The first one made me realize that my assumptions don't go further than my knowledge behind them, and given the complexity of software running on my machine, that's almost non-existent. But what really drove it home was this Slashdot comment: The end result is that I notice the many naive ways in which traditional C "bare metal" programmers assume that higher level languages are implemented. They make bad "optimization" decisions in projects they influence, because they have no idea how a compiler works or how different a good runtime system may be from the naive macro-assembler model they understand. Then it hit me: C is just one more abstraction, like all others. Even the CPU itself is only an abstraction! I've just never seen it break, because I don't have the tools to measure it. I'm confused. Has my mind been mutilated beyond recovery, like Dijkstra said about BASIC? Am I living in a constant state of premature optimization? Is there hope for me, now that I realized I know nothing about anything? Is there anything to know, even? And why is it so fascinating, that everything I've written in the last five years might have been fundamentally wrong? To sum it up: is there any value in knowing more than the API docs tell me? EDIT: Made CW. Of course this also means now you must post examples of the interpreter/runtime optimizing better than we do :)

    Read the article

  • Implement a threading to prevent UI block on a bug in an async function

    - by Marcx
    I think I ran up againt a bug in an async function... Precisely the getDirectoryListingAsync() of the File class... This method is supposted to return an object containing the lists of files in a specified folder. I found that calling this method on a direcory with a lot of files (in my tests more than 20k files), after few seconds there is a block on the UI until the process is completed... I think that this method is separated in two main block: 1) get the list of files 2) create the array with the details of the files The point 1 seems to be async (for a few second the ui is responsive), then when the process pass from point 1 to point 2 the block of the UI occurs until the complete event is dispathed... Here's some (simple) code: private function checkFiles(dir:File):void { if (dir.exists) { dir.addEventListener( FileListEvent.DIRECTORY_LISTING, listaImmaginiLocale); dir.getDirectoryListingAsync(); // after this point, for the firsts seconds the UI respond well (point 1), // few seconds later (point 2) the UI is frozen } } private function listaImmaginiLocale( event:FileListEvent ):void { // from this point on the UI is responsive again... } Actually in my projects there are some function that perform an heavy cpu usage and to prevent the UI block I implemented a simple function that after some iteration will wait giving time to UI to be refreshed. private var maxIteration:int = 150000; private function sampleFunct(offset:int = 0) :void { if (offset < maxIteration) { // do something // call the recursive function using a timeout.. // if the offset in multiple by 1000 the function will wait 15 millisec, // otherwise it will be called immediately // 1000 is a random number for the pourpose of this example, but I usually change the // value based on how much heavy is the function itself... setTimeout(function():void{aaa(++offset);}, (offset%1000?15:0)); } } Using this method I got a good responsive UI without afflicting performance... I'd like to implement it into the getDirectoryListingAsync method but I don't know if it's possibile how can I do it where is the file to edit or extend.. Any suggestion???

    Read the article

  • Link failure with either abnormal memory consumption or LNK1106 in Visual Studio 2005.

    - by Corvin
    Hello, I am trying to build a solution for windows XP in Visual Studio 2005. This solution contains 81 projects (static libs, exe's, dlls) and is being successfully used by our partners. I copied the solution bundle from their repository and tried setting it up on 3 similar machines of people in our group. I was successful on two machines and the solution failed to build on my machine. The build on my machine encountered two problems: During a simple build creation of the biggest static library (about 522Mb in debug mode) would fail with the message "13libd\ui1d.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x20101879" Full solution rebuild creates this library, however when it comes to linking the library to main .exe file, devenv.exe spawns link.exe which consumes about 80Mb of physical memory and 250MB of virtual and spawns another link.exe, which does the same. This goes on until the system runs out of memory. On PCs of my colleagues where successful build could be performed, there is only one link.exe process which uses all the memory required for linking (about 500Mb physical). There is a plenty of hard drive space on my machine and the file system is NTFS. All three of our systems are similar - Core2Quad processors, 4Gb of RAM, Windows XP SP3. We are using Visual studio installed from the same source. I tried using a different RAM and CPU, using dedicated graphics adapter to eliminate possibility of video memory sharing influencing the build, putting solution files to different location, using different versions of VS 2005 (Professional, Standard and Team Suite), changing the amount of available virtual memory, running memtest86 and building the project from scratch (i.e. a clean bundle). I have read what MSDN says about LNK1106, none of the cases apply to me except for maybe "out of heap space", however I am not sure how I should fight this. The only idea that I have left is reinstalling the OS, however I am not sure that it would help and I am not sure that my situation wouldn't repeat itself on a different machine. Would anyone have any sort of advice for me? Thanks

    Read the article

  • A generic C++ library that provides QtConcurrent functionality?

    - by Lucas
    QtConcurrent is awesome. I'll let the Qt docs speak for themselves: QtConcurrent includes functional programming style APIs for parallel list processing, including a MapReduce and FilterReduce implementation for shared-memory (non-distributed) systems, and classes for managing asynchronous computations in GUI applications. For instance, you give QtConcurrent::map() an iterable sequence and a function that accepts items of the type stored in the sequence, and that function is applied to all the items in the collection. This is done in a multi-threaded manner, with a thread pool equal to the number of logical CPU's on the system. There are plenty of other function in QtConcurrent, like filter(), filteredReduced() etc. The standard CompSci map/reduce functions and the like. I'm totally in love with this, but I'm starting work on an OSS project that will not be using the Qt framework. It's a library, and I don't want to force others to depend on such a large framework like Qt. I'm trying to keep external dependencies to a minimum (it's the decent thing to do). I'm looking for a generic C++ framework that provides me with the same/similar high-level primitives that QtConcurrent does. AFAIK boost has nothing like this (I may be wrong though). boost::thread is very low-level compared to what I'm looking for. I know C# has something very similar with their Parallel Extensions so I know this isn't a Qt-only idea. What do you suggest I use?

    Read the article

  • JAR files, don't they just bloat and slow Java down?

    - by Josamoto
    Okay, the question might seem dumb, but I'm asking it anyways. After struggling for hours to get a Spring + BlazeDS project up and running, I discovered that I was having problems with my project as a result of not including the right dependencies for Spring etc. There were .jars missing from my WEB-INF/lib folder, yes, silly me. After a while, I managed to get all the .jar files where they belong, and it comes at a whopping 12.5MB at that, and there's more than 30 of them! Which concerns me, but it probably and hopefully shouldn't be concerned. How does Java operate in terms of these JAR files, they do take up quite a bit of hard drive space, taking into account that it's compressed and compiled source code. So that can really quickly populate a lot of RAM and in an instant. My questions are: Does Java load an entire .jar file into memory when say for instance a class in that .jar is instantiated? What about stuff that's in the .jar that never gets used. Do .jars get cached somehow, for optimized application performance? When a single .jar is loaded, I understand that the thing sits in memory and is available across multiple HTTP requests (i.e. for the lifetime of the server instance running), unlike PHP where objects are created on the fly with each request, is this assumption correct? When using Spring, I'm thinking, I had to include all those fiddly .jars, wouldn't I just be better off just using native Java, with say at least and ORM solution like Hibernate? So far, Spring just took extra time configuring, extra hard drive space, extra memory, cpu consumption, so I'm concerned that the framework is going to cost too much application performance just to get for example, IoC implemented with my BlazeDS server. There still has to come ORM, a unit testing framework and bits and pieces here and there. It's just so easy to bloat up a project quickly and irresponsibly easily. Where do I draw the line?

    Read the article

  • c++ floating point precision loss: 3015/0.00025298219406977296

    - by SigTerm
    The problem. Microsoft Visual C++ 2005 compiler, 32bit windows xp sp3, amd 64 x2 cpu. Code: double a = 3015.0; double b = 0.00025298219406977296; //*((unsigned __int64*)(&a)) == 0x40a78e0000000000 //*((unsigned __int64*)(&b)) == 0x3f30945640000000 double f = a/b;//3015/0.00025298219406977296; the result of calculation (i.e. "f") is 11917835.000000000 (*((unsigned __int64*)(&f)) == 0x4166bb4160000000) although it should be 11917834.814763514 (i.e. *((unsigned __int64*)(&f)) == 0x4166bb415a128aef). I.e. fractional part is lost. Unfortunately, I need fractional part to be correct. Questions: 1) Why does this happen? 2) How can I fix the problem? Additional info: 0) The result is taken directly from "watch" window (it wasn't printed, and I didn't forget to set printing precision). I also provided hex dump of floating point variable, so I'm absolutely sure about calculation result. 1) The disassembly of f = a/b is: fld qword ptr [a] fdiv qword ptr [b] fstp qword ptr [f] 2) f = 3015/0.00025298219406977296; yields correct result (f == 11917834.814763514 , *((unsigned __int64*)(&f)) == 0x4166bb415a128aef ), but it looks like in this case result is simply calculated during compile-time: fld qword ptr [__real@4166bb415a128aef (828EA0h)] fstp qword ptr [f] So, how can I fix this problem? P.S. I've found a temporary workaround (i need only fractional part of division, so I simply use f = fmod(a/b)/b at the moment), but I still would like to know how to fix this problem properly - double precision is supposed to be 16 decimal digits, so such calculation isn't supposed to cause problems.

    Read the article

  • Handling large (object) datasets with PHP

    - by Aron Rotteveel
    I am currently working on a project that extensively relies on the EAV model. Both entities as their attributes are individually represented by a model, sometimes extending other models (or at least, base models). This has worked quite well so far since most areas of the application only rely on filtered sets of entities, and not the entire dataset. Now, however, I need to parse the entire dataset (IE: all entities and all their attributes) in order to provide a sorting/filtering algorithm based on the attributes. The application currently consists of aproximately 2200 entities, each with aproximately 100 attributes. Every entity is represented by a single model (for example Client_Model_Entity) and has a protected property called $_attributes, which is an array of Attribute objects. Each entity object is about 500KB, which results in an incredible load on the server. With 2000 entities, this means a single task would take 1GB of RAM (and a lot of CPU time) in order to work, which is unacceptable. Are there any patterns or common approaches to iterating over such large datasets? Paging is not really an option, since everything has to be taken into account in order to provide the sorting algorithm.

    Read the article

  • 3 questions about JSON-P format.

    - by Tristan
    Hi, I have to write a script which will be hosted on differents domains. This script has to get information from my server. So, stackoverflow's user told me that i have to use JSON-P format, which, after research, is what i'm going to do. (the data provided in JSON-P is for displaying some information hosted on my server on other website) How do I output JSON-P from my server ? Is it the same as the json_encode function from PHP How do i design the tree pattern for the output JSON-P (you know, like : ({"name" : "foo", "id" : "xxxxx", "blog" : "http://xxxxxx.com"}); can I steal this from my XML output ? (http://bit.ly/9kzBDP) Each time a visitor browse a website on which my widget is it'll make a request on my server, requesting the JSON-P data to display on the client side. It'll increase dramatically the CPU load (1 visitor on the website who will have the script = 1 SQL request on my server to output data), so is there any way to 'caching' the JSON-P information output to refresh it only one or twice a day and stores it into a 'file' (in which extension?). BUT on the other hand i would say that requesting the JSON-P data directly (without caching it) is a plus, because, websites which will integrates the script only want to display THEIR information and not the whole data. So, making a script with something like that: jQuery.getJSON("http://www.something.com/json-p/outpout?filter=5&callback=?", function(data) { ................); }); Where filter= the information the website wants to display What do you think ? Thank you very much ;)

    Read the article

  • What hash algorithms are parallelizable? Optimizing the hashing of large files utilizing on multi-co

    - by DanO
    I'm interested in optimizing the hashing of some large files (optimizing wall clock time). The I/O has been optimized well enough already and the I/O device (local SSD) is only tapped at about 25% of capacity, while one of the CPU cores is completely maxed-out. I have more cores available, and in the future will likely have even more cores. So far I've only been able to tap into more cores if I happen to need multiple hashes of the same file, say an MD5 AND a SHA256 at the same time. I can use the same I/O stream to feed two or more hash algorithms, and I get the faster algorithms done for free (as far as wall clock time). As I understand most hash algorithms, each new bit changes the entire result, and it is inherently challenging/impossible to do in parallel. Are any of the mainstream hash algorithms parallelizable? Are there any non-mainstream hashes that are parallelizable (and that have at least a sample implementation available)? As future CPUs will trend toward more cores and a leveling off in clock speed, is there any way to improve the performance of file hashing? (other than liquid nitrogen cooled overclocking?) or is it inherently non-parallelizable?

    Read the article

  • Want to Receive dynamic length data from a message queue in IPC?

    - by user1089679
    Here I have to send and receive dynamic data using a SysV message queue. so in structure filed i have dynamic memory allocation char * because its size may be varies. so how can i receive this type of message at receiver side. Please let me know how can i send dynamic length of data with message queue. I am getting problem in this i posted my code below. send.c /*filename : send.c *To compile : gcc send.c -o send */ #include <stdio.h> #include <stdlib.h> #include <errno.h> #include <string.h> #include <sys/types.h> #include <sys/ipc.h> #include <sys/msg.h> struct my_msgbuf { long mtype; char *mtext; }; int main(void) { struct my_msgbuf buf; int msqid; key_t key; static int count = 0; char temp[5]; int run = 1; if ((key = ftok("send.c", 'B')) == -1) { perror("ftok"); exit(1); } printf("send.c Key is = %d\n",key); if ((msqid = msgget(key, 0644 | IPC_CREAT)) == -1) { perror("msgget"); exit(1); } printf("Enter lines of text, ^D to quit:\n"); buf.mtype = 1; /* we don't really care in this case */ int ret = -1; while(run) { count++; buf.mtext = malloc(50); strcpy(buf.mtext,"Hi hello test message here"); snprintf(temp, sizeof (temp), "%d",count); strcat(buf.mtext,temp); int len = strlen(buf.mtext); /* ditch newline at end, if it exists */ if (buf.mtext[len-1] == '\n') buf.mtext[len-1] = '\0'; if (msgsnd(msqid, &buf, len+1, IPC_NOWAIT) == -1) /* +1 for '\0' */ perror("msgsnd"); if(count == 100) run = 0; usleep(1000000); } if (msgctl(msqid, IPC_RMID, NULL) == -1) { perror("msgctl"); exit(1); } return 0; } receive.c /* filename : receive.c * To compile : gcc receive.c -o receive */ #include <stdio.h> #include <stdlib.h> #include <errno.h> #include <sys/types.h> #include <sys/ipc.h> #include <sys/msg.h> struct my_msgbuf { long mtype; char *mtext; }; int main(void) { struct my_msgbuf buf; int msqid; key_t key; if ((key = ftok("send.c", 'B')) == -1) { /* same key as send.c */ perror("ftok"); exit(1); } if ((msqid = msgget(key, 0644)) == -1) { /* connect to the queue */ perror("msgget"); exit(1); } printf("test: ready to receive messages, captain.\n"); for(;;) { /* receive never quits! */ buf.mtext = malloc(50); if (msgrcv(msqid, &buf, 50, 0, 0) == -1) { perror("msgrcv"); exit(1); } printf("test: \"%s\"\n", buf.mtext); } return 0; }

    Read the article

  • trouble calculating offset index into 3D array

    - by Derek
    Hello, I am writing a CUDA kernel to create a 3x3 covariance matrix for each location in the rows*cols main matrix. So that 3D matrix is rows*cols*9 in size, which i allocated in a single malloc accordingly. I need to access this in a single index value the 9 values of the 3x3 covariance matrix get their values set according to the appropriate row r and column c from some other 2D arrays. In other words - I need to calculate the appropriate index to access the 9 elements of the 3x3 covariance matrix, as well as the row and column offset of the 2D matrices that are inputs to the value, as well as the appropriate index for the storage array. i have tried to simplify it down to the following: //I am calling this kernel with 1D blocks who are 512 cols x 1row. TILE_WIDTH=512 int bx = blockIdx.x; int by = blockIdx.y; int tx = threadIdx.x; int ty = threadIdx.y; int r = by + ty; int c = bx*TILE_WIDTH + tx; int offset = r*cols+c; int ndx = r*cols*rows + c*cols; if((r < rows) && (c < cols)){ //this IF statement is trying to avoid the case where a threadblock went bigger than my original array..not sure if correct d_cov[ndx + 0] = otherArray[offset]; d_cov[ndx + 1] = otherArray[offset] d_cov[ndx + 2] = otherArray[offset] d_cov[ndx + 3] = otherArray[offset] d_cov[ndx + 4] = otherArray[offset] d_cov[ndx + 5] = otherArray[offset] d_cov[ndx + 6] = otherArray[offset] d_cov[ndx + 7] = otherArray[offset] d_cov[ndx + 8] = otherArray[offset] } When I check this array with the values calculated on the CPU, which loops over i=rows, j=cols, k = 1..9 The results do not match up. in other words d_cov[i*rows*cols + j*cols + k] != correctAnswer[i][j][k] Can anyone give me any tips on how to sovle this problem? Is it an indexing problem, or some other logic error?

    Read the article

  • Why is CoRegisterClassObject creating two extra threads?

    - by Stijn Sanders
    I'm trying to fix a problem that only recently happens on a number of machine's on a VPN. They each run a client application I wrote that exposes a COM automation object. For some strange reason I haven't been able to discover yet, one thread in the application takes up all of the available CPU time, slowing other operation on the machine. In observing the application's strange behaviour, I've noticed it's the third thread started, and if I debug on my machine I notice the first call to CoRegisterClassObject created two extra threads. If the second of these two threads is the one that gets into an infinite loop, I'm not at all shure how to fix this. Where could I check next about what's wrong? Could it have started by one of the recent patches rolled out by Microsoft this last 'patch tuesday'? I had a go with ProcessExplorer to extract a stack trace of the thread: ntoskrnl.exe!ExReleaseResourceLite+0x1a3 ntoskrnl.exe!PsGetContextThread+0x329 WLDAP32.dll!Ordinal325+0x1231 WLDAP32.dll!Ordinal325+0x129e WLDAP32.dll!Ordinal325+0x1178 ntdll.dll!LdrInitializeThunk+0x24 ntdll.dll!LdrShutdownThread+0xe9 kernel32.dll!ExitThread+0x3e kernel32.dll!FreeLibraryAndExitThread+0x1e ole32.dll!StringFromGUID2+0x65d kernel32.dll!GetModuleFileNameA+0x1ba

    Read the article

  • How to copy files without slowing down my app?

    - by Kevin Gebhardt
    I have a bunch of little files in my assets which need to be copied to the SD-card on the first start of my App. The copy code i got from here placed in an IntentService works like a charm. However, when I start to copy many litte files, the whole app gets increddible slow (I'm not really sure why by the way), which is a really bad experience for the user on first start. As I realised other apps running normal in that time, I tried to start a child process for the service, which didn't work, as I can't acess my assets from another process as far as I understood. Has anybody out there an idea how a) to copy the files without blocking my app b) to get through to my assets from a private process (process=":myOtherProcess" in Manifest) or c) solve the problem in a complete different way Edit: To make this clearer: The copying allready takes place in a seperate thread (started automaticaly by IntentService). The problem is not to separate the task of copying but that the copying in a dedicated thread somehow affects the rest of the app (e.g. blocking to many app-specific resources?) but not other apps (so it's not blocking the whole CPU or someting) Edit2: Problem solved, it turns out, there wasn't really a problem. See my answer below.

    Read the article

  • Many users, many cpus, no delays. Good for cloud?

    - by Eric
    I wish to set up a CPU-intensive time-important query service for users on the internet. A usage scenario is described below. Is cloud computing the right way to go for such an implementation? If so, what cloud vendor(s) cater to this type of application? I ask specifically, in terms of: 1) pricing 2) latency resulting from: - slow CPUs, instance creations, JIT compiles, etc.. - internal management and communication of processes inside the cloud (e.g. a queuing process and a calculation process) - communication between cloud and end user 3) ease of deployment A usage scenario I am expecting is: - A typical user sends a query (XML of size around 1K) once every 30 seconds on average. - Each query requires a numerical computation of average time 0.2 sec and max time 1 sec on a 1 GHz Pentium. The computation requires no data other than the query itself and is performed by the same piece of code each time. - The delay a user experiences between sending a query and receiving a response should be on average no more than 2 seconds and in general no more than 5 seconds. - A background save to a DB of the response should occur (not time critical) - There can be up to 30000 simultaneous users - i.e., on average 1000 queries a second, each requiring an average 0.2 sec calculation, so that would necessitate around 200 CPUs. Currently I'm look at GAE Java (for quicker deployment and less IT hassle) and EC2 (Speed and price optimization) as options. Where can I learn more about the right way to set ups such a system? past threads, different blogs, books, etc.. BTW, if my terminology is wrong or confusing, please let me know. I'd greatly appreciate any help.

    Read the article

  • Determing if an unordered vector<T> has all unique elements

    - by Hooked
    Profiling my cpu-bound code has suggested I that spend a long time checking to see if a container contains completely unique elements. Assuming that I have some large container of unsorted elements (with < and = defined), I have two ideas on how this might be done: The first using a set: template <class T> bool is_unique(vector<T> X) { set<T> Y(X.begin(), X.end()); return X.size() == Y.size(); } The second looping over the elements: template <class T> bool is_unique2(vector<T> X) { typename vector<T>::iterator i,j; for(i=X.begin();i!=X.end();++i) { for(j=i+1;j!=X.end();++j) { if(*i == *j) return 0; } } return 1; } I've tested them the best I can, and from what I can gather from reading the documentation about STL, the answer is (as usual), it depends. I think that in the first case, if all the elements are unique it is very quick, but if there is a large degeneracy the operation seems to take O(N^2) time. For the nested iterator approach the opposite seems to be true, it is lighting fast if X[0]==X[1] but takes (understandably) O(N^2) time if all the elements are unique. Is there a better way to do this, perhaps a STL algorithm built for this very purpose? If not, are there any suggestions eek out a bit more efficiency?

    Read the article

  • C#: Thread.CurrentThread.CurrentUICulture not working consistently

    - by xTRUMANx
    I've been working on a pet project on the weekends to learn more about C# and have encountered an odd problem when working with localization. To be more specific, the problem I have is with System.Threading.Thread.CurrentThread.CurrentUICulture. I've set up my app so that the user can quickly change the language of the app by clicking a menu item. The menu item in turn, saves the two-letter code for the language (e.g. "en", "fr", etc.) in a user setting called 'Language' and then restarts the application. Properties.Settings.Default.Language = "en"; Properties.Settings.Default.Save(); Application.Restart(); When the application is started up, the first line of code in the Form's constructor (even before InitializeComponent()) fetches the Language string from the settings and sets the CurrentUICulture like so: public Form1() { Thread.CurrentThread.CurrentUICulture = new CultureInfo(Properties.Settings.Default.Language); InitializeComponent(); } The thing is, this doesn't work consistently. Sometimes, all works well and the application loads the correct language based on the string saved in the settings file. Other times, it doesn't, and the language remains the same after the application is restarted. At first I thought that I didn't save the language before restarting the application but that is definitely not the case. When the correct language fails to load, if I were to close the application and run it again, the correct language would come up correctly. So this implies that the Language string has been saved but the CurrentUICulture assignment in my form constructor is having no effect sometimes. Any help? Is there something I'm missing of how threading works in C#? This could be machine-specific, so if it makes any difference I'm using Pentium Dual-Core CPU.

    Read the article

  • Drawing line graphics leads Flash to spiral out of control!

    - by drpepper
    Hi, I'm having problems with some AS3 code that simply draws on a Sprite's Graphics object. The drawing happens as part of a larger procedure called on every ENTER_FRAME event of the stage. Flash neither crashes nor returns an error. Instead, it starts running at 100% CPU and grabs all the memory that it can, until I kill the process manually or my computer buckles under the pressure when it gets up to around 2-3 GB. This will happen at a random time, and without any noticiple slowdown beforehand. WTF? Has anyone seen anything like this? PS: I used to do the drawing within a MOUSE_MOVE event handler, which brought this problem on even faster. PPS: I'm developing on Linux, but reproduced the same problem on Windows. UPDATE: You asked for some code, so here we are. The drawing function looks like this: public static function drawDashedLine(i_graphics : Graphics, i_from : Point, i_to : Point, i_on : Number, i_off : Number) : void { const vecLength : Number = Point.distance(i_from, i_to); i_graphics.moveTo(i_from.x, i_from.y); var dist : Number = 0; var lineIsOn : Boolean = true; while(dist < vecLength) { dist = Math.min(vecLength, dist + (lineIsOn ? i_on : i_off)); const p : Point = Point.interpolate(i_from, i_to, 1 - dist / vecLength); if(lineIsOn) i_graphics.lineTo(p.x, p.y); else i_graphics.moveTo(p.x, p.y); lineIsOn = !lineIsOn; } } and is called like this (m_graphicsLayer is a Sprite): m_graphicsLayer.graphics.clear(); if (m_destinationPoint) { m_graphicsLayer.graphics.lineStyle(2, m_fixedAim ? 0xff0000 : 0x333333, 1); drawDashedLine(m_graphicsLayer.graphics, m_initialPos, m_destinationPoint, 10, 10); }

    Read the article

  • Stable/repeatable random sort (MySQL, Rails)

    - by Matt Rogish
    I'd like to paginate through a randomly sorted list of ActiveRecord models (rows from MySQL database). However, this randomization needs to persist on a per-session basis, so that other people that visit the website also receive a random, paginate-able list of records. Let's say there are enough entities (tens of thousands) that storing the randomly sorted ID values in either the session or a cookie is too large, so I must temporarily persist it in some other way (MySQL, file, etc.). Initially I thought I could create a function based on the session ID and the page ID (returning the object IDs for that page) however since the object ID values in MySQL are not sequential (there are gaps), that seemed to fall apart as I was poking at it. The nice thing is that it would require no/minimal storage but the downsides are that it is likely pretty complex to implement and probably CPU intensive. My feeling is I should create an intersection table, something like: random_sorts( sort_id, created_at, user_id NULL if guest) random_sort_items( sort_id, item_id, position ) And then simply store the 'sort_id' in the session. Then, I can paginate the random_sorts WHERE sort_id = n ORDER BY position LIMIT... as usual. Of course, I'd have to put some sort of a reaper in there to remove them after some period of inactivity (based on random_sorts.created_at). Unfortunately, I'd have to invalidate the sort as new objects were created (and/or old objects being removed, although deletion is very rare). And, as load increases the size/performance of this table (even properly indexed) drops. It seems like this ought to be a solved problem but I can't find any rails plugins that do this... Any ideas? Thanks!!

    Read the article

  • Was Visual Studio 2008 or 2010 written to use multi cores?

    - by Erx_VB.NExT.Coder
    basically i want to know if the visual studio IDE and/or compiler in 2010 was written to make use of a multi core environment (i understand we can target multi core environments in 08 and 10, but that is not my question). i am trying to decide on if i should get a higher clock dual core or a lower clock quad core, as i want to try and figure out which processor will give me the absolute best possible experience with Visual Studio 2010 (ide and background compiler). if they are running the most important section (background compiler and other ide tasks) in one core, then the core will get cut off quicker if running a quad core, esp if background compiler is the heaviest task, i would imagine this would b e difficult to seperate in more then one process, so even if it uses multi cores you might still be better off with going for a higher clock cpu if the majority of the processing is still bound to occur in one core (ie the most significant part of the VS environment). i am a vb programmer, they've made great performance improvements in beta 2, congrats, but i would love to be able to use VS seamlessly... anyone have any ideas? thanks, erx

    Read the article

  • Effective optimization strategies on modern C++ compilers

    - by user168715
    I'm working on scientific code that is very performance-critical. An initial version of the code has been written and tested, and now, with profiler in hand, it's time to start shaving cycles from the hot spots. It's well-known that some optimizations, e.g. loop unrolling, are handled these days much more effectively by the compiler than by a programmer meddling by hand. Which techniques are still worthwhile? Obviously, I'll run everything I try through a profiler, but if there's conventional wisdom as to what tends to work and what doesn't, it would save me significant time. I know that optimization is very compiler- and architecture- dependent. I'm using Intel's C++ compiler targeting the Core 2 Duo, but I'm also interested in what works well for gcc, or for "any modern compiler." Here are some concrete ideas I'm considering: Is there any benefit to replacing STL containers/algorithms with hand-rolled ones? In particular, my program includes a very large priority queue (currently a std::priority_queue) whose manipulation is taking a lot of total time. Is this something worth looking into, or is the STL implementation already likely the fastest possible? Along similar lines, for std::vectors whose needed sizes are unknown but have a reasonably small upper bound, is it profitable to replace them with statically-allocated arrays? I've found that dynamic memory allocation is often a severe bottleneck, and that eliminating it can lead to significant speedups. As a consequence I'm interesting in the performance tradeoffs of returning large temporary data structures by value vs. returning by pointer vs. passing the result in by reference. Is there a way to reliably determine whether or not the compiler will use RVO for a given method (assuming the caller doesn't need to modify the result, of course)? How cache-aware do compilers tend to be? For example, is it worth looking into reordering nested loops? Given the scientific nature of the program, floating-point numbers are used everywhere. A significant bottleneck in my code used to be conversions from floating point to integers: the compiler would emit code to save the current rounding mode, change it, perform the conversion, then restore the old rounding mode --- even though nothing in the program ever changed the rounding mode! Disabling this behavior significantly sped up my code. Are there any similar floating-point-related gotchas I should be aware of? One consequence of C++ being compiled and linked separately is that the compiler is unable to do what would seem to be very simple optimizations, such as move method calls like strlen() out of the termination conditions of loop. Are there any optimization like this one that I should look out for because they can't be done by the compiler and must be done by hand? On the flip side, are there any techniques I should avoid because they are likely to interfere with the compiler's ability to automatically optimize code? Lastly, to nip certain kinds of answers in the bud: I understand that optimization has a cost in terms of complexity, reliability, and maintainability. For this particular application, increased performance is worth these costs. I understand that the best optimizations are often to improve the high-level algorithms, and this has already been done.

    Read the article

  • C++ pimpl idiom wastes an instruction vs. C style?

    - by Rob
    (Yes, I know that one machine instruction usually doesn't matter. I'm asking this question because I want to understand the pimpl idiom, and use it in the best possible way; and because sometimes I do care about one machine instruction.) In the sample code below, there are two classes, Thing and OtherThing. Users would include "thing.hh". Thing uses the pimpl idiom to hide it's implementation. OtherThing uses a C style – non-member functions that return and take pointers. This style produces slightly better machine code. I'm wondering: is there a way to use C++ style – ie, make the functions into member functions – and yet still save the machine instruction. I like this style because it doesn't pollute the namespace outside the class. Note: I'm only looking at calling member functions (in this case, calc). I'm not looking at object allocation. Below are the files, commands, and the machine code, on my Mac. thing.hh: class ThingImpl; class Thing { ThingImpl *impl; public: Thing(); int calc(); }; class OtherThing; OtherThing *make_other(); int calc(OtherThing *); thing.cc: #include "thing.hh" struct ThingImpl { int x; }; Thing::Thing() { impl = new ThingImpl; impl->x = 5; } int Thing::calc() { return impl->x + 1; } struct OtherThing { int x; }; OtherThing *make_other() { OtherThing *t = new OtherThing; t->x = 5; } int calc(OtherThing *t) { return t->x + 1; } main.cc (just to test the code actually works...) #include "thing.hh" #include <cstdio> int main() { Thing *t = new Thing; printf("calc: %d\n", t->calc()); OtherThing *t2 = make_other(); printf("calc: %d\n", calc(t2)); } Makefile: all: main thing.o : thing.cc thing.hh g++ -fomit-frame-pointer -O2 -c thing.cc main.o : main.cc thing.hh g++ -fomit-frame-pointer -O2 -c main.cc main: main.o thing.o g++ -O2 -o $@ $^ clean: rm *.o rm main Run make and then look at the machine code. On the mac I use otool -tv thing.o | c++filt. On linux I think it's objdump -d thing.o. Here is the relevant output: Thing::calc(): 0000000000000000 movq (%rdi),%rax 0000000000000003 movl (%rax),%eax 0000000000000005 incl %eax 0000000000000007 ret calc(OtherThing*): 0000000000000010 movl (%rdi),%eax 0000000000000012 incl %eax 0000000000000014 ret Notice the extra instruction because of the pointer indirection. The first function looks up two fields (impl, then x), while the second only needs to get x. What can be done?

    Read the article

< Previous Page | 230 231 232 233 234 235 236 237 238 239 240 241  | Next Page >