Search Results

Search found 5756 results on 231 pages for 'cpu utilization'.

Page 210/231 | < Previous Page | 206 207 208 209 210 211 212 213 214 215 216 217  | Next Page >

  • trouble calculating offset index into 3D array

    - by Derek
    Hello, I am writing a CUDA kernel to create a 3x3 covariance matrix for each location in the rows*cols main matrix. So that 3D matrix is rows*cols*9 in size, which i allocated in a single malloc accordingly. I need to access this in a single index value the 9 values of the 3x3 covariance matrix get their values set according to the appropriate row r and column c from some other 2D arrays. In other words - I need to calculate the appropriate index to access the 9 elements of the 3x3 covariance matrix, as well as the row and column offset of the 2D matrices that are inputs to the value, as well as the appropriate index for the storage array. i have tried to simplify it down to the following: //I am calling this kernel with 1D blocks who are 512 cols x 1row. TILE_WIDTH=512 int bx = blockIdx.x; int by = blockIdx.y; int tx = threadIdx.x; int ty = threadIdx.y; int r = by + ty; int c = bx*TILE_WIDTH + tx; int offset = r*cols+c; int ndx = r*cols*rows + c*cols; if((r < rows) && (c < cols)){ //this IF statement is trying to avoid the case where a threadblock went bigger than my original array..not sure if correct d_cov[ndx + 0] = otherArray[offset]; d_cov[ndx + 1] = otherArray[offset] d_cov[ndx + 2] = otherArray[offset] d_cov[ndx + 3] = otherArray[offset] d_cov[ndx + 4] = otherArray[offset] d_cov[ndx + 5] = otherArray[offset] d_cov[ndx + 6] = otherArray[offset] d_cov[ndx + 7] = otherArray[offset] d_cov[ndx + 8] = otherArray[offset] } When I check this array with the values calculated on the CPU, which loops over i=rows, j=cols, k = 1..9 The results do not match up. in other words d_cov[i*rows*cols + j*cols + k] != correctAnswer[i][j][k] Can anyone give me any tips on how to sovle this problem? Is it an indexing problem, or some other logic error?

    Read the article

  • Queuing using table or MSMQ?

    - by Lieven Cardoen
    A part of the application I'm working on is an swf that shows a test with some 80 questions. Each question is saved to a sql server through WebORB and asp.net. If a candidate finisheds the test, the session needs to be validated. Problem now is that sometimes 350 candidates finish their test at the same moment, and cpu on webserver and sql server explodes (350 validations concurrently). Now, how would I implement queuing here? In the database, there's a table that has a record for each session. One column holds the status. 1 is finished, 2 is validated. I could implement queuing in two ways (as I see it, maybe you have other propositions): A process that checks the table for records with status 1. If it finds one, it validates the session. So, sessions are validated one after one. If a candidate finishes its session, a message is sent to a MSMQ queue. Another process listens to the queue and validates sessions one after one. Now, What would be the best approach? Where do you start the process that will validate sessions? In your global.asax (application_start)? As a windows service? As an exe on the root of the website that is started in application_start? To me, using the table and looking for records with status 1 seems the easiest way.

    Read the article

  • Simple POSIX threads question

    - by Andy
    Hi, I have this POSIX thread: void subthread(void) { while(!quit_thread) { // do something ... // don't waste cpu cycles if(!quit_thread) usleep(500); } // free resources ... // tell main thread we're done quit_thread = FALSE; } Now I want to terminate subthread() from my main thread. I've tried the following: quit_thread = TRUE; // wait until subthread() has cleaned its resources while(quit_thread); But it does not work! The while() clause does never exit although my subthread clearly sets quit_thread to FALSE after having freed its resources! If I modify my shutdown code like this: quit_thread = TRUE; // wait until subthread() has cleaned its resources while(quit_thread) usleep(10); Then everything is working fine! Could someone explain to me why the first solution does not work and why the version with usleep(10) suddenly works? I know that this is not a pretty solution. I could use semaphores/signals for this but I'd like to learn something about multithreading, so I'd like to know why my first solution doesn't work. Thanks!

    Read the article

  • Deploy to web container, bundle web container or embed web container...

    - by Jason
    I am developing an application that needs to be as simple as possible to install for the end user. While the end users will likely be experience Linux users (or sales engineers), they don't really know anything about Tomcat, Jetty, etc, nor do I think they should. So I see 3 ways to deploy our applications. I should also state that this is the first app that I have had to deploy that had a web interface, so I haven't really faced this question before. First is to deploy the application into an existing web container. Since we only deploy to Suse or RedHat this seems easy enough to do. However, we're not big on the idea of multiple apps running in one web container. It makes it harder to take down just one app. The next option is to just bundle Tomcat or Jetty and have the startup/shutdown scripts launch our bundled web container. Or 3rd, embed.. This will probably provide the same user experience as the second option. I'm curious what others do when faced with this problem to make it as fool proof as possible on the end user. I've almost ruled out deploying into an existing web container as we often like to set per application resource limits and CPU affinity, which I believe would affect all apps deployed into a web container/app server and not just a specific application. Thank you.

    Read the article

  • Different programming languages possibilities

    - by b-gen-jack-o-neill
    Hello. This should be very simple question. There are many programming languages out there, compiled into machine code or managed code. I first started with ASM back in high school. Assembler is very nice, since you know what exactly CPU does. Next, (as you can see from my other questions here) I decided to learn C and C++. I choosed C becouse from what I read it is the language with output most close to assembler-written programs. But, what I want to know is, can any other Windows programming language out there call win32 API? To be exact, like C has its special header and functions for win32 api interactions, is this assumed to be some important part of programming language? Or are there any languages that have no support for calling win32 API, or just use console to IO and some functions for basic file IO? Becouse, for Windows programming with graphic output, it is essential to have acess to win32 API. I know this question might seem silly, but still please, help me, I ask for study porposes. Thanks.

    Read the article

  • c++ floating point precision loss: 3015/0.00025298219406977296

    - by SigTerm
    The problem. Microsoft Visual C++ 2005 compiler, 32bit windows xp sp3, amd 64 x2 cpu. Code: double a = 3015.0; double b = 0.00025298219406977296; //*((unsigned __int64*)(&a)) == 0x40a78e0000000000 //*((unsigned __int64*)(&b)) == 0x3f30945640000000 double f = a/b;//3015/0.00025298219406977296; the result of calculation (i.e. "f") is 11917835.000000000 (*((unsigned __int64*)(&f)) == 0x4166bb4160000000) although it should be 11917834.814763514 (i.e. *((unsigned __int64*)(&f)) == 0x4166bb415a128aef). I.e. fractional part is lost. Unfortunately, I need fractional part to be correct. Questions: 1) Why does this happen? 2) How can I fix the problem? Additional info: 0) The result is taken directly from "watch" window (it wasn't printed, and I didn't forget to set printing precision). I also provided hex dump of floating point variable, so I'm absolutely sure about calculation result. 1) The disassembly of f = a/b is: fld qword ptr [a] fdiv qword ptr [b] fstp qword ptr [f] 2) f = 3015/0.00025298219406977296; yields correct result (f == 11917834.814763514 , *((unsigned __int64*)(&f)) == 0x4166bb415a128aef ), but it looks like in this case result is simply calculated during compile-time: fld qword ptr [__real@4166bb415a128aef (828EA0h)] fstp qword ptr [f] So, how can I fix this problem? P.S. I've found a temporary workaround (i need only fractional part of division, so I simply use f = fmod(a/b)/b at the moment), but I still would like to know how to fix this problem properly - double precision is supposed to be 16 decimal digits, so such calculation isn't supposed to cause problems.

    Read the article

  • Standard term for a thread I/O reorder buffer?

    - by Crashworks
    I have a case where many threads all concurrently generate data that is ultimately written to one long, serial file. I need to somehow serialize these writes so that the file gets written in the right order. ie, I have an input queue of 2048 jobs j0..jn, each of which produces a chunk of data oi. The jobs run in parallel on, say, eight threads, but the output blocks have to appear in the file in the same order as the corresponding input blocks — the output file has to be in the order o0o1o2... The solution to this is pretty self evident: I need some kind of buffer that accumulates and writes the output blocks in the correct order, similar to a CPU reorder buffer in Tomasulo's algorithm, or to the way that TCP reassembles out-of-order packets before passing them to the application layer. Before I go code it, I'd like to do a quick literature search to see if there are any papers that have solved this problem in a particularly clever or efficient way, since I have severe realtime and memory constraints. I can't seem to find any papers describing this though; a Scholar search on every permutation of [threads, concurrent, reorder buffer, reassembly, io, serialize] hasn't yielded anything useful. I feel like I must just not be searching the right terms. Is there a common academic name or keyword for this kind of pattern that I can search on?

    Read the article

  • Why is CoRegisterClassObject creating two extra threads?

    - by Stijn Sanders
    I'm trying to fix a problem that only recently happens on a number of machine's on a VPN. They each run a client application I wrote that exposes a COM automation object. For some strange reason I haven't been able to discover yet, one thread in the application takes up all of the available CPU time, slowing other operation on the machine. In observing the application's strange behaviour, I've noticed it's the third thread started, and if I debug on my machine I notice the first call to CoRegisterClassObject created two extra threads. If the second of these two threads is the one that gets into an infinite loop, I'm not at all shure how to fix this. Where could I check next about what's wrong? Could it have started by one of the recent patches rolled out by Microsoft this last 'patch tuesday'? I had a go with ProcessExplorer to extract a stack trace of the thread: ntoskrnl.exe!ExReleaseResourceLite+0x1a3 ntoskrnl.exe!PsGetContextThread+0x329 WLDAP32.dll!Ordinal325+0x1231 WLDAP32.dll!Ordinal325+0x129e WLDAP32.dll!Ordinal325+0x1178 ntdll.dll!LdrInitializeThunk+0x24 ntdll.dll!LdrShutdownThread+0xe9 kernel32.dll!ExitThread+0x3e kernel32.dll!FreeLibraryAndExitThread+0x1e ole32.dll!StringFromGUID2+0x65d kernel32.dll!GetModuleFileNameA+0x1ba

    Read the article

  • How much is too much memory allocation in NDK?

    - by Maximus
    The NDK download page notes that, "Typical good candidates for the NDK are self-contained, CPU-intensive operations that don't allocate much memory, such as signal processing, physics simulation, and so on." I came from a C background and was excited to try to use the NDK to operate most of my OpenGL ES functions and any native functions related to physics, animation of vertices, etc... I'm finding that I'm relying quite a bit on Native code and wondering if I may be making some mistakes. I've had no trouble with testing at this point, but I'm curious if I may run into problems in the future. For example, I have game struct defined (somewhat like is seen in the San-Angeles example). I'm loading vertex information for objects dynamically (just what is needed for an active game area) so there's quite a bit of memory allocation happening for vertices, normals, texture coordinates, indices and texture graphic data... just to name the essentials. I'm quite careful about freeing what is allocated between game areas. Would I be safer setting some caps on array sizes or should I charge bravely forward as I'm going now?

    Read the article

  • What hash algorithms are parallelizable? Optimizing the hashing of large files utilizing on multi-co

    - by DanO
    I'm interested in optimizing the hashing of some large files (optimizing wall clock time). The I/O has been optimized well enough already and the I/O device (local SSD) is only tapped at about 25% of capacity, while one of the CPU cores is completely maxed-out. I have more cores available, and in the future will likely have even more cores. So far I've only been able to tap into more cores if I happen to need multiple hashes of the same file, say an MD5 AND a SHA256 at the same time. I can use the same I/O stream to feed two or more hash algorithms, and I get the faster algorithms done for free (as far as wall clock time). As I understand most hash algorithms, each new bit changes the entire result, and it is inherently challenging/impossible to do in parallel. Are any of the mainstream hash algorithms parallelizable? Are there any non-mainstream hashes that are parallelizable (and that have at least a sample implementation available)? As future CPUs will trend toward more cores and a leveling off in clock speed, is there any way to improve the performance of file hashing? (other than liquid nitrogen cooled overclocking?) or is it inherently non-parallelizable?

    Read the article

  • are mobile can be used as a devices to develop application

    - by Richa Media and services
    I say that we can work @ mobile using a technique and work with any IDE and use any OS without problem like work on visual studio and use Window 7 How it possible ? 1. We use Mobile like a CPU and Use a monitor to watch code. We use samsung's techniques to display on monitor. it's give signal to monitor wirelessly to display code on monitor 2 We use Wireless keyboard and Mouse (if user like USB then he also use USB keyboard and mouse) 3. We use a component inside of mobile to control all devices like internet , wi-fi bluethoth. by component user easily setup , control and use feature. 4 don't be confused. i am sure to say that we not use mobile to watch code on mobile screen and Mobile 's keyboard because it's too smaller to work so we use Monitor (LCD) to display code and a keyboard to work comfortably and freely. 5. what are you think if you see a developer who work using this way 6. it is not impossible. give me some feedback and suggestion about your thinking on this technologies.

    Read the article

  • DirectX: Game loop order, draw first and then handle input?

    - by Ricket
    I was just reading through the DirectX documentation and encountered something interesting in the page for IDirect3DDevice9::BeginScene : To enable maximal parallelism between the CPU and the graphics accelerator, it is advantageous to call IDirect3DDevice9::EndScene as far ahead of calling present as possible. I've been accustomed to writing my game loop to handle input and such, then draw. Do I have it backwards? Maybe the game loop should be more like this: (semi-pseudocode, obviously) while(running) { d3ddev->Clear(...); d3ddev->BeginScene(); // draw things d3ddev->EndScene(); // handle input // do any other processing // play sounds, etc. d3ddev->Present(NULL, NULL, NULL, NULL); } According to that sentence of the documentation, this loop would "enable maximal parallelism". Is this commonly done? Are there any downsides to ordering the game loop like this? I see no real problem with it after the first iteration... And I know the best way to know the actual speed increase of something like this is to actually benchmark it, but has anyone else already tried this and can you attest to any actual speed increase?

    Read the article

  • Was Visual Studio 2008 or 2010 written to use multi cores?

    - by Erx_VB.NExT.Coder
    basically i want to know if the visual studio IDE and/or compiler in 2010 was written to make use of a multi core environment (i understand we can target multi core environments in 08 and 10, but that is not my question). i am trying to decide on if i should get a higher clock dual core or a lower clock quad core, as i want to try and figure out which processor will give me the absolute best possible experience with Visual Studio 2010 (ide and background compiler). if they are running the most important section (background compiler and other ide tasks) in one core, then the core will get cut off quicker if running a quad core, esp if background compiler is the heaviest task, i would imagine this would b e difficult to seperate in more then one process, so even if it uses multi cores you might still be better off with going for a higher clock cpu if the majority of the processing is still bound to occur in one core (ie the most significant part of the VS environment). i am a vb programmer, they've made great performance improvements in beta 2, congrats, but i would love to be able to use VS seamlessly... anyone have any ideas? thanks, erx

    Read the article

  • Can knowing C actually hurt the code you write in higher level languages?

    - by Jurily
    The question seems settled, beaten to death even. Smart people have said smart things on the subject. To be a really good programmer, you need to know C. Or do you? I was enlightened twice this week. The first one made me realize that my assumptions don't go further than my knowledge behind them, and given the complexity of software running on my machine, that's almost non-existent. But what really drove it home was this Slashdot comment: The end result is that I notice the many naive ways in which traditional C "bare metal" programmers assume that higher level languages are implemented. They make bad "optimization" decisions in projects they influence, because they have no idea how a compiler works or how different a good runtime system may be from the naive macro-assembler model they understand. Then it hit me: C is just one more abstraction, like all others. Even the CPU itself is only an abstraction! I've just never seen it break, because I don't have the tools to measure it. I'm confused. Has my mind been mutilated beyond recovery, like Dijkstra said about BASIC? Am I living in a constant state of premature optimization? Is there hope for me, now that I realized I know nothing about anything? Is there anything to know, even? And why is it so fascinating, that everything I've written in the last five years might have been fundamentally wrong? To sum it up: is there any value in knowing more than the API docs tell me? EDIT: Made CW. Of course this also means now you must post examples of the interpreter/runtime optimizing better than we do :)

    Read the article

  • JAR files, don't they just bloat and slow Java down?

    - by Josamoto
    Okay, the question might seem dumb, but I'm asking it anyways. After struggling for hours to get a Spring + BlazeDS project up and running, I discovered that I was having problems with my project as a result of not including the right dependencies for Spring etc. There were .jars missing from my WEB-INF/lib folder, yes, silly me. After a while, I managed to get all the .jar files where they belong, and it comes at a whopping 12.5MB at that, and there's more than 30 of them! Which concerns me, but it probably and hopefully shouldn't be concerned. How does Java operate in terms of these JAR files, they do take up quite a bit of hard drive space, taking into account that it's compressed and compiled source code. So that can really quickly populate a lot of RAM and in an instant. My questions are: Does Java load an entire .jar file into memory when say for instance a class in that .jar is instantiated? What about stuff that's in the .jar that never gets used. Do .jars get cached somehow, for optimized application performance? When a single .jar is loaded, I understand that the thing sits in memory and is available across multiple HTTP requests (i.e. for the lifetime of the server instance running), unlike PHP where objects are created on the fly with each request, is this assumption correct? When using Spring, I'm thinking, I had to include all those fiddly .jars, wouldn't I just be better off just using native Java, with say at least and ORM solution like Hibernate? So far, Spring just took extra time configuring, extra hard drive space, extra memory, cpu consumption, so I'm concerned that the framework is going to cost too much application performance just to get for example, IoC implemented with my BlazeDS server. There still has to come ORM, a unit testing framework and bits and pieces here and there. It's just so easy to bloat up a project quickly and irresponsibly easily. Where do I draw the line?

    Read the article

  • Flash browser game - HTTP + PHP vs Socket + Something else

    - by Maurycy Zarzycki
    I am developing a non-real time browser RPG game (think Kingdom of Loathing) which would be played from within a Flash app. At first I just wanted to make the communication with server using simply URLLoader to tell PHP what I am doing, and using $_SESSION to store data needed in-between request. I wonder if it wouldn't be better to base it on a socket connection, an app residing on a server written in Java or Python. The problem is I have never ever written such an app so I have no idea how much I'd have to "shift" my thoughts from simple responding do request (like PHP) to continuously working application. I won't hide I am also concerned about the memory and CPU usage of such Server app, when for example there would be hundreds of users connected. I've done some research. I have tried to do some research, but thanks to my nil knowledge on the sockets subject I haven't found anything helpful. So, considering the fact I don't need real time data exchange, will it be wise to develop the server side part as socket server, not in plain ol' PHP?

    Read the article

  • Windows 2008 VPS hosting experiences

    - by Luke Bennett
    Whilst similar questions exist, I couldn't find any which quite match my request. I'm looking for hosting for some personal .NET projects which for various reasons I do not want to host on our servers at work. I need to be able to host multiple sites and for that reason I'm thinking of a VPS with RDP access for the time being - don't fancy shared hosting as I feel that doesn't offer me the flexbility and control I'm looking for. What experiences do people have of Windows 2008 VPS providers? I've come across a few possibilities although it seems a lot of places are still on Windows 2003 with 2008 'coming soon'. Is VPS the best way to go? Eventually (depending on how the projects take off) I intend to get a dedicated box but at this stage it's not cost-effective. Also, what are people's experiences of running SQL Server Express on a VPS? What would you say the minimum requirements are for CPU/memory? I know it's not going to be anywhere near as performant as SQL Server 2005/8 running on a dedicated box but I'm hoping it will be an acceptable starting point. Any other tips/advice also welcome! Edit: Forgot to mention, I'm ideally looking for UK hosting although I'm open to alternatives.

    Read the article

  • I need to make a multithreading program (python)

    - by Andreawu98
    import multiprocessing import time from itertools import product out_file = open("test.txt", 'w') P = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p','q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',] N = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] M = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] c = int(input("Insert the number of digits you want: ")) n = int(input("If you need number press 1: ")) m = int(input("If you need upper letters press 1: ")) i = [] if n == 1: P = P + N if m == 1: P = P + M then = time.time() def worker(): for i in product(P, repeat=c): #check every possibilities k = '' for z in range(0, c): # k = k + str(i[z]) # print each possibility in a txt without parentesis or comma out_file.write( k + '\n') # out_file.close() now = time.time() diff = str(now - then) # To see how long does it take print(diff) worker() time.sleep(10) # just to check console The code check every single possibility and print it out in a test.txt file. It works but I really can't understand how can I speed it up. I saw it use 1 core out of my quad core CPU so I thought Multi-threading might work even though I don't know how. Please help me. Sorry for my English, I am from Italy.

    Read the article

  • A generic C++ library that provides QtConcurrent functionality?

    - by Lucas
    QtConcurrent is awesome. I'll let the Qt docs speak for themselves: QtConcurrent includes functional programming style APIs for parallel list processing, including a MapReduce and FilterReduce implementation for shared-memory (non-distributed) systems, and classes for managing asynchronous computations in GUI applications. For instance, you give QtConcurrent::map() an iterable sequence and a function that accepts items of the type stored in the sequence, and that function is applied to all the items in the collection. This is done in a multi-threaded manner, with a thread pool equal to the number of logical CPU's on the system. There are plenty of other function in QtConcurrent, like filter(), filteredReduced() etc. The standard CompSci map/reduce functions and the like. I'm totally in love with this, but I'm starting work on an OSS project that will not be using the Qt framework. It's a library, and I don't want to force others to depend on such a large framework like Qt. I'm trying to keep external dependencies to a minimum (it's the decent thing to do). I'm looking for a generic C++ framework that provides me with the same/similar high-level primitives that QtConcurrent does. AFAIK boost has nothing like this (I may be wrong though). boost::thread is very low-level compared to what I'm looking for. I know C# has something very similar with their Parallel Extensions so I know this isn't a Qt-only idea. What do you suggest I use?

    Read the article

  • 3 questions about JSON-P format.

    - by Tristan
    Hi, I have to write a script which will be hosted on differents domains. This script has to get information from my server. So, stackoverflow's user told me that i have to use JSON-P format, which, after research, is what i'm going to do. (the data provided in JSON-P is for displaying some information hosted on my server on other website) How do I output JSON-P from my server ? Is it the same as the json_encode function from PHP How do i design the tree pattern for the output JSON-P (you know, like : ({"name" : "foo", "id" : "xxxxx", "blog" : "http://xxxxxx.com"}); can I steal this from my XML output ? (http://bit.ly/9kzBDP) Each time a visitor browse a website on which my widget is it'll make a request on my server, requesting the JSON-P data to display on the client side. It'll increase dramatically the CPU load (1 visitor on the website who will have the script = 1 SQL request on my server to output data), so is there any way to 'caching' the JSON-P information output to refresh it only one or twice a day and stores it into a 'file' (in which extension?). BUT on the other hand i would say that requesting the JSON-P data directly (without caching it) is a plus, because, websites which will integrates the script only want to display THEIR information and not the whole data. So, making a script with something like that: jQuery.getJSON("http://www.something.com/json-p/outpout?filter=5&callback=?", function(data) { ................); }); Where filter= the information the website wants to display What do you think ? Thank you very much ;)

    Read the article

  • Link failure with either abnormal memory consumption or LNK1106 in Visual Studio 2005.

    - by Corvin
    Hello, I am trying to build a solution for windows XP in Visual Studio 2005. This solution contains 81 projects (static libs, exe's, dlls) and is being successfully used by our partners. I copied the solution bundle from their repository and tried setting it up on 3 similar machines of people in our group. I was successful on two machines and the solution failed to build on my machine. The build on my machine encountered two problems: During a simple build creation of the biggest static library (about 522Mb in debug mode) would fail with the message "13libd\ui1d.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x20101879" Full solution rebuild creates this library, however when it comes to linking the library to main .exe file, devenv.exe spawns link.exe which consumes about 80Mb of physical memory and 250MB of virtual and spawns another link.exe, which does the same. This goes on until the system runs out of memory. On PCs of my colleagues where successful build could be performed, there is only one link.exe process which uses all the memory required for linking (about 500Mb physical). There is a plenty of hard drive space on my machine and the file system is NTFS. All three of our systems are similar - Core2Quad processors, 4Gb of RAM, Windows XP SP3. We are using Visual studio installed from the same source. I tried using a different RAM and CPU, using dedicated graphics adapter to eliminate possibility of video memory sharing influencing the build, putting solution files to different location, using different versions of VS 2005 (Professional, Standard and Team Suite), changing the amount of available virtual memory, running memtest86 and building the project from scratch (i.e. a clean bundle). I have read what MSDN says about LNK1106, none of the cases apply to me except for maybe "out of heap space", however I am not sure how I should fight this. The only idea that I have left is reinstalling the OS, however I am not sure that it would help and I am not sure that my situation wouldn't repeat itself on a different machine. Would anyone have any sort of advice for me? Thanks

    Read the article

  • own drawImage / drawLine in OpenGL

    - by Chrise
    I'm implementing some native 2D-draw functions in my graphics engine for android, but now there's another question coming up, when I observe the performance of my program. At the moment I'm implementing a drawLine/drawImage function. In summary, there are following different values for drawing each different line / image: the color the alpha value the width of the line rotation (only for images) size/scale (also for images) blending method (subrtract, add, normal-alpha) Now, when an imageLine is drawn, I put the CPU-calculated vertex-positions and uv-values for 6 vertices (2 triangles), into a Floatbuffer and draw it immediately with drawArrays, after passing information for drawing (color,alpha, etc.) via uniforms to the shader. When I draw an image, the pre-set VBO is directly drawn after passing information. The first fact I recognized, is: of course drawing Images is much faster, than imagelines (beacuse of VBOs), but also: I cannot pre-put vertex-data into a VBO for imageLines, because imageLines have no static shape like normal images (varying linelength, varying linewidth and the vertex positions of x1,y1 and x2,y2 change too often) That's why I use a normal Floatbuffer, instead of a VBO. So my question is: What's the best way for managing images, and other 2D-graphics functions. For me it's some kind of important, that the user of the engine is able to draw as many images/2D graphics as possible, without loosing to much performance. You can find the functions for drawing images, imagelines, rects, quads, etc. here: https://github.com/Chrise55/LLama3D/blob/master/Llama3DLibrary/src/com/llama3d/object/graphics/image/ImageBase.java Here an example how it looks with many images (testing artificial neural networks), it works fine, but already little bit slow with that many images... :(

    Read the article

  • PHP, MySQL, Memcache / Ajax Scaling Problem

    - by Jeff Andersen
    I'm building a ajax tic tac toe game in PHP/MySQL. The premise of the game is to be able to share a url like mygame.com/123 with your friends and you play multiple simultaneous games. The way I have it set up is that a file (reload.php) is being called every 3 seconds while the user is viewing their game board space. This reload.php builds their game boards and the output (html) replaces their current game board (thus showing games in which it is their turn) Initially I built it entirely with PHP/MySQL and had zero caching. A friend gave me a suggestion to try doing all of the temporary/quick read information through memcache (storing moves, and ID matchups) and then building the game boards from this information. My issue is that, both solutions encounter a wall when there is roughly 30-40 active users with roughly 40-50 games running. It is running on a VPS from VPS.net with 2 nodes. (Dedicated CPU: 1.2GHz, RAM: 752MB) Each call to reload.php peforms 3 selects and 2 insert queries. The size of the data being pulled is negligible. The same actions happen on index.php to build the boards for the initial visit. Now that the backstory is done, my question is: Would there be a bottleneck in that each user is polling the same file every 3 seconds to rebuild their gameboards, and that all users are sitting on index.php from which the AJAX calls are made within the HTML. If so, is it possible to spread the users' calls out over a set of files designated to building the game boards (eg. reload1.php 2, 3 etc) and direct users to the appropriate file. Would this relieve the pressure? A long winded explanation; however, I didn't have anywhere else to ask. Thanks very much for any insight.

    Read the article

  • Determing if an unordered vector<T> has all unique elements

    - by Hooked
    Profiling my cpu-bound code has suggested I that spend a long time checking to see if a container contains completely unique elements. Assuming that I have some large container of unsorted elements (with < and = defined), I have two ideas on how this might be done: The first using a set: template <class T> bool is_unique(vector<T> X) { set<T> Y(X.begin(), X.end()); return X.size() == Y.size(); } The second looping over the elements: template <class T> bool is_unique2(vector<T> X) { typename vector<T>::iterator i,j; for(i=X.begin();i!=X.end();++i) { for(j=i+1;j!=X.end();++j) { if(*i == *j) return 0; } } return 1; } I've tested them the best I can, and from what I can gather from reading the documentation about STL, the answer is (as usual), it depends. I think that in the first case, if all the elements are unique it is very quick, but if there is a large degeneracy the operation seems to take O(N^2) time. For the nested iterator approach the opposite seems to be true, it is lighting fast if X[0]==X[1] but takes (understandably) O(N^2) time if all the elements are unique. Is there a better way to do this, perhaps a STL algorithm built for this very purpose? If not, are there any suggestions eek out a bit more efficiency?

    Read the article

  • How do software events work internally?

    - by Duddle
    Hello! I am a student of Computer Science and have learned many of the basic concepts of what is going on "under the hood" while a computer program is running. But recently I realized that I do not understand how software events work efficiently. In hardware, this is easy: instead of the processor "busy waiting" to see if something happened, the component sends an interrupt request. But how does this work in, for example, a mouse-over event? My guess is as follows: if the mouse sends a signal ("moved"), the operating system calculates its new position p, then checks what program is being drawn on the screen, tells that program position p, then the program itself checks what object is at p, checks if any event handlers are associated with said object and finally fires them. That sounds terribly inefficient to me, since a tiny mouse movement equates to a lot of cpu context switches (which I learned are relatively expensive). And then there are dozens of background applications that may want to do stuff of their own as well. Where is my intuition failing me? I realize that even "slow" 500MHz processors do 500 million operations per second, but still it seems too much work for such a simple event. Thanks in advance!

    Read the article

< Previous Page | 206 207 208 209 210 211 212 213 214 215 216 217  | Next Page >