Search Results

Search found 3265 results on 131 pages for 'parallel coordinates'.

Page 28/131 | < Previous Page | 24 25 26 27 28 29 30 31 32 33 34 35  | Next Page >

  • No speed-up with useless printf's using OpenMP

    - by t2k32316
    I just wrote my first OpenMP program that parallelizes a simple for loop. I ran the code on my dual core machine and saw some speed up when going from 1 thread to 2 threads. However, I ran the same code on a school linux server and saw no speed-up. After trying different things, I finally realized that removing some useless printf statements caused the code to have significant speed-up. Below is the main part of the code that I parallelized: #pragma omp parallel for private(i) for(i = 2; i <= n; i++) { printf("useless statement"); prime[i-2] = is_prime(i); } I guess that the implementation of printf has significant overhead that OpenMP must be duplicating with each thread. What causes this overhead and why can OpenMP not overcome it?

    Read the article

  • object won't die (still references to it that I can't find)

    - by user288558
    I'm using parallel-python and start a new job server in a function. after the functions ends it still exists even though I didn't return it out of the function (I used weakref to test this). I guess there's still some references to this object somewhere. My two theories: It starts threads and it logs to root logger. My questions: can I somehow findout in which namespace there is still a reference to this object. I have the weakref reference. Does anyone know how to detach a logger? What other debug suggestions do people have? here is my testcode: def pptester(): js=pp.Server(ppservers=nodes) js.set_ncpus(0) fh=file('tmp.tmp.tmp','w') tmp=[] for i in range(200): tmp.append(js.submit(ppworktest,(),(),('os','subprocess'))) js.print_stats() return weakref.ref(js) thanks in advance Wolfgang

    Read the article

  • Running Awk command on a cluster

    - by alex
    How do you execute a Unix shell command (awk script, a pipe etc) on a cluster in parallel (step 1) and collect the results back to a central node (step 2) Hadoop seems to be a huge overkill with its 600k LOC and its performance is terrible (takes minutes just to initialize the job) i don't need shared memory, or - something like MPI/openMP as i dont need to synchronize or share anything, don't need a distributed VM or anything as complex Google's SawZall seems to work only with Google proprietary MapReduce API some distributed shell packages i found failed to compile, but there must be a simple way to run a data-centric batch job on a cluster, something as close as possible to native OS, may be using unix RPC calls i liked rsync simplicity but it seem to update remote notes sequentially, and you cant use it for executing scripts as afar as i know switching to Plan 9 or some other network oriented OS looks like another overkill i'm looking for a simple, distributed way to run awk scripts or similar - as close as possible to data with a minimal initialization overhead, in a nothing-shared, nothing-synchronized fashion Thanks Alex

    Read the article

  • RabbitMQ serializing messages from queue with multiple consumers

    - by Refefer
    Hi there, I'm having a problem where I have a queue set up in shared mode and multiple consumers bound to it. The issue is that it appears that rabbitmq is serializing the messages, that is, only one consumer at a time is able to run. I need this to be parallel, however, I can't seem to figure out how. Each consumer is running in its own process. There are plenty of messages in the queue. I'm using py-amqplib to interface with RabbitMQ. Any thoughts?

    Read the article

  • Linker library for OpenMP for Snow Leopard?

    - by unknownthreat
    Currently, I am trying out OpenMP on XCode 3.2.2 on Snow Leopard: #include <omp.h> #include <iostream> #include <stdio.h> int main (int argc, char * const argv[]) { #pragma omp parallel printf("Hello from thread %d, nthreads %d\n", omp_get_thread_num(), omp_get_num_threads()); return 0; } I didn't include any linking libraries yet, so the linker complains: "_omp_get_thread_num", referenced from: _main in main.o "_omp_get_num_threads", referenced from: _main in main.o OK, fine, no problem, I take a look in the existing framework, looking for keywords such as openmp or omp... here comes the problem, where is the linking library? Or should I say, what is the name of the linking library for openMP? Is it dylib, framework or what? Or do I need to get it from somewhere first?

    Read the article

  • Expert system for writing programs?

    - by aaa
    I am brainstorming an idea of developing a high level software to manipulate matrix algebra equations, tensor manipulations to be exact, to produce optimized C++ code using several criteria such as sizes of dimensions, available memory on the system, etc. Something which is similar in spirit to tensor contraction engine, TCE, but specifically oriented towards producing optimized rather than general code. The end result desired is software which is expert in producing parallel program in my domain. Does this sort of development fall on the category of expert systems? What other projects out there work in the same area of producing code given the constraints?

    Read the article

  • Cilk or Cilk++ or OpenMP

    - by Aman Deep Gautam
    I'm creating a multi-threaded application in Linux. here is the scenario: Suppose I am having x instance of a class BloomFilter and I have some y GB of data(greater than memory available). I need to test membership for this y GB of data in each of the bloom filter instance. It is pretty much clear that parallel programming will help to speed up the task moreover since I am only reading the data so it can be shared across all processes or threads. Now I am confused about which one to use Cilk, Cilk++ or OpenMP(which one is better). Also I am confused about which one to go for Multithreading or Multiprocessing

    Read the article

  • What's a good algorithm for searching arrays N and M, in order to find elements in N that also exist

    - by GenTiradentes
    I have two arrays, N and M. they are both arbitrarily sized, though N is usually smaller than M. I want to find out what elements in N also exist in M, in the fastest way possible. To give you an example of one possible instance of the program, N is an array 12 units in size, and M is an array 1,000 units in size. I want to find which elements in N also exist in M. (There may not be any matches.) The more parallel the solution, the better. I used to use a hash map for this, but it's not quite as efficient as I'd like it to be. Typing this out, I just thought of running a binary search of M on sizeof(N) independent threads. (Using CUDA) I'll see how this works, though other suggestions are welcome.

    Read the article

  • Physical Cores vs Virtual Cores in Parallelism

    - by Code Curiosity
    When it comes to virtualization, I have been deliberating on the relationship between the physical cores and the virtual cores, especially in how it effects applications employing parallelism. For example, in a VM scenario, if there are less physical cores than there are virtual cores, if that's possible, what's the effect or limits placed on the application's parallel processing? I'm asking, because in my environment, it's not disclosed as to what the physical architecture is. Is there still much advantage to parallelizing if the application lives on a dual core VM hosted on a single core physical machine?

    Read the article

  • Activate thread synchronically

    - by mayap
    Hi All, I'm using .Net 4.0 parallel library. The tasks I execute, ask to run some other task, sometimes synchronically and somethimes asynchronically, dependending on some conditions which are not known in advanced. For async call, i simply create new tasks and that's it. I don't know how to handly sync call: how to run it from the same thread, maybe that sync tasks will also ask to execute sync tasks recursively. all this issue is pretty new to me. thanks in advance.

    Read the article

  • Ideas on frameworks in .NET that can be used for job processing and notifications

    - by Rajat Mehta
    Scenario: We have one instance of WCF windows service which exposes contracts like: AddNewJob(Job job), GetJobs(JobQuery query) etc. This service is consumed by 70-100 instances of client which is Windows Form based .NET app. Typically the service has 50-100 inward calls/minute to add or query jobs that are stored in a table on Sql Server. The same service is also responsible for processing these jobs in real time. It queries database every 5 seconds picks up the queued jobs and starts processing them. A job has 6 states. Queued, Pre-processing, Processing, Post-processing, Completed, Failed, Locked. Another responsibility on this service is to update all clients on every state change of every job. This means almost 200+ callbacks to clients per second. Question: This whole implementation is done using WCF Duplex bindings and works perfectly fine on small number of parallel jobs. Problem arises when we scale it up to 1000 jobs at a time. The notifications don't work as expected, it leads to memory overflow etc. Is there any standard framework that can provide a clean infrastructure for handling this scenario?? Apologies for the long explanation!

    Read the article

  • Stack usage with MMX intrinsics and Microsoft C++

    - by arik-funke
    I have an inline assembler loop that cumulatively adds elements from an int32 data array with MMX instructions. In particular, it uses the fact that the MMX registers can accommodate 16 int32s to calculate 16 different cumulative sums in parallel. I would now like to convert this piece of code to MMX intrinsics but I am afraid that I will suffer a performance penalty because one cannot explicitly intruct the compiler to use the 8 MMX registers to accomulate 16 independent sums. Can anybody comment on this and maybe propose a solution on how to convert the piece of code below to use intrinsics? == inline assembler (only part within the loop) == paddd mm0, [esi+edx+8*0] ; add first & second pair of int32 elements paddd mm1, [esi+edx+8*1] ; add third & fourth pair of int32 elements ... paddd mm2, [esi+edx+8*2] paddd mm3, [esi+edx+8*3] paddd mm4, [esi+edx+8*4] paddd mm5, [esi+edx+8*5] paddd mm6, [esi+edx+8*6] paddd mm7, [esi+edx+8*7] ; add 15th & 16th pair of int32 elements esi points to the beginning of the data array edx provides the offset in the data array for the current loop iteration the data array is arranged such that the elements for the 16 independent sums are interleaved.

    Read the article

  • How do I optimize this postfix expression tree for speed?

    - by Peter Stewart
    Thanks to the help I received in this post: I have a nice, concise recursive function to traverse a tree in postfix order: deque <char*> d; void Node::postfix() { if (left != __nullptr) { left->postfix(); } if (right != __nullptr) { right->postfix(); } d.push_front(cargo); return; }; This is an expression tree. The branch nodes are operators randomly selected from an array, and the leaf nodes are values or the variable 'x', also randomly selected from an array. char *values[10]={"1.0","2.0","3.0","4.0","5.0","6.0","7.0","8.0","9.0","x"}; char *ops[4]={"+","-","*","/"}; As this will be called billions of times during a run of the genetic algorithm of which it is a part, I'd like to optimize it for speed. I have a number of questions on this topic which I will ask in separate postings. The first is: how can I get access to each 'cargo' as it is found. That is: instead of pushing 'cargo' onto a deque, and then processing the deque to get the value, I'd like to start processing it right away. I don't yet know about parallel processing in c++, but this would ideally be done concurrently on two different processors. In python, I'd make the function a generator and access succeeding 'cargo's using .next(). But I'm using c++ to speed up the python implementation. I'm thinking that this kind of tree has been around for a long time, and somebody has probably optimized it already. Any Ideas? Thanks

    Read the article

  • Thread management advice - Is TPL a good idea?

    - by Ian
    I'm hoping to get some advice on the use of thread managment and hopefully the task parallel library, because I'm not sure I've been going down the correct route. Probably best is that I give an outline of what I'm trying to do. Given a Problem I need to generate a Solution using a heuristic based algorithm. I start of by calculating a base solution, this operation I don't think can be parallelised so we don't need to worry about. Once the inital solution has been generated, I want to trigger n threads, which attempt to find a better solution. These threads need to do a couple of things: They need to be initalized with a different 'optimization metric'. In other words they are attempting to optimize different things, with a precedence level set within code. This means they all run slightly different calculation engines. I'm not sure if I can do this with the TPL.. If one of the threads finds a better solution that the currently best known solution (which needs to be shared across all threads) then it needs to update the best solution, and force a number of other threads to restart (again this depends on precedence levels of the optimization metrics). I may also wish to combine certain calculations across threads (e.g. keep a union of probabilities for a certain approach to the problem). This is probably more optional though. The whole system needs to be thread safe obviously and I want it to be running as fast as possible. I tried quite an implementation that involved managing my own threads and shutting them down etc, but it started getting quite complicated, and I'm now wondering if the TPL might be better. I'm wondering if anyone can offer any general guidance? Thanks...

    Read the article

  • What hash algorithms are paralellizable? Optimizing the hashing of large files utilizing on mult-co

    - by DanO
    I'm interested in optimizing the hashing of some large files (optimizing wall clock time). The I/O has been optimized well enough already and the I/O device (local SSD) is only tapped at about 25% of capacity, while one of the CPU cores is completely maxed-out. I have more cores available, and in the future will likely have even more cores. So far I've only been able to tap into more cores if I happen to need multiple hashes of the same file, say an MD5 AND a SHA256 at the same time. I can use the same I/O stream to feed two or more hash algorithms, and I get the faster algorithms done for free (as far as wall clock time). As I understand most hash algorithms, each new bit changes the entire result, and it is inherently challenging/impossible to do in parallel. Are any of the mainstream hash algorithms parallelizable? Are there any non-mainstream hashes that are parallelizable (and that have at least a sample implementation available)? As future CPUs will trend toward more cores and a leveling off in clock speed, is there any way to improve the performance of file hashing? (other than liquid nitrogen cooled overclocking?) or is it inherently non-parallelizable?

    Read the article

  • OpenMP: Get total number of running threads

    - by Konrad Rudolph
    I need to know the total number of threads that my application has spawned via OpenMP. Unfortunately, the omp_get_num_threads() function does not work here since it only yields the number of threads in the current team. However, my code runs recursively (divide and conquer, basically) and I want to spawn new threads as long as there are still idle processors, but no more. Is there a way to get around the limitations of omp_get_num_threads and get the total number of running threads? If more detail is required, consider the following pseudo-code that models my workflow quite closely: function divide_and_conquer(Job job, int total_num_threads): if job.is_leaf(): # Recurrence base case. job.process() return left, right = job.divide() current_num_threads = omp_get_num_threads() if current_num_threads < total_num_threads: # (1) #pragma omp parallel num_threads(2) #pragma omp section divide_and_conquer(left, total_num_threads) #pragma omp section divide_and_conquer(right, total_num_threads) else: divide_and_conquer(left, total_num_threads) divide_and_conquer(right, total_num_threads) job = merge(left, right) If I call this code with a total_num_threads value of 4, the conditional annotated with (1) will always evaluate to true (because each thread team will contain at most two threads) and thus the code will always spawn two new threads, no matter how many threads are already running at a higher level. I am searching for a platform-independent way of determining the total number of threads that are currently running in my application.

    Read the article

  • What hash algorithms are parallelizable? Optimizing the hashing of large files utilizing on multi-co

    - by DanO
    I'm interested in optimizing the hashing of some large files (optimizing wall clock time). The I/O has been optimized well enough already and the I/O device (local SSD) is only tapped at about 25% of capacity, while one of the CPU cores is completely maxed-out. I have more cores available, and in the future will likely have even more cores. So far I've only been able to tap into more cores if I happen to need multiple hashes of the same file, say an MD5 AND a SHA256 at the same time. I can use the same I/O stream to feed two or more hash algorithms, and I get the faster algorithms done for free (as far as wall clock time). As I understand most hash algorithms, each new bit changes the entire result, and it is inherently challenging/impossible to do in parallel. Are any of the mainstream hash algorithms parallelizable? Are there any non-mainstream hashes that are parallelizable (and that have at least a sample implementation available)? As future CPUs will trend toward more cores and a leveling off in clock speed, is there any way to improve the performance of file hashing? (other than liquid nitrogen cooled overclocking?) or is it inherently non-parallelizable?

    Read the article

  • The best way to predict performance without actually porting the code?

    - by ardiyu07
    I believe there are people with the same experience with me, where he/she must give a (estimated) performance report of porting a program from sequential to parallel with some designated multicore hardwares, with a very few amount of time given. For instance, if a 10K LoC sequential program was given and executes on Intel i7-3770k (not vectorized) in 100 ms, how long would it take to run if one parallelizes the code to a Tesla C2075 with NVIDIA CUDA, given that all kinds of parallelizing optimization techniques were done? (but you're only given 2-4 days to report the performance? assume that you didn't know the algorithm at all. Or perhaps it'd be safer if we just assume that it's an impossible situation to finish the job) Therefore, I'm wondering, what most likely be the fastest way to give such performance report? Is it safe to calculate solely by the hardware's capability, such as GFLOPs peak and memory bandwidth rate? Is there a mathematical way to calculate it? If there is, please prove your method with the corresponding problem description and the algorithm, and also the target hardwares' specifications. Or perhaps there already exists such tool to (roughly) estimate code porting? (Please don't the answer: 'kill yourself is the fastest way.')

    Read the article

  • Why are there 3 conflicting OpenCV camera calibration formulas?

    - by John
    I'm having a problem with OpenCV's various parameterization of coordinates used for camera calibration purposes. The problem is that three different sources of information on image distortion formulae apparently give three non-equivalent description of the parameters and equations involved: (1) In their book "Learning OpenCV…" Bradski and Kaehler write regarding lens distortion (page 376): xcorrected = x * ( 1 + k1 * r^2 + k2 * r^4 + k3 * r^6 ) + [ 2 * p1 * x * y + p2 * ( r^2 + 2 * x^2 ) ], ycorrected = y * ( 1 + k1 * r^2 + k2 * r^4 + k3 * r^6 ) + [ p1 * ( r^2 + 2 * y^2 ) + 2 * p2 * x * y ], where r = sqrt( x^2 + y^2 ). Assumably, (x, y) are the coordinates of pixels in the uncorrected captured image corresponding to world-point objects with coordinates (X, Y, Z), camera-frame referenced, for which xcorrected = fx * ( X / Z ) + cx and ycorrected = fy * ( Y / Z ) + cy, where fx, fy, cx, and cy, are the camera's intrinsic parameters. So, having (x, y) from a captured image, we can obtain the desired coordinates ( xcorrected, ycorrected ) to produced an undistorted image of the captured world scene by applying the above first two correction expressions. However... (2) The complication arises as we look at OpenCV 2.0 C Reference entry under the Camera Calibration and 3D Reconstruction section. For ease of comparison we start with all world-point (X, Y, Z) coordinates being expressed with respect to the camera's reference frame, just as in #1. Consequently, the transformation matrix [ R | t ] is of no concern. In the C reference, it is expressed that: x' = X / Z, y' = Y / Z, x'' = x' * ( 1 + k1 * r'^2 + k2 * r'^4 + k3 * r'^6 ) + [ 2 * p1 * x' * y' + p2 * ( r'^2 + 2 * x'^2 ) ], y'' = y' * ( 1 + k1 * r'^2 + k2 * r'^4 + k3 * r'^6 ) + [ p1 * ( r'^2 + 2 * y'^2 ) + 2 * p2 * x' * y' ], where r' = sqrt( x'^2 + y'^2 ), and finally that u = fx * x'' + cx, v = fy * y'' + cy. As one can see these expressions are not equivalent to those presented in #1, with the result that the two sets of corrected coordinates ( xcorrected, ycorrected ) and ( u, v ) are not the same. Why the contradiction? It seems to me the first set makes more sense as I can attach physical meaning to each and every x and y in there, while I find no physical meaning in x' = X / Z and y' = Y / Z when the camera focal length is not exactly 1. Furthermore, one cannot compute x' and y' for we don't know (X, Y, Z). (3) Unfortunately, things get even murkier when we refer to the writings in Intel's Open Source Computer Vision Library Reference Manual's section Lens Distortion (page 6-4), which states in part: "Let ( u, v ) be true pixel image coordinates, that is, coordinates with ideal projection, and ( u ~, v ~ ) be corresponding real observed (distorted) image coordinates. Similarly, ( x, y ) are ideal (distortion-free) and ( x ~, y ~ ) are real (distorted) image physical coordinates. Taking into account two expansion terms gives the following: x ~ = x * ( 1 + k1 * r^2 + k2 * r^4 ) + [ 2 p1 * x * y + p2 * ( r^2 + 2 * x^2 ) ] y ~ = y * ( 1 + k1 * r^2 + k2 * r^4 ] + [ 2 p2 * x * y + p2 * ( r^2 + 2 * y^2 ) ], where r = sqrt( x^2 + y^2 ). ... "Because u ~ = cx + fx * u and v ~ = cy + fy * v , … the resultant system can be rewritten as follows: u ~ = u + ( u – cx ) * [ k1 * r^2 + k2 * r^4 + 2 * p1 * y + p2 * ( r^2 / x + 2 * x ) ] v ~ = v + ( v – cy ) * [ k1 * r^2 + k2 * r^4 + 2 * p2 * x + p1 * ( r^2 / y + 2 * y ) ] The latter relations are used to undistort images from the camera." Well, it would appear that the expressions involving x ~ and y ~ coincided with the two expressions given at the top of this writing involving xcorrected and ycorrected. However, x ~ and y ~ do not refer to corrected coordinates, according to the given description. I don't understand the distinction between the meaning of the coordinates ( x ~, y ~ ) and ( u ~, v ~ ), or for that matter, between the pairs ( x, y ) and ( u, v ). From their descriptions it appears their only distinction is that ( x ~, y ~ ) and ( x, y ) refer to 'physical' coordinates while ( u ~, v ~ ) and ( u, v ) do not. What is this distinction all about? Aren't they all physical coordinates? I'm lost! Thanks for any input!

    Read the article

  • Understanding normal maps on terrain

    - by JohnB
    I'm having trouble understanding some of the math behind normal map textures even though I've got it to work using borrowed code, I want to understand it. I have a terrain based on a heightmap. I'm generating a mesh of triangles at load time and rendering that mesh. Now for each vertex I need to calculate a normal, a tangent, and a bitangent. My understanding is as follows, have I got this right? normal is a unit vector facing outwards from the surface of the triangle. For a vertex I take the average of the normals of the triangles using that vertex. tangent is a unit vector in the direction of the 'u' coordinates of the texture map. As my texture u,v coordinates follow the x and y coordinates of the terrain, then my understanding is that this vector is simply the vector along the surface in the x direction. So should be able to calculate this as simply the difference between vertices in the x direction to get a vector, (and normalize it). bitangent is a unit vector in the direction of the 'v' coordinates of the texture map. As my texture u,v coordinates follow the x and y coordinates of the terrain, then my understanding is that this vector is simply the vector along the surface in the y direction. So should be able to calculate this as simply the difference between vertices in the y direction to get a vector, (and normalize it). However the code I have borrowed seems much more complicated than this and takes into account the actual values of u, and v at each vertex which I don't understand the need for as they increase in exactly the same direction as x, and y. I implemented what I thought from above, and it simply doesn't work, the normals are clearly not working for lighting. Have I misunderstood something? Or can someone explain to me the physical meaning of the tangent and bitangent vectors when applied to a mesh generated from a hightmap like this, when u and v texture coordinates map along the x and y directions. Thanks for any help understanding this.

    Read the article

  • How can I improve my isometric tile-picking algorithm?

    - by Cypher
    I've spent the last few days researching isometric tile-picking algorithms (converting screen-coordinates to tile-coordinates), and have obviously found a lot of the math beyond my grasp. I have come fairly close and what I have is workable, but I would like to improve on this algorithm as it's a little off and seems to pick down and to the right of the mouse pointer. I've uploaded a video to help visualize the current implementation: http://youtu.be/EqwWcq1zuaM My isometric rendering algorithm is based on what is found at this stackoverflow question's answer, with the exception that my x and y axis' are inverted (x increased down-right, while y increased up-right). Here is where I am converting from screen to tiles: // these next few lines convert the mouse pointer position from screen // coordinates to tile-grid coordinates. cameraOffset captures the current // mouse location and takes into consideration the camera's position on screen. System.Drawing.Point cameraOffset = new System.Drawing.Point( 0, 0 ); cameraOffset.X = mouseLocation.X + (int)camera.Left; cameraOffset.Y = ( mouseLocation.Y + (int)camera.Top ); // the camera-aware mouse coordinates are then further converted in an attempt // to select only the "tile" portion of the grid tiles, instead of the entire // rectangle. this algorithm gets close, but could use improvement. mouseTileLocation.X = ( cameraOffset.X + 2 * cameraOffset.Y ) / Global.TileWidth; mouseTileLocation.Y = -( ( 2 * cameraOffset.Y - cameraOffset.X ) / Global.TileWidth ); Things to make note of: mouseLocation is a System.Drawing.Point that represents the screen coordinates of the mouse pointer. cameraOffset is the screen position of the mouse pointer that includes the position of the game camera. mouseTileLocation is a System.Drawing.Point that is supposed to represent the tile coordinates of the mouse pointer. If you check out the above link to youtube, you'll notice that the picking algorithm is off a bit. How can I improve on this?

    Read the article

  • Performing a Depth First Search iteratively using async/parallel processing?

    - by Prabhu
    Here is a method that does a DFS search and returns a list of all items given a top level item id. How could I modify this to take advantage of parallel processing? Currently, the call to get the sub items is made one by one for each item in the stack. It would be nice if I could get the sub items for multiple items in the stack at the same time, and populate my return list faster. How could I do this (either using async/await or TPL, or anything else) in a thread safe manner? private async Task<IList<Item>> GetItemsAsync(string topItemId) { var items = new List<Item>(); var topItem = await GetItemAsync(topItemId); Stack<Item> stack = new Stack<Item>(); stack.Push(topItem); while (stack.Count > 0) { var item = stack.Pop(); items.Add(item); var subItems = await GetSubItemsAsync(item.SubId); foreach (var subItem in subItems) { stack.Push(subItem); } } return items; } I was thinking of something along these lines, but it's not coming together: var tasks = stack.Select(async item => { items.Add(item); var subItems = await GetSubItemsAsync(item.SubId); foreach (var subItem in subItems) { stack.Push(subItem); } }).ToList(); if (tasks.Any()) await Task.WhenAll(tasks); The language I'm using is C#.

    Read the article

  • How can I determine the first visible tile in an isometric perspective?

    - by alekop
    I am trying to render the visible portion of a diamond-shaped isometric map. The "world" coordinate system is a 2D Cartesian system, with the coordinates increasing diagonally (in terms of the view coordinate system) along the axes. The "view" coordinates are simply mouse offsets relative to the upper left corner of the view. My rendering algorithm works by drawing diagonal spans, starting from the upper right corner of the view and moving diagonally to the right and down, advancing to the next row when it reaches the right view edge. When the rendering loop reaches the lower left corner, it stops. There are functions to convert a point from view coordinates to world coordinates and then to map coordinates. Everything works when rendering from tile 0,0, but as the view scrolls around the rendering needs to start from a different tile. I can't figure out how to determine which tile is closest to the upper right corner. At the moment I am simply converting the coordinates of the upper right corner to map coordinates. This works as long as the view origin (upper right corner) is inside the world, but when approaching the edges of the map the starting tile coordinate obviously become invalid. I guess this boils down to asking "how can I find the intersection between the world X axis and the view X axis?"

    Read the article

  • Creating a voxel chunk with a VBO - How to translate the coordinates of each block and add it to the VBO chunk?

    - by sunsunsunsunsun
    Im trying to make a voxel engine similar to minecraft as a little learning experience and a way to learn some opengl. I have created a chunk class and I want to put all of the vertices for the whole chunk into a single VBO. I was previously only putting each block into a vbo and making a call to render each block. Anyways, I am a bit confused about how I can translate the coordinates of each block in the chunk when I'm putting all vertices into one vbo. This is what I have at the moment. public void putVertices(float tx, float ty, float tz) { float l_length = 1.0f; float l_height = 1.0f; float l_width = 1.0f; vertexPositionData.put(new float[]{ xOffset + l_length + tx, l_height + ty, zOffset + -l_width + tz, xOffset + -l_length + tx, l_height + ty, zOffset + -l_width + tz, xOffset + -l_length + tx, l_height + ty, zOffset + l_width + tz, xOffset + l_length + tx, l_height + ty, zOffset + l_width + tz, xOffset + l_length + tx, -l_height + ty, zOffset + l_width + tz, xOffset + -l_length + tx, -l_height + ty, zOffset + l_width + tz, xOffset + -l_length + tx, -l_height + ty, zOffset + -l_width + tz, xOffset + l_length + tx, -l_height + ty, zOffset + -l_width + tz, xOffset + l_length + tx, l_height + ty, zOffset + l_width + tz, xOffset + -l_length + tx, l_height + ty,zOffset + l_width + tz, xOffset + -l_length + tx, -l_height + ty,zOffset + l_width + tz, xOffset + l_length + tx, -l_height + ty, zOffset + l_width + tz, xOffset + l_length + tx, -l_height + ty, zOffset + -l_width + tz, xOffset + -l_length + tx, -l_height + ty,zOffset + -l_width + tz, xOffset + -l_length + tx, l_height + ty, zOffset + -l_width + tz, xOffset + l_length + tx, l_height + ty, zOffset + -l_width + tz, xOffset + -l_length + tx, l_height + ty, zOffset + l_width + tz, xOffset + -l_length + tx, l_height + ty, zOffset + -l_width + tz, xOffset + -l_length + tx, -l_height + ty, zOffset + -l_width + tz, xOffset + -l_length + tx, -l_height + ty,zOffset + l_width + tz, xOffset + l_length + tx, l_height + ty,zOffset + -l_width + tz, xOffset + l_length + tx, l_height + ty, zOffset + l_width + tz, xOffset + l_length + tx, -l_height + ty, zOffset + l_width + tz, xOffset + l_length + tx, -l_height + ty, zOffset + -l_width + tz }); } public void createChunk() { vertexPositionData = BufferUtils.createFloatBuffer((24*3)*activateBlocks); Random random = new Random(); for (int x = 0; x < CHUNK_SIZE; x++) { for (int y = 0; y < CHUNK_SIZE; y++) { for (int z = 0; z < CHUNK_SIZE; z++) { if(blocks[x][y][z].getActive()) { putVertices(x*2.0f, y*2.0f, z*2.0f); } } } } Whats any easy way to translate the vertices of each block into its correct position? I was previously using glTranslatef with each call to render block but this wont work now. What I am doing now also does not work, the blocks all render in stacks on top of each other and it looks like this: http://i.imgur.com/NyFtBTI.png Thanks

    Read the article

  • Does my TPL partitioner cause a deadlock?

    - by Scott Chamberlain
    I am starting to write my first parallel applications. This partitioner will enumerate over a IDataReader pulling chunkSize records at a time from the data-source. protected class DataSourcePartitioner<object[]> : System.Collections.Concurrent.Partitioner<object[]> { private readonly System.Data.IDataReader _Input; private readonly int _ChunkSize; public DataSourcePartitioner(System.Data.IDataReader input, int chunkSize = 10000) : base() { if (chunkSize < 1) throw new ArgumentOutOfRangeException("chunkSize"); _Input = input; _ChunkSize = chunkSize; } public override bool SupportsDynamicPartitions { get { return true; } } public override IList<IEnumerator<object[]>> GetPartitions(int partitionCount) { var dynamicPartitions = GetDynamicPartitions(); var partitions = new IEnumerator<object[]>[partitionCount]; for (int i = 0; i < partitionCount; i++) { partitions[i] = dynamicPartitions.GetEnumerator(); } return partitions; } public override IEnumerable<object[]> GetDynamicPartitions() { return new ListDynamicPartitions(_Input, _ChunkSize); } private class ListDynamicPartitions : IEnumerable<object[]> { private System.Data.IDataReader _Input; int _ChunkSize; private object _ChunkLock = new object(); public ListDynamicPartitions(System.Data.IDataReader input, int chunkSize) { _Input = input; _ChunkSize = chunkSize; } public IEnumerator<object[]> GetEnumerator() { while (true) { List<object[]> chunk = new List<object[]>(_ChunkSize); lock(_Input) { for (int i = 0; i < _ChunkSize; ++i) { if (!_Input.Read()) break; var values = new object[_Input.FieldCount]; _Input.GetValues(values); chunk.Add(values); } if (chunk.Count == 0) yield break; } var chunkEnumerator = chunk.GetEnumerator(); lock(_ChunkLock) //Will this cause a deadlock? { while (chunkEnumerator.MoveNext()) { yield return chunkEnumerator.Current; } } } } IEnumerator IEnumerable.GetEnumerator() { return ((IEnumerable<object[]>)this).GetEnumerator(); } } } I wanted IEnumerable object it passed back to be thread safe (the .Net example was so I am assuming PLINQ and TPL could need it) will the lock on _ChunkLock near the bottom help provide thread safety or will it cause a deadlock? From the documentation I could not tell if the lock would be released on the yeld return. Also if there is built in functionality to .net that will do what I am trying to do I would much rather use that. And if you find any other problems with the code I would appreciate it.

    Read the article

< Previous Page | 24 25 26 27 28 29 30 31 32 33 34 35  | Next Page >