tbb - Developer IT

how to implement intel's tbb::blocked_range2d in C++

- by Hristo

I'm trying to parallelize nested for loops with parellel_for() and the blocked_range2d from Intel's TBB using C++. The for loops look like this: for(int i = 0; i < N; ++i) { for(int j = 0; j < E[i]; ++j) { for(int k = 0; k < T; ++k) { score[k] += delta(i, trRating[k][i], exRating[j][i]); } } } ... and I am trying to do the following: class LoopBody { private: int *myscore; public: LoopBody(int *score) { myscore = score; } void operator()(const blocked_range2d<int> &r) const { for(int i = r.rows().begin(); i != r.rows().end(); ++i); for(int j = 0; j < E[i]; ++j) { for(int k = r.cols().begin(); k != r.cols().end(); ++k) { myscore[k] += foo(...); // uses i,j,k to look up indices in arrays } } } } }; void computeScores(int score[]) { parallel_for(blocked_range2d<int>(0, N, 0, T), LoopBody(score)); } ... but I am getting the following compile errors: error: identifier "i" is undefined for(int j = 0; j < E[i]; ++j) { ^ error: expected an identifier }; ^ I'm not really sure if I am doing this the right way, but any advice is appreciated. Also, this is my first time using Intel's TBB so I really don't know anything about it. Any ideas how make this work? Thanks, Hristo

Read the article

How can I build multiple processes with TBB?

- by Jackie

Now I plan to parallelize my sequential solver. I hope I could run several copies of my solver(maybe with different parameters) in parallel simultaneously on a multi-core computer. can I do this with TBB? The reason I ask this question is that the book says: do not introduce anything in your code that will not allow single-thread execution. Any experts can explain this issue? Thanks.

Read the article

How to sort TBB concurrent_vector or concurrent_queue?

- by Jackie

Now I have a solver in that I need to keep a set of self-defined data type objects in a concurrent_vector or queue. It has to be concurrent because the objects come from different threads.With this concurrent container, I hope to sort these objects, eliminate duplicates and send them back when other threads need them. However, I know TBB offers concurrent_vector and concurrent_queue which can be read and written concurrently from different threads. But how to sort the objects inside a container? Does everyone know how to do that? Thanks.

Read the article

How do I link against Intel TBB on Mac OS X with GCC?

- by SilverSun

I can't for the life of me figure out how to compile and link against the Intel TBB library on my Mac. I've run the commercial installer and the tbbvars.sh script but I can't figure this out. I have a feeling it is something really obvious and it's just been a bit too long since I've done this kind of thing. tbb_test.cpp #include <tbb/concurrent_queue.h> int main() { tbb::concurrent_queue<int> q; } g++ tbb_test.cpp -I /Library/Frameworks/TBB.framework/Headers -ltbb ...can't find the symbols. Cheers!

Read the article

How does one modify the thread scheduling behavior when using Threading Building Blocks (TBB)?

- by J Teller

Does anyone know how to modify the thread scheduling (specifically affinity) when using TBB? Doing a high level analysis on a simple parallel-for application, it seems like TBB is specifying the underlying threads' affinity in a way that reduces performance. Specifically, the cores I'm running on have hyper-threading enabled, and it looks like TBB is affinitizing threads to the same core even if there is a different core left completely unloaded. FWIW, I realize it's likely that TBB is doing the "right thing" and that changing the threads' affinity will only reduce performance. I'd just like to experiment with it to see if that's really the case.

Read the article

Combining TBB and IPP with Intel Parallel Studio

When performance really counts Parallel computing - Intel Parallel Studio - Hardware - Programming - Components

Read the article

Parallel Programming Essentials via the Intel TBB

An article meant to introduce and expand upon the Intel Threading Building Blocks threading library

Read the article

C++ Intel TBB : sortie de la version 3 de la bibliothèque open source pour le développement parallè

La bibliothèque open source TBB d'Intel pour programmer en parallèle vient de sortir en version 3 Intel vient d'annoncer aujourd'hui la sortie de la troisième version de sa bibliothèque TBB (thread building blocks). Cette bibliothèque C++, disponible en open source, a pour objectif de permettre de programmer en parallèle, afin d'accéder aux ressources des machines multi-coeurs actuels. Citation: Today, Intel released Intel® Threading Building Blocks (Intel® TBB) 3.0, a high-level parallel programming toolkit that ...

Read the article

Parallel Programming. Boost's MPI, OpenMP, TBB, or something else?

- by unknownthreat

Hello, I am totally a novice in parallel programming, but I do know how to program C++. Now, I am looking around for parallel programming library. I just want to give it a try, just for fun, and right now, I found 3 APIs, but I am not sure which one should I stick with. Right now, I see Boost's MPI, OpenMP and TBB. For anyone who have experienced with any of these 3 API (or any other parallelism API), could you please tell me the difference between these? Are there any factor to consider, like AMD or Intel architecture?

Read the article

Multi-Core Programming. Boost's MPI, OpenMP, TBB, or something else?

- by unknownthreat

Hello, I am totally a novice in Multi-Core Programming, but I do know how to program C++. Now, I am looking around for Multi-Core Programming library. I just want to give it a try, just for fun, and right now, I found 3 APIs, but I am not sure which one should I stick with. Right now, I see Boost's MPI, OpenMP and TBB. For anyone who have experienced with any of these 3 API (or any other API), could you please tell me the difference between these? Are there any factor to consider, like AMD or Intel architecture?

Read the article

A question about some code from TBB book, Thanks.

- by Jackie

I am reading the book: Intel Threading Building Blocks. I often have difficulties understanding them. For example,the following code is from the book(page 112): Node* AllocateNode() { Node* n; FreeListMutexType::scoped_lock lock; lock.acquire(FreeListMutex); n=FreeList; if(n) Freelist=n->next; lock.release(); if(!n) n=new Node(); return n; } There is other introduction regarding this code. I can not understand it. What does it means? How can I understand this book better? Thanks.

Read the article

A question about TBB/C++ code

- by Jackie

I am reading The thread building block book. I do not understand this piece of code: FibTask& a=*new(allocate_child()) FibTask(n-1,&x); FibTask& b=*new(allocate_child()) FibTask(n-2,&y); What do these directive mean? class object reference and new work together? Thanks for explanation. The following code is the defination of this class FibTask. class FibTask: public task { public: const long n; long* const sum; FibTask(long n_,long* sum_):n(n_),sum(sum_) {} task* execute() { if(n FibTask& a=*new(allocate_child()) FibTask(n-1,&x); FibTask& b=*new(allocate_child()) FibTask(n-2,&y); set_ref_count(3); spawn(b); spawn_and_wait_for_all(a); *sum=x+y; } return 0; } };

Read the article

Fast inter-thread communication mechanism

- by Stan

I need a fast inter-thread communication mechanism for passing work (void*) from TBB tasks to several workers which are running blocking operations. Currently I'm looking into using pipe()+libevent. Is there a faster and more elegant alternative for use with Intel Threading Building Blocks?

Read the article

Is it safe to spin on a volatile variable in user-mode threads?

- by yongsun

I'm not quite sure if it's safe to spin on a volatile variable in user-mode threads, to implement a light-weight spin_lock, I looked at the tbb source code, tbb_machine.h:170, //! Spin WHILE the value of the variable is equal to a given value /** T and U should be comparable types. */ template<typename T, typename U> void spin_wait_while_eq( const volatile T& location, U value ) { atomic_backoff backoff; while( location==value ) backoff.pause(); } And there is no fences in atomic_backoff class as I can see. While from other user-mode spin_lock implementation, most of them use CAS (Compare and Swap).

Read the article

How do i write tasks? (parallel code)

- by acidzombie24

I am impressed with intel thread building blocks. I like how i should write task and not thread code and i like how it works under the hood with my limited understanding (task are in a pool, there wont be 100 threads on 4cores, a task is not guaranteed to run because it isnt on its own thread and may be far into the pool. But it may be run with another related task so you cant do bad things like typical thread unsafe code). I wanted to know more about writing task. I like the 'Task-based Multithreading - How to Program for 100 cores' video here http://www.gdcvault.com/sponsor.php?sponsor_id=1 (currently second last link. WARNING it isnt 'great'). My fav part was 'solving the maze is better done in parallel' which is around the 48min mark (you can click the link on the left side. That part is really all you need to watch if any). However i like to see more code examples and some API of how to write task. Does anyone have a good resource? I have no idea how a class or pieces of code may look after pushing it onto a pool or how weird code may look when you need to make a copy of everything and how much of everything is pushed onto a pool.

Read the article

How to approach parallel processing of messages?

- by Dan

I am redesigning the messaging system for my app to use intel threading building blocks and am stumped trying to decide between two possible approaches. Basically, I have a sequence of message objects and for each message type, a sequence of handlers. For each message object, I apply each handler registered for that message objects type. The sequential version would be something like this (pseudocode): for each message in message_sequence <- SEQUENTIAL for each handler in (handler_table for message.type) apply handler to message <- SEQUENTIAL The first approach which I am considering processes the message objects in turn (sequentially) and applies the handlers concurrently. Pros: predictable ordering of messages (ie, we are guaranteed a FIFO processing order) (potentially) lower latency of processing each message Cons: more processing resources available than handlers for a single message type (bad parallelization) bad use of processor cache since message objects need to be copied for each handler to use large overhead for small handlers The pseudocode of this approach would be as follows: for each message in message_sequence <- SEQUENTIAL parallel_for each handler in (handler_table for message.type) apply handler to message <- PARALLEL The second approach is to process the messages in parallel and apply the handlers to each message sequentially. Pros: better use of processor cache (keeps the message object local to all handlers which will use it) small handlers don't impose as much overhead (as long as there are other handlers also to be run) more messages are expected than there are handlers, so the potential for parallelism is greater Cons: Unpredictable ordering - if message A is sent before message B, they may both be processed at the same time, or B may finish processing before all of A's handlers are finished (order is non-deterministic) The pseudocode is as follows: parallel_for each message in message_sequence <- PARALLEL for each handler in (handler_table for message.type) apply handler to message <- SEQUENTIAL The second approach has more advantages than the first, but non-deterministic ordering is a big disadvantage.. Which approach would you choose and why? Are there any other approaches I should consider (besides the obvious third approach: parallel messages and parallel handlers, which has the disadvantages of both and no real redeeming factors as far as I can tell)? Thanks!

Read the article

How do i write task? (parallel code)

- by acidzombie24

I am impressed with intel thread building blocks. I like how i should write task and not thread code and i like how it works under the hood with my limited understanding (task are in a pool, there wont be 100 threads on 4cores, a task is not guaranteed to run because it isnt on its own thread and may be far into the pool. But it may be run with another related task so you cant do bad things like typical thread unsafe code). I wanted to know more about writing task. I like the 'Task-based Multithreading - How to Program for 100 cores' video here http://www.gdcvault.com/sponsor.php?sponsor_id=1 (currently second last link. WARNING it isnt 'great'). My fav part was 'solving the maze is better done in parallel' which is around the 48min mark (you can click the link on the left side). However i like to see more code examples and some API of how to write task. Does anyone have a good resource? I have no idea how a class or pieces of code may look after pushing it onto a pool or how weird code may look when you need to make a copy of everything and how much of everything is pushed onto a pool.

Read the article

OpenGL, how to set a monochrome texture to a colored shape?

- by Santiago

I'm developing on Android with OpenGL ES, I draw some cubes and I change its colors with glColor4f. Now, what I want is to give a more realistic effect on the cubes, so I create a monochromatic 8bit depth, 64x64 pixel size PNG file. I loaded on a texture, and here is my problem, witch is the way to combine the color and the texture to get a colorized and textured cubes onto the screen? I'm not an expert on OpenGL, I tried this: On create: public void asignBitmap(GL10 gl, Bitmap bitmap) { int[] textures = new int[1]; gl.glGenTextures(1, textures, 0); mTexture = textures[0]; gl.glBindTexture(GL10.GL_TEXTURE_2D, mTexture); gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MIN_FILTER, GL10.GL_NEAREST); gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MAG_FILTER, GL10.GL_LINEAR); gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_WRAP_S, GL10.GL_CLAMP_TO_EDGE); gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_WRAP_T, GL10.GL_CLAMP_TO_EDGE); gl.glTexEnvf(GL10.GL_TEXTURE_ENV, GL10.GL_TEXTURE_ENV_MODE, GL10.GL_REPLACE); GLUtils.texImage2D(GL10.GL_TEXTURE_2D, 0, GL10.GL_ALPHA, bitmap, 0); ByteBuffer tbb = ByteBuffer.allocateDirect(texCoords.length * 4); tbb.order(ByteOrder.nativeOrder()); mTexBuffer = tbb.asFloatBuffer(); for (int i = 0; i < 48; i++) mTexBuffer.put(texCoords[i]); mTexBuffer.position(0); } And OnDraw: public void draw(GL10 gl, int alphawires) { gl.glColor4f(1.0f, 0.0f, 0.0f, 0.5f); //RED gl.glBindTexture(GL10.GL_TEXTURE_2D, mTexture); gl.glBlendFunc(GL10.GL_SRC_ALPHA, GL10.GL_ONE_MINUS_SRC_ALPHA); gl.glEnable(GL10.GL_TEXTURE_2D); gl.glEnable(GL10.GL_BLEND); gl.glEnableClientState(GL10.GL_TEXTURE_COORD_ARRAY); gl.glTexCoordPointer(2, GL10.GL_FLOAT, 0, mTexBuffer); //Set the face rotation gl.glFrontFace(GL10.GL_CW); //Point to our buffers gl.glVertexPointer(3, GL10.GL_FLOAT, 0, vertexBuffer); //Enable the vertex and color state gl.glEnableClientState(GL10.GL_VERTEX_ARRAY); //Draw the vertices as triangles, based on the Index Buffer information gl.glDrawElements(GL10.GL_TRIANGLES, 36, GL10.GL_UNSIGNED_BYTE, indexBuffer); //Disable the client state before leaving gl.glDisableClientState(GL10.GL_VERTEX_ARRAY); gl.glDisableClientState(GL10.GL_TEXTURE_COORD_ARRAY); gl.glDisable(GL10.GL_BLEND); gl.glDisable(GL10.GL_TEXTURE_2D); } I'm even not sure if I have to use a blend option, because I don't need transparency, but is a plus :)

Read the article

OpenGL, how to set a monocrome texture to a colored shape?

- by Santiago

I'm developing on Android with OpenGL ES, I draw some cubes and I change its colors with glColor4f. Now, what I want is to give a more realistic effect on the cubes, so I create a monochromatic 8bit depth, 64x64 pixel size PNG file. I loaded on a texture, and here is my problem, witch is the way to combine the color and the texture to get a colorized and textured cubes onto the screen? I'm not an expert on OpenGL, I tried this: On create: public void asignBitmap(GL10 gl, Bitmap bitmap) { int[] textures = new int[1]; gl.glGenTextures(1, textures, 0); mTexture = textures[0]; gl.glBindTexture(GL10.GL_TEXTURE_2D, mTexture); gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MIN_FILTER, GL10.GL_NEAREST); gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MAG_FILTER, GL10.GL_LINEAR); gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_WRAP_S, GL10.GL_CLAMP_TO_EDGE); gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_WRAP_T, GL10.GL_CLAMP_TO_EDGE); gl.glTexEnvf(GL10.GL_TEXTURE_ENV, GL10.GL_TEXTURE_ENV_MODE, GL10.GL_REPLACE); GLUtils.texImage2D(GL10.GL_TEXTURE_2D, 0, GL10.GL_ALPHA, bitmap, 0); ByteBuffer tbb = ByteBuffer.allocateDirect(texCoords.length * 4); tbb.order(ByteOrder.nativeOrder()); mTexBuffer = tbb.asFloatBuffer(); for (int i = 0; i < 48; i++) mTexBuffer.put(texCoords[i]); mTexBuffer.position(0); } And OnDraw: public void draw(GL10 gl, int alphawires) { gl.glColor4f(1.0f, 0.0f, 0.0f, 0.5f); //RED gl.glBindTexture(GL10.GL_TEXTURE_2D, mTexture); gl.glBlendFunc(GL10.GL_SRC_ALPHA, GL10.GL_ONE_MINUS_SRC_ALPHA); gl.glEnable(GL10.GL_TEXTURE_2D); gl.glEnable(GL10.GL_BLEND); gl.glEnableClientState(GL10.GL_TEXTURE_COORD_ARRAY); gl.glTexCoordPointer(2, GL10.GL_FLOAT, 0, mTexBuffer); //Set the face rotation gl.glFrontFace(GL10.GL_CW); //Point to our buffers gl.glVertexPointer(3, GL10.GL_FLOAT, 0, vertexBuffer); //Enable the vertex and color state gl.glEnableClientState(GL10.GL_VERTEX_ARRAY); //Draw the vertices as triangles, based on the Index Buffer information gl.glDrawElements(GL10.GL_TRIANGLES, 36, GL10.GL_UNSIGNED_BYTE, indexBuffer); //Disable the client state before leaving gl.glDisableClientState(GL10.GL_VERTEX_ARRAY); gl.glDisableClientState(GL10.GL_TEXTURE_COORD_ARRAY); gl.glDisable(GL10.GL_BLEND); gl.glDisable(GL10.GL_TEXTURE_2D); } I'm even not sure if I have to use a blend option, because I don't need transparency, but is a plus :) Thank's

Read the article

can't get texture to work

- by user583713

It been a while since I use android opengl but for what ever reason I get a white squre box and not the texture I what on the screen. Oh I do not think this would matter but just in case I put a linerlayout view first then the surfaceview on but anyway Here my code: public class GameEngine { private float vertices[]; private float textureUV[]; private int[] textureId = new int[1]; private FloatBuffer vertextBuffer; private FloatBuffer textureBuffer; private short indices[] = {0,1,2,2,1,3}; private ShortBuffer indexBuffer; private float x, y, z; private float rot, rotX, rotY, rotZ; public GameEngine() { } public void setEngine(float x, float y, float vertices[]){ this.x = x; this.y = y; this.vertices = vertices; ByteBuffer vbb = ByteBuffer.allocateDirect(this.vertices.length * 4); vbb.order(ByteOrder.nativeOrder()); vertextBuffer = vbb.asFloatBuffer(); vertextBuffer.put(this.vertices); vertextBuffer.position(0); vertextBuffer.clear(); ByteBuffer ibb = ByteBuffer.allocateDirect(indices.length * 2); ibb.order(ByteOrder.nativeOrder()); indexBuffer = ibb.asShortBuffer(); indexBuffer.put(indices); indexBuffer.position(0); indexBuffer.clear(); } public void draw(GL10 gl){ gl.glLoadIdentity(); gl.glTranslatef(x, y, z); gl.glRotatef(rot, rotX, rotY, rotZ); gl.glBindTexture(GL10.GL_TEXTURE_2D, textureId[0]); gl.glEnableClientState(GL10.GL_VERTEX_ARRAY); gl.glEnableClientState(GL10.GL_TEXTURE_COORD_ARRAY); gl.glVertexPointer(2, GL10.GL_FLOAT, 0, vertextBuffer); gl.glTexCoordPointer(2, GL10.GL_FLOAT, 0, textureBuffer); gl.glDrawElements(GL10.GL_TRIANGLES, indices.length, GL10.GL_UNSIGNED_SHORT, indexBuffer); gl.glDisableClientState(GL10.GL_VERTEX_ARRAY); gl.glDisableClientState(GL10.GL_TEXTURE_COORD_ARRAY); } public void LoadTexture(float textureUV[], GL10 gl, InputStream is) throws IOException{ this.textureUV = textureUV; ByteBuffer tbb = ByteBuffer.allocateDirect(this.textureUV.length * 4); tbb.order(ByteOrder.nativeOrder()); textureBuffer = tbb.asFloatBuffer(); textureBuffer.put(this.textureUV); textureBuffer.position(0); textureBuffer.clear(); Bitmap bitmap = null; try{ bitmap = BitmapFactory.decodeStream(is); }finally{ try{ is.close(); is = null; gl.glGenTextures(1, textureId,0); gl.glBindTexture(GL10.GL_TEXTURE_2D, textureId[0]); gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MIN_FILTER, GL10.GL_NEAREST); gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MAG_FILTER, GL10.GL_LINEAR); //Different possible texture parameters, e.g. GL10.GL_CLAMP_TO_EDGE gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_WRAP_S, GL10.GL_REPEAT); gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_WRAP_T, GL10.GL_REPEAT); //Use the Android GLUtils to specify a two-dimensional texture image from our bitmap GLUtils.texImage2D(GL10.GL_TEXTURE_2D, 0, bitmap, 0); gl.glBlendFunc(GL10.GL_SRC_ALPHA, GL10.GL_ONE_MINUS_SRC_ALPHA); //Clean up gl.glBindTexture(GL10.GL_TEXTURE_2D, textureId[0]); bitmap.recycle(); }catch(IOException e){ } } } public void setVector(float x, float y, float z){ this.x = x; this.y = y; this.z = z; } public void setRot(float rot, float x, float y, float z){ this.rot = rot; this.rotX = x; this.rotY = y; this.rotZ = z; } public float getZ(){ return z; } }

Read the article

problem with my texture coordinates on a square.

- by Evan Kimia

Im very new to OpenGL ES, and have been doing a tutorial to build a square. The square is made, and now im trying to map a 256 by 256 image onto it. The problem is, im only seeing a very zoomed in portion of this bitmap; Im fairly certain my texture coords are whats wrong here. Thanks! package se.jayway.opengl.tutorial; import java.io.IOException; import java.io.InputStream; import java.nio.ByteBuffer; import java.nio.ByteOrder; import java.nio.FloatBuffer; import java.nio.ShortBuffer; import javax.microedition.khronos.opengles.GL10; import android.content.Context; import android.graphics.Bitmap; import android.graphics.BitmapFactory; import android.opengl.GLUtils; public class Square { // Our vertices. private float vertices[] = { -1.0f, 1.0f, 0.0f, // 0, Top Left -1.0f, -1.0f, 0.0f, // 1, Bottom Left 1.0f, -1.0f, 0.0f, // 2, Bottom Right 1.0f, 1.0f, 0.0f, // 3, Top Right }; //Our texture. private float texture[] = { 0.0f, 0.0f, 0.0f, 0.0f, 1.0f, 0.0f, 1.0f, 1.0f, 0.0f, 1.0f, 0.0f, 0.0f, }; // The order we like to connect them. private short[] indices = { 0, 1, 2, 0, 2, 3 }; // Our vertex buffer. private FloatBuffer vertexBuffer; // Our index buffer. private ShortBuffer indexBuffer; //texture buffer. private FloatBuffer textureBuffer; //Our texture pointer. private int[] textures = new int[1]; public Square() { // a float is 4 bytes, therefore we multiply the number if // vertices with 4. ByteBuffer vbb = ByteBuffer.allocateDirect(vertices.length * 4); vbb.order(ByteOrder.nativeOrder()); vertexBuffer = vbb.asFloatBuffer(); vertexBuffer.put(vertices); vertexBuffer.position(0); // a float is 4 bytes, therefore we multiply the number of // vertices with 4. ByteBuffer tbb = ByteBuffer.allocateDirect(texture.length * 4); vbb.order(ByteOrder.nativeOrder()); textureBuffer = tbb.asFloatBuffer(); textureBuffer.put(texture); textureBuffer.position(0); // short is 2 bytes, therefore we multiply the number if // vertices with 2. ByteBuffer ibb = ByteBuffer.allocateDirect(indices.length * 2); ibb.order(ByteOrder.nativeOrder()); indexBuffer = ibb.asShortBuffer(); indexBuffer.put(indices); indexBuffer.position(0); } /** * This function draws our square on screen. * @param gl */ public void draw(GL10 gl) { // Counter-clockwise winding. gl.glFrontFace(GL10.GL_CCW); // Enable face culling. gl.glEnable(GL10.GL_CULL_FACE); // What faces to remove with the face culling. gl.glCullFace(GL10.GL_BACK); //Bind our only previously generated texture in this case gl.glBindTexture(GL10.GL_TEXTURE_2D, textures[0]); // Enabled the vertices buffer for writing and to be used during // rendering. gl.glEnableClientState(GL10.GL_VERTEX_ARRAY); //Enable texture buffer array gl.glEnableClientState(GL10.GL_TEXTURE_COORD_ARRAY); // Specifies the location and data format of an array of vertex // coordinates to use when rendering. gl.glVertexPointer(3, GL10.GL_FLOAT, 0, vertexBuffer); gl.glTexCoordPointer(2, GL10.GL_FLOAT, 0, textureBuffer); gl.glDrawElements(GL10.GL_TRIANGLES, indices.length, GL10.GL_UNSIGNED_SHORT, indexBuffer); // Disable the vertices buffer. gl.glDisableClientState(GL10.GL_VERTEX_ARRAY); //Disable the texture buffer. gl.glDisableClientState(GL10.GL_TEXTURE_COORD_ARRAY); // Disable face culling. gl.glDisable(GL10.GL_CULL_FACE); } /** * Load the textures * * @param gl - The GL Context * @param context - The Activity context */ public void loadGLTexture(GL10 gl, Context context) { //Get the texture from the Android resource directory InputStream is = context.getResources().openRawResource(R.drawable.test); Bitmap bitmap = null; try { //BitmapFactory is an Android graphics utility for images bitmap = BitmapFactory.decodeStream(is); } finally { //Always clear and close try { is.close(); is = null; } catch (IOException e) { } } //Generate one texture pointer... gl.glGenTextures(1, textures, 0); //...and bind it to our array gl.glBindTexture(GL10.GL_TEXTURE_2D, textures[0]); //Create Nearest Filtered Texture gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MIN_FILTER, GL10.GL_NEAREST); gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MAG_FILTER, GL10.GL_LINEAR); //Different possible texture parameters, e.g. GL10.GL_CLAMP_TO_EDGE gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_WRAP_S, GL10.GL_REPEAT); gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_WRAP_T, GL10.GL_REPEAT); //Use the Android GLUtils to specify a two-dimensional texture image from our bitmap GLUtils.texImage2D(GL10.GL_TEXTURE_2D, 0, bitmap, 0); //Clean up bitmap.recycle(); } }

Read the article

How can you name the Dataset's Tables you return in a stored proc ?

- by Brann

I've got the following stored procedure Create procedure psfoo () AS select * from tbA select * from tbB I'm then accessing the data this way : Sql Command mySqlCommand = new SqlCommand("psfoo" , DbConnection) DataSet ds = new DataSet(); mySqlCommand.CommandType = CommandType.StoredProcedure; SqlDataAdapter mySqlDataAdapter = new SqlDataAdapter(); mySqlDataAdapter.SelectCommand = mySqlCommand; mySqlDataAdapter.Fill(ds); Now, when I want to access my tables, I have to do this : DataTable datatableA = ds.Tables[0]; DataTable datatableB = ds.Tables[1]; the dataset Tables property also got an accessor by string (instead of int). Is it possible so specify the name of the tables in the SQL code, so that I can instead write this : DataTable datatableA = ds.Tables["NametbA"]; DataTable datatableB = ds.Tables["NametbB"]; I'm using SQL server 2008, if that makes a difference.

Read the article

Running C++ AMP kernels on the CPU

- by Daniel Moth

One of the FAQs we receive is whether C++ AMP can be used to target the CPU. For targeting multi-core we have a technology we released with VS2010 called PPL, which has had enhancements for VS 11 – that is what you should be using! FYI, it also has a Linux implementation via Intel's TBB which conforms to the same interface. When you choose to use C++ AMP, you choose to take advantage of massively parallel hardware, through accelerators like the GPU. Having said that, you can always use the accelerator class to check if you are running on a system where the is no hardware with a DirectX 11 driver, and decide what alternative code path you wish to follow. In fact, if you do nothing in code, if the runtime does not find DX11 hardware to run your code on, it will choose the WARP accelerator which will run your code on the CPU, taking advantage of multi-core and SSE2 (depending on the CPU capabilities WARP also uses SSE3 and SSE 4.1 – it does not currently use AVX and on such systems you hopefully have a DX 11 GPU anyway). A few things to know about WARP It is our fallback CPU solution, not intended as a primary target of C++ AMP. WARP stands for Windows Advanced Rasterization Platform and you can read old info on this MSDN page on WARP. What is new in Windows 8 Developer Preview is that WARP now supports DirectCompute, which is what C++ AMP builds on. It is not currently clear if we will have a CPU fallback solution for non-Windows 8 platforms when we ship. When you create a WARP accelerator, its is_emulated property returns true. WARP does not currently support double precision. BTW, when we refer to WARP, we refer to this accelerator described above. If we use lower case "warp", that refers to a bunch of threads that run concurrently in lock step and share the same instruction. In the VS 11 Developer Preview, the size of warp in our Ref emulator is 4 – Ref is another emulator that runs on the CPU, but it is extremely slow not intended for production, just for debugging. Comments about this post by Daniel Moth welcome at the original blog.

Read the article

C++ thread safety - exchange data between worker and controller

- by peterchen

I still feel a bit unsafe about the topic and hope you folks can help me - For passing data (configuration or results) between a worker thread polling something and a controlling thread interested in the most recent data, I've ended up using more or less the following pattern repeatedly: Mutex m; tData * stage; // temporary, accessed concurrently // send data, gives up ownership, receives old stage if any tData * Send(tData * newData) { ScopedLock lock(m); swap(newData, stage); return newData; } // receiving thread fetches latest data here tData * Fetch(tData * prev) { ScopedLock lock(m); if (stage != 0) { // ... release prev prev = stage; stage = 0; } return prev; // now current } Note: This is not supposed to be a full producer-consumer queue, only the msot recent data is relevant. Also, I've skimmed ressource management somewhat here. When necessary I'm using two such stages: one to send config changes to the worker, and for sending back results. Now, my questions assuming that ScopedLock implements a full memory barrier: do stage and/or workerData need to be volatile? is volatile necessary for tData members? can I use smart pointers instead of the raw pointers - say boost::shared_ptr? Anything else that can go wrong? I am basically trying to avoid "volatile infection" spreading into tData, and minimize lock contention (a lock free implementation seems possible, too). However, I'm not sure if this is the easiest solution. ScopedLock acts as a full memory barrier. Since all this is more or less platform dependent, let's say Visual C++ x86 or x64, though differences/notes for other platforms are welcome, too. (a prelimenary "thanks but" for recommending libraries such as Intel TBB - I am trying to understand the platform issues here)

Read the article

How are you taking advantage of Multicore?

- by tgamblin

As someone in the world of HPC who came from the world of enterprise web development, I'm always curious to see how developers back in the "real world" are taking advantage of parallel computing. This is much more relevant now that all chips are going multicore, and it'll be even more relevant when there are thousands of cores on a chip instead of just a few. My questions are: How does this affect your software roadmap? I'm particularly interested in real stories about how multicore is affecting different software domains, so specify what kind of development you do in your answer (e.g. server side, client-side apps, scientific computing, etc). What are you doing with your existing code to take advantage of multicore machines, and what challenges have you faced? Are you using OpenMP, Erlang, Haskell, CUDA, TBB, UPC or something else? What do you plan to do as concurrency levels continue to increase, and how will you deal with hundreds or thousands of cores? If your domain doesn't easily benefit from parallel computation, then explaining why is interesting, too. Finally, I've framed this as a multicore question, but feel free to talk about other types of parallel computing. If you're porting part of your app to use MapReduce, or if MPI on large clusters is the paradigm for you, then definitely mention that, too. Update: If you do answer #5, mention whether you think things will change if there get to be more cores (100, 1000, etc) than you can feed with available memory bandwidth (seeing as how bandwidth is getting smaller and smaller per core). Can you still use the remaining cores for your application?

Search Results

Search found 30 results on 2 pages for 'tbb'.

Page 1/2 | 1 2 | Next Page >

- by Hristo

- by Jackie

- by Jackie

- by SilverSun

- by J Teller

- by unknownthreat

- by unknownthreat

- by Jackie

- by Jackie

- by Stan

- by yongsun

- by acidzombie24

- by Dan

- by acidzombie24

- by Santiago

- by Santiago

- by user583713

- by Evan Kimia

- by Brann

- by Daniel Moth

- by peterchen

- by tgamblin

1 2 | Next Page >