Search Results

Search found 266 results on 11 pages for 'multiplication'.

Page 4/11 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 | Next Page >

Creating a dataframe in pandas by multiplying two series together

- by Aoife

Say I have two series in pandas, series A and series B. How do I create a dataframe in which all of those values are multiplied together, i.e. with series A down the left hand side and series B along the top. Basically the same concept as this, where series A would be the yellow on the left and series B the yellow along the top, and all the values in between would be filled in by multiplication: http://www.google.co.uk/imgres?imgurl=http://www.vaughns-1-pagers.com/computer/multiplication-tables/times-table-12x12.gif&imgrefurl=http://www.vaughns-1-pagers.com/computer/multiplication-tables.htm&h=533&w=720&sz=58&tbnid=9B8R_kpUloA4NM:&tbnh=90&tbnw=122&zoom=1&usg=__meqZT9kIAMJ5b8BenRzF0l-CUqY=&docid=j9BT8tUCNtg--M&sa=X&ei=bkBpUpOWOI2p0AWYnIHwBQ&ved=0CE0Q9QEwBg Thanks!

Read the article
How to get it working in O(n)?

- by evermean

I came across an interview task/question that really got me thinking ... so here it goes: You have an array A[N] of N numbers. You have to compose an array Output[N] such that Output[i] will be equal to multiplication of all the elements of A[N] except A[i]. For example Output[0] will be multiplication of A[1] to A[N-1] and Output[1] will be multiplication of A[0] and from A[2] to A[N-1]. Solve it without division operator and in O(n). I really tried to come up with a solution but I always end up with a complexity of O(n^2). Perhaps the is anyone smarter than me who can tell me an algorithm that works in O(n) or at least give me a hint...

Read the article
Speed up lighting in deferred shading

- by kochol

I implemented a simple deferred shading renderer. I use 3 G-Buffer for storing position (R32F), normal (G16R16F) and albedo (ARGB8). I use sphere map algorithm to store normals in world space. Currently I use inverse of view * projection matrix to calculate the position of each pixel from stored depth value. First I want to avoid per pixel matrix multiplication for calculating the position. Is there another way to store and calculate position in G-Buffer without the need of matrix multiplication Store the normal in view space Every lighting in my engine is in world space and I want do the lighting in view space to speed up my lighting pass. I want an optimized lighting pass for my deferred engine.

Read the article
Memory read/write access efficiency

- by wolfPack88

I've heard conflicting information from different sources, and I'm not really sure which one to believe. As such, I'll post what I understand and ask for corrections. Let's say I want to use a 2D matrix. There are three ways that I can do this (at least that I know of). 1: int i; char **matrix; matrix = malloc(50 * sizeof(char *)); for(i = 0; i < 50; i++) matrix[i] = malloc(50); 2: int i; int rowSize = 50; int pointerSize = 50 * sizeof(char *); int dataSize = 50 * 50; char **matrix; matrix = malloc(dataSize + pointerSize); char *pData = matrix + pointerSize - rowSize; for(i = 0; i < 50; i++) { pData += rowSize; matrix[i] = pData; } 3: //instead of accessing matrix[i][j] here, we would access matrix[i * 50 + j] char *matrix = malloc(50 * 50); In terms of memory usage, my understanding is that 3 is the most efficient, 2 is next, and 1 is least efficient, for the reasons below: 3: There is only one pointer and one allocation, and therefore, minimal overhead. 2: Once again, there is only one allocation, but there are now 51 pointers. This means there is 50 * sizeof(char *) more overhead. 1: There are 51 allocations and 51 pointers, causing the most overhead of all options. In terms of performance, once again my understanding is that 3 is the most efficient, 2 is next, and 1 is least efficient. Reasons being: 3: Only one memory access is needed. We will have to do a multiplication and an addition as opposed to two additions (as in the case of a pointer to a pointer), but memory access is slow enough that this doesn't matter. 2: We need two memory accesses; once to get a char *, and then to the appropriate char. Only two additions are performed here (once to get to the correct char * pointer from the original memory location, and once to get to the correct char variable from wherever the char * points to), so multiplication (which is slower than addition) is not required. However, on modern CPUs, multiplication is faster than memory access, so this point is moot. 1: Same issues as 2, but now the memory isn't contiguous. This causes cache misses and extra page table lookups, making it the least efficient of the lot. First and foremost: Is this correct? Second: Is there an option 4 that I am missing that would be even more efficient?

Read the article
View space lighting in deferred shading

- by kochol

I implemented a simple deferred shading renderer. I use 3 G-Buffer for storing position (R32F), normal (G16R16F) and albedo (ARGB8). I use sphere map algorithm to store normals in world space. Currently I use inverse of view * projection matrix to calculate the position of each pixel from stored depth value. First I want to avoid per pixel matrix multiplication for calculating the position. Is there another way to store and calculate position in G-Buffer without the need of matrix multiplication Store the normal in view space Every lighting in my engine is in world space and I want do the lighting in view space to speed up my lighting pass. I want an optimized lighting pass for my deferred engine.

Read the article
How do I correctly multiply an XMMATRIX by a scalar?

- by user43129

Using DirectXMath and its XMMATRIX structure in C++ and Direct X 11, how does one multiply that matrix structure by a single float scalar? I want to implement the operation B = A * f; where A and B are XMMATRIX and f is a float. I found all sorts of functions to multiply a matrix by another matrix or a vector. I found all sorts of functions to construct matrices. I could find no scalar multiplication! Why is there no such function? Is there no use case? Did I miss something? How do I implement scalar multiplication?

Read the article
Prolog - generate correct bracketing

- by Henrik Bak

I'd like to get some help in the following exam problem, i have no idea how to do this: Input: a list of numbers, eg.: [1,2,3,4] Output: every possible correct bracketing. Eg.: (in case of input [1,2,3,4]): ((1 2) (3 4)) ((1 (2 3)) 4) (1 ((2 3) 4)) (1 (2 (3 4))) (((1 2) 3) 4) Bracketing here is like a method with two arguments, for example multiplication - then the output is the possible multiplication orders. Please help, i'm stuck with this one. Any help is appreciated, thanks!

Read the article
How computer multiplies 2 numbers?

- by ckv

How does a computer perform a multiplication on 2 numbers say 100 * 55. My guess was that the computer did repeated addition to achieve multiplication. Of course this could be the case for integer numbers. However for floating point numbers there must be some other logic. Note: This was asked in an interview.

Read the article
Scheduling thread tiles with C++ AMP

- by Daniel Moth

This post assumes you are totally comfortable with, what some of us call, the simple model of C++ AMP, i.e. you could write your own matrix multiplication. We are now ready to explore the tiled model, which builds on top of the non-tiled one. Tiling the extent We know that when we pass a grid (which is just an extent under the covers) to the parallel_for_each call, it determines the number of threads to schedule and their index values (including dimensionality). For the single-, two-, and three- dimensional cases you can go a step further and subdivide the threads into what we call tiles of threads (others may call them thread groups). So here is a single-dimensional example: extent<1> e(20); // 20 units in a single dimension with indices from 0-19 grid<1> g(e); // same as extent tiled_grid<4> tg = g.tile<4>(); …on the 3rd line we subdivided the single-dimensional space into 5 single-dimensional tiles each having 4 elements, and we captured that result in a concurrency::tiled_grid (a new class in amp.h). Let's move on swiftly to another example, in pictures, this time 2-dimensional: So we start on the left with a grid of a 2-dimensional extent which has 8*6=48 threads. We then have two different examples of tiling. In the first case, in the middle, we subdivide the 48 threads into tiles where each has 4*3=12 threads, hence we have 2*2=4 tiles. In the second example, on the right, we subdivide the original input into tiles where each has 2*2=4 threads, hence we have 4*3=12 tiles. Notice how you can play with the tile size and achieve different number of tiles. The numbers you pick must be such that the original total number of threads (in our example 48), remains the same, and every tile must have the same size. Of course, you still have no clue why you would do that, but stick with me. First, we should see how we can use this tiled_grid, since the parallel_for_each function that we know expects a grid. Tiled parallel_for_each and tiled_index It turns out that we have additional overloads of parallel_for_each that accept a tiled_grid instead of a grid. However, those overloads, also expect that the lambda you pass in accepts a concurrency::tiled_index (new in amp.h), not an index<N>. So how is a tiled_index different to an index? A tiled_index object, can have only 1 or 2 or 3 dimensions (matching exactly the tiled_grid), and consists of 4 index objects that are accessible via properties: global, local, tile_origin, and tile. The global index is the same as the index we know and love: the global thread ID. The local index is the local thread ID within the tile. The tile_origin index returns the global index of the thread that is at position 0,0 of this tile, and the tile index is the position of the tile in relation to the overall grid. Confused? Here is an example accompanied by a picture that hopefully clarifies things: array_view<int, 2> data(8, 6, p_my_data); parallel_for_each(data.grid.tile<2,2>(), [=] (tiled_index<2,2> t_idx) restrict(direct3d) { /* todo */ }); Given the code above and the picture on the right, what are the values of each of the 4 index objects that the t_idx variables exposes, when the lambda is executed by T (highlighted in the picture on the right)? If you can't work it out yourselves, the solution follows: t_idx.global = index<2> (6,3) t_idx.local = index<2> (0,1) t_idx.tile_origin = index<2> (6,2) t_idx.tile = index<2> (3,1) Don't move on until you are comfortable with this… the picture really helps, so use it. Tiled Matrix Multiplication Example – part 1 Let's paste here the C++ AMP matrix multiplication example, bolding the lines we are going to change (can you guess what the changes will be?) 01: void MatrixMultiplyTiled_Part1(vector<float>& vC, const vector<float>& vA, const vector<float>& vB, int M, int N, int W) 02: { 03: 04: array_view<const float,2> a(M, W, vA); 05: array_view<const float,2> b(W, N, vB); 06: array_view<writeonly<float>,2> c(M, N, vC); 07: parallel_for_each(c.grid, 08: [=](index<2> idx) restrict(direct3d) { 09: 10: int row = idx[0]; int col = idx[1]; 11: float sum = 0.0f; 12: for(int i = 0; i < W; i++) 13: sum += a(row, i) * b(i, col); 14: c[idx] = sum; 15: }); 16: } To turn this into a tiled example, first we need to decide our tile size. Let's say we want each tile to be 16*16 (which assumes that we'll have at least 256 threads to process, and that c.grid.extent.size() is divisible by 256, and moreover that c.grid.extent[0] and c.grid.extent[1] are divisible by 16). So we insert at line 03 the tile size (which must be a compile time constant). 03: static const int TS = 16; ...then we need to tile the grid to have tiles where each one has 16*16 threads, so we change line 07 to be as follows 07: parallel_for_each(c.grid.tile<TS,TS>(), ...that means that our index now has to be a tiled_index with the same characteristics as the tiled_grid, so we change line 08 08: [=](tiled_index<TS, TS> t_idx) restrict(direct3d) { ...which means, without changing our core algorithm, we need to be using the global index that the tiled_index gives us access to, so we insert line 09 as follows 09: index<2> idx = t_idx.global; ...and now this code just works and it is tiled! Closing thoughts on part 1 The process we followed just shows the mechanical transformation that can take place from the simple model to the tiled model (think of this as step 1). In fact, when we wrote the matrix multiplication example originally, the compiler was doing this mechanical transformation under the covers for us (and it has additional smarts to deal with the cases where the total number of threads scheduled cannot be divisible by the tile size). The point is that the thread scheduling is always tiled, even when you use the non-tiled model. But with this mechanical transformation, we haven't gained anything… Hint: our goal with explicitly using the tiled model is to gain even more performance. In the next post, we'll evolve this further (beyond what the compiler can automatically do for us, in this first release), so you can see the full usage of the tiled model and its benefits… Comments about this post by Daniel Moth welcome at the original blog.

Read the article
SQL SERVER – Basic Calculation and PEMDAS Order of Operation

- by pinaldave

After thinking a long time, I have decided to write about this blog post. I had no plan to create a blog post about this subject but the amount of conversation this one has created on my Facebook page, I decided to bring up a few of the question and concerns discussed on the Facebook page. There are more than 10,000 comments here so far. There are lots of discussion about what should be the answer. Well, as far as I can tell there is a big debate going on on Facebook, for educational purpose you should go ahead and read some of the comments. They are very interesting and for sure teach some new stuff. Even though some of the comments are clearly wrong they have made some good points and I believe it for sure develops some logic. Here is my take on this subject. I believe the answer is 9 as I follow PEMDAS Order of Operation. PEMDAS stands for parentheses, exponents, multiplication, division, addition, subtraction. PEMDAS is commonly known as BODMAS in India. BODMAS stands for Brackets, Orders (ie Powers and Square Roots, etc), Division, Multiplication, Addition and Subtraction. PEMDAS and BODMAS are almost same and both of them follow the operation order from LEFT to RIGHT. Let us try to simplify above statement using the PEMDAS or BODMAS (whatever you prefer to call). Step 1: 6 ÷ 2 (1+2) (parentheses first) Step 2: = 6 ÷ 2 * (1+2) (adding multiplication sign for further clarification) Step 3: = 6 ÷ 2* (3) (single digit in parentheses – simplify using operator) Step 4: = 6 ÷ 2 * 3 (Remember next Operation should be LEFT to RIGHT) Step 5: = 3 * 3 (because 6 ÷ 2 = 3; remember LEFT to RIGHT) Step 6: = 9 (final answer) Some often find Step 4 confusing and often ended up multiplying 2 and 3 resulting Step 5 to be 6 ÷ 6, this is incorrect because in this case we did not follow the order of LEFT to RIGHT. When we do not follow the order of operation from LEFT to RIGHT we end up with the answer 1 which is incorrect. Let us see what SQL Server returns as a result. I executed following statement in SQL Server Management Studio SELECT 6/2*(1+2) It is clear that SQL Server also thinks that the answer should be 9. Let us go ahead and ask Google what will be the answer of above question in Google I have searched for the following term: 6/2(1+2) The result also says the answer should be 9. If you want a further reference here is a great video which describes why the answer should be 9 and not 1. And here is a fantastic conversation on Google Groups. Well, now what is your take on this subject? You are welcome to share constructive feedback and your answer may be different from my answer. NOTE: A healthy conversation about this subject is indeed encouraged but if there is a single bad word or comment is flaming it will be deleted without any notification (it does not matter how valuable information it contains). Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: About Me, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
java number exceeds long.max_value - how to detect?

- by jurchiks

I'm having problems detecting if a sum/multiplication of two numbers exceeds the maximum value of a long integer. Example code: long a = 2 * Long.MAX_VALUE; System.out.println("long.max * smth > long.max... or is it? a=" + a); This gives me -2, while I would expect it to throw a NumberFormatException... Is there a simple way of making this work? Because I have some code that does multiplications in nested IF blocks or additions in a loop and I would hate to add more IFs to each IF or inside the loop. Edit: oh well, it seems that this answer from another question is the most appropriate for what I need: http://stackoverflow.com/a/9057367/540394 I don't want to do boxing/unboxing as it adds unnecassary overhead, and this way is very short, which is a huge plus to me. I'll just write two short functions to do these checks and return the min or max long. Edit2: here's the function for limiting a long to its min/max value according to the answer I linked to above: /** * @param a : one of the two numbers added/multiplied * @param b : the other of the two numbers * @param c : the result of the addition/multiplication * @return the minimum or maximum value of a long integer if addition/multiplication of a and b is less than Long.MIN_VALUE or more than Long.MAX_VALUE */ public static long limitLong(long a, long b, long c) { return (((a > 0) && (b > 0) && (c <= 0)) ? Long.MAX_VALUE : (((a < 0) && (b < 0) && (c >= 0)) ? Long.MIN_VALUE : c)); } Tell me if you think this is wrong.

Read the article
Python — Time complexity of built-in functions versus manually-built functions in finite fields

- by stackuser

Generally, I'm wondering about the advantages versus disadvantages of using the built-in arithmetic functions versus rolling your own in Python. Specifically, I'm taking in GF(2) finite field polynomials in string format, converting to base 2 values, performing arithmetic, then output back into polynomials as string format. So a small example of this is in multiplication: Rolling my own: def multiply(a,b): bitsa = reversed("{0:b}".format(a)) g = [(b<<i)*int(bit) for i,bit in enumerate(bitsa)] return reduce(lambda x,y: x+y,g) Versus the built-in: def multiply(a,b): # a,b are GF(2) polynomials in binary form .... return a*b #returns product of 2 polynomials in gf2 Currently, operations like multiplicative inverse (with for example 20 bit exponents) take a long time to run in my program as it's using all of Python's built-in mathematical operations like // floor division and % modulus, etc. as opposed to making my own division, remainder, etc. I'm wondering how much of a gain in efficiency and performance I can get by building these manually (as shown above). I realize the gains are dependent on how well the manual versions are built, that's not the question. I'd like to find out 'basically' how much advantage there is over the built-in's. So for instance, if multiplication (as in the example above) is well-suited for base 10 (decimal) arithmetic but has to jump through more hoops to change bases to binary and then even more hoops in operating (so it's lower efficiency), that's what I'm wondering. Like, I'm wondering if it's possible to bring the time down significantly by building them myself in ways that maybe some professionals here have already come across.

Read the article
Which opcodes are faster at the CPU level?

- by Geotarget

In every programming language there are sets of opcodes that are recommended over others. I've tried to list them here, in order of speed. Bitwise Integer Addition / Subtraction Integer Multiplication / Division Comparison Control flow Float Addition / Subtraction Float Multiplication / Division Where you need high-performance code, C++ can be hand optimized in assembly, to use SIMD instructions or more efficient control flow, data types, etc. So I'm trying to understand if the data type (int32 / float32 / float64) or the operation used (*, +, &) affects performance at the CPU level. Is a single multiply slower on the CPU than an addition? In MCU theory you learn that speed of opcodes is determined by the number of CPU cycles it takes to execute. So does it mean that multiply takes 4 cycles and add takes 2? Exactly what are the speed characteristics of the basic math and control flow opcodes? If two opcodes take the same number of cycles to execute, then both can be used interchangeably without any performance gain / loss? Any other technical details you can share regarding x86 CPU performance is appreciated

Read the article
How do I implement deceleration for the player character?

- by tesselode

Using delta time with addition and subtraction is easy. player.speed += 100 * dt However, multiplication and division complicate things a bit. For example, let's say I want the player to double his speed every second. player.speed = player.speed * 2 * dt I can't do this because it'll slow down the player (unless delta time is really high). Division is the same way, except it'll speed things way up. How can I handle multiplication and division with delta time? Edit: it looks like my question has confused everyone. I really just wanted to be able to implement deceleration without this horrible mass of code: else if speed > 0 then speed = speed - 20 * dt if speed < 0 then speed = 0 end end if speed < 0 then speed = speed + 20 * dt if speed > 0 then speed = 0 end end end Because that's way bigger than it needs to be. So far a better solution seems to be: speed = speed - speed * whatever_number * dt

Read the article
Java single Array best choice for accessing pixels for manipulation?

- by Petrol

I am just watching this tutorial https://www.youtube.com/watch?v=HwUnMy_pR6A and the guy (who seems to be pretty competent) is using a single array to store and access the pixels of his to-be-rendered image. I was wondering if this really is the best way to do this. The alternative of Multi-Array does have one pointer more, but Arrays do have an O(1) for accessing each index and calculating the index in a single array seems to take one addition and one multiplication operation per pixel. And if Multi-Arrays really are bad, can't you use something with Hashing to avoid those addition and multiplication operations? EDIT: here is his code... public class Screen { private int width, height; public int[] pixels; public Screen(int width, int height) { this.width = width; this.height = height; // creating array the size of one index/int for every pixel // single array has better performance than multi-array pixels = new int[width * height]; } public void render() { for (int y = 0; y < height; y++) { for (int x = 0; x < width; x++) { pixels[x + y * width] = 0xff00ff; } } } }

Read the article
When to use Shift operators << >> in C# ?

- by Junior Mayhé

I was studying shift operators in C#, trying to find out when to use them in my code. I found an answer but for Java, you could: a) Make faster integer multiplication and division operations: *4839534 * 4* can be done like this: 4839534 << 2 or 543894 / 2 can be done like this: 543894 1 Shift operations much more faster than multiplication for most of processors. b) Reassembling byte streams to int values c) For accelerating operations with graphics since Red, Green and Blue colors coded by separate bytes. d) Packing small numbers into one single long... For b, c and d I can't imagine here a real sample. Does anyone know if we can accomplish all these items in C#? Is there more practical use for shift operators in C#?

Read the article
Shouldn't this cause an Overflow? It doesn't!

- by Cyberherbalist

What's up with this, anyway? I do a simple multiplication: Int64 x = 11111111111; Int64 y = 11111111111; Int64 z = x * y; And at the end of the multiplication, z shows a value of: -5670418394979206991 This has clearly overflowed, but no exception is raised. I'd like one to be raised, but... Note that this is on Windows Phone 7, but I don't think this has any bearing on the issue. Or does it?

Read the article
How to pass operators as parameters

- by Rodion Ingles

I have to load an array of doubles from a file, multiply each element by a value in a table (different values for different elements), do some work on it, invert the multiplication (that is, divide) and then save the data back to file. Currently I implement the multiplication and division process in two separate methods. Now there is some extra work behind the scenes but apart from the specific statements where the multiplication/division occurs, the rest of the code is identical. As you can imagine, with this approach you have to be very careful making any changes. The surrounding code is not trivial, so its either a case of manually editing each method or copying changes from one method to the other and remembering to change the * and / operators. After too many close calls I am fed up of this and would like to make a common function which implements the common logic and two wrapper functions which pass which operator to use as a parameter. My initial approach was to use function pointers: MultiplyData(double data) { TransformData(data, &(operator *)); } DivideData(double data) { TransformData(data, &(operator /)); } TransformData(double data, double (*func)(double op1, double op2)) { /* Do stuff here... */ } However, I can't pass the operators as pointers (is this because it is an operator on a native type?), so I tried to use function objects. Initially I thought that multiplies and divides functors in <functional> would be ideal: MultiplyData(double data) { std::multiplies<double> multFunct; TransformData(data, &multFunct); } DivideData(double data) { std::divides<double> divFunct; TransformData(data, &divFunct); } TransformData(double data, std::binary_function<double, double, double> *funct) { /* Do stuff here... */ } As you can see I was trying to use a base class pointer to pass the functor polymorphically. The problem is that std::binary_function does not declare an operator() member for the child classes to implement. Is there something I am missing, or is the solution to implement my own functor heirarchy (which really seems more trouble than it is worth)?

Read the article
How do I indicate that a class doesn't support certain operators?

- by romeovs

I'm writing a class that represents an ordinal scale, but has no logical zero-point (eg time). This scale should permit addition and substraction (operator+, operator+=, ...) but not multiplication. Yet, I always felt it to be a good practice that when one overloads one operator of a certain group (in this case the math operators), one should also overload all the others that belong to that group. In this case that would mean I should need to overload the multiplication and division operators also, because if a user can use A+B he would probable expect to be able the other operators. Is there a method that I can use to throw an error for this at compiler time? The easiest method would be just no to overload the operators operator*, ... yet it would seem appropriate to add a bit more explaination than operator* is not know for class "time". Or is this something that I really should not care about (RTFM user)?

Read the article
Fast ceiling of an integer division in C / C++

- by andand

Given integer values x and y, C and C++ returns as the quotient q = x/y the floor of the floating point valued equivalent. I'm interestd in a method of returning the ceiling instead? For example, ceil(10/5) = 2 and ceil(11/5) = 3. The obvious approach involves something like: q = x / y; if (q * y < x) ++q; This requires an extra comparison and multiplication; and other methods I've seen (used in fact) involve casting as a float or double. Is there a more direct method that avoids the additional multiplication (or a second division) and branch, and that also avoids casting as a floating point number?

Read the article
Why PHP (script) serves more requests than CGI (compiled)?

- by Lucas Batistussi

I developed the following CGI script and run on Apache 2 (http://localhost/test.chtml). I did same script in PHP (http://localhost/verifica.php). Later I performed Apache benchmark using Apache Benchmark tool. The results are showed in images. include #include <stdlib.h> int main(void) { printf("%s%c%c\n", "Content-Type:text/html;charset=iso-8859-1",13,10); printf("<TITLE>Multiplication results</TITLE>\n"); printf("<H3>Multiplication results</H3>\n"); return 0; } Someone can explain me why PHP serves more requests than CGI script?

Read the article
Inverse projection: question about w coordinate

- by fayeWilly

I have to perform in shader an inverse projection from a u/v of a render target. What I do is: Get NDC as 2*(u,v,depth) - 1 Then world space as tmp = (P*V)^-1 * (NDC,1.0); world space = tmp/tmp.w; This apparently works, but I am confused about the w division there. Why this work? Shouldn't be a multiplication by a w somewhere (as in the "forward" pipeline there is the perpsective division?) Thank you, Faye

Read the article
Is this an effective monetization method for an Android game? [on hold]

- by Matthew Page

The short version: I plan to make an Android puzzle game where the user tries to get 3-6 numbers to their predetermined goal numbers. The free version of the app will have three predetermined levels (easy, medium, hard). The full version ($0.99, probably) will have a level generator where there will be unlimited easy, medium, or hard levels, as well as a custom difficulty option where users can set specific vales to the number of numbers to equate to their goal, the number of buttons to use, etc. Users will also have the option to get a one-time "hint" for a fee of $0.49, or unlimited hints for a one-time fee of $2.99. The long version: Mechanics of Game and Victory The application is a number puzzle. When the user begins a new game, depending on the input by the user, between 3 and 6 numbers show up on the top of the screen, and between 3 and 6 buttons show up on the bottom of the screen. The buttons all have two options: to increase every number the same way, or decrease every number the same way. The buttons either use addition / subtraction, multiplication / division, or exponents / roots, all depending on the number displayed on the button. Addition buttons are green, multiplication buttons are blue, and exponential buttons are red. The user wins when all of the numbers displayed on the screen equate to their goal number, displayed below each number. Monetization If the user is playing the full (priced) version of the app, upon the start of the game, the user will be confronted with a dialogue asking for the number of buttons and the number of numbers to equate in the game. Then, based on the user input, a random puzzle will be generated. If the user is playing the free version of the app, the user will be asked to either play an “easy”, “hard”, or “expert” puzzle. A pre-determined puzzle from each category will be used in the game. If the user has played that puzzle before, a dialogue will show saying this to the user and advertising the full version of the app. The full version of the app will also be advertised upon the successful or in successful completion of a puzzle. Upon exiting this advertisement, another full screen advertisement will appear from a third party. Also, the solution to the puzzle should be stored by the program, and if the user pays a small fee, he/she can see a hint to the solution to the program. In the free version of the app, the user may use their first hint for free. Also, the user can use unlimited hints for a slightly larger fee. Is this an effective monetization method?

Read the article
Give a session on C++ AMP – here is how

- by Daniel Moth

Ever since presenting on C++ AMP at the AMD Fusion conference in June, then the Gamefest conference in August, and the BUILD conference in September, I've had numerous requests about my material from folks that want to re-deliver the same session. The C++ AMP session I put together has evolved over the 3 presentations to its final form that I used at BUILD, so that is the one I recommend you base yours on. Please get the slides and the recording from channel9 (I'll refer to slide numbers below). This is how I've been presenting the C++ AMP session: Context (slide 3, 04:18-08:18) Start with a demo, on my dual-GPU machine. I've been using the N-Body sample (for VS 11 Developer Preview). (slide 4) Use an nvidia slide that has additional examples of performance improvements that customers enjoy with heterogeneous computing. (slide 5) Talk a bit about the differences today between CPU and GPU hardware, leading to the fact that these will continue to co-exist and that GPUs are great for data parallel algorithms, but not much else today. One is a jack of all trades and the other is a number cruncher. (slide 6) Use the APU example from amd, as one indication that the hardware space is still in motion, emphasizing that the C++ AMP solution is a data parallel API, not a GPU API. It has a future proof design for hardware we have yet to see. (slide 7) Provide more meta-data, as blogged about when I first introduced C++ AMP. Code (slide 9-11) Introduce C++ AMP coding with a simplistic array-addition algorithm – the slides speak for themselves. (slide 12-13) index<N>, extent<N>, and grid<N>. (Slide 14-16) array<T,N>, array_view<T,N> and comparison between them. (Slide 17) parallel_for_each. (slide 18, 21) restrict. (slide 19-20) actual restrictions of restrict(direct3d) – the slides speak for themselves. (slide 22) bring it altogether with a matrix multiplication example. (slide 23-24) accelerator, and accelerator_view. (slide 26-29) Introduce tiling incl. tiled matrix multiplication [tiling probably deserves a whole session instead of 6 minutes!]. IDE (slide 34,37) Briefly touch on the concurrency visualizer. It supports GPU profiling, but enhancements specific to C++ AMP we hope will come at the Beta timeframe, which is when I'll be spending more time talking about it. (slide 35-36, 51:54-59:16) Demonstrate the GPU debugging experience in VS 11. Summary (slide 39) Re-iterate some of the points of slide 7, and add the point that the C++ AMP spec will be open for other compiler vendors to implement, even on other platforms (in fact, Microsoft is actively working on that). (slide 40) Links to content – see slide – including where all your questions should go: http://social.msdn.microsoft.com/Forums/en/parallelcppnative/threads. "But I don't have time for a full blown session, I only need 2 (or just 1, or 3) C++ AMP slides to use in my session on related topic X" If all you want is a small number of slides, you can take some from the session above and customize them. But because I am so nice, I have created some slides for you, including talking points in the notes section. Download them here. Comments about this post by Daniel Moth welcome at the original blog.

Read the article
Give a session on C++ AMP – here is how

- by Daniel Moth

Ever since presenting on C++ AMP at the AMD Fusion conference in June, then the Gamefest conference in August, and the BUILD conference in September, I've had numerous requests about my material from folks that want to re-deliver the same session. The C++ AMP session I put together has evolved over the 3 presentations to its final form that I used at BUILD, so that is the one I recommend you base yours on. Please get the slides and the recording from channel9 (I'll refer to slide numbers below). This is how I've been presenting the C++ AMP session: Context (slide 3, 04:18-08:18) Start with a demo, on my dual-GPU machine. I've been using the N-Body sample (for VS 11 Developer Preview). (slide 4) Use an nvidia slide that has additional examples of performance improvements that customers enjoy with heterogeneous computing. (slide 5) Talk a bit about the differences today between CPU and GPU hardware, leading to the fact that these will continue to co-exist and that GPUs are great for data parallel algorithms, but not much else today. One is a jack of all trades and the other is a number cruncher. (slide 6) Use the APU example from amd, as one indication that the hardware space is still in motion, emphasizing that the C++ AMP solution is a data parallel API, not a GPU API. It has a future proof design for hardware we have yet to see. (slide 7) Provide more meta-data, as blogged about when I first introduced C++ AMP. Code (slide 9-11) Introduce C++ AMP coding with a simplistic array-addition algorithm – the slides speak for themselves. (slide 12-13) index<N>, extent<N>, and grid<N>. (Slide 14-16) array<T,N>, array_view<T,N> and comparison between them. (Slide 17) parallel_for_each. (slide 18, 21) restrict. (slide 19-20) actual restrictions of restrict(direct3d) – the slides speak for themselves. (slide 22) bring it altogether with a matrix multiplication example. (slide 23-24) accelerator, and accelerator_view. (slide 26-29) Introduce tiling incl. tiled matrix multiplication [tiling probably deserves a whole session instead of 6 minutes!]. IDE (slide 34,37) Briefly touch on the concurrency visualizer. It supports GPU profiling, but enhancements specific to C++ AMP we hope will come at the Beta timeframe, which is when I'll be spending more time talking about it. (slide 35-36, 51:54-59:16) Demonstrate the GPU debugging experience in VS 11. Summary (slide 39) Re-iterate some of the points of slide 7, and add the point that the C++ AMP spec will be open for other compiler vendors to implement, even on other platforms (in fact, Microsoft is actively working on that). (slide 40) Links to content – see slide – including where all your questions should go: http://social.msdn.microsoft.com/Forums/en/parallelcppnative/threads. "But I don't have time for a full blown session, I only need 2 (or just 1, or 3) C++ AMP slides to use in my session on related topic X" If all you want is a small number of slides, you can take some from the session above and customize them. But because I am so nice, I have created some slides for you, including talking points in the notes section. Download them here. Comments about this post by Daniel Moth welcome at the original blog.

Read the article

Search Results

Search found 266 results on 11 pages for 'multiplication'.

Page 4/11 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 | Next Page >

- by Aoife

- by evermean

- by kochol

- by wolfPack88

- by kochol

- by user43129

- by Henrik Bak

- by ckv

- by Daniel Moth

- by pinaldave

- by jurchiks

- by stackuser

- by Geotarget

- by tesselode

- by Petrol

- by Junior Mayhé

- by Cyberherbalist

- by Rodion Ingles

- by romeovs

- by andand

- by Lucas Batistussi

- by fayeWilly

- by Matthew Page

- by Daniel Moth

- by Daniel Moth

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 | Next Page >