Search Results

Search found 2037 results on 82 pages for 'matrix multiplication'.

Page 54/82 | < Previous Page | 50 51 52 53 54 55 56 57 58 59 60 61  | Next Page >

  • Optimize code perfromance when odd/even threads are doing different things in CUDA

    - by Ashraf
    Hi all! I have two large vectors, I am trying to do some sort of element multiplication, where an even-numbered element in the first vector is multiplied by the next odd-numbered element in the second vector .... and where the odd-numbered element in the first vector is multiplied by the preceding even-numbered element in the second vector Ex. vector 1 is V1(1) V1(2) V1(3) V1(4) vector 2 is V2(1) V2(2) V2(3) V2(4) V1(1) * V2(2) V1(3) * V2(4) V1(2) * V2(1) V1(4) * V2(3) I have written a Cuda code to do this: (Pds has the elements of the first vector in shared memory, Nds the second Vector) //instead of using %2 .. i check for the first bit to decide if number is odd/even -- faster if ((tx & 0x0001) == 0x0000) Nds[tx+1] = Pds[tx] * Nds[tx+1]; else Nds[tx-1] = Pds[tx] * Nds[tx-1]; __syncthreads(); Is there anyway to further accelerate this code or avoid divergence .. Thanks

    Read the article

  • C++ double division by 0.0 versus DBL_MIN

    - by wonsungi
    When finding the inverse square root of a double, is it better to clamp invalid non-positive inputs at 0.0 or MIN_DBL? (In my example below double b may end up being negative due to floating point rounding errors and because the laws of physics are slightly slightly fudged in the game.) Both division by 0.0 and MIN_DBL produce the same outcome in the game because 1/0.0 and 1/DBL_MIN are effectively infinity. My intuition says MIN_DBL is the better choice, but would there be any case for using 0.0? Like perhaps sqrt(0.0), 1/0.0 and multiplication by 1.#INF000000000000 execute faster because they are special cases. double b = 1 - v.length_squared()/(c*c); #ifdef CLAMP_BY_0 if (b < 0.0) b = 0.0; #endif #ifdef CLAMP_BY_DBL_MIN if (b <= 0.0) b = DBL_MIN; #endif double lorentz_factor = 1/sqrt(b); double division in MSVC: 1/0.0 = 1.#INF000000000000 1/DBL_MIN = 4.4942328371557898e+307

    Read the article

  • Using Python tuples as vectors

    - by Etaoin
    I need to represent immutable vectors in Python ("vectors" as in linear algebra, not as in programming). The tuple seems like an obvious choice. The trouble is when I need to implement things like addition and scalar multiplication. If a and b are vectors, and c is a number, the best I can think of is this: tuple(map(lambda x,y: x + y, a, b)) # add vectors 'a' and 'b' tuple(map(lambda x: x * c, a)) # multiply vector 'a' by scalar 'c' which seems inelegant; there should be a clearer, simpler way to get this done -- not to mention avoiding the call to tuple, since map returns a list. Is there a better option?

    Read the article

  • Django QuerySet ordering by expression

    - by Andrew
    How can i use order_by like order_by('field1'*'field2') For example i have items with price listed in different currencies, so to order items - i have to make currency conversion. class Currency(models.Model): code = models.CharField(max_length=3, primary_key=True) rateToUSD = models.DecimalField(max_digits=20,decimal_places=10) class Item(models.Model): priceRT = models.DecimalField(max_digits=15, decimal_places=2, default=0) cur = models.ForeignKey(Currency) I would like to have something like: Item.objects.all().order_by(F('priceRT')*F('cur__rateToUSD')) But unfortunately it doesnt work, i also faild with annotate. How can i permorm QuerySet ordering by result of value multiplication of 2 model's fields.

    Read the article

  • Algorithm to split an array into N groups based on item index (should be something simple)

    - by serg
    I feel that it should be something very simple and obvious but just stuck on this for the last half an hour and can't move on. All I need is to split an array of elements into N groups based on element index. For example we have an array of 30 elements [e1,e2,...e30], that has to be divided into N=3 groups like this: group1: [e1, ..., e10] group2: [e11, ..., e20] group3: [e21, ..., e30] I came up with nasty mess like this for N=3 (pseudo language, I left multiplication on 0 and 1 just for clarification): for(i=0;i<array_size;i++) { if(i>=0*(array_size/3) && i<1*(array_size/3) { print "group1"; } else if(i>=1*(array_size/3) && i<2*(array_size/3) { print "group2"; } else if(i>=2*(array_size/3) && i<3*(array_size/3) print "group3"; } } But what would be the proper general solution? Thanks.

    Read the article

  • Is there any valid reason radians are used as the inputs to trig function in many modern languages?

    - by johnmortal
    Is there any pressing reason trig functions should use radian inputs in modern programming languages? As far as I know radians are typically ugly to deal with except in three cases: (1) You want to compute an arc length and you know the angle of the arc and (2) You need to do symbolic calculus with trig functions (3) certain infinite series expansion look prettier if the input is in radians. None of these scenarios seem like a worthy justification for every programming language I am familiar with using radian inputs for Sin, Cos, Tangent, etc... The third one sounds good because it might mean one gets faster computations using radians (very slightly faster- the cost of one additional floating point multiplication ) , but I am dubious even of that because most commonly the developer had to take an extra step to put the angle in radians in the first place. The other two are ridiculous justifications for all the added obscurity.

    Read the article

  • Why might different computers calculate different arithmetic results in VB.NET?

    - by Eyal
    I have some software written in VB.NET that performs a lot of calculations, mostly extracting jpegs to bitmaps and computing calculations on the pixels like convolutions and matrix multiplication. Different computers are giving me different results despite having identical inputs. What might be the reason? Edit: I can't provide the algorithm because it's proprietary but I can provide all the relevant operations: ULong \ ULong (Turuncating division) Bitmap.Load("filename.bmp') (Load a bitmap into memory) Bitmap.GetPixel(Integer, Integer) (Get a pixel's brightness) Double + Double Double * Double Math.Sqrt(Double) Math.PI Math.Cos(Double) ULong - ULong ULong * ULong ULong << ULong List.OrderBy(Of Double)(Func) Hmm... Is it possible that OrderBy is using a non-stable QuickSort and that QuickSort is using a random pivot? Edit: Just tested, nope. The sort is stable.

    Read the article

  • How much difference you find between scrollview offset and contentview offset of table?

    - by neha
    Hi all, In my application, I want to detect the end of the entire table with scrollview. Since there're no sections in my scrollview, I'm using noOfRows*rowHeight to reach the end. I'm using scrollview.contentOffset.y to detect the y offset, but this contentOffset isn't matching the multiplication result i.e. I have 20 rows and with height as 250. So it comes as 5000, but my scrollview.offset.y at the end of last cell is nearly about 4650. What's this difference? Thanx in advance.

    Read the article

  • How can I copy from the browser and paste to vim without unicode problems

    - by dsummersl
    This happens to me all the time: I copy something from a rich text screen (usually a browser) and then paste it into vim. Usually its a code block and then when I go to compile or run or what have you I get all kind of bazaar errors. I scratch my head, and then spend half an hour trying to figure out what is wrong before I realize I've copied some non ASCII characters: dashes, left and right quotes, long underscores, multiplication signs in place of x's, etc. So I ask you: how can I copy non-ASCII into my VIM session without error? Is there a paste mode that automatically 'down samples' unicode to ASCII? Is there a quick/dirty search for non ASCII characters in a file?

    Read the article

  • How do I use compiler intrinsic __fmul_?

    - by Eric Thoma
    I am writing a massively parallel GPU application. I have been optimizing it by hand. I received a 20% performance increase with _fdividef(x, y), and according to The Cuda C Programming Guide (section C.2.1), using similar functions for multiplication and adding is also beneficial. The function is stated as this: "_fmulrn,rz,ru,rd". __fdividef(x,y) was not stated with the arguments in brackets. I was wondering, what are those brackets? If I run the simple code: int t = __fmul_(5,4); I a compiler error about how _fmul is undefined. I have the CUDA runtime included, so I don't think it is a setup thing; rather it is something to do with those square brackets. How do I correctly use this function? Thank you.

    Read the article

  • Potential problems porting to different architectures

    - by Brendan Long
    I'm writing a Linux program that currently compiles and works fine on x86 and x86_64, and now I'm wondering if there's anything special I'll need to do to make it work on other architectures. What I've heard is that for cross platform code I should: Don't assume anything about the size of a pointer, int or size_t Don't make assumptions about byte order (I don't do any bit shifting -- I assume gcc will optimize my power of two multiplication/division for me) Don't use assembly blocks (obvious) Make sure your libraries work (I'm using SQLite, libcurl and Boost, which all seem pretty cross-platform) Is there anything else I need to worry about? I'm not currently targeting any other architectures, but I expect to support ARM at some point, and I figure I might as well make it work on any architecture if I can. Also, regarding my second point about byte order, do I need to do anything special with text input? I read files with getline(), so it seems like that should be done automatically as well.

    Read the article

  • Division inaccurate in Javascript?

    - by Nate
    If I perform the following operation in Javascript: 0.06120*400 The result is 24.48. However, if I do this: 24.48/400 The result is: 0.061200000000000004 JSFiddle: http://jsfiddle.net/zcDH7/ So it appears that Javascript rounds things differently when doing division and multiplication? Using my calculator, the operation 24.48/400 results in the correct answer of 0.0612. How should I deal with Javascript's inaccurate division? I can't simply round the number off, because I will be dealing with numbers of varying precision. Thanks for your advice.

    Read the article

  • Code Golf: All +-*/ Combinations for 3 integers

    - by Flash84x
    Write a program that takes 3 integers separated by spaces and perform every single combination of addition, subtraction, multiplication and division operations possible and display the result with the operation combination used. Example: $./solution 1 2 3 Results in the following output 1+2+3 = 6 1-2-3 = -4 1*2*3 = 6 1/2/3 = 0 (integer answers only, round up at .5) 1*2-3 = -1 3*1+2 = 5 etc... Order of operation rules apply, assume there will be no parenthesis used i.e. (3-1)*2 = 4 is not a combination, although you could implement this for "extra credit" For results where a divide by 0 occurs simply return NaN

    Read the article

  • Matlab - Propagate unit vectors on to the edge of shape boundaries

    - by Graham
    Hi I have a set of unit vectors which I want to propagate on to the edge of shape boundary defined by a binary image. The shape boundary is defined by a 1px wide white edge. I also have the coordinates of these points stored in a 2 row by n column matrix. The shape forms a concave boundary with no holes within itself made of around 2500 points. What would be the best method to do this? Are there some sort of ray tracing algorithms that could be used? Or would it be a case of taking the unit vector and multiplying it by a scalar and testing after multiplication if the end point of the vector is outside the shape boundary. When the end point of the unit vector is outside the shape, just find the point of intersection? Thank you very much in advance for any help!

    Read the article

  • sloving Algorithm notation

    - by neednewname
    Use big-O notation to classify the traditional grade school algorithms for addition and multiplication. That is, if asked to add two numbers each having N digits, how many individual additions must be performed? If asked to multiply two N-digit numbers, how many individual multiplications are required Suppose f is a function that returns the result of reversing the string of symbols given as its input, and g is a function that returns the concatenation of the two strings given as its input. If x is the string hrwa, what is returned by g(f(x),x)? Explain your answer - don't just provide the result!

    Read the article

  • How can I rewrite Verilog code to remove extra reg?

    - by EquinoX
    How can I rewrite the code below so that I don't need to have an extra reg mul. I just wanted to take the 32 bits of the resulting 32 * 32 bit multiplication and put it into Result input signed[31:0] Reg1; input signed[31:0] Reg2; output[31:0] Result; reg signed[31:0] Result; reg[63:0] mul; mul = Reg1 * Reg2; Result = mul[31:0];

    Read the article

  • Efficiently draw a grid in Windows Forms

    - by Joel
    I'm writing an implementation of Conway's Game of Life in C#. This is the code I'm using to draw the grid, it's in my panel_Paint event. g is the graphics context. for (int y = 0; y < numOfCells * cellSize; y += cellSize) { for (int x = 0; x < numOfCells * cellSize; x += cellSize) { g.DrawLine(p, x, 0, x, y + numOfCells * cellSize); g.DrawLine(p, 0, x, y + size * drawnGrid, x); } } When I run my program, it is unresponsive until it finishes drawing the grid, which takes a few seconds at numOfCells = 100 & cellSize = 10. Removing all the multiplication makes it faster, but not by very much. Is there a better/more efficient way to draw my grid? Thanks

    Read the article

  • Multiplying numbers on horizontal, vertial, and diagonal lines

    - by untwisted
    I'm currently working on a project Euler problem (www.projecteuler.net) for fun but have hit a stumbling block. One of the problem provides a 20x20 grid of numbers and asks for the greatest product of 4 numbers on a straight line. This line can be either horizontal, vertical, or diagonal. Using a procedural language I'd have no problem solving this, but part of my motivation for doing these problems in the first place is to gain more experience and learn more Haskell. As of right now I'm reading in the grid and converting it to a list of list of ints, eg -- [[Int]]. This makes the horizontal multiplication trivial, and by transposing this grid the vertical also becomes trivial. The diagonal is what is giving me trouble. I've thought of a few ways where I could use explicit array slicing or indexing, to get a solution, but it seems overly complicated and hackey. I believe there is probably an elegant, functional solution here, and I'd love to hear what others can come up with.

    Read the article

  • Special simple random number generator

    - by psihodelia
    How to create a function, which on every call generates a random integer number? This number must be most random as possible (according to uniform distribution). It is only allowed to use one static variable and at most 3 elementary steps, where each step consists of only one basic arithmetic operation of arity 1 or 2. Example: int myrandom(void){ static int x; x = some_step1; x = some_step2; x = some_step3; return x; } Basic arithmetic operations are +,-,%,and, not, xor, or, left shift, right shift, multiplication and division. Of course, no rand(), random() or similar staff is allowed.

    Read the article

  • iPhone SDK Zoom and refresh PDF with Quartz

    - by Ben
    Looking at the QuartzDemo sample application, I love the speed of the PDF rending using quartz alone (that is, without using uiwebview). However, when I'm zooming in the PDF it doesn't seem to become more clear like it does in PDF view. Is there something that I can change to have the same effect when zooming in and out using multitouch? like manipulate the PDF transformation matrix or something? Thanks a bunch. --Ben

    Read the article

  • MATLAB: vectorized array creation from a list of start/end indices

    - by merv
    I have a two-column matrix M that contains the start/end indices of a bunch of intervals: startInd EndInd 1 3 6 10 12 12 15 16 How can I generate a vector of all the interval indices: v = [1 2 3 6 7 8 9 10 12 15 16]; I'm doing the above using loops, but I'm wondering if there's a more elegant vectorized solution? v = []; for i=1:size(M,1) v = [v M(i,1):M(i,2)]; end

    Read the article

  • Need good RDLC examples/samples

    - by Sachin
    I am in evaluation phase of report tool. I prefer RDLC for the same. But I need some examples/samples available in the wild which can guide us on using the RDLC off the shelf. I would be looking for examples from as simple as list of data and as complex as using matrix, calculation, grouping, etc. This will help us to make a reference point if anytime we get stuck up somewhere.

    Read the article

  • Do any JS implementations currently support (or have support on the roadmap for) fast, vectorized op

    - by agnoster
    I'd like to do a bit of matrix/vector arithmetic in JavaScript, and was wondering if any browsers or other JS implementations actually have support for vectorized operations, for instance for quickly summing the entries of two Arrays (or summing, or whatever). Even if that currently doesn't mean it compiles down to vectorized operations, at least some language support would be nice for when it does get implemented - I'd take the existence of functions or syntax to support it as a step in the right direction. (Understandably, "vectorization javascript" searches are pretty much all about graphics and SVG.)

    Read the article

  • selecting the pixels from an image in opencv

    - by ajith
    hi everyone, this is refined version of my previous question http://stackoverflow.com/questions/2602628/computing-matting-laplacian-matrix-of-an-image actually i want to do following operation... summation for all k|(i,j)?wk [(Ii-µk)*(Ij-µk)]...where wk is 3X3 window & µk is mean of wk...here i dont know how to select Ii & Ij separately from an image which is 2 dimensional[Iij]...or does the eqn means anything else??please someone help me..

    Read the article

< Previous Page | 50 51 52 53 54 55 56 57 58 59 60 61  | Next Page >