Search Results

Search found 1354 results on 55 pages for 'compute scalar'.

Page 45/55 | < Previous Page | 41 42 43 44 45 46 47 48 49 50 51 52  | Next Page >

  • fast similarity detection

    - by reinierpost
    I have a large collection of objects and I need to figure out the similarities between them. To be exact: given two objects I can compute their dissimilarity as a number, a metric - higher values mean less similarity and 0 means the objects have identical contents. The cost of computing this number is proportional to the size of the smaller object (each object has a given size). I need the ability to quickly find, given an object, the set of objects similar to it. To be exact: I need to produce a data structure that maps any object o to the set of objects no more dissimilar to o than d, for some dissimilarity value d, such that listing the objects in the set takes no more time than if they were in an array or linked list (and perhaps they actually are). Typically, the set will be very much smaller than the total number of objects, so it is really worthwhile to perform this computation. It's good enough if the data structure assumes a fixed d, but if it works for an arbitrary d, even better. Have you seen this problem before, or something similar to it? What is a good solution? To be exact: a straightforward solution involves computing the dissimilarities between all pairs of objects, but this is slow - O(n2) where n is the number of objects. Is there a general solution with lower complexity?

    Read the article

  • 3x3 Sobel operator and gradient features

    - by pithyless
    Reading a paper, I'm having difficulty understanding the algorithm described: Given a black and white digital image of a handwriting sample, cut out a single character to analyze. Since this can be any size, the algorithm needs to take this into account (if it will be easier, we can assume the size is 2^n x 2^m). Now, the description states given this image we will convert it to a 512-bit feature (a 512-bit hash) as follows: (192 bits) computes the gradient of the image by convolving it with a 3x3 Sobel operator. The direction of the gradient at every edge is quantized to 12 directions. (192 bits) The structural feature generator takes the gradient map and looks in a neighborhood for certain combinations of gradient values. (used to compute 8 distinct features that represent lines and corners in the image) (128 bits) Concavity generator uses an 8-point star operator to find coarse concavities in 4 directions, holes, and lagrge-scale strokes. The image feature maps are normalized with a 4x4 grid. I'm for now struggling with how to take an arbitrary image, split into 16 sections, and using a 3x3 Sobel operator to come up with 12 bits for each section. (But if you have some insight into the other parts, feel free to comment :)

    Read the article

  • Bitwise Interval Arithmetic

    - by KennyTM
    I've recently read an interesting thread on the D newsgroup, which basically asks, Given two signed integers a ∈ [amin, amax], b ∈ [bmin, bmax], what is the tightest interval of a | b? I'm think if interval arithmetics can be applied on general bitwise operators (assuming infinite bits). The bitwise-NOT and shifts are trivial since they just corresponds to -1 − x and 2n x. But bitwise-AND/OR are a lot trickier, due to the mix of bitwise and arithmetic properties. Is there a polynomial-time algorithm to compute the intervals of bitwise-AND/OR? Note: Assume all bitwise operations run in linear time (of number of bits), and test/set a bit is constant time. The brute-force algorithm runs in exponential time. Because ~(a | b) = ~a & ~b and a ^ b = (a | b) & ~(a & b), solving the bitwise-AND and -NOT problem implies bitwise-OR and -XOR are done. Although the content of that thread suggests min{a | b} = max(amin, bmin), it is not the tightest bound. Just consider [2, 3] | [8, 9] = [10, 11].)

    Read the article

  • Posting xml from classic asp to asp.net

    - by Chris Dunaway
    I apologize if this has been asked before. I searched and didn't find anything that matched my situation. Also bear in mind I am fairly new to asp/asp.net development. My current project is a relatively simple e-commerce site. The customer will connect to the site, select products, input shipping and billing information, payment information (credit card) and submit the order. The project is being split into two parts: The store front which includes displaying the items and taking the customer's shipping and billing information and the payment site which will collect the customers credit card, compute tax, and save the order into the company's system. The reason that the site was split up, was that our side (payment side) already has facilities for credit card handling and tax computation. There may also be some regulatory issues that the store front side does not want to deal with (which we already do). I'm working on the payment portion of the app and I am using asp.net. The store front side is being written in classic asp (not my decision). Each part will be hosted on different servers. The problem I am having is transferring the contents of the "shopping cart" to our app so that we can collect the cc info and submit the order. We had thought that the classic asp could somehow post an xml fragment which contains the billing/shipping info and the items selected. Our side would display a summary of the order, securely collect the credit card info, and submit the order to our system. But I have been unable to post or send the xml from a classic asp on one server, to our asp.net application on another. It all works just fine when I test on the same server. How can I post (or otherwise transfer) the shopping cart data from classic asp to asp.net across server boundaries and transfer control to the asp.net application? As I said, I am new to web development, so this is proving quite a challenge for me. Thanks

    Read the article

  • Static Data Structures on Embedded Devices (Android in particular)

    - by Mark
    I've started working on some Android applications and have a question regarding how people normally deal with situations where you have a static data set and have an application where that data is needed in memory as one of the standard java collections or as an array. In my current specific issue i have a spreadsheet with some pre-calculated data. It consists of ~100 rows and 3 columns. 1 Column is a string, 1 column is a float, 1 column is an integer. I need access to this data as an array in java. It seems like i could: 1) Encode in XML - This would be cpu intensive to decode in my experience. 2) build into SQLite database - seems like a lot of overhead for static access to data i only need array style access to in ram. 3) Build into binary blob and read in. (never done this in java, i miss void *) 4) Build a python script to take the CSV version of my data and spit out a java function that adds the values to my desired structure with hard coded values. 5) Store a string array via androids resource mechanism and compute the other 2 columns on application load. In my case the computation would require a lot of calls to Math.log, Math.pow and Math.floor which i'd rather not have to do for load time and battery usage reasons. I mostly work in low power embedded applications in C and as such #4 is what i'm used to doing in these situations. It just seems like it should be far easier to gain access to static data structures in java/android. Perhaps I'm just being too battery usage conscious and in my single case i imagine the answer is that it doesn't matter much, but if every application took that stance it could begin to matter. What approaches do people usually take in this situation? Anything I missed?

    Read the article

  • Two strange efficiency problems in Mathematica

    - by Jess Riedel
    FIRST PROBLEM I have timed how long it takes to compute the following statements (where V[x] is a time-intensive function call): Alice = Table[V[i],{i,1,300},{1000}]; Bob = Table[Table[V[i],{i,1,300}],{1000}]^tr; Chris_pre = Table[V[i],{i,1,300}]; Chris = Table[Chris_pre,{1000}]^tr; Alice, Bob, and Chris are identical matricies computed 3 slightly different ways. I find that Chris is computed 1000 times faster than Alice and Bob. It is not surprising that Alice is computed 1000 times slower because, naively, the function V must be called 1000 more times than when Chris is computed. But it is very surprising that Bob is so slow, since he is computed identically to Chris except that Chris stores the intermediate step Chris_pre. Why does Bob evaluate so slowly? SECOND PROBLEM Suppose I want to compile a function in Mathematica of the form f(x)=x+y where "y" is a constant fixed at compile time (but which I prefer not to directly replace in the code with its numerical because I want to be able to easily change it). If y's actual value is y=7.3, and I define f1=Compile[{x},x+y] f2=Compile[{x},x+7.3] then f1 runs 50% slower than f2. How do I make Mathematica replace "y" with "7.3" when f1 is compiled, so that f1 runs as fast as f2? Many thanks!

    Read the article

  • Required Working Precision for the BBP Algorithm?

    - by brainfsck
    Hello, I'm looking to compute the nth digit of Pi in a low-memory environment. As I don't have decimals available to me, this integer-only BBP algorithm in Python has been a great starting point. I only need to calculate one digit of Pi at a time. How can I determine the lowest I can set D, the "number of digits of working precision"? D=4 gives me many correct digits, but a few digits will be off by one. For example, computing digit 393 with precision of 4 gives me 0xafda, from which I extract the digit 0xa. However, the correct digit is 0xb. No matter how high I set D, it seems that testing a sufficient number of digits finds an one where the formula returns an incorrect value. I've tried upping the precision when the digit is "close" to another, e.g. 0x3fff or 0x1000, but cannot find any good definition of "close"; for instance, calculating at digit 9798 gives me 0xcde6 , which is not very close to 0xd000, but the correct digit is 0xd. Can anyone help me figure out how much working precision is needed to calculate a given digit using this algorithm? Thank you,

    Read the article

  • Execution time in nano seconds and related issues

    - by anup
    Hi All, I am using the following code to compute execution time in milli-secs. struct timespec tp; if (clock_gettime (CLOCK_REALTIME, &tp) == 0) return ((tp.tv_sec * 1000000000) + tp.tv_nsec); else return ; Can you please tell me whether this is correct? Let's name this function comptime_nano(). Now, I write the following code in main() to check execution times of following operations. unsigned long int a, b, s1, s3; a = (unsigned long int)(1) << 63; b = (unsigned long int)(1) << 63; btime = comptime_nano(); s1 = b >> 30; atime = comptime_nano(); printf ("Time =%ld for %lu\n", (atime - btime), s1); btime = comptime_nano(); s3 = a >> 1; atime = comptime_nano(); printf ("Time =%ld for %lu\n", (atime - btime), s3); To my surprise, the first operation takes about roughly 4 times more time than the second. Again, if I change the relative ordering of these operations, the respective timings change drastically. Please comment...

    Read the article

  • how to better (inambiguaously) use the terms CAPTCHA and various types of interactions?

    - by vgv8
    I am working on survey of state-of-the-art and trends of spam prevention techniques. I observe that non-intrusive, transparent to visitor spam prevention techniques (like context-based filtering or honey traps) are frequently called non-captcha. Is it correct understanding of term CAPTCHA which is "type of challenge-response [ 2 ]test used in computing to ensure that the response is not generated by a compute" [ 1 ] and challenge-response does not seem to imply obligatory human involvement. So, which understanding (definition) of term and classification I'd better to stick with? How would I better call CAPTCHA without direct human interaction in order to avoid ambiguity and confusion of terms understnding? How would I better (succinctly and unambiguously) coin the term for captchas requiring human interaction but without typing into textbox? How would I better (succinctly and unambiguously) coin the terms to mark the difference between human interaction with images (playing, drag&dropping, rearranging, clicking with images) vs. just recognizing them (and then typing into a textbox the answer without interaction with images)? PS. The problem is that recognition of a wiggled word in an image or typing the answer to question is also interaction and when I start to use the terms "interaction", "interactive", "captcha", "protection", "non-captcha", "non-interactive", "static", "dynamic", "visible", "hidden" the terms overlap ambiguously with which another (especailly because the definitions or their actual practice of usage are vague or contradictive). [ 1 ] http://en.wikipedia.org/wiki/CAPTCHA

    Read the article

  • How to detect hidden field tampering?

    - by Myron
    On a form of my web app, I've got a hidden field that I need to protect from tampering for security reasons. I'm trying to come up with a solution whereby I can detect if the value of the hidden field has been changed, and react appropriately (i.e. with a generic "Something went wrong, please try again" error message). The solution should be secure enough that brute force attacks are infeasible. I've got a basic solution that I think will work, but I'm not security expert and I may be totally missing something here. My idea is to render two hidden inputs: one named "important_value", containing the value I need to protect, and one named "important_value_hash" containing the SHA hash of the important value concatenated with a constant long random string (i.e. the same string will be used every time). When the form is submitted, the server will re-compute the SHA hash, and compare against the submitted value of important_value_hash. If they are not the same, the important_value has been tampered with. I could also concatenate additional values with the SHA's input string (maybe the user's IP address?), but I don't know if that really gains me anything. Will this be secure? Anyone have any insight into how it might be broken, and what could/should be done to improve it? Thanks!

    Read the article

  • PCA extended face recognition

    - by cMinor
    The state of the art says that we can use PCA to perform face recognition. like this, this or this I am working with a project that involves training a classifier to detect a person who is wearing glasess or hats or even a mustache. The purpose of doing this is to detect when a person that has robbed a bank, store, or have commeted some sort of crime(s) (we have their image in a database), enters a certain place ( historically we know these guys have robbed, so we should take care to avoid problems). We came first to have a distributed database with all images of criminals, then I thought to have a layer of them clasifying these criminals using accesories like hats, mustache or anything that hides their face etc... Then, to apply that knowledge to detect when a particular or a suspect person enters a comercial place. ( In practice when someone is going to rob not all the times they are using an accesorie...) What do you think about this idea of doing PCA to first detect principal components of the face and then the components of an accesory. I was thinking that maybe a probabilistic approach is better so we can compute the probability the criminal is the person that entered a place and call the respective authorities.

    Read the article

  • Finding subsets that can be completed to tuples without duplicates

    - by Jules
    We have a collection of sets A_1,..,A_n. The goal is to find new sets for each of the old sets. newA_i = {a_i in A_i such that there exist (a_1,..,a_n) in (A1,..,An) with no a_k = a_j for all k and j} So in words this says that we remove all the elements from A_i that can't be used to form a tuple (a_1,..a_n) from the sets (A_1,..,A_n) such that the tuple doesn't contain duplicates. My question is how to compute these new sets quickly. If you just implement this definition by generating all possible v's this will take exponential time. Do you know a better algorithm? Edit: here's an example. Take A_1 = {1,2,3,4} A_2 = {2}. Now the new sets look like this: newA_1 = {1,3,4} newA_2 = {2} The 2 has been removed from A_1 because if you choose it the tuple will always be (2,2) which is invalid because it contains duplicates. On the other hand 1,3,4 are valid because (1,2), (3,2) and (4,2) are valid tuples. Another example: A_1 = {1,2,3} A_2 = {1,4,5} A_3 = {2,4,5} A_4 = {1,2,3} A_5 = {1,2,3} Now the new sets are: newA_1 = {1,2,3} newA_2 = {4,5} newA_3 = {4,5} newA_4 = {1,2,3} newA_5 = {1,2,3} The 1 and 2 are removed from sets 2 and 3 because if you choose the 1 or 2 from these sets you'll only have 2 values left for sets 1, 4 and 5, so you will always have duplicates in tuples that look like (_,1,_,_,_) or like (_,_,2,_,_).

    Read the article

  • GMail appearing to ignore Reply-To.

    - by Samuurai
    I'm using a gmail account to send emails from my website. I'm using the same account to pick up emails which are generated by the contact facility on my site. I'm using the Reply-To field to attempt to make it easier to hit reply and easily get back to people. The message comes up with the 'from' address and ignores the 'reply-to' address. Here's my header: Return-Path: <[email protected]> Received: from svr1 (ec2-79-125-266-266.eu-west-1.compute.amazonaws.com [79.125.266.266]) by mx.google.com with ESMTPS id u14sm23273123gvf.17.2010.03.10.14.33.24 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 10 Mar 2010 14:33:25 -0800 (PST) Received: from localhost ([127.0.0.1] helo=www.rds.com) by aquacouture with esmtp (Exim 4.69) (envelope-from <[email protected]>) id 1NpUSx-0001dK-JM for [email protected]; Wed, 10 Mar 2010 22:33:23 +0000 User-Agent: CodeIgniter Date: Wed, 10 Mar 2010 22:33:23 +0000 From: "New Inquiry" <[email protected]> Reply-To: "Beren" <[email protected]> To: [email protected] Subject: =?utf-8?Q?Test?= X-Sender: [email protected] X-Mailer: CodeIgniter X-Priority: 3 (Normal) Message-ID: <[email protected]> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="B_ALT_4b981e3390ccd" This is a multi-part message in MIME format. Your email application may not support this format. --B_ALT_4b981e3390ccd Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit test --B_ALT_4b981e3390ccd Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable test --B_ALT_4b981e3390ccd--

    Read the article

  • Converting python collaborative filtering code to use Map Reduce

    - by Neil Kodner
    Using Python, I'm computing cosine similarity across items. given event data that represents a purchase (user,item), I have a list of all items 'bought' by my users. Given this input data (user,item) X,1 X,2 Y,1 Y,2 Z,2 Z,3 I build a python dictionary {1: ['X','Y'], 2 : ['X','Y','Z'], 3 : ['Z']} From that dictionary, I generate a bought/not bought matrix, also another dictionary(bnb). {1 : [1,1,0], 2 : [1,1,1], 3 : [0,0,1]} From there, I'm computing similarity between (1,2) by calculating cosine between (1,1,0) and (1,1,1), yielding 0.816496 I'm doing this by: items=[1,2,3] for item in items: for sub in items: if sub >= item: #as to not calculate similarity on the inverse sim = coSim( bnb[item], bnb[sub] ) I think the brute force approach is killing me and it only runs slower as the data gets larger. Using my trusty laptop, this calculation runs for hours when dealing with 8500 users and 3500 items. I'm trying to compute similarity for all items in my dict and it's taking longer than I'd like it to. I think this is a good candidate for MapReduce but I'm having trouble 'thinking' in terms of key/value pairs. Alternatively, is the issue with my approach and not necessarily a candidate for Map Reduce?

    Read the article

  • Find max integer size that a floating point type can handle without loss of precision

    - by Checkers
    Double has range more than a 64-bit integer, but its precision is less dues to its representation (since double is 64-bit as well, it can't fit more actual values). So, when representing larger integers, you start to lose precision in the integer part. #include <boost/cstdint.hpp> #include <limits> template<typename T, typename TFloat> void maxint_to_double() { T i = std::numeric_limits<T>::max(); TFloat d = i; std::cout << std::fixed << i << std::endl << d << std::endl; } int main() { maxint_to_double<int, double>(); maxint_to_double<boost::intmax_t, double>(); maxint_to_double<int, float>(); return 0; } This prints: 2147483647 2147483647.000000 9223372036854775807 9223372036854775800.000000 2147483647 2147483648.000000 Note how max int can fit into a double without loss of precision and boost::intmax_t (64-bit in this case) cannot. float can't even hold an int. Now, the question: is there a way in C++ to check if the entire range of a given integer type can fit into a loating point type without loss of precision? Preferably, it would be a compile-time check that can be used in a static assertion, and would not involve enumerating the constants the compiler should know or can compute.

    Read the article

  • Small-o(n^2) implementation of Polynomial Multiplication

    - by AlanTuring
    I'm having a little trouble with this problem that is listed at the back of my book, i'm currently in the middle of test prep but i can't seem to locate anything regarding this in the book. Anyone got an idea? A real polynomial of degree n is a function of the form f(x)=a(n)x^n+?+a1x+a0, where an,…,a1,a0 are real numbers. In computational situations, such a polynomial is represented by a sequence of its coefficients (a0,a1,…,an). Assuming that any two real numbers can be added/multiplied in O(1) time, design an o(n^2)-time algorithm to compute, given two real polynomials f(x) and g(x) both of degree n, the product h(x)=f(x)g(x). Your algorithm should **not** be based on the Fast Fourier Transform (FFT) technique. Please note it needs to be small-o(n^2), which means it complexity must be sub-quadratic. The obvious solution that i have been finding is indeed the FFT, but of course i can't use that. There is another method that i have found called convolution, where if you take polynomial A to be a signal and polynomial B to be a filter. A passed through B yields a shifted signal that has been "smoothed" by A and the resultant is A*B. This is supposed to work in O(n log n) time. Of course i am completely unsure of implementation. If anyone has any ideas of how to achieve a small-o(n^2) implementation please do share, thanks.

    Read the article

  • How would you design a question/answer view (iPhone SDK)

    - by Aurélien Vallée
    I'm new to iPhone development, and I have a question on how to create a view for my application. The view should display a problem (using formatted/syntax highlighted text), and multiple possible answers. The user should be able to click on an answer to validate it. Currently, I am trying to use a UITableView embedding UIWebView as contentView. That allows me to display formatted text easily. The problem is that it is a real pain to compute and adjust the height of the cells. I have to preload the webview, call sizeToFit, get its height, and update the cell accordingly. This process should be done for the problem and the answers (as they are HTML formatted text too). It's such a pain that I am planning to switch to something else. I thought using only a big UIWebView and design everything in HTML. But I looked at some articles describing how to communicate between the HTML page and the ObjectiveC code. This seems to involve some awful tricks too... So... that's it, I don't really know what I should do. I guess some of you dealt with such things before, and would provide some greatly appreciated tips :)

    Read the article

  • Iterative Cartesian Product in Java

    - by akappa
    Hi, I want to compute the cartesian product of an arbitrary number of nonempty sets in Java. I've wrote that iterative code... public static <T> List<Set<T>> cartesianProduct(List<Set<T>> list) { List<Iterator<T>> iterators = new ArrayList<Iterator<T>>(list.size()); List<T> elements = new ArrayList<T>(list.size()); List<Set<T>> toRet = new ArrayList<Set<T>>(); for (int i = 0; i < list.size(); i++) { iterators.add(list.get(i).iterator()); elements.add(iterators.get(i).next()); } for (int j = 1; j >= 0;) { toRet.add(Sets.newHashSet(elements)); for (j = iterators.size()-1; j >= 0 && !iterators.get(j).hasNext(); j--) { iterators.set(j, list.get(j).iterator()); elements.set(j, iterators.get(j).next()); } elements.set(Math.abs(j), iterators.get(Math.abs(j)).next()); } return toRet; } ...but I found it rather inelegant. Someone has a better, still iterative solution? A solution that uses some wonderful functional-like approach? Otherwise... suggestion about how to improve it? Errors? Thanks :)

    Read the article

  • Neural Network with softmax activation

    - by Cambium
    This is more or less a research project for a course, and my understanding of NN is very/fairly limited, so please be patient :) ============== I am currently in the process of building a neural network that attempts to examine an input dataset and output the probability/likelihood of each classification (there are 5 different classifications). Naturally, the sum of all output nodes should add up to 1. Currently, I have two layers, and I set the hidden layer to contain 10 nodes. I came up with two different types of implementations 1) Logistic sigmoid for hidden layer activation, softmax for output activation 2) Softmax for both hidden layer and output activation I am using gradient descent to find local maximums in order to adjust the hidden nodes' weights and the output nodes' weights. I am certain in that I have this correct for sigmoid. I am less certain with softmax (or whether I can use gradient descent at all), after a bit of researching, I couldn't find the answer and decided to compute the derivative myself and obtained softmax'(x) = softmax(x) - softmax(x)^2 (this returns an column vector of size n). I have also looked into the MATLAB NN toolkit, the derivative of softmax provided by the toolkit returned a square matrix of size nxn, where the diagonal coincides with the softmax'(x) that I calculated by hand; and I am not sure how to interpret the output matrix. I ran each implementation with a learning rate of 0.001 and 1000 iterations of back propagation. However, my NN returns 0.2 (an even distribution) for all five output nodes, for any subset of the input dataset. My conclusions: o I am fairly certain that my gradient of descent is incorrectly done, but I have no idea how to fix this. o Perhaps I am not using enough hidden nodes o Perhaps I should increase the number of layers Any help would be greatly appreciated! The dataset I am working with can be found here (processed Cleveland): http://archive.ics.uci.edu/ml/datasets/Heart+Disease

    Read the article

  • How can I factor out repeated expressions in an SQL Query? Column aliases don't seem to be the ticke

    - by Weston C
    So, I've got a query that looks something like this: SELECT id, DATE_FORMAT(CONVERT_TZ(callTime,'+0:00','-7:00'),'%b %d %Y') as callDate, DATE_FORMAT(CONVERT_TZ(callTime,'+0:00','-7:00'),'%H:%i') as callTimeOfDay, SEC_TO_TIME(callLength) as callLength FROM cs_calldata WHERE customerCode='999999-abc-blahblahblah' AND CONVERT_TZ(callTime,'+0:00','-7:00') >= '2010-04-25' AND CONVERT_TZ(callTime,'+0:00','-7:00') <= '2010-05-25' If you're like me, you probably start thinking that maybe it would improve readability and possibly the performance of this query if I wasn't asking it to compute CONVERT_TZ(callTime,'+0:00','-7:00') four separate times. So I try to create a column alias for that expression and replace further occurances with that alias: SELECT id, CONVERT_TZ(callTime,'+0:00','-7:00') as callTimeZoned, DATE_FORMAT(callTimeZoned,'%b %d %Y') as callDate, DATE_FORMAT(callTimeZoned,'%H:%i') as callTimeOfDay, SEC_TO_TIME(callLength) as callLength FROM cs_calldata WHERE customerCode='5999999-abc-blahblahblah' AND callTimeZoned >= '2010-04-25' AND callTimeZoned <= '2010-05-25' This is when I learned, to quote the MySQL manual: Standard SQL disallows references to column aliases in a WHERE clause. This restriction is imposed because when the WHERE clause is evaluated, the column value may not yet have been determined. So, that approach would seem to be dead in the water. How is someone writing queries with recurring expressions like this supposed to deal with it?

    Read the article

  • Efficient way of calculating average difference of array elements from array average value

    - by Saysmaster
    Is there a way to calculate the average distance of array elements from array average value, by only "visiting" each array element once? (I search for an algorithm) Example: Array : [ 1 , 5 , 4 , 9 , 6 ] Average : ( 1 + 5 + 4 + 9 + 6 ) / 5 = 5 Distance Array : [|1-5|, |5-5|, |4-5|, |9-5|, |6-5|] = [4 , 0 , 1 , 4 , 1 ] Average Distance : ( 4 + 0 + 1 + 4 + 1 ) / 5 = 2 The simple algorithm needs 2 passes. 1st pass) Reads and accumulates values, then divides the result by array length to calculate average value of array elements. 2nd pass) Reads values, accumulates each one's distance from the previously calculated average value, and then divides the result by array length to find the average distance of the elements from the average value of the array. The two passes are identical. It is the classic algorithm of calculating the average of a set of values. The first one takes as input the elements of the array, the second one the distances of each element from the array's average value. Calculating the average can be modified to not accumulate the values, but caclulating the average "on the fly" as we sequentialy read the elements from the array. The formula is: Compute Running Average of Array's elements ------------------------------------------- RA[i] = E[i] {for i == 1} RA[i] = RA[i-1] - RA[i-1]/i + A[i]/i { for i > 1 } Where A[x] is the array's element at position x, RA[x] is the average of the array's elements between position 1 and x (running average). My question is: Is there a similar algorithm, to calculate "on the fly" (as we read the array's elements), the average distance of the elements from the array's mean value? The problem is that, as we read the array's elements, the final average value of the array is not known. Only the running average is known. So calculating differences from the running average will not yield the correct result. I suppose, if such algorithm exists, it probably should have the "ability" to compensate, in a way, on each new element read for the error calculated as far.

    Read the article

  • Quickest algorithm for finding sets with high intersection

    - by conradlee
    I have a large number of user IDs (integers), potentially millions. These users all belong to various groups (sets of integers), such that there are on the order of 10 million groups. To simplify my example and get to the essence of it, let's assume that all groups contain 20 user IDs (i.e., all integer sets have a cardinality of 100). I want to find all pairs of integer sets that have an intersection of 15 or greater. Should I compare every pair of sets? (If I keep a data structure that maps userIDs to set membership, this would not be necessary.) What is the quickest way to do this? That is, what should my underlying data structure be for representing the integer sets? Sorted sets, unsorted---can hashing somehow help? And what algorithm should I use to compute set intersection)? I prefer answers that relate C/C++ (especially STL), but also any more general, algorithmic insights are welcome. Update Also, note that I will be running this in parallel in a shared memory environment, so ideas that cleanly extend to a parallel solution are preferred.

    Read the article

  • What are practical guidelines for evaluating a language's "Turing Completeness"?

    - by AShelly
    I've read "what-is-turing-complete" and the wikipedia page, but I'm less interested in a formal proof than in the practical implications of being Turing Complete. What I'm actually trying to decide is if the toy language I've just designed could be used as a general-purpose language. I know I can prove it is if I can write a Turing machine with it. But I don't want to go through that exercise until I'm fairly certain of success. Is there a minimum set of features without which Turing Completeness is impossible? Is there a set of features which virtually guarantees completeness? (My guess is that conditional branching and a readable/writeable memory store will get me most of the way there) EDIT: I think I've gone off on a tangent by saying "Turing Complete". I'm trying to guess with reasonable confidence that a newly invented language with a certain feature set (or alternately, a VM with a certain instruction set) would be able to compute anything worth computing. I know proving you can building a Turing machine with it is one way, but not the only way. What I was hoping for was a set of guidelines like: "if it can do X,Y,and Z, it can probably do anything".

    Read the article

  • CUDA: When to use shared memory and when to rely on L1 caching?

    - by Roger Dahl
    After Compute Capability 2.0 (Fermi) was released, I've wondered if there are any use cases left for shared memory. That is, when is it better to use shared memory than just let L1 perform its magic in the background? Is shared memory simply there to let algorithms designed for CC < 2.0 run efficiently without modifications? To collaborate via shared memory, threads in a block write to shared memory and synchronize with __syncthreads(). Why not simply write to global memory (through L1), and synchronize with __threadfence_block()? The latter option should be easier to implement since it doesn't have to relate to two different locations of values, and it should be faster because there is no explicit copying from global to shared memory. Since the data gets cached in L1, threads don't have to wait for data to actually make it all the way out to global memory. With shared memory, one is guaranteed that a value that was put there remains there throughout the duration of the block. This is as opposed to values in L1, which get evicted if they are not used often enough. Are there any cases where it's better too cache such rarely used data in shared memory than to let the L1 manage them based on the usage pattern that the algorithm actually has?

    Read the article

  • How can I make this method more Scalalicious

    - by Neil Chambers
    I have a function that calculates the left and right node values for some collection of treeNodes given a simple node.id, node.parentId association. It's very simple and works well enough...but, well, I am wondering if there is a more idiomatic approach. Specifically is there a way to track the left/right values without using some externally tracked value but still keep the tasty recursion. /* * A tree node */ case class TreeNode(val id:String, val parentId: String){ var left: Int = 0 var right: Int = 0 } /* * a method to compute the left/right node values */ def walktree(node: TreeNode) = { /* * increment state for the inner function */ var c = 0 /* * A method to set the increment state */ def increment = { c+=1; c } // poo /* * the tasty inner method * treeNodes is a List[TreeNode] */ def walk(node: TreeNode): Unit = { node.left = increment /* * recurse on all direct descendants */ treeNodes filter( _.parentId == node.id) foreach (walk(_)) node.right = increment } walk(node) } walktree(someRootNode) Edit - The list of nodes is taken from a database. Pulling the nodes into a proper tree would take too much time. I am pulling a flat list into memory and all I have is an association via node id's as pertains to parents and children. Adding left/right node values allows me to get a snapshop of all children (and childrens children) with a single SQL query. The calculation needs to run very quickly in order to maintain data integrity should parent-child associations change (which they do very frequently). In addition to using the awesome Scala collections I've also boosted speed by using parallel processing for some pre/post filtering on the tree nodes. I wanted to find a more idiomatic way of tracking the left/right node values. After looking at the answers listed I have settled on this synthesised version: def walktree(node: TreeNode) = { def walk(node: TreeNode, counter: Int): Int = { node.left = counter node.right = treeNodes .filter( _.parentId == node.id) .foldLeft(counter+1) { (counter, curnode) => walk(curnode, counter) + 1 } node.right } walk(node,1) }

    Read the article

< Previous Page | 41 42 43 44 45 46 47 48 49 50 51 52  | Next Page >