Search Results

Search found 8262 results on 331 pages for 'optimization algorithm'.

Page 126/331 | < Previous Page | 122 123 124 125 126 127 128 129 130 131 132 133  | Next Page >

  • incremental way of counting quantiles for large set of data

    - by Gacek
    I need to count the quantiles for a large set of data. Let's assume we can get the data only through some portions (i.e. one row of a large matrix). To count the Q3 quantile one need to get all the portions of the data and store it somewhere, then sort it and count the quantile: List<double> allData = new List<double>(); foreach(var row in matrix) // this is only example. In fact the portions of data are not rows of some matrix { allData.AddRange(row); } allData.Sort(); double p = 0.75*allData.Count; int idQ3 = (int)Math.Ceiling(p) - 1; double Q3 = allData[idQ3]; Now, I would like to find a way of counting this without storing the data in some separate variable. The best solution would be to count some parameters od mid-results for first row and then adjust it step by step for next rows. Note: These datasets are really big (ca 5000 elements in each row) The Q3 can be estimated, it doesn't have to be an exact value. I call the portions of data "rows", but they can have different leghts! Usually it varies not so much (+/- few hundred samples) but it varies! This question is similar to this one: http://stackoverflow.com/questions/1058813/on-line-iterator-algorithms-for-estimating-statistical-median-mode-skewness But I need to count quantiles. ALso there are few articles in this topic, i.e.: http://web.cs.wpi.edu/~hofri/medsel.pdf http://portal.acm.org/citation.cfm?id=347195&dl But before I would try to implement these, I wanted to ask you if there are maybe any other, qucker ways of counting the 0.25/0.75 quantiles?

    Read the article

  • Naive Bayesian classification (spam filtering) - Doubt in one calculation? Which one is right? Plz c

    - by Microkernel
    Hi guys, I am implementing Naive Bayesian classifier for spam filtering. I have doubt on some calculation. Please clarify me what to do. Here is my question. In this method, you have to calculate P(S|W) - Probability that Message is spam given word W occurs in it. P(W|S) - Probability that word W occurs in a spam message. P(W|H) - Probability that word W occurs in a Ham message. So to calculate P(W|S), should I do (1) (Number of times W occuring in spam)/(total number of times W occurs in all the messages) OR (2) (Number of times word W occurs in Spam)/(Total number of words in the spam message) So, to calculate P(W|S), should I do (1) or (2)? (I thought it to be (2), but I am not sure, so plz clarify me) I am refering http://en.wikipedia.org/wiki/Bayesian_spam_filtering for the info by the way. I got to complete the implementation by this weekend :( Thanks and regards, MicroKernel :) @sth: Hmm... Shouldn't repeated occurrence of word 'W' increase a message's spam score? In the your approach it wouldn't, right?. Lets take a scenario and discuss... Lets say, we have 100 training messages, out of which 50 are spam and 50 are Ham. and say word_count of each message = 100. And lets say, in spam messages word W occurs 5 times in each message and word W occurs 1 time in Ham message. So total number of times W occuring in all the spam message = 5*50 = 250 times. And total number of times W occuring in all Ham messages = 1*50 = 50 times. Total occurance of W in all of the training messages = (250+50) = 300 times. So, in this scenario, how do u calculate P(W|S) and P(W|H) ? Naturally we should expect, P(W|S) P(W|H)??? right. Please share your thought...

    Read the article

  • how to develop a program to minimize errors in human transcription of hand written surveys

    - by Alex. S.
    I need to develop custom software to do surveys. Questions may be of multiple choice, or free text in a very few cases. I was asked to design a subsystem to check if there is any error in the manual data entry for the multiple choices part. We're trying to speed up the user data entry process and to minimize human input differences between digital forms and the original questionnaires. The surveys are filled with handwritten marks and text by human interviewers, so it's possible to find hard to read marks, or also the user could accidentally select a different value in some question, and we would like to avoid that. The software must include some automatic control to detect possible typing differences. Each answer of the multiple choice questions has the same probability of being selected. This question has two parts: The GUI. The most simple thing I have in mind is to implement the most usable design of the questions display: use of large and readable fonts and space generously the choices. Is there something else? For faster input, I would like to use drop down lists (favoring keyboard over mouse). Given the questions are grouped in sections, I would like to show the answers selected for the questions of that section, but this could slow down the process. Any other ideas? The error checking subsystem. What else can I do to minimize or to check human typos in the multiple choice questions? Is this a solvable problem? is there some statistical methodology to check values that were entered by the users are the same from the hand filled forms? For example, let's suppose the survey has 5 questions, and each has 4 options. Let's say I have n survey forms filled in paper by interviewers, and they're ready to be entered in the software, then how to minimize the accidental differences that can have the manual transcription of the n surveys, without having to double check everything in the 5 questions of the n surveys? My first suggestion is that at the end of the processing of all the hand filled forms, the software could choose some forms randomly to make a double check of the responses in a few instances, but on what criteria can I make this selection? This validation would be enough to cover everything in a significant way? The actual survey is nation level and it has 56 pages with over 200 questions in total, so it will be a lot of hand written pages by many people, and the intention is to reduce the likelihood of errors and to optimize speed in the data entry process. The surveys must filled in paper first, given the complications of taking laptops or handhelds with the interviewers.

    Read the article

  • Why does Java's hashCode() in String use 31 as a multiplier?

    - by jacobko
    In Java, the hash code for a String object is computed as s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1] using int arithmetic, where s[i] is the ith character of the string, n is the length of the string, and ^ indicates exponentiation. Why is 31 used as a multiplier? I understand that the multiplier should be a relatively large prime number. So why not 29, or 37, or even 97?

    Read the article

  • How can I group an array of rectangles into "Islands" of connected regions?

    - by Eric
    The problem I have an array of java.awt.Rectangles. For those who are not familiar with this class, the important piece of information is that they provide an .intersects(Rectangle b) function. I would like to write a function that takes this array of Rectangles, and breaks it up into groups of connected rectangles. Lets say for example, that these are my rectangles (constructor takes the arguments x, y, width,height): Rectangle[] rects = new Rectangle[] { new Rectangle(0, 0, 4, 2), //A new Rectangle(1, 1, 2, 4), //B new Rectangle(0, 4, 8, 2), //C new Rectangle(6, 0, 2, 2) //D } A quick drawing shows that A intersects B and B intersects C. D intersects nothing. A tediously drawn piece of ascii art does the job too: +-------+ +---+ ¦A+---+ ¦ ¦ D ¦ +-+---+-+ +---+ ¦ B ¦ +-+---+---------+ ¦ +---+ C ¦ +---------------+ Therefore, the output of my function should be: new Rectangle[][]{ new Rectangle[] {A,B,C}, new Rectangle[] {D} } The failed code This was my attempt at solving the problem: public List<Rectangle> getIntersections(ArrayList<Rectangle> list, Rectangle r) { List<Rectangle> intersections = new ArrayList<Rectangle>(); for(Rectangle rect : list) { if(r.intersects(rect)) { list.remove(rect); intersections.add(rect); intersections.addAll(getIntersections(list, rect)); } } return intersections; } public List<List<Rectangle>> mergeIntersectingRects(Rectangle... rectArray) { List<Rectangle> allRects = new ArrayList<Rectangle>(rectArray); List<List<Rectangle>> groups = new ArrayList<ArrayList<Rectangle>>(); for(Rectangle rect : allRects) { allRects.remove(rect); ArrayList<Rectangle> group = getIntersections(allRects, rect); group.add(rect); groups.add(group); } return groups; } Unfortunately, there seems to be an infinite recursion loop going on here. My uneducated guess would be that java does not like me doing this: for(Rectangle rect : allRects) { allRects.remove(rect); //... } Can anyone shed some light on the issue?

    Read the article

  • Converting to a column oriented array in Java

    - by halfwarp
    Although I have Java in the title, this could be for any OO language. I'd like to know a few new ideas to improve the performance of something I'm trying to do. I have a method that is constantly receiving an Object[] array. I need to split the Objects in this array through multiple arrays (List or something), so that I have an independent list for each column of all arrays the method receives. Example: List<List<Object>> column-oriented = new ArrayList<ArrayList<Object>>(); public void newObject(Object[] obj) { for(int i = 0; i < obj.length; i++) { column-oriented.get(i).add(obj[i]); } } Note: For simplicity I've omitted the initialization of objects and stuff. The code I've shown above is slow of course. I've already tried a few other things, but would like to hear some new ideas. How would you do this knowing it's very performance sensitive?

    Read the article

  • Is there a faster way to parse through a large file with regex quickly?

    - by Ray Eatmon
    Problem: Very very, large file I need to parse line by line to get 3 values from each line. Everything works but it takes a long time to parse through the whole file. Is it possible to do this within seconds? Typical time its taking is between 1 minute and 2 minutes. Example file size is 148,208KB I am using regex to parse through every line: Here is my c# code: private static void ReadTheLines(int max, Responder rp, string inputFile) { List<int> rate = new List<int>(); double counter = 1; try { using (var sr = new StreamReader(inputFile, Encoding.UTF8, true, 1024)) { string line; Console.WriteLine("Reading...."); while ((line = sr.ReadLine()) != null) { if (counter <= max) { counter++; rate = rp.GetRateLine(line); } else if(max == 0) { counter++; rate = rp.GetRateLine(line); } } rp.GetRate(rate); Console.ReadLine(); } } catch (Exception e) { Console.WriteLine("The file could not be read:"); Console.WriteLine(e.Message); } } Here is my regex: public List<int> GetRateLine(string justALine) { const string reg = @"^\d{1,}.+\[(.*)\s[\-]\d{1,}].+GET.*HTTP.*\d{3}[\s](\d{1,})[\s](\d{1,})$"; Match match = Regex.Match(justALine, reg, RegexOptions.IgnoreCase); // Here we check the Match instance. if (match.Success) { // Finally, we get the Group value and display it. string theRate = match.Groups[3].Value; Ratestorage.Add(Convert.ToInt32(theRate)); } else { Ratestorage.Add(0); } return Ratestorage; } Here is an example line to parse, usually around 200,000 lines: 10.10.10.10 - - [27/Nov/2002:16:46:20 -0500] "GET /solr/ HTTP/1.1" 200 4926 789

    Read the article

  • Data Structure for a particular problem??

    - by AGeek
    Hi, Which data structure can perform insertion, deletion and searching operation in O(1) time in the worst case. We may assume the set of elements are integers drawn from a finite set 1,2,...,n, and initialization can take O(n) time. I can only think of implementing a hash table. Implementing it with Trees will not give O(1) time complexity for any of the operation. Or is it possible?? Kindly share your views on this, or any other data structure apart from these.. Thanks..

    Read the article

  • How do I find all paths through a set of given nodes in a DAG?

    - by Hanno Fietz
    I have a list of items (blue nodes below) which are categorized by the users of my application. The categories themselves can be grouped and categorized themselves. The resulting structure can be represented as a Directed Acyclic Graph (DAG) where the items are sinks at the bottom of the graph's topology and the top categories are sources. Note that while some of the categories might be well defined, a lot is going to be user defined and might be very messy. Example: On that structure, I want to perform the following operations: find all items (sinks) below a particular node (all items in Europe) find all paths (if any) that pass through all of a set of n nodes (all items sent via SMTP from example.com) find all nodes that lie below all of a set of nodes (intersection: goyish brown foods) The first seems quite straightforward: start at the node, follow all possible paths to the bottom and collect the items there. However, is there a faster approach? Remembering the nodes I already passed through probably helps avoiding unnecessary repetition, but are there more optimizations? How do I go about the second one? It seems that the first step would be to determine the height of each node in the set, as to determine at which one(s) to start and then find all paths below that which include the rest of the set. But is this the best (or even a good) approach? The graph traversal algorithms listed at Wikipedia all seem to be concerned with either finding a particular node or the shortest or otherwise most effective route between two nodes. I think both is not what I want, or did I just fail to see how this applies to my problem? Where else should I read?

    Read the article

  • Find a repeated numbers out of 3 boxes

    - by james1
    I have 3 boxes, each box contain 10 piece of numbered paper (1 - 10) but there is a number the same in all 3 boxes eg: box1 has number 4 and box2 has number 4 and box3 also has number 4. How to find that repeated number in java with an efficient/fastest way possible?

    Read the article

  • Algorithm for dragging objects on a fixed grid

    - by FlyingStreudel
    Hello, I am working on a program for the mapping and playing of the popular tabletop game D&D :D Right now I am working on getting the basic functionality like dragging UI elements around, snapping to the grid and checking for collisions. Right now every object when released from the mouse immediately snaps to the nearest grid point. This causes an issue when something like a player object snaps to a grid point that has a wall -or other- adjacent. So essentially when the player is dropped they wind up with some of the wall covering them. This is fine and working as intended, however the problem is that now my collision detection is tripped whenever you try to move this player because its sitting underneath a wall and because of this you cant drag the player anymore. Here is the relevant code: void UIObj_MouseMove(object sender, MouseEventArgs e) { blocked = false; if (dragging) { foreach (UIElement o in ((Floor)Parent).Children) { if (o.GetType() != GetType() && o.GetType().BaseType == typeof(UIObj) && Math.Sqrt(Math.Pow(((UIObj)o).cX - cX, 2) + Math.Pow(((UIObj)o).cY - cY, 2)) < Math.Max(r.Height + ((UIObj)o).r.Height, r.Width + ((UIObj)o).r.Width)) { double Y = e.GetPosition((Floor)Parent).Y; double X = e.GetPosition((Floor)Parent).X; Geometry newRect = new RectangleGeometry(new Rect(Margin.Left + (X - prevX), Margin.Top + (Y - prevY), Margin.Right + (X - prevX), Margin.Bottom + (Y - prevY))); GeometryHitTestParameters ghtp = new GeometryHitTestParameters(newRect); VisualTreeHelper.HitTest(o, null, new HitTestResultCallback(MyHitTestResultCallback), ghtp); } } if (!blocked) { Margin = new Thickness(Margin.Left + (e.GetPosition((Floor)Parent).X - prevX), Margin.Top + (e.GetPosition((Floor)Parent).Y - prevY), Margin.Right + (e.GetPosition((Floor)Parent).X - prevX), Margin.Bottom + (e.GetPosition((Floor)Parent).Y - prevY)); InvalidateVisual(); } prevX = e.GetPosition((Floor)Parent).X; prevY = e.GetPosition((Floor)Parent).Y; cX = Margin.Left + r.Width / 2; cY = Margin.Top + r.Height / 2; } } internal virtual void SnapToGrid() { double xPos = Margin.Left; double yPos = Margin.Top; double xMarg = xPos % ((Floor)Parent).cellDim; double yMarg = yPos % ((Floor)Parent).cellDim; if (xMarg < ((Floor)Parent).cellDim / 2) { if (yMarg < ((Floor)Parent).cellDim / 2) { Margin = new Thickness(xPos - xMarg, yPos - yMarg, xPos - xMarg + r.Width, yPos - yMarg + r.Height); } else { Margin = new Thickness(xPos - xMarg, yPos - yMarg + ((Floor)Parent).cellDim, xPos - xMarg + r.Width, yPos - yMarg + ((Floor)Parent).cellDim + r.Height); } } else { if (yMarg < ((Floor)Parent).cellDim / 2) { Margin = new Thickness(xPos - xMarg + ((Floor)Parent).cellDim, yPos - yMarg, xPos - xMarg + ((Floor)Parent).cellDim + r.Width, yPos - yMarg + r.Height); } else { Margin = new Thickness(xPos - xMarg + ((Floor)Parent).cellDim, yPos - yMarg + ((Floor)Parent).cellDim, xPos - xMarg + ((Floor)Parent).cellDim + r.Width, yPos - yMarg + ((Floor)Parent).cellDim + r.Height); } } } Essentially I am looking for a simple way to modify the existing code to allow the movement of a UI element that has another one sitting on top of it. Thanks!

    Read the article

  • simple plot algorithm with autoscale

    - by adrin
    I need to implement a simple plotting component in C#(WPF to be more precise). What i have is a collection of data samples containing time (X axis) and a value (both double types). I have a drawing canvas of a fixed size (Width x Height) and a DrawLine method/function that can draw on it. The problem I am facing now is how do I draw the plot so that it is autoscaled? In other words how do I map the samples I have to actual pixels on my Width x Height canvas?

    Read the article

  • Lexographical sorting problem

    - by Shawn Mclean
    I'm doing a problem that says concatenate the words to generate the lexicographically lowest possible string. from a competition. Take for example this string: jibw ji jp bw jibw The actual output turns out to be: bw jibw jibw ji jp When I do sorting on this, I get: bw ji jibw jibw jp. Does this mean that this is not sorting? If it is sorting, does lexicographic sorting take into consideration pushing the shorter strings to the back or something? I've been doing some reading on lexigographical order and I dont see any point or scenarios on which this is used, do you have any?

    Read the article

  • Why is my producer-consumer blocking?

    - by User007
    My code is here: http://pastebin.com/Fi3h0E0P Here is the output 0 Should we take order today (y or n): y Enter order number: 100 More customers (y or n): n Stop serving customers right now. Passing orders to cooker: There are total of 1 order(s) 1 Roger, waiter. I am processing order #100 The goal is waiter must take orders and then give them to the cook. The waiter has to wait cook finishes all pizza, deliver the pizza, and then take new orders. I asked how P-V work in my previous post here. I don't think it has anything to do with \n consuming? I tried all kinds of combination of wait(), but none work. Where did I make a mistake? The main part is here: //Producer process if(pid > 0) { while(1) { printf("0"); P(emptyShelf); // waiter as P finds no items on shelf; P(mutex); // has permission to use the shelf waiter_as_producer(); V(mutex); // cooker now can use the shelf V(orderOnShelf); // cooker now can pickup orders wait(); printf("2"); P(pizzaOnShelf); P(mutex); waiter_as_consumer(); V(mutex); V(emptyShelf); printf("3 "); } } if(pid == 0) { while(1) { printf("1"); P(orderOnShelf); // make sure there is an order on shelf P(mutex); //permission to work cooker_as_consumer(); // take order and put pizza on shelf printf("return from cooker"); V(mutex); //release permission printf("just released perm"); V(pizzaOnShelf); // pizza is now on shelf printf("after"); wait(); printf("4"); } } So I imagine this is the execution path: enter waiter_as_producer, then go to child process (cooker), then transfer the control back to parent, finish waiter_as_consumer, switch back to child. The two waits switch back to parent (like I said I tried all possible wait() combination...).

    Read the article

  • question about LSD radix sort

    - by davit-datuashvili
    hello i have following code public class LSD{ public static int R=1<<8; public static int bytesword=4; public static void radixLSD(int a[],int l,int r){ int aux[]=new int[a.length]; for (int d=bytesword-1;d>=0;d--){ int i, j; int count[]=new int[R+1]; for ( j=0;j<R;j++) count[j]=0; for (i=l;i<=r;i++) count[digit(a[i],d)+1]++; for (j=1;j<R;j++) count[j]+=count[j-1]; for (i=l;i<=r;i++) aux[count[digit(a[i],d)]++]=a[i]; for (i=l;i<=r;i++) a[i]=aux[i-1]; } } public static void main(String[]args){ int a[]=new int[]{3,6,5,7,4,8,9}; radixLSD(a,0,a.length-1); for (int i=0;i<a.length;i++){ System.out.println(a[i]); } } public static int digit(int n,int d){ return (n>>d)&1; } } but it show me mistake java.lang.ArrayIndexOutOfBoundsException: -1 at LSD.radixLSD(LSD.java:19) at LSD.main(LSD.java:29) please help me

    Read the article

  • Where can I learn more about datastructure tricky questions?

    - by Sandbox
    I am relatively new to programming (around 1 year programming C#-winforms). Also, I come from a non CS background (no formal degree) Recently, while being interviewed for a job, I was asked about implementing a queue using a stack. I fumbled and wan't able to answer the question. After, the interview I could do it(had to spend some time). I have learnt (and think that I know it well) basic algorithms in datastructures using the book Data Structures: A Pseudocode Approach with C - Richard F. Gilberg (Author) . I want to know about sites/ books which have such questions along with answers. I think this will allow me to develop my CS specific problem solving skills. Any help is appreciated. BOUNTY: I am looking at some blog/website with datastructure and algorithms Q&A.

    Read the article

  • Find the shortest path in a graph which visits certain nodes.

    - by dmd
    I have a undirected graph with about 100 nodes and about 200 edges. One node is labelled 'start', one is 'end', and there's about a dozen labelled 'mustpass'. I need to find the shortest path through this graph that starts at 'start', ends at 'end', and passes through all of the 'mustpass' nodes (in any order). ( http://3e.org/local/maize-graph.png / http://3e.org/local/maize-graph.dot.txt is the graph in question - it represents a corn maze in Lancaster, PA)

    Read the article

  • Fast dot product for a very special case

    - by psihodelia
    Given a vector X of size L, where every scalar element of X is from a binary set {0,1}, it is to find a dot product z=dot(X,Y) if vector Y of size L consists of the integer-valued elements. I suggest, there must exist a very fast way to do it. Let's say we have L=4; X[L]={1, 0, 0, 1}; Y[L]={-4, 2, 1, 0} and we have to find z=X[0]*Y[0] + X[1]*Y[1] + X[2]*Y[2] + X[3]*Y[3] (which in this case will give us -4). It is obvious that X can be represented using binary digits, e.g. an integer type int32 for L=32. Then, all what we have to do is to find a dot product of this integer with an array of 32 integers. Do you have any idea or suggestions how to do it very fast?

    Read the article

  • Finding the maximum weight subsequence of an array of positive integers?

    - by BeeBand
    I'm tring to find the maximum weight subsequence of an array of positive integers - the catch is that no adjacent members are allowed in the final subsequence. The exact same question was asked here, and a recursive solution was given by MarkusQ. He provides an explanation, but can anyone help me understand how he has expanded the function? How does this solution take into consideration non-adjacent members?

    Read the article

  • How to output multicolumn html without "widows"?

    - by user314850
    I need to output to HTML a list of categorized links in exactly three columns of text. They must be displayed similar to columns in a newspaper or magazine. So, for example, if there are 20 lines total the first and second columns would contain 7 lines and the last column would contain 6. The list must be dynamic; it will be regularly changed. The tricky part is that the links are categorized with a title and this title cannot be a "widow". If you have a page layout background you'll know that this means the titles cannot be displayed at the bottom of the column -- they must have at least one link underneath them, otherwise they should bump to the next column (I know, technically it should be two lines if I were actually doing page layout, but in this case one is acceptable). I'm having a difficult time figuring out how to get this done. Here's an example of what I mean: Shopping Link 3 Link1 Link 1 Link 4 Link2 Link 2 Link 3 Link 3 Cars Link 1 Music Games Link 2 Link 1 Link 1 Link 2 News As you can see, the "News" title is at the bottom of the middle column, and so is a "widow". This is unacceptable. I could bump it to the next column, but that would create an unnecessarily large amount of white space at the bottom of the second column. What needs to happen instead is that the entire list needs to be re-balanced. I'm wondering if anyone has any tips for how to accomplish this, or perhaps source code or a plug in. Python is preferable, but any language is fine. I'm just trying to get the general concept down.

    Read the article

  • Position elements without overlap

    - by eWolf
    I have a number of rectangular elements that I want to position in a 2D space. I calculate an ideal position for each element. Now my problem is that many elements overlap as very often the ideal positions are concentrated in one region. I want to avoid overlap as much as possible (doesn't have to be perfect, though). How can I do this? I've heard physics simulations are suitable for this - is that correct? And can anyone provide an example/tutorial? By the way: I'm using XNA, if you know any .NET library that does exactly this job - tell me!

    Read the article

  • How can I test if a point lies within a 3d shape with its surface defined by a point cloud?

    - by Ben
    Hi I have a collection of points which describe the surface of a shape that should be roughly spherical, and I need a method with which to determine if any other given point lies within this shape. I've previously been approximating the shape as an exact sphere, but this has proven too inaccurate and I need a more accurate method. Simplicity and speed is favourable over complete accuracy, a good approximation will suffice. I've come across techniques for converting a point cloud to a 3d mesh, but most things I have found have been very complicated, and I am looking for something as simple as possible. Any ideas? Many thanks, Ben.

    Read the article

< Previous Page | 122 123 124 125 126 127 128 129 130 131 132 133  | Next Page >