Search Results

Search found 13604 results on 545 pages for 'programming agnostic'.

Page 89/545 | < Previous Page | 85 86 87 88 89 90 91 92 93 94 95 96  | Next Page >

  • translate by replacing words inside existing text

    - by Berry Tsakala
    What are common approaches for translating certain words (or expressions) inside a given text, when the text must be reconstructed (with punctuations and everythin.) ? The translation comes from a lookup table, and covers words, collocations, and emoticons like L33t, CUL8R, :-), etc. Simple string search-and-replace is not enough since it can replace part of longer words (cat dog ? caterpillar dogerpillar). Assume the following input: s = "dogbert, started a dilbert dilbertion proces cat-bert :-)" after translation, i should receive something like: result = "anna, started a george dilbertion process cat-bert smiley" I can't simply tokenize, since i loose punctuations and word positions. Regular expressions, works for normal words, but don't catch special expressions like the smiley :-) but it does . re.sub(r'\bword\b','translation',s) ==> translation re.sub(r'\b:-\)\b','smiley',s) ==> :-) for now i'm using the above mentioned regex, and simple replace for the non-alphanumeric words, but it's far from being bulletproof. (p.s. i'm using python)

    Read the article

  • Based on your development stack, which is easier for you and why? Debugging or logging?

    - by leeand00
    Please state if you are developing on the front end, back end, or if you are developing a mobile/desktop application. List your development stack Language, IDE, etc.. Unit Testing or no Unit Testing Be sure to include any AOP frameworks if used. Tell me if it is easier for you to use debugging or to using logging during development, and why you feel it is easier. I'm just trying to get a feel for why people choose debugging or logging based on their development stack.

    Read the article

  • What statistics can be maintained for a set of numerical data without iterating?

    - by Dan Tao
    Update Just for future reference, I'm going to list all of the statistics that I'm aware of that can be maintained in a rolling collection, recalculated as an O(1) operation on every addition/removal (this is really how I should've worded the question from the beginning): Obvious Count Sum Mean Max* Min* Median** Less Obvious Variance Standard Deviation Skewness Kurtosis Mode*** Weighted Average Weighted Moving Average**** OK, so to put it more accurately: these are not "all" of the statistics I'm aware of. They're just the ones that I can remember off the top of my head right now. *Can be recalculated in O(1) for additions only, or for additions and removals if the collection is sorted (but in this case, insertion is not O(1)). Removals potentially incur an O(n) recalculation for non-sorted collections. **Recalculated in O(1) for a sorted, indexed collection only. ***Requires a fairly complex data structure to recalculate in O(1). ****This can certainly be achieved in O(1) for additions and removals when the weights are assigned in a linearly descending fashion. In other scenarios, I'm not sure. Original Question Say I maintain a collection of numerical data -- let's say, just a bunch of numbers. For this data, there are loads of calculated values that might be of interest; one example would be the sum. To get the sum of all this data, I could... Option 1: Iterate through the collection, adding all the values: double sum = 0.0; for (int i = 0; i < values.Count; i++) sum += values[i]; Option 2: Maintain the sum, eliminating the need to ever iterate over the collection just to find the sum: void Add(double value) { values.Add(value); sum += value; } void Remove(double value) { values.Remove(value); sum -= value; } EDIT: To put this question in more relatable terms, let's compare the two options above to a (sort of) real-world situation: Suppose I start listing numbers out loud and ask you to keep them in your head. I start by saying, "11, 16, 13, 12." If you've just been remembering the numbers themselves and nothing more, and then I say, "What's the sum?", you'd have to think to yourself, "OK, what's 11 + 16 + 13 + 12?" before responding, "52." If, on the other hand, you had been keeping track of the sum yourself while I was listing the numbers (i.e., when I said, "11" you thought "11", when I said "16", you thought, "27," and so on), you could answer "52" right away. Then if I say, "OK, now forget the number 16," if you've been keeping track of the sum inside your head you can simply take 16 away from 52 and know that the new sum is 36, rather than taking 16 off the list and them summing up 11 + 13 + 12. So my question is, what other calculations, other than the obvious ones like sum and average, are like this? SECOND EDIT: As an arbitrary example of a statistic that (I'm almost certain) does require iteration -- and therefore cannot be maintained as simply as a sum or average -- consider if I asked you, "how many numbers in this collection are divisible by the min?" Let's say the numbers are 5, 15, 19, 20, 21, 25, and 30. The min of this set is 5, which divides into 5, 15, 20, 25, and 30 (but not 19 or 21), so the answer is 5. Now if I remove 5 from the collection and ask the same question, the answer is now 2, since only 15 and 30 are divisible by the new min of 15; but, as far as I can tell, you cannot know this without going through the collection again. So I think this gets to the heart of my question: if we can divide kinds of statistics into these categories, those that are maintainable (my own term, maybe there's a more official one somewhere) versus those that require iteration to compute any time a collection is changed, what are all the maintainable ones? What I am asking about is not strictly the same as an online algorithm (though I sincerely thank those of you who introduced me to that concept). An online algorithm can begin its work without having even seen all of the input data; the maintainable statistics I am seeking will certainly have seen all the data, they just don't need to reiterate through it over and over again whenever it changes.

    Read the article

  • Why is it useful to count the number of bits?

    - by Scorchin
    I've seen the numerous questions about counting the number of set bits in a insert type of input, but why is it useful? For those looking for algorithms about bit counting, look here: http://stackoverflow.com/questions/1517848/counting-common-bits-in-a-sequence-of-unsigned-longs http://stackoverflow.com/questions/472325/fastest-way-to-count-number-of-bit-transitions-in-an-unsigned-int http://stackoverflow.com/questions/109023/best-algorithm-to-count-the-number-of-set-bits-in-a-32-bit-integer

    Read the article

  • How to use a DHT for a social trading environment

    - by Lirik
    I'm trying to understand if a DHT can be used to solve a problem I'm working on: I have a trading environment where professional option traders can get an increase in their risk limit by requesting that fellow traders lend them some of their risk limit. The lending trader will can either search for traders with certain risk parameters which are part of every trader's profile, i.e. Greeks, or the lending trader can subscribe to requests from certain traders. I want this environment to be scalable and decentralized, but I don't know how traders can search for specific profile parameters when the data is contained in a DHT. Could anybody explain how this can be done?

    Read the article

  • Find all complete sub-graphs within a graph

    - by mvid
    Is there a known algorithm or method to find all complete sub-graphs within a graph? I have an undirected, unweighted graph and I need to find all subgraphs within it where each node in the subgraph is connected to each other node in the subgraph. Is there an existing algorithm for this?

    Read the article

  • When is performance gain significant enough to implement that optimization?

    - by Zwei steinen
    Hi, following the text book, I do measure performance whenever I try optimizing my code. Sometimes, however, the performance gain is rather small and I can't decisively decide whether I should implement that optimization. For example, when a fix shortens an average response time of 100ms to 90ms under some conditions, should I implement that fix? What if it shortens 200ms to 190ms? How many condition should I try before I can conclude that it will be beneficial overall? I guess it's not possible to give a straight forward answer to this, as it depends on too many things, but is there a good rule of thumb that I should follow? Are there any guideline/best-practices?

    Read the article

  • Have you been in cases where TDD increased development time?

    - by BillyONeal
    Hello everyone :) I was reading http://stackoverflow.com/questions/2512504/tdd-how-to-start-really-thinking-tdd and I noticed many of the answers indicate that tests + application should take less time than just writing the application. In my experience, this is not true. My problem though is that some 90% of the code I write has a TON of operating system calls. The time spent to actually mock these up takes much longer than just writing the code in the first place. Sometimes 4 or 5 times as long to write the test as to write the actual code. I'm curious if there are other developers in this kind of a scenario.

    Read the article

  • Easiest way to find the correct kademlia bucket

    - by Martin
    In the Kademlia protocol node IDs are 160 bit numbers. Nodes are stored in buckets, bucket 0 stores all the nodes which have the same ID as this node except for the very last bit, bucket 1 stores all the nodes which have the same ID as this node except for the last 2 bits, and so on for all 160 buckets. What's the fastest way to find which bucket I should put a new node into? I have my buckets simply stored in an array, and need a method like so: Bucket[] buckets; //array with 160 items public Bucket GetBucket(Int160 myId, Int160 otherId) { //some stuff goes here } The obvious approach is to work down from the most significant bit, comparing bit by bit until I find a difference, I'm hoping there is a better approach based around clever bit twiddling. Practical note: My Int160 is stored in a byte array with 20 items, solutions which work well with that kind of structure will be preferred.

    Read the article

  • Segmenting a double array of labels

    - by Ami
    The Problem: I have a large double array populated with various labels. Each element (cell) in the double array contains a set of labels and some elements in the double array may be empty. I need an algorithm to cluster elements in the double array into discrete segments. A segment is defined as a set of pixels that are adjacent within the double array and one label that all those pixels in the segment have in common. (Diagonal adjacency doesn't count and I'm not clustering empty cells). |-------|-------|------| | Jane | Joe | | | Jack | Jane | | |-------|-------|------| | Jane | Jane | | | | Joe | | |-------|-------|------| | | Jack | Jane | | | Joe | | |-------|-------|------| In the above arrangement of labels distributed over nine elements, the largest cluster is the “Jane” cluster occupying the four upper left cells. What I've Considered: I've considered iterating through every label of every cell in the double array and testing to see if the cell-label combination under inspection can be associated with a preexisting segment. If the element under inspection cannot be associated with a preexisting segment it becomes the first member of a new segment. If the label/cell combination can be associated with a preexisting segment it associates. Of course, to make this method reasonable I'd have to implement an elaborate hashing system. I'd have to keep track of all the cell-label combinations that stand adjacent to preexisting segments and are in the path of the incrementing indices that are iterating through the double array. This hash method would avoid having to iterate through every pixel in every preexisting segment to find an adjacency. Why I Don't Like it: As is, the above algorithm doesn't take into consideration the case where an element in the double array can be associated with two unique segments, one in the horizontal direction and one in the vertical direction. To handle these cases properly, I would need to implement a test for this specific case and then implement a method that will both associate the element under inspection with a segment and then concatenate the two adjacent identical segments. On the whole, this method and the intricate hashing system that it would require feels very inelegant. Additionally, I really only care about finding the large segments in the double array and I'm much more concerned with the speed of this algorithm than with the accuracy of the segmentation, so I'm looking for a better way. I assume there is some stochastic method for doing this that I haven't thought of. Any suggestions?

    Read the article

  • how to read an address in multiple formats like google maps

    - by ratan
    notice that on google maps you can input the address any way you like. as long as it is a valid address...google maps will read it. In some ruby book I had seen code snippet for something like this, but with phone numbers. Any ideas how this could be done for addresses? in language of your choice. EDIT: i dont care about a "valid" address. I just want to parse an address. so that 123 fake street, WA, 34223 would be an address and so will 123 fake street WA 34223

    Read the article

  • Map Reduce Frameworks/Infrastructure

    - by Johannes Rudolph
    Map Reduce is a pattern that seems to get a lot of traction lately and I start to see it manifest in one of my projects that is focused on an event processing pipeline (iPhone Accelerometer and GPS data). I needed to built a lot of infrastructure for this project, in fact it overweighs the logic code interacting with it by 2x. Some of the components I built where EventProcessors (with in- and output plus buffering, timing etc.), multiplexers and aggregators. This leads me to my question what the "common" required infrastrucutre for map reduce is. Since I am working with .Net a lot I can see map reduce infrastructure built into the Framework and language constructs. Functional languages support this paradigm per se. It seems every language can be used with map reduce, some have better support than others, others again are built around that concept (e.g. Go). And there are Frameworks like Apache Hadoop to support map reduce.

    Read the article

  • Cross-platform and language (de)serialization

    - by fwgx
    I'm looking for a way to serialize a bunch of C++ structs in the most convenient way so that the serialization is portable across C++ and Java (at a minimum) and across 32bit/64bit, big/little endian platforms. The structures to be serialized just contain data, i.e. they're pure data objects with no state or behavior. The idea being that we serialize the structs into an octet blob that we can store in a database "generically" and be read out later on. Thus avoiding changing the database whenever a struct changes and also avoiding assigning each data member to a field - i.e. we only want one table to hold everything "generically" as a binary blob. This should make less work for developers and require less changes when structures change. I've looked at boost.serialize but don't think there's a way to enable compatibility with Java. And likewise for inheriting Serializable in Java. If there is a way to do it by starting with an IDL file that would be best as we already have IDL files that describe the structures. Cheers in advance!

    Read the article

  • How/when to hire new programmers, and how to integrate them?

    - by Shaul
    Hiring new programmers, especially in a small company, can often present a Catch-22 situation. We have too much work to do, so we need to hire new programmers. But we can't hire new programmers now, because they will need mentoring and several months of learning curve in your industry/product/environment before they're useful, and none of the programmers has time to be a mentor to a new programmer, because they're all completely swamped with the current work load. That may be a slightly frivolous way of describing the situation, but nevertheless, it's difficult for a small company on a tight budget to justify hiring someone who is not only going to be unproductive for a long time, but will also take away from the performance of the current programmers. How have you dealt with this kind of situation? When is the best time to hire someone? What are the best tasks to assign to a new team member so that they can learn their way around your code base and start getting their hands dirty as quickly as possible? How do you get the new guy useful without bogging your existing programmers down in too much mentoring? Any comments & suggestions you have are much appreciated!

    Read the article

  • Linear feedback shift register?

    - by Mattia Gobbi
    Lately I bumped repeatedly into the concept of LFSR, that I find quite interesting because of its links with different fields and also fascinating in itself. It took me some effort to understand, the final help was this really good page, much better than the (at first) cryptic wikipedia entry. So I wanted to write some small code for a program that worked like a LFSR. To be more precise, that somehow showed how a LFSR works. Here's the cleanest thing I could come up with after some lenghtier attempts (Python): def lfsr(seed, taps): sr, xor = seed, 0 while 1: for t in taps: xor += int(sr[t-1]) if xor%2 == 0.0: xor = 0 else: xor = 1 print xor sr, xor = str(xor) + sr[:-1], 0 print sr if sr == seed: break lfsr('11001001', (8,7,6,1)) #example I named "xor" the output of the XOR function, not very correct. However, this is just meant to show how it circles through its possible states, in fact you noticed the register is represented by a string. Not much logical coherence. This can be easily turned into a nice toy you can watch for hours (at least I could :-) def lfsr(seed, taps): import time sr, xor = seed, 0 while 1: for t in taps: xor += int(sr[t-1]) if xor%2 == 0.0: xor = 0 else: xor = 1 print xor print time.sleep(0.75) sr, xor = str(xor) + sr[:-1], 0 print sr print time.sleep(0.75) Then it struck me, what use is this in writing software? I heard it can generate random numbers; is it true? how? So, it would be nice if someone could: explain how to use such a device in software development come up with some code, to support the point above or just like mine to show different ways to do it, in any language Also, as theres not much didactic stuff around about this piece of logic and digital circuitry, it would be nice if this could be a place for noobies (like me) to get a better understanding of this thing, or better, to understand what it is and how it can be useful when writing software. Should have made it a community wiki? That said, if someone feels like golfing... you're welcome.

    Read the article

  • Algorithm possible amounts (over)paid for a specific price, based on denominations

    - by Wrikken
    In a current project, people can order goods delivered to their door and choose 'pay on delivery' as a payment option. To make sure the delivery guy has enough change customers are asked to input the amount they will pay (e.g. delivery is 48,13, they will pay with 60,- (3*20,-)). Now, if it were up to me I'd make it a free field, but apparantly higher-ups have decided is should be a selection based on available denominations, without giving amounts that would result in a set of denominations which could be smaller. Example: denominations = [1,2,5,10,20,50] price = 78.12 possibilities: 79 (multitude of options), 80 (e.g. 4*20) 90 (e.g. 50+2*20) 100 (2*50) It's international, so the denominations could change, and the algorithm should be based on that list. The closest I have come which seems to work is this: for all denominations in reversed order (large=>small) add ceil(price/denomination) * denomination to possibles baseprice = floor(price/denomination) * denomination; for all smaller denominations as subdenomination in reversed order add baseprice + (ceil((price - baseprice) / subdenomination) * subdenomination) to possibles end for end for remove doubles sort Is seems to work, but this has emerged after wildly trying all kinds of compact algorithms, and I cannot defend why it works, which could lead to some edge-case / new countries getting wrong options, and it does generate some serious amounts of doubles. As this is probably not a new problem, and Google et al. could not provide me with an answer save for loads of pages calculating how to make exact change, I thought I'd ask SO: have you solved this problem before? Which algorithm? Any proof it will always work?

    Read the article

  • Non-Relational DBMS Design Resources

    - by Matt Luongo
    Hey guys, As a personal project, I'm looking to build a rudimentary DBMS. I've read the relevant sections in Elmasri & Navathe (5ed), but could use a more focused text. The rub is that I want to play with novel non-relational data models. While a lot of E&N was great- indexing implementation details in particular- the more advanced DBMS implementation was only targeted to a relational model. I could also use something a bit more practical and detail-oriented, with real-world recommendations. I'd like to defer staring at DBMS source for a while if I can. Any ideas?

    Read the article

  • Find location using only distance and bearing?

    - by pinnacler
    Triangulation works by checking your angle to three KNOWN targets. "I know the that's the Lighthouse of Alexandria, it's located here (X,Y) on a map, and it's to my right at 90 degrees." Repeat 2 more times for different targets and angles. Trilateration works by checking your distance from three KNOWN targets. "I know the that's the Lighthouse of Alexandria, it's located here (X,Y) on a map, and I'm 100 meters away from that." Repeat 2 more times for different targets and ranges. But both of those methods rely on knowing WHAT you're looking at. Say you're in a forest and you can't differentiate between trees, but you know where key trees are. These trees have been hand picked as "landmarks." You have a robot moving through that forest slowly. Do you know of any ways to determine location based solely off of angle and range, exploiting geometry between landmarks? Note, you will see other trees as well, so you won't know which trees are key trees. Ignore the fact that a target may be occluded. Our pre-algorithm takes care of that. 1) If this exists, what's it called? I can't find anything. 2) What do you think the odds are of having two identical location 'hits?' I imagine it's fairly rare. 3) If there are two identical location 'hits,' how can I determine my exact location after I move the robot next. (I assume the chances of having 2 occurrences of EXACT angles in a row, after I reposition the robot, would be statistically impossible, barring a forest growing in rows like corn). Would I just calculate the position again and hope for the best? Or would I somehow incorporate my previous position estimate into my next guess? If this exists, I'd like to read about it, and if not, develop it as a side project. I just don't have time to reinvent the wheel right now, nor have the time to implement this from scratch. So if it doesn't exist, I'll have to figure out another way to localize the robot since that's not the aim of this research, if it does, lets hope it's semi-easy.

    Read the article

  • Does code in the constructor add to code in subclass constructors?

    - by Jeremy Rudd
    Does code in the constructor add to code in subclass constructors? Or does the subclass's constructor override the superclass? Given this example superclass constructor: class Car{ function Car(){ trace("CAR") } } ...and this subclass constructor: class FordCar extends Car{ function FordCar(){ trace("FORD") } } When an instance of FordCar is created, will this trace "Car" and "Ford" ??

    Read the article

  • CodeGolf: Find the Unique Paths

    - by st0le
    Here's a pretty simple idea, in this pastebin I've posted some pair of numbers. These represent Nodes of a directed graph. The input to stdin will be of the form, (they'll be numbers, i'll be using an example here) c d q r a b b c d e p q so x y means x is connected to y (not viceversa) There are 2 paths in that example. a->b->c->d->e and p->q->r. You need to print all the unique paths from that graph The output should be of the format a->b->c->d->e p->q->r Notes You can assume the numbers are chosen such that one path doesn't intersect the other (one node belongs to one path) The pairs are in random order. They are more than 1 paths, they can be of different lengths. All numbers are less than 1000. If you need more details, please leave a comment. I'll amend as required. Shameless-Plug For those who enjoy Codegolf, please Commit at Area51 for its very own site:) (for those who don't enjoy it, please support it as well, so we'll stay out of your way...)

    Read the article

  • What is a Windows scripting language that: does not rely on .NET and offers the most OOP support and

    - by jJack
    What is a Windows scripting language that: does not rely on .NET and offers the most OOP support and has simplest deployment? It doesn't necessarily need to be a scripting language; It can be in the form of a compiled executable, however it needs to be self contained--only ONE file, no DLL's and it cannot be declared to "include" other files. I cannot rely on the user having any .NET installed and it needs to be able to run on Windows 7 64 bit. By "most OOP support", I basically mean anything that has better OOP support than VBScript. A little context: Everything I have done thus far is in VBScript and writes a bunch of data into an .html file, which in the end is to be viewed by Internet Explorer. It also zips up a bunch of directories and files. It heavily relies on accessing the registry, file-system, and WMI (I can probably do without accessing WMI though, as long as I have good registry access). I can bring myself to code in any language so long as it meets me ridonkulous requirements stated above. I look forward to some good answers from those more experienced than I.

    Read the article

  • How to restrain one's self from the overwhelming urge to rewrite everything?

    - by Scott Saad
    Setup Have you ever had the experience of going into a piece of code to make a seemingly simple change and then realizing that you've just stepped into a wasteland that deserves some serious attention? This usually gets followed up with an official FREAK OUT moment, where the overwhelming feeling of rewriting everything in sight starts to creep up. It's important to note that this bad code does not necessarily come from others as it may indeed be something we've written or contributed to in the past. Problem It's obvious that there is some serious code rot, horrible architecture, etc. that needs to be dealt with. The real problem, as it relates to this question, is that it's not the right time to rewrite the code. There could be many reasons for this: Currently in the middle of a release cycle, therefore any changes should be minimal. It's 2:00 AM in the morning, and the brain is starting to shut down. It could have seemingly adverse affects on the schedule. The rabbit hole could go much deeper than our eyes are able to see at this time. etc... Question So how should we balance the duty of continuously improving the code, while also being a responsible developer? How do we refrain from contributing to the broken window theory, while also being aware of actions and the potential recklessness they may cause? Update Great answers! For the most part, there seems to be two schools of thought: Don't resist the urge as it's a good one to have. Don't give in to the temptation as it will burn you to the ground. It would be interesting to know if more people feel any balance exists.

    Read the article

< Previous Page | 85 86 87 88 89 90 91 92 93 94 95 96  | Next Page >