probability theory - Page 2

Calculating spam probability in python

- by Hobhouse

I am building a website in python/django and want to predict wether a user submission is valid or wether it is spam. Users have an accept rate on their submissions, like this website has. Users can moderate other users' submissions; and these moderations are later metamoderated by an admin. Given this: user A with an submission accept rate of 60% submits something. user B moderates A's post as a valid submission. However, his moderations are often wrong, and his moderations' accept rate is a mere 30%. user C moderates A's post as spam. User C is usually right. His moderations' accept rate is 80%. How can I predict the chance of A's post being spam?

Read the article

Calculating spam probability

- by Hobhouse

I am building a website in python/django and want to predict wether a user submission is valid or wether it is spam. Users have an accept rate on their submissions, like this website has. Users can moderate other users' submissions; and these moderations are later metamoderated by an admin. Given this: user A with an submission accept rate of 60% submits something. user B moderates A's post as a valid submission. However, his moderations are often wrong, and his moderations' accept rate is a mere 30%. user C moderates A's post as spam. User C is usually right. His moderations' accept rate is 80%. How can I predict the chance of A's post being spam?

Read the article

Why does NUnit ignore datapoints when using generics in a theory

- by The Chairman

I'm trying to make use of the TheoryAttribute, introduced in NUnit 2.5. Everything works fine as long as the arguments are of a defined type: [Datapoint] public double[,] Array2X2 = new double[,] { { 1, 0 }, { 0, 1 } }; [Theory] public void TestForArbitraryArray(double[,] array) { // ... } It does not work, when I use generics: [Datapoint] public double[,] Array2X2 = new double[,] { { 1, 0 }, { 0, 1 } }; [Theory] public void TestForArbitraryArray<T>(T[,] array) { // ... } NUnit gives a warning saying No arguments were provided. Why is that?

Read the article

Is there a good example of the difference between practice and theory?

- by a_person

There has been a lot of posters advising that the best way to retain knowledge is to apply it practically. After ignoring said advice for several years in a futile attempt to accumulate enough theoretical knowledge to be prepared for every possible case scenario, the process which lead me to assembling a library that's easily worth ~6K, I finally get it. I would like to share my story in the hopes that others will avoid taking the same route that was taken by me. I've selected graphical format (photos with caption to be exact) as my media. Help me with your ideas, maybe a fragment of code, or other imagery that would convey a message of the inherent difference between practice and theory.

Read the article

Is it possible to test a theory?

- by user363295

We are a group of students who are working on a theory in software engineering (talking about the theory takes a lot of time so I just skip that). Implementing the theory is impossible, due to technical limitations, but it can be proven on a paper logically. We've been pushed to do a testing on it, so it can be proved that way too (although we bleieve that's not possible!), now: Basically, is it possible to test something like this? If it is, what type of testing should we use? I heard,its possible to handout a brief about it to some experts and asking about their opinion (not sure if that's true), is that a testing method? if it is, what does it called? and how exactly can be done?

Read the article

Graph theory in python

- by Dan

I was wondering how people deal with graph theory in python? How is a graph stored? Are there libraries for this? For example how would I input a graph and then find its Chromatic polynomial? Or its girth? Or the number of unique spanning trees? How about problems that involve edge weight like salesman problems? I don't need all of these answered, I'm just looking for a method or tool set that will be able to help me approach solve problems like this. Thanks, Dan

Read the article

What is the relaxation condition in graph theory

- by windopal

Hi, I'm trying to understand the main concepts of graph theory and the algorithms within it. Most algorithms seem to contain a "Relaxation Condition" I'm unsure about what this is. Could some one explain it to me please. An example of this is dijkstras algorithm, here is the pseudo-code. 1 function Dijkstra(Graph, source): 2 for each vertex v in Graph: // Initializations 3 dist[v] := infinity // Unknown distance function from source to v 4 previous[v] := undefined // Previous node in optimal path from source 5 dist[source] := 0 // Distance from source to source 6 Q := the set of all nodes in Graph // All nodes in the graph are unoptimized - thus are in Q 7 while Q is not empty: // The main loop 8 u := vertex in Q with smallest dist[] 9 if dist[u] = infinity: 10 break // all remaining vertices are inaccessible from source 11 remove u from Q 12 for each neighbor v of u: // where v has not yet been removed from Q. 13 alt := dist[u] + dist_between(u, v) 14 if alt < dist[v]: // Relax (u,v,a) 15 dist[v] := alt 16 previous[v] := u 17 return dist[] Thanks

Read the article

Set Theory and .NET

- by MasterMax1313

Recently I came across a situation where set theory and set math fit what I was doing to the letter (granted there was an easier way to accomplish what I needed - i.e. LINQ - but I didn't think of that at the time). However I didn't know of any generic set libraries. Granted IEnumerables provide some set operations (Union, etc.), but nothing like Intersection or set comparison. Can anyone point out something that fits here? Something that implements set math using a generic type?

Read the article

Is there a theory for "transactional" sequences of failing and no-fail actions?

- by Ross Bencina

My question is about writing transaction-like functions that execute sequences of actions, some of which may fail. It is related to the general C++ principle "destructors can't throw," no-fail property, and maybe also with multi-phase transactions or exception safety. However, I'm thinking about it in language-neutral terms. My concern is with correctly designing error handling in C++ functions that must be reliable. I would like to know what the concepts below are called so that I can learn more about them. I'm sorry that I can't ask the question more directly. Since I don't know this area I have provided an example to explain my question. The question is at the end. Here goes: Consider a sequence of steps or actions executed sequentially, where actions belong to one of two classes: those that always succeed, and those that may fail. In the examples below: S stands for an action that always succeeds (called "no-fail" in some settings). F stands for an action that may fail (for example, it might fail to allocate memory or do I/O that could fail). Consider a sequences of actions (executed sequentially from left to right): S->S->S->S Since each action in the sequence above succeeds, the whole sequence succeeds. On the other hand, the following sequence may fail because the last action may fail: S->S->S->F So, claim: a sequence has the no-fail (S) property if and only if all of its actions are no-fail. Now, I'm interested in action sequences that form "atomic transactions", with "failure atomicity," i.e. where either the whole sequence completes successfully, or there is no effect. I.e. if some action fails, the earlier ones must be rolled back. This requires that any successfully executed actions prior to a failing action must always be able to be rolled back. Consider the sequence: S->S->S->F S<-S<-S In the example above, the first row is the forward path of the transaction, and the second row are inverse actions (executed from right to left) that can be used to roll back if the final top row actions fails. It seems to me that for a transaction to support failure atomicity, the following invariant must hold: Claim: To support failure atomicity (either completion or complete roll-back on failure) all actions preceding the latest failable (F) action on the forward path (marked * in the example below) must have no-fail (S) inverses. The following is an example of a sequence that supports failure atomicity: * S->F->F->F S<-S<-S Further, if we want the transaction to be able to attempt cancellation mid-way through, but still guarantee either full completion or full rollback then we need the following property: Claim: To support failure atomicity and cancellation mid-way through execution, in the face of errors in the inverse (cancellation) path, all actions following the earliest failable (F) inverse on the reverse path (marked *) must be no-fail (S). F->F->F->S->S S<-S<-F<-F * I believe that these two conditions guarantee that an abortable/cancelable transaction will never get "stuck". My questions are: What is the study and theory of these properties called? are my claims correct? and what else is there to know? UPDATE 1: Updated terminology: what I previously called "robustness" is called atomicity in the database literature. UPDATE 2: Added explicit reference to failure atomicity, which seems to be a thing.

Read the article

Computationally simple Pseudo-Gaussian Distribution with varying mean and standard deviation?

- by mstksg

This picture from wikipedia has a nice example of the sort of functions I'd ideally like to generate http://en.wikipedia.org/wiki/File:Normal_Distribution_PDF.svg Right now I'm using the Irwin-Hall Distribution, which is more or less a Polynomial approximation of the Gaussian distribution...basically, you use uniform random number generator and iterate it x times, and take the average. The more iterations, the more like a Gaussian Distribution it is. It's pretty nice; however I'd like to be able to have one where I can vary the mean. For example, let's say I wanted a number between the range 0 and 10, but around 7. Like, the mean (if I repeated this function multiple times) would turn out to be 7, but the actual range is 0-10. Is there one I should look up, or should I work on doing some fancy maths with standard Gaussian Distributions?

Read the article

Formula to calculate probability of unrecoverable read error during RAID rebuild

- by OlafM

I need to compare the reliability of different RAID systems with either consumer or enterprise drives. The formula to have the probability of success of a rebuild, ignoring mechanical problems, is simple: error_probability = 1 - (1-per_bit_error_rate)^bit_read and with 3 TB drives I get 38% probability to experience an URE (unrecoverable read error) for a 2+1 disks RAID5 (4.7% for enterprise drives) 21% for a RAID1 (2.4% for enterprise drives) 51% probability of error during recovery for the 3+1 RAID5 often used by users of SOHO products like Synologys. Most people don't know about this. Calculating the error for single disk tolerance is easy, my question concerns systems tolerant to multiple disks failures (RAID6/Z2, RAIDZ3 and RAID1 with multiple disks). If only the first disk is used for rebuild and the second one is read again from the beginning in case or an URE, then the error probability is the one calculated above squared (14.5% for consumer RAID5 2+1, 4.5% for consumer RAID1 1+2). However, I suppose (at least in ZFS that has full checksums!) that the second parity/available disk is read only where needed, meaning that only few sectors are needed: how many UREs can possibly happen in the first disk? not many, otherwise the error probability for single-disk tolerance systems would skyrocket even more than I calculated. If I'm correct, a second parity disk would practically lower the risk to extremely low values. Am I correct?

Read the article

Color Theory: How to convert Munsell HVC to RGB/HSB/HSL

- by Ian Boyd

I'm looking at at document that describes the standard colors used in dentistry to describe the color of a tooth. They quote hue, value, chroma values, and indicate they are from the 1905 Munsell description of color: The system of colour notation developed by A. H. Munsell in 1905 identifies colour in terms of three attributes: HUE, VALUE (Brightness) and CHROMA (saturation) [15] HUE (H): Munsell defined hue as the quality by which we distinguish one colour from another. He selected five principle colours: red, yellow, green, blue, and purple; and five intermediate colours: yellow-red, green-yellow, blue-green, purple-blue, and red-purple. These were placed around a colour circle at equal points and the colours in between these points are a mixture of the two, in favour of the nearer point/colour (see Fig 1.). VALUE (V): This notation indicates the lightness or darkness of a colour in relation to a neutral grey scale, which extends from absolute black (value symbol 0) to absolute white (value symbol 10). This is essentially how ‘bright’ the colour is. CHROMA (C): This indicates the degree of divergence of a given hue from a neutral grey of the same value. The scale of chroma extends from 0 for a neutral grey to 10, 12, 14 or farther, depending upon the strength (saturation) of the sample to be evaluated. There are various systems for categorising colour, the Vita system is most commonly used in Dentistry. This uses the letters A, B, C and D to notate the hue (colour) of the tooth. The chroma and value are both indicated by a value from 1 to 4. A1 being lighter than A4, but A4 being more saturated than A1. If placed in order of value, i.e. brightness, the order from brightest to darkest would be: A1, B1, B2, A2, A3, D2, C1, B3, D3, D4, A3.5, B4, C2, A4, C3, C4 The exact values of Hue, Value and Chroma for each of the shades is shown below (16) So my question is, can anyone convert Munsell HVC into RGB, HSB or HSL? Hue Value (Brightness) Chroma(Saturation) === ================== ================== 4.5 7.80 1.7 2.4 7.45 2.6 1.3 7.40 2.9 1.6 7.05 3.2 1.6 6.70 3.1 5.1 7.75 1.6 4.3 7.50 2.2 2.3 7.25 3.2 2.4 7.00 3.2 4.3 7.30 1.6 2.8 6.90 2.3 2.6 6.70 2.3 1.6 6.30 2.9 3.0 7.35 1.8 1.8 7.10 2.3 3.7 7.05 2.4 They say that Value(Brightness) varies from 0..10, which is fine. So i take 7.05 to mean 70.5%. But what is Hue measured in? i'm used to hue being measured in degrees (0..360). But the values i see would all be red - when they should be more yellow, or brown. Finally, it says that Choma/Saturation can range from 0..10 ...or even higher - which makes it sound like an arbitrary scale. So can anyone convert Munsell HVC to HSB or HSL, or better yet, RGB?

Read the article

Theory of formal languages - Automaton

- by dader51

Hi everybody ! I'm wondering about formal languages. I have a kind of parser : It reads à xml-like serialized tree structure and turn it into a multidimmensionnal array. I figured out that i need at least three variables to achieve the job : $tree = array(); // a new array $pTree = array(&$tree); // a new array which the first element points to $tree; $deep = 0; plus the one containing the sentence splitted into words. My point is on the similarities between the algorithm deing used and the differents kinds of automatons ( state machines turing machines stack ... ). The $words variable is the "tape" of the automaton, the test/conditions of the algorithm are transitions, $deep is the state and $tree is the output. I cannont figure what is $pTree. So the question is : which is the automaton I implictly use here, and to which formal languages family does it fit ? And what's about recursion ?

Read the article

Theory: "Lexical Encoding"

- by _ande_turner_

I am using the term "Lexical Encoding" for my lack of a better one. A Word is arguably the fundamental unit of communication as opposed to a Letter. Unicode tries to assign a numeric value to each Letter of all known Alphabets. What is a Letter to one language, is a Glyph to another. Unicode 5.1 assigns more than 100,000 values to these Glyphs currently. Out of the approximately 180,000 Words being used in Modern English, it is said that with a vocabulary of about 2,000 Words, you should be able to converse in general terms. A "Lexical Encoding" would encode each Word not each Letter, and encapsulate them within a Sentence. // An simplified example of a "Lexical Encoding" String sentence = "How are you today?"; int[] sentence = { 93, 22, 14, 330, QUERY }; In this example each Token in the String was encoded as an Integer. The Encoding Scheme here simply assigned an int value based on generalised statistical ranking of word usage, and assigned a constant to the question mark. Ultimately, a Word has both a Spelling & Meaning though. Any "Lexical Encoding" would preserve the meaning and intent of the Sentence as a whole, and not be language specific. An English sentence would be encoded into "...language-neutral atomic elements of meaning ..." which could then be reconstituted into any language with a structured Syntactic Form and Grammatical Structure. What are other examples of "Lexical Encoding" techniques? If you were interested in where the word-usage statistics come from : http://www.wordcount.org

Read the article

Set theory problem - transform one set to another

- by mwnorman

Given 2 sets A & B, is there an algorithm that will generate the insert/update/delete operations required to transform set A into set B?

Read the article

How to create reproducible probability in map generation?

- by nickbadal

So for my game, I'm using perlin noise to generate regions of my map (water/land, forest/grass) but I'd also like to create some probability based generation too. For instance: if(nextInt(10) > 2 && tile.adjacentTo(Type.WATER)) tile.setType(Type.SAND); This works fine, and is even reproduceable (based on a common seed) if the nextInt() calls are always in the same order. The issue is that in my game, the world is generated on demand, based on the player's location. This means, that if I explore the map differently, and the chunks of the map are generated in a different order, the randomness is no longer consistent. How can I get this sort of randomness to be consistent, independent of call order? Thanks in advance :)

Read the article

A Grand Unified Theory of AI

A new approach unites two prevailing but often opposed strains in the history of AI research Artificial intelligence - Physics - Alternative - Quantum Mechanics - Quantum Field Theory

Read the article

The theory of evolution applied to software

- by Michel Grootjans

I recently realized the many parallels you can draw between the theory of evolution and evolving software. Evolution is not the proverbial million monkeys typing on a million typewriters, where one of them comes up with the complete works of Shakespeare. We would have noticed by now, since the proverbial monkeys are now blogging on the Internet ;-) One of the main ideas of the theory of evolution is the balance between random mutations and natural selection. Random mutations happen all the time: millions of mutations over millions of years. Most of them are totally useless. Some of them are beneficial to the evolved species. Natural selection favors the beneficially mutated species. Less beneficial mutations die off. The mutated rabbit doesn't have to be faster than the fox. It just has to be faster than the other rabbits. Theory of evolution Evolving software Random mutations happen all the time. Most of these mutations are so bad, the new species dies off, or cannot reproduce. Developers write new code all the time. New ideas come up during the act of writing software. The really bad ones don't get past the stage of idea. The bad ones don't get committed to source control. Natural selection favors the beneficial mutated species Good ideas and new code gets discussed in group during informal peer review. Less than good code gets refactored. Enhanced code makes it more readable, maintainable... A good set of traits makes the species superior to others. It becomes widespread A good design tends to make it easier to add new features, easier to understand the current implementations, easier to optimize for performance...thus superior. The best designs get carried over from project to project. They appear in blogs, articles and books about principles, patterns and practices. Of course the act of writing software is deliberate. This can hardly be called random mutations. Though it sometimes might seem that code evolves through a will of its own ;-) Does this mean that evolving software (evolution) is better than a big design up front (creationism)? Not necessarily. It's a false idea to think that a project starts from scratch and everything evolves from there. Everyone carries his experience of what works and what doesn't. Up front design is necessary, but is best kept simple and minimal, just enough to get you started. Let the good experiences and ideas help to drive the process, whether they come from you or from others, from past experience or from the most junior developer on your team. Once again, balance is the keyword. Balance design up front with evolution on a daily basis. How do you know what balance is right? Through your own experience of what worked and what didn't (here's evolution again). Notes: The evolution of software can quickly degenerate without discipline. TDD is a discipline that leaves little to chance on that part. Write your test to describe the new behavior. Write just enough code to make it behave as specified. Refactor to evolve the code to a higher standard. The responsibility of good design rests continuously on each developers' shoulders. Promiscuous pair programming helps quickly spreading the design to the whole team.

Read the article

What's the term to describe this combination?

- by developer.cyrus

There are 4 items: 1, 2, 3, and 4. If we just allow the following combinations, what should we call them? I forgot it. Is it called nCr? 1 2 3 4 1 2 3 1 2 4 2 3 4 1 2 1 3 1 4 2 3 2 4 3 4 1 2 3 4

Read the article

Resources on concepts/theory behind GUI development?

- by ShrimpCrackers

I was wondering if there were any resources that explain concepts/theory behind GUI development. I don't mean a resource that explains how to use a GUI library, but rather how to create your own widgets. For example a resource that explains different methods on how to implement scrollable listboxes. I ask because I have an idea for a game tool where I would like to create my own widgets and let users drag and drop them onto some kind of form. How do GUI libraries usually draw widgets? I'm not sure if reskinning widgets from a GUI library fits my needs, since widget behavior needs to be dynamic based on user interaction.

Read the article

In terms of loss of volume or corruption, is failure probability of an Amazon EBS volume 'x', indepe

- by Tony Morgan

In terms of loss of volume or corruption, is failure probability of an Amazon EBS* volume 'x', independent of the failure of another volume 'y'. Amazon states[1] AFR** of between 0.1%-0.5%, lets say 0.5%, 0.005. To restate the question is the AFR composed of two EBSs mirrored actually 0.005*0.005 = 0.000025? To be clear I'm not interested in high availability here, just very high durability. *EBS = elastic block storage (amazons persistant disks) **AFR = annual failure rate. [1] http://aws.amazon.com/ebs/

Read the article

How to know the exactly probability of a multi-class prediction using libsvm?

- by lgnlgn

Using svm_predict_probability to estimate the probabilty for a multi-class. I need to know what the order of the returned. It does not seem to be in increasing or decreasing order of the class labels, or the array of labels returned by svm_get_labels. So how do I know which class is coresponding to a probability? Many thanks

Read the article

probability of trouble-free upgrade

- by intuited

One of the problems with recommending Ubuntu to potential future users, especially those not particularly given to technical endeavours, is that there is a chance that upgrades will break their machine, and they'll have to pay or otherwise coerce some knowledgeable person into fixing them. In my limited experience of running successive versions of Ubuntu since 8-something on a couple of different laptops, this chance is quite high. I'm not sure if I'm just unlucky with the hardware that I'm using, or if it's a result of the higher-than-average number of packages I have installed, or if upgrades are just typically problematic. So I'd like to know the likelihood, for a casual user, of doing a release upgrade, for example from 10.04 to 10.10, without experiencing any regression bugs. Obviously this is dependent on the hardware that people are running. Canonical seems to be making some efforts towards collecting data on this, for example with the "I am affected by this bug" checkbox on their issue tracker, and with the laptop compatibility reports, but I've not seen anything comprehensive. I'm hoping for an objective reference here, for example a study carried out by relatively unbiased individuals. However, anecdotal evidence is probably useful too.

Read the article

Book Review (Book 10) - The Information: A History, a Theory, a Flood

- by BuckWoody

This is a continuation of the books I challenged myself to read to help my career - one a month, for year. You can read my first book review here, and the entire list is here. The book I chose for March 2012 was: The Information: A History, a Theory, a Flood by James Gleick. I was traveling at the end of last month so I’m a bit late posting this review here. Why I chose this book: My personal belief about computing is this: All computing technology is simply re-arranging data. We take data in, we manipulate it, and we send it back out. That’s computing. I had heard from some folks about this book and it’s treatment of data. I heard that it dealt with the basics of data - and the semantics of data, information and so on. It also deals with the earliest forms of history of information, which fascinates me. It’s similar I was told, to GEB which a favorite book of mine as well, so that was a bonus. Some folks I talked to liked it, some didn’t - so I thought I would check it out. What I learned: I liked the book. It was longer than I thought - took quite a while to read, even though I tend to read quickly. This is the kind of book you take your time with. It does in fact deal with the earliest forms of human interaction and the basics of data. I learned, for instance, that the genesis of the binary communication system is based in the invention of telegraph (far-writing) codes, and that the earliest forms of communication were expensive. In fact, many ciphers were invented not to hide military secrets, but to compress information. A sort of early “lol-speak” to keep the cost of transmitting data low! I think the comparison with GEB is a bit over-reaching. GEB is far more specific, fanciful and so on. In fact, this book felt more like something fro Richard Dawkins, and tended to wander around the subject quite a bit. I imagine the author doing his research and writing each chapter as a book that followed on from the last one. This is what possibly bothered those who tended not to like it, I think. Towards the middle of the book, I think the author tended to be a bit too fragmented even for me. He began to delve into memes, biology and more - I think he might have been better off breaking that off into another work. The existentialism just seemed jarring. All in all, I liked the book. I recommend it to any technical professional, specifically ones involved with data technology in specific. And isn’t that all of us? :)

Read the article

reservoir sampling problem: correctness of proof

- by eSKay

This MSDN article proves the correctness of Reservoir Sampling algorithm as follows: Base case is trivial. For the k+1st case, the probability a given element i with position <= k is in R is s/k. The probability i is replaced is the probability k+1st element is chosen multiplied by i being chosen to be replaced, which is: s/(k+1) * 1/s = 1/(k+1), and prob that i is not replaced is k/k+1. So any given element's probability of lasting after k+1 rounds is: (chosen in k steps, and not removed in k steps) = s/k * k/(k+1), which is s/(k+1). So, when k+1 = n, any element is present with probability s/n. about step 3: What are the k+1 rounds mentioned? What is chosen in k steps, and not removed in k steps? Why are we only calculating this probability for elements that were already in R after the first s steps?

Search Results

Search found 1745 results on 70 pages for 'probability theory'.

Page 2/70 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Hobhouse

- by Hobhouse

- by The Chairman

- by a_person

- by user363295

- by Dan

- by windopal

- by MasterMax1313

- by Ross Bencina

- by mstksg

- by OlafM

- by Ian Boyd

- by dader51

- by _ande_turner_

- by mwnorman

- by nickbadal

- by Michel Grootjans

- by developer.cyrus

- by ShrimpCrackers

- by Tony Morgan

- by lgnlgn

- by intuited

- by BuckWoody

- by eSKay

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >