Search Results

Search found 2683 results on 108 pages for 'statistical analysis'.

Page 18/108 | < Previous Page | 14 15 16 17 18 19 20 21 22 23 24 25 | Next Page >

Big Oh notation does not mention constant value

- by user883561

I am a programmer and have just started reading Algorithms. I am not completely convinced with the notations namely Bog Oh, Big Omega and Big Theta. The reason is by definition of Big Oh, it states that there should be a function g(x) such that it is always greater than or equal to f(x). Or f(x) <= c.n for all values of n n0. My doubt is the why dont we mention the constant value in the definition? For example. lets say a function 6n+4, we denote it as O(n). but its not true that the definition holds good for all constant value. this holds good only when c = 10 and n = 1. For lesser values of c than 6, the value of n0 increases. So why we do not mention the constant value as a part of the definition.

Read the article
How to discriminate from two nodes with identical frequencies in a Huffman's tree?

- by Omega

Still on my quest to compress/decompress files with a Java implementation of Huffman's coding (http://en.wikipedia.org/wiki/Huffman_coding) for a school assignment. From the Wikipedia page, I quote: Create a leaf node for each symbol and add it to the priority queue. While there is more than one node in the queue: Remove the two nodes of highest priority (lowest probability) from the queue Create a new internal node with these two nodes as children and with probability equal to the sum of the two nodes' probabilities. Add the new node to the queue. The remaining node is the root node and the tree is complete. Now, emphasis: Remove the two nodes of highest priority (lowest probability) from the queue Create a new internal node with these two nodes as children and with probability equal to the sum of the two nodes' probabilities. So I have to take two nodes with the lowest frequency. What if there are multiple nodes with the same low frequency? How do I discriminate which one to use? The reason I ask this is because Wikipedia has this image: And I wanted to see if my Huffman's tree was the same. I created a file with the following content: aaaaeeee nnttmmiihhssfffouxprl And this was the result: Doesn't look so bad. But there clearly are some differences when multiple nodes have the same frequency. My questions are the following: What is Wikipedia's image doing to discriminate the nodes with the same frequency? Is my tree wrong? (Is Wikipedia's image method the one and only answer?) I guess there is one specific and strict way to do this, because for our school assignment, files that have been compressed by my program should be able to be decompressed by other classmate's programs - so there must be a "standard" or "unique" way to do it. But I'm a bit lost with that. My code is rather straightforward. It literally just follows Wikipedia's listed steps. The way my code extracts the two nodes with the lowest frequency from the queue is to iterate all nodes and if the current node has a lower frequency than any of the two "smallest" known nodes so far, then it replaces the highest one. Just like that.

Read the article
Optimize Many-to-Many with SUMMARIZE and Other Techniques

- by Marco Russo (SQLBI)

We are still in the early days of DAX and even if I have been using it since 2 years ago, there is still a lot to learn on that. One of the topics that historically interests me (and many of the readers here, probably) is the many-to-many relationships between dimensions in a dimensional data model. When I and Alberto wrote the The Many to Many Revolution 2.0 we discovered the SUMMARIZE based pattern very late in the whitepaper writing. It is very important for performance optimization and it should be always used. In the last month, Gerhard Brueckl also presented an approach based on cross table filtering behavior that simplify the syntax involved, even if it’s harder to explain how it works internally. I published a short article titled Optimize Many-to-Many Calculation in DAX with SUMMARIZE and Cross Table Filtering on SQLBI website just to provide a quick reference to the three patterns available. A further study is still required to compare performance between SUMMARIZE and Cross Table Filtering patterns. Up to now, I haven’t observed big differences between them, even if their execution plans might be not identical and this suggest me that depending on other conditions you might favor one over the other.

Read the article
Why create a Huffman tree per character instead of a Node?

- by Omega

For a school assignment we're supposed to make a Java implementation of a compressor/decompresser using Huffman's algorithm. I've been reading a bit about it, specially this C++ tutorial: http://www.cprogramming.com/tutorial/computersciencetheory/huffman.html In my program, I've been thinking about having Nodes that have the following properties: Total Frequency Character (if a leaf) Right child (if any) Left child (if any) Parent (if any) So when building the Huffman tree, it is just a matter of linking a node to others, etc. However, I'm a bit confused with the following quote (emphasis mine): First, every letter starts off as part of its own tree and the trees are ordered by the frequency of the letters in the original string. Then the two least-frequently used letters are combined into a single tree, and the frequency of that tree is set to be the combined frequency of the two trees that it links together. My question: why should I create a tree per letter, instead of just a node per letter and then do the linking later? I have not begun coding, I'm just studying the algorithm first, so I guess I'm missing an important detail. What is it?

Read the article
Algorithm to measure how "diffused" 5,000 pennies are in an economy?

- by makerofthings7

Please allow me to use this example/metaphor to describe an algorithm I need. Objects There are 5 thousand pennies. There are 50 cups. There is a tracking history (Passport "stamp" etc) that is associated with each penny as it moves between cups. Definition I'll define a "highly diffused" penny as one that passes through many cups. A "poorly diffused" penny is one that either passes back and forth between 2 cups Question How can I objectively measure the diffusion of a penny as: The number of moves the penny has gone through The number of cups the penny has been in A unit of time (day, week, month) Why am I doing this? I want to detect if a cup is hoarding pennies. Resistance from bad actors Since hoarding is bad, the "bad cup" may simply solicit a partner and simply move pennies between each other. This will reduce the amount of time a coin isn't in transit, and would skew hoarding detection. A solution might be to detect if a cup (or set of cups) are common "partners" with each other, though I'm not sure how to think though this problem. Broad applicability Any assistance would be helpful, since I would think that this algorithm is common to Economics The study of migration patterns of animals, citizens of a country Other natural occurring phenomena ... and probably exists as a term or concept I'm unfamiliar with.

Read the article
Which stages of the requirements analysis process in mobile requirements engineering are the most challenging ones?

- by user363295

I'm doing a research on formulating a requirements analysis model as a stage of requirements engineering for mobile-application development by considering the limitations and the needs of it ( agility and etc.. .), what I'm trying to figure out is that which parts of this process (requirements analysis for mobile development) are the most challenging ones ( so i can focus more on) , and if there is any stage that u think I need to include or exclude (exp. some may think a quality plan may or may not be necessary and etc.) to make it more clear below is the list of few of the areas in which I can focus on ( by the way your suggestions can be anything out of the below list.) -Requirements specification -Prototyping -Requirements Prioritization -Focusing on quality functions

Read the article
How to avoid or minimise use of check/conditional statement in my scenario?

- by Muneeb Nasir

I have scenario, where I got stream and I need to check for some value. If I got any new value I have to store it in any of data structure. It seems very easy, I can place conditional statement if-else or can use contain method of set/map to check either received is new or not. But the problem is checking will effect my application performance, in stream I will receive hundreds for value in second, if I start checking each and every value I received then for sure it effect performance. Anybody can suggest me any mechanism or algorithm to solve my issue, either by bypassing checks or at least minimize them?

Read the article
Community Events in Köln (October) and Copenhagen November #ssas #tabular #powerpivot

- by Marco Russo (SQLBI)

Short update about community events in Europe where I will speak.On October 11 I will present DAX in Action in Köln - all details in the PASS local chapter here: http://www.sqlpass.de/Regionen/Deutschland/K%C3%B6lnBonnD%C3%BCsseldorf.aspxI will be speaking at a community event in Copenhagen on November 21, 2012. The session will be Excel 2013 PowerPivot in Action and details about time and location are available here: http://msbip.dk/events/30/msbip-mode-nr-9/I will be in Köln and Copenhagen to teach the SSAS Tabular Workshop. The workshop in Köln is the first in Germany and I look forward to meet new BI developers there.Copenhagen is the second edition after another we delivered this spring. It is a convenient location also for people coming from Malmoe and Göteborg in Sweden. Last event in Copenhagen were conflicting with a large event in Sweden, maybe this time I'll meet more people coming from the other side of the Øresund Bridge!Many other dates and location are available on the SSAS Tabular Workshop website.

Read the article
Approach for packing 2D shapes while minimizing total enclosing area

- by Dennis

Not sure on my tags for this question, but in short .... I need to solve a problem of packing industrial parts into crates while minimizing total containing area. These parts are motors, or pumps, or custom-made components, and they have quite unusual shapes. For some, it may be possible to assume that a part === rectangular cuboid, but some are not so simple, i.e. they assume a shape more of that of a hammer or letter T. With those, (assuming 2D shape), by alternating direction of top & bottom, one can pack more objects into the same space, than if all tops were in the same direction. Crude example below with letter "T"-shaped parts: ***** xxxxx ***** x ***** *** ooo * x vs * x vs * x vs * x o * x * xxxxx * x * x o xxxxx xxx Right now we are solving the problem by something like this: using CAD software, make actual models of how things fit in crate boxes make estimates of actual crate dimensions & write them into Excel file (1) is crazy amount of work and as the result we have just a limited amount of possible entries in (2), the Excel file. The good things is that programming this is relatively easy. Given a combination of products to go into crates, we do a lookup, and if entry exists in the Excel (or Database), we bring it out. If it doesn't, we say "sorry, no data!". I don't necessarily want to go full force on making up some crazy algorithm that given geometrical part description can align, rotate, and figure out best part packing into a crate, given its shape, but maybe I do.. Question Well, here is my question: assuming that I can represent my parts as 2D (to be determined how), and that some parts look like letter T, and some parts look like rectangles, which algorithm can I use to give me a good estimate on the dimensions of the encompassing area, while ensuring that the parts are packed in a minimal possible area, to minimize crating/shipping costs? Are there approximation algorithms? Seeing how this can get complex, is there an existing library I could use? My thought / Approach My naive approach would be to define a way to describe position of parts, and place the first part, compute total enclosing area & dimensions. Then place 2nd part in 0 degree orientation, repeat, place it at 180 degree orientation, repeat (for my case I don't think 90 degree rotations will be meaningful due to long lengths of parts). Proceed using brute force "tacking on" other parts to the enclosing area until all parts are processed. I may have to shift some parts a tad (see 3rd pictorial example above with letters T). This adds a layer of 2D complexity rather than 1D. I am not sure how to approach this. One idea I have is genetic algorithms, but I think those will take up too much processing power and time. I will need to look out for shape collisions, as well as adding extra padding space, since we are talking about real parts with irregularities rather than perfect imaginary blocks. I'm afraid this can get geometrically messy fairly fast, and I'd rather keep things simple, if I can. But what if the best (practical) solution is to pack things into different crate boxes rather than just one? This can get a bit more tricky. There is human element involved as well, i.e. like parts can go into same box and are thus a constraint to be considered. Some parts that are not the same are sometimes grouped together for shipping and can be considered as a common grouped item. Sometimes customers want things shipped their way, which adds human element to constraints. so there will have to be some customization.

Read the article
Is there a correlation between complexity and reachability?

- by Saladin Akara

I've been studying cyclomatic complexity (McCabe) and reachability of software at uni recently. Today my lecturer said that there's no correlation between the two metrics, but is this really the case? I'd think there would definitely be some correlation, as less complex programs (from the scant few we've looked at) seem to have 'better' results in terms of reachability. Does anyone know of any attempt to look at the two metrics together, and if not, what would be a good place to find data on both complexity and reachability for a large(ish) number of programs? (As clarification, this isn't a homework question. Also, if I've put this in the wrong place, let me know.)

Read the article
How to model the components of a non Information System?

- by Adel C Kod

So I am working on a project that's related to the Kernel code(specifically related to the TCP/IP stack of the kernel). I need to build some models to describe the functionality and components of my system. Initially I thought about Class Diagram, it can describe the general architecture of my system but it doesn't make sense since my code is VERY structured(written in standard C). I also thought about DFDs, they'd describe the processes of my system, and how the data is flowing. But they contain something which doesn't really fit in; data-storages. I have no databases here(at all). For the functionality, other team members suggested using Activity and Sequence diagrams, which is kinda okay with me, but what about the system components? So basically my question is; I want to describe the components of my system; what do you suggest as a meaningful diagram to follow? (Again, the project is a research low-level systems-oriented project with almost no user-interface at all)

Read the article
Technique to Solve Hard Programming logic

- by Paresh Mayani

I have heard about many techniques which are used by developer/software manager to solve hard programming logic or to create flow of an application and this flow will be implemented by developers to create an actual application. Some of the technique which i know, are: Flowchart Screen-Layout Data Flow Diagram E-R Diagram Algorithm of every programs I'd like to know about two facts: (1) Are there any techniques other than this ? (2) Which one is the most suitable to solve hard programming logic and process of application creation?

Read the article
Sources of requirements? [closed]

- by user970696

I was reading a book about SW engineering the other day and it went like: Sources of both functional and non-functional requirements are: law (for specific cases) business and user requirements etc. //what else then? So the question is, what other sources of requirements there are when an analyst is gathering the information? Lets consider a desktop app for mobile operator. As for the comment, I do not think this is a broad question as the books usually mention 1-2 sources. I would like to know more, if anyone can help.

Read the article
Saving all hits to a web app

- by bevanb

Are there standard approaches to persisting data for every hit that a web app receives? This would be for analytics purposes (as a better alternative to log mining down the road). Seems like Redis would be a must. Is it advisable to also use a different DB server for that table, or would Redis be enough to mitigate the impact on the main DB? Also, how common is this practice? Seems like a no brainer for businesses who want to better understand their users, but I haven't read much about it.

Read the article
Clustering Strings on the basis of Common Substrings

- by pk188

I have around 10000+ strings and have to identify and group all the strings which looks similar(I base the similarity on the number of common words between any two give strings). The more number of common words, more similar the strings would be. For instance: How to make another layer from an existing layer Unable to edit data on the network drive Existing layers in the desktop Assistance with network drive In this case, the strings 1 and 3 are similar with common words Existing, Layer and 2 and 4 are similar with common words Network Drive(eliminating stop word) The steps I'm following are: Iterate through the data set Do a row by row comparison Find the common words between the strings Form a cluster where number of common words is greater than or equal to 2(eliminating stop words) If number of common words<2, put the string in a new cluster. Assign the rows either to the existing clusters or form a new one depending upon the common words Continue until all the strings are processed I am implementing the project in C#, and have got till step 3. However, I'm not sure how to proceed with the clustering. I have researched a lot about string clustering but could not find any solution that fits my problem. Your inputs would be highly appreciated.

Read the article
how to avoid or minimise use of check/conditional statement?

- by Muneeb Nasir

I have scenario, where i got stream and i need to check for some value. if i got my any new value i have to store it in any of data structure. well it seems very easy, i can place conditional statement if-else or can use contain method of set/map to check either received is new or not. but the problem is checking will effect my application performance, in stream i'll receive hundreds for value in second, if i start checking each and every value i received than for sure it effect performance. Any body can suggest me any mechanism or algorithm that solve my issue. either by bypassing checks or atleast minimize them?

Read the article
How should I compress a file with multiple bytes that are the same with Huffman coding?

- by Omega

On my great quest for compressing/decompressing files with a Java implementation of Huffman coding (http://en.wikipedia.org/wiki/Huffman_coding) for a school assignment, I am now at the point of building a list of prefix codes. Such codes are used when decompressing a file. Basically, the code is made of zeroes and ones, that are used to follow a path in a Huffman tree (left or right) for, ultimately, finding a byte. In this Wikipedia image, to reach the character m the prefix code would be 0111 The idea is that when you compress the file, you will basically convert all the bytes of the file into prefix codes instead (they tend to be smaller than 8 bits, so there's some gain). So every time the character m appears in a file (which in binary is actually 1101101), it will be replaced by 0111 (if we used the tree above). Therefore, 1101101110110111011011101101 becomes 0111011101110111 in the compressed file. I'm okay with that. But what if the following happens: In the file to be compressed there exists only one unique byte, say 1101101. There are 1000 of such byte. Technically, the prefix code of such byte would be... none, because there is no path to follow, right? I mean, there is only one unique byte anyway, so the tree has just one node. Therefore, if the prefix code is none, I would not be able to write the prefix code in the compressed file, because, well, there is nothing to write. Which brings this problem: how would I compress/decompress such file if it is impossible to write a prefix code when compressing? (using Huffman coding, due to the school assignment's rules) This tutorial seems to explain a bit better about prefix codes: http://www.cprogramming.com/tutorial/computersciencetheory/huffman.html but doesn't seem to address this issue either.

Read the article
How do I debug a cluster running Microsoft server 2003?

- by alcor

I'm sole developer of a complex critical software system, written in Visual C++ 2005. It's deployed on a classical Microsoft cluster scenario (active/passive), that has Windows Server 2003 R2. If a server A goes down, the other one (B) starts and take the ownership of its duties. You have to know that: Both servers have the same Microsoft patches/fixes, same hardware, same everything. Both servers use the same memory storage (a RAID-6 through fiber channel). This software has a main module that launches the peripheral modules. if a peripheral module crashes, the main module restarts it. When I switch the application in one of the two servers (let's say the B server) two of the peripheral modules of the main applications just started to crash apparently without reason about 2 seconds after the start of the peripheral module. What could I do to analyze/inspect/resolve this weird situation?

Read the article
I need advice on how to debug a cluster

- by alcor

I'm the only developer of a complex critical software system, written in Visual C++ 2005. It's deployed on a classical Microsoft cluster scenario (active/passive), that has Windows Server 2003 R2. If a server A goes down, the other one (B) starts and take the ownership of its duties. You have to know that: both servers have the same Microsoft patches/fixes, same hardware, same everything. both servers use the same memory storage (a RAID-6 through fiber channel). this software has a main module who launch the peripheral modules. if a peripheral module crashes, the main module restarts it. When I switch the application in one of the two servers (let's say the B server) two of the peripheral modules of the main applications just started to crash apparently without reason about 2 seconds after the start of the peripheral module. What could I do to analyze/inspect/resolve this weird situation?

Read the article
Internships and certification in IT Business Analyst? [closed]

- by light

I'm new in this field, I have almost no experience. But I know here many people who do it every day and have many years of experience behind! And I hope they can help me! Generally, I'm interested to know the following: certificates to fully understand technology and to show to employers maybe you know good places to have an internship (country doesn't matter) EDIT #1: question was changed to be more specific. EDIT #2: thanks, I think the question should be closed, because the question depends from my need, what type of work I want to have).

Read the article
Where’s my MD.050?

- by Dave Burke

A question that I’m sometimes asked is “where’s my MD.050 in OUM?” For those not familiar with an MD.050, it serves the purpose of being a Functional Design Document (FDD) in one of Oracle’s legacy Methods. Functional Design Documents have existed for many years with their primary purpose being to describe the functional aspects of one or more components of an IT system, typically, a Custom Extension of some sort. So why don’t we have a direct replacement for the MD.050/FDD in OUM? In simple terms, the disadvantage of the MD.050/FDD approach is that it tends to lead practitioners into “Design mode” too early in the process. Whereas OUM encourages more emphasis on gathering, and describing the functional requirements of a system ahead of the formal Analysis and Design process. So that just means more work up front for the Business Analyst or Functional Consultants right? Well no…..the design of a solution, particularly when it involves a complex custom extension, does not necessarily take longer just because you put more thought into the functional requirements. In fact, one could argue the complete opposite, in that by putting more emphasis on clearly understanding the nuances of functionality requirements early in the process, then the overall time and cost incurred during the Analysis to Design process should be less. In short, as your understanding of requirements matures over time, it is far easier (and more cost effective) to update a document or a diagram, than to change lines of code. So how does that translate into Tasks and Work Products in OUM? Let us assume you have reached a point on a project where a Custom Extension is needed. One of the first things you should consider doing is creating a Use Case, and remember, a Use Case could be as simple as a few lines of text reflecting a “User Story”, or it could be what Cockburn1 describes a “fully dressed Use Case”. It is worth mentioned at this point the highly scalable nature of OUM in the sense that “documents” should not be produced just because that is the way we have always done things. Some projects may well be predicated upon a base of electronic documents, whilst other projects may take a much more Agile approach to describing functional requirements; through “User Stories” perhaps. In any event, it is quite common for a Custom Extension to involve the creation of several “components”, i.e. some new screens, an interface, a report etc. Therefore several Use Cases might be required, which in turn can then be assembled into a Use Case Package. Once you have the Use Cases attributed to an appropriate (fit-for-purpose) level of detail, and assembled into a Package, you can now create an Analysis Model for the Package. An Analysis Model is conceptual in nature, and depending on the solution being developing, would involve the creation of one or more diagrams (i.e. Sequence Diagrams, Collaboration Diagrams etc.) which collectively describe the Data, Behavior and Use Interface requirements of the solution. If required, the various elements of the Analysis Model may be indexed via an Analysis Specification. For Custom Extension projects that follow a pure Object Orientated approach, then the Analysis Model will naturally support the development of the Design Model without any further artifacts. However, for projects that are transitioning to this approach, then the various elements of the Analysis Model may be represented within the Analysis Specification. If we now return to the original question of “Where’s my MD.050”. The full answer would be: Capture the functional requirements within a Use Case Group related Use Cases into a Package Create an Analysis Model for each Package Consider creating an Analysis Specification (AN.100) as a index to each Analysis Model artifact An alternative answer for a relatively simple Custom Extension would be: Capture the functional requirements within a Use Case Optionally, group related Use Cases into a Package Create an Analysis Specification (AN.100) for each package 1 Cockburn, A, 2000, Writing Effective Use Case, Addison-Wesley Professional; Edition 1

Read the article
Detecting an online poker cheat

- by Tom Gullen

It recently emerged on a large poker site that some players were possibly able to see all opponents cards as they played through exploiting a security vulnerability that was discovered. A naïve cheater would win at an incredibly fast rate, and these cheats are caught very quickly usually, and if not caught quickly they are easy to detect through a quick scan through their hand histories. The more difficult problem occurs when the cheater exhibits intelligence, bluffing in spots they are bound to be called in, calling river bets with the worst hands, the basic premise is that they lose pots on purpose to disguise their ability to see other players cards, and they win at a reasonably realistic rate. Given: A data set of millions of verified and complete information hand histories Theoretical unlimited computer power Assume the game No Limit Hold'em, although suggestions on Omaha or limit poker may be beneficial How could we reasonably accurately classify these cheaters? The original 2+2 thread appeals for ideas, and I thought that the SO community might have some useful suggestions. It's an interesting problem also because it is current, and has real application in bettering the world if someone finds a creative solution, as there is a good chance genuine players will have funds refunded to them when identified cheaters are discovered.

Read the article
Resharper 5.0 slows down VS 2008 after license is added. Especially ASPX pages

- by Tawani

I just purchased a license to ReSharper 5.0 and it has been almost impossible to edit ASPX pages. I read somewhere that I can disable code analysis on a single page with the following command "Ctrl+Shift+Alt+8". But it also hasn't worked. Is there a way to disable ReSharper Code Analysis on all ASPX pages?

Read the article
What is the main point to focus while analysing a big development project?

- by Starx

I am currently developing a portal website, which is quite vast? My boss want me to publish a analysis report of this project. But so far I made is just few line? So I am just looking for some expert point out of what I should consider in analysis stage.

Read the article
Static code Anaylysis tool for C and C++ [closed]

- by Vaibhav

Possible Duplicate: What open source C++ static analysis tools are available? I am Looking for a static code analysis tool for C and C++ on a Unix platform. Can you please recommend one for me, that you have used your self or know about.

Read the article

< Previous Page | 14 15 16 17 18 19 20 21 22 23 24 25 | Next Page >