Search Results

Search found 4291 results on 172 pages for 'cluster analysis'.

Page 33/172 | < Previous Page | 29 30 31 32 33 34 35 36 37 38 39 40  | Next Page >

  • Distinct Count of Customers in a SCD Type 2 in #DAX

    - by Marco Russo (SQLBI)
    If you have a Slowly Changing Dimension (SCD) Type 2 for your customer and you want to calculate the number of distinct customers that bought a product, you cannot use the simple formula: Customers := DISTINCTCOUNT( FactTable[Customer Id] ) ) because it would return the number of distinct versions of customers. What you really want to do is to calculate the number of distinct application keys of the customers, that could be a lower number than the number you’ve got with the previous formula. Assuming that a Customer Code column in the Customers dimension contains the application key, you should use the following DAX formula: Customers := COUNTROWS( SUMMARIZE( FactTable, Customers[Customer Code] ) ) Be careful: only the version above is really fast, because it is solved by xVelocity (formerly known as VertiPaq) engine. Other formulas involving nested calculations might be more complex and move computation to the formula engine, resulting in slower query. This is absolutely an interesting pattern and I have to say it’s a killer feature. Try to do the same in Multidimensional…

    Read the article

  • What is the difference between Static code analysis and code review?

    - by Xander
    I just wanted to know what is the difference between static code analysis and code review. How these two are done? What are the tools available today for code review/ static analysis of PHP. I also like to know about good tools for any language code review. Thanks in Advance. Xander Cage Note: I am asking this because I was not able to understand the difference. Please, I expect some answers than "I am Mr.Geek and you asked an irrelevant bla bla..... this is closed". I know this sounds mean. But I am sorry.

    Read the article

  • How would I go about measuring the impact an article has on the internet?

    - by Jimbo Mombasa
    For an application of mine, I analyze the sentiment of articles, using NLTK, to display sentiment trends. But right now all articles weigh the same amount. This does not show a very accurate picture because some articles have a higher impact on the internet than others. For example, a blog post from some unknown blog should not weigh the same amount as an article from the New York Times. How can I determine their impact?

    Read the article

  • My Last "Catch-Up" Post for 2010 Content

    - by KKline
    I did a lot of writing in 2010. Unfortunately, I didn't do a good job of keeping all of that writing equally distributed throughout all of the channels where I'm active. So here are a few more posts from my blog, put on-line during the months of November and December 2010, that I didn't get posted here on SQLBlog.com: 1. It's Time to Upgrade! So many of my customers and many of you, dear readers, are still on SQL Server 2005. Join Kevin Kline , SQL Server MVP and SQL Server Technology Strategist...(read more)

    Read the article

  • Storing and analyzing rock climbing difficulty

    - by Zonedabone
    I'm working on a WordPress plugin to manage rock climbing data, and I need to think of a way to store rock climbing grades from all of the different systems in a unified way. There are many different systems, all of which have some numerical system. A comparison of all the systems: http://en.wikipedia.org/wiki/Grade_(climbing)#Comparison_tables Is there some unified way that I can store and analyze these, or do I just need to assign numbers to them all and call it a day? My current plan is to save the score type and then assign each score a numerical value, which I can then use to compare and graph them.

    Read the article

  • New videos available #dax #ssas #powerpivot

    - by Marco Russo (SQLBI)
    The collaboration I and Alberto started with Project Botticelli is starting producing content. At this point we have three videos available: DAX in Action shows the power of DAX in PowerPivot solving common patterns not so easy or fast to solve in other languages DAX: Calculated Columns vs. Measures shows the difference between calculated columns and measures in DAX Introduction to DAX has a content corresponding to the title! The first two videos are freely available, the third one is longer and visible only to subscribers. The goal for this series of video is to reach advanced Excel users and BI developers that are new to DAX. If we should categorize this content, it’s a sort of level 200 session in a conference. I don’t expect readers of this blog to watch this video (if not for the sake of curiosity!) but if you have to explain this subject to anyone else and you have other priorities… well, you can add this post to the list of resources you provide for studying the subject!

    Read the article

  • Seeking a free Lint for C which programmers will *want* to use

    - by Mawg
    When I try to persuade others to Lint their code I always get excuses - too difficult to set up, too difficult to understand, false positives, etc (most of which translates to too lazy, too stupid or too afraid of new things). Is there any way that I can make Linting easier? We code in C using Netbeans. Can I incorporate Splint into Netbeans? I did find a Splint GUI which was quite good, but there was no way to lint a directory tree. Any ideas? Thanks in advance

    Read the article

  • Big Oh notation does not mention constant value

    - by user883561
    I am a programmer and have just started reading Algorithms. I am not completely convinced with the notations namely Bog Oh, Big Omega and Big Theta. The reason is by definition of Big Oh, it states that there should be a function g(x) such that it is always greater than or equal to f(x). Or f(x) <= c.n for all values of n n0. My doubt is the why dont we mention the constant value in the definition? For example. lets say a function 6n+4, we denote it as O(n). but its not true that the definition holds good for all constant value. this holds good only when c = 10 and n = 1. For lesser values of c than 6, the value of n0 increases. So why we do not mention the constant value as a part of the definition.

    Read the article

  • How to discriminate from two nodes with identical frequencies in a Huffman's tree?

    - by Omega
    Still on my quest to compress/decompress files with a Java implementation of Huffman's coding (http://en.wikipedia.org/wiki/Huffman_coding) for a school assignment. From the Wikipedia page, I quote: Create a leaf node for each symbol and add it to the priority queue. While there is more than one node in the queue: Remove the two nodes of highest priority (lowest probability) from the queue Create a new internal node with these two nodes as children and with probability equal to the sum of the two nodes' probabilities. Add the new node to the queue. The remaining node is the root node and the tree is complete. Now, emphasis: Remove the two nodes of highest priority (lowest probability) from the queue Create a new internal node with these two nodes as children and with probability equal to the sum of the two nodes' probabilities. So I have to take two nodes with the lowest frequency. What if there are multiple nodes with the same low frequency? How do I discriminate which one to use? The reason I ask this is because Wikipedia has this image: And I wanted to see if my Huffman's tree was the same. I created a file with the following content: aaaaeeee nnttmmiihhssfffouxprl And this was the result: Doesn't look so bad. But there clearly are some differences when multiple nodes have the same frequency. My questions are the following: What is Wikipedia's image doing to discriminate the nodes with the same frequency? Is my tree wrong? (Is Wikipedia's image method the one and only answer?) I guess there is one specific and strict way to do this, because for our school assignment, files that have been compressed by my program should be able to be decompressed by other classmate's programs - so there must be a "standard" or "unique" way to do it. But I'm a bit lost with that. My code is rather straightforward. It literally just follows Wikipedia's listed steps. The way my code extracts the two nodes with the lowest frequency from the queue is to iterate all nodes and if the current node has a lower frequency than any of the two "smallest" known nodes so far, then it replaces the highest one. Just like that.

    Read the article

  • Optimize Many-to-Many with SUMMARIZE and Other Techniques

    - by Marco Russo (SQLBI)
    We are still in the early days of DAX and even if I have been using it since 2 years ago, there is still a lot to learn on that. One of the topics that historically interests me (and many of the readers here, probably) is the many-to-many relationships between dimensions in a dimensional data model. When I and Alberto wrote the The Many to Many Revolution 2.0 we discovered the SUMMARIZE based pattern very late in the whitepaper writing. It is very important for performance optimization and it should be always used. In the last month, Gerhard Brueckl also presented an approach based on cross table filtering behavior that simplify the syntax involved, even if it’s harder to explain how it works internally. I published a short article titled Optimize Many-to-Many Calculation in DAX with SUMMARIZE and Cross Table Filtering on SQLBI website just to provide a quick reference to the three patterns available. A further study is still required to compare performance between SUMMARIZE and Cross Table Filtering patterns. Up to now, I haven’t observed big differences between them, even if their execution plans might be not identical and this suggest me that depending on other conditions you might favor one over the other.

    Read the article

  • Algorithm to measure how "diffused" 5,000 pennies are in an economy?

    - by makerofthings7
    Please allow me to use this example/metaphor to describe an algorithm I need. Objects There are 5 thousand pennies. There are 50 cups. There is a tracking history (Passport "stamp" etc) that is associated with each penny as it moves between cups. Definition I'll define a "highly diffused" penny as one that passes through many cups. A "poorly diffused" penny is one that either passes back and forth between 2 cups Question How can I objectively measure the diffusion of a penny as: The number of moves the penny has gone through The number of cups the penny has been in A unit of time (day, week, month) Why am I doing this? I want to detect if a cup is hoarding pennies. Resistance from bad actors Since hoarding is bad, the "bad cup" may simply solicit a partner and simply move pennies between each other. This will reduce the amount of time a coin isn't in transit, and would skew hoarding detection. A solution might be to detect if a cup (or set of cups) are common "partners" with each other, though I'm not sure how to think though this problem. Broad applicability Any assistance would be helpful, since I would think that this algorithm is common to Economics The study of migration patterns of animals, citizens of a country Other natural occurring phenomena ... and probably exists as a term or concept I'm unfamiliar with.

    Read the article

  • Why create a Huffman tree per character instead of a Node?

    - by Omega
    For a school assignment we're supposed to make a Java implementation of a compressor/decompresser using Huffman's algorithm. I've been reading a bit about it, specially this C++ tutorial: http://www.cprogramming.com/tutorial/computersciencetheory/huffman.html In my program, I've been thinking about having Nodes that have the following properties: Total Frequency Character (if a leaf) Right child (if any) Left child (if any) Parent (if any) So when building the Huffman tree, it is just a matter of linking a node to others, etc. However, I'm a bit confused with the following quote (emphasis mine): First, every letter starts off as part of its own tree and the trees are ordered by the frequency of the letters in the original string. Then the two least-frequently used letters are combined into a single tree, and the frequency of that tree is set to be the combined frequency of the two trees that it links together. My question: why should I create a tree per letter, instead of just a node per letter and then do the linking later? I have not begun coding, I'm just studying the algorithm first, so I guess I'm missing an important detail. What is it?

    Read the article

  • Which stages of the requirements analysis process in mobile requirements engineering are the most challenging ones?

    - by user363295
    I'm doing a research on formulating a requirements analysis model as a stage of requirements engineering for mobile-application development by considering the limitations and the needs of it ( agility and etc.. .), what I'm trying to figure out is that which parts of this process (requirements analysis for mobile development) are the most challenging ones ( so i can focus more on) , and if there is any stage that u think I need to include or exclude (exp. some may think a quality plan may or may not be necessary and etc.) to make it more clear below is the list of few of the areas in which I can focus on ( by the way your suggestions can be anything out of the below list.) -Requirements specification -Prototyping -Requirements Prioritization -Focusing on quality functions

    Read the article

  • How to avoid or minimise use of check/conditional statement in my scenario?

    - by Muneeb Nasir
    I have scenario, where I got stream and I need to check for some value. If I got any new value I have to store it in any of data structure. It seems very easy, I can place conditional statement if-else or can use contain method of set/map to check either received is new or not. But the problem is checking will effect my application performance, in stream I will receive hundreds for value in second, if I start checking each and every value I received then for sure it effect performance. Anybody can suggest me any mechanism or algorithm to solve my issue, either by bypassing checks or at least minimize them?

    Read the article

  • Community Events in Köln (October) and Copenhagen November #ssas #tabular #powerpivot

    - by Marco Russo (SQLBI)
    Short update about community events in Europe where I will speak.On October 11 I will present DAX in Action in Köln - all details in the PASS local chapter here: http://www.sqlpass.de/Regionen/Deutschland/K%C3%B6lnBonnD%C3%BCsseldorf.aspxI will be speaking at a community event in Copenhagen on November 21, 2012. The session will be Excel 2013 PowerPivot in Action and details about time and location are available here: http://msbip.dk/events/30/msbip-mode-nr-9/I will be in Köln and Copenhagen to teach the SSAS Tabular Workshop. The workshop in Köln is the first in Germany and I look forward to meet new BI developers there.Copenhagen is the second edition after another we delivered this spring. It is a convenient location also for people coming from Malmoe and Göteborg in Sweden. Last event in Copenhagen were conflicting with a large event in Sweden, maybe this time I'll meet more people coming from the other side of the Øresund Bridge!Many other dates and location are available on the SSAS Tabular Workshop website.

    Read the article

  • Approach for packing 2D shapes while minimizing total enclosing area

    - by Dennis
    Not sure on my tags for this question, but in short .... I need to solve a problem of packing industrial parts into crates while minimizing total containing area. These parts are motors, or pumps, or custom-made components, and they have quite unusual shapes. For some, it may be possible to assume that a part === rectangular cuboid, but some are not so simple, i.e. they assume a shape more of that of a hammer or letter T. With those, (assuming 2D shape), by alternating direction of top & bottom, one can pack more objects into the same space, than if all tops were in the same direction. Crude example below with letter "T"-shaped parts: ***** xxxxx ***** x ***** *** ooo * x vs * x vs * x vs * x o * x * xxxxx * x * x o xxxxx xxx Right now we are solving the problem by something like this: using CAD software, make actual models of how things fit in crate boxes make estimates of actual crate dimensions & write them into Excel file (1) is crazy amount of work and as the result we have just a limited amount of possible entries in (2), the Excel file. The good things is that programming this is relatively easy. Given a combination of products to go into crates, we do a lookup, and if entry exists in the Excel (or Database), we bring it out. If it doesn't, we say "sorry, no data!". I don't necessarily want to go full force on making up some crazy algorithm that given geometrical part description can align, rotate, and figure out best part packing into a crate, given its shape, but maybe I do.. Question Well, here is my question: assuming that I can represent my parts as 2D (to be determined how), and that some parts look like letter T, and some parts look like rectangles, which algorithm can I use to give me a good estimate on the dimensions of the encompassing area, while ensuring that the parts are packed in a minimal possible area, to minimize crating/shipping costs? Are there approximation algorithms? Seeing how this can get complex, is there an existing library I could use? My thought / Approach My naive approach would be to define a way to describe position of parts, and place the first part, compute total enclosing area & dimensions. Then place 2nd part in 0 degree orientation, repeat, place it at 180 degree orientation, repeat (for my case I don't think 90 degree rotations will be meaningful due to long lengths of parts). Proceed using brute force "tacking on" other parts to the enclosing area until all parts are processed. I may have to shift some parts a tad (see 3rd pictorial example above with letters T). This adds a layer of 2D complexity rather than 1D. I am not sure how to approach this. One idea I have is genetic algorithms, but I think those will take up too much processing power and time. I will need to look out for shape collisions, as well as adding extra padding space, since we are talking about real parts with irregularities rather than perfect imaginary blocks. I'm afraid this can get geometrically messy fairly fast, and I'd rather keep things simple, if I can. But what if the best (practical) solution is to pack things into different crate boxes rather than just one? This can get a bit more tricky. There is human element involved as well, i.e. like parts can go into same box and are thus a constraint to be considered. Some parts that are not the same are sometimes grouped together for shipping and can be considered as a common grouped item. Sometimes customers want things shipped their way, which adds human element to constraints. so there will have to be some customization.

    Read the article

  • Technique to Solve Hard Programming logic

    - by Paresh Mayani
    I have heard about many techniques which are used by developer/software manager to solve hard programming logic or to create flow of an application and this flow will be implemented by developers to create an actual application. Some of the technique which i know, are: Flowchart Screen-Layout Data Flow Diagram E-R Diagram Algorithm of every programs I'd like to know about two facts: (1) Are there any techniques other than this ? (2) Which one is the most suitable to solve hard programming logic and process of application creation?

    Read the article

  • Is there a correlation between complexity and reachability?

    - by Saladin Akara
    I've been studying cyclomatic complexity (McCabe) and reachability of software at uni recently. Today my lecturer said that there's no correlation between the two metrics, but is this really the case? I'd think there would definitely be some correlation, as less complex programs (from the scant few we've looked at) seem to have 'better' results in terms of reachability. Does anyone know of any attempt to look at the two metrics together, and if not, what would be a good place to find data on both complexity and reachability for a large(ish) number of programs? (As clarification, this isn't a homework question. Also, if I've put this in the wrong place, let me know.)

    Read the article

  • Sources of requirements? [closed]

    - by user970696
    I was reading a book about SW engineering the other day and it went like: Sources of both functional and non-functional requirements are: law (for specific cases) business and user requirements etc. //what else then? So the question is, what other sources of requirements there are when an analyst is gathering the information? Lets consider a desktop app for mobile operator. As for the comment, I do not think this is a broad question as the books usually mention 1-2 sources. I would like to know more, if anyone can help.

    Read the article

  • How to model the components of a non Information System?

    - by Adel C Kod
    So I am working on a project that's related to the Kernel code(specifically related to the TCP/IP stack of the kernel). I need to build some models to describe the functionality and components of my system. Initially I thought about Class Diagram, it can describe the general architecture of my system but it doesn't make sense since my code is VERY structured(written in standard C). I also thought about DFDs, they'd describe the processes of my system, and how the data is flowing. But they contain something which doesn't really fit in; data-storages. I have no databases here(at all). For the functionality, other team members suggested using Activity and Sequence diagrams, which is kinda okay with me, but what about the system components? So basically my question is; I want to describe the components of my system; what do you suggest as a meaningful diagram to follow? (Again, the project is a research low-level systems-oriented project with almost no user-interface at all)

    Read the article

  • Saving all hits to a web app

    - by bevanb
    Are there standard approaches to persisting data for every hit that a web app receives? This would be for analytics purposes (as a better alternative to log mining down the road). Seems like Redis would be a must. Is it advisable to also use a different DB server for that table, or would Redis be enough to mitigate the impact on the main DB? Also, how common is this practice? Seems like a no brainer for businesses who want to better understand their users, but I haven't read much about it.

    Read the article

  • how to avoid or minimise use of check/conditional statement?

    - by Muneeb Nasir
    I have scenario, where i got stream and i need to check for some value. if i got my any new value i have to store it in any of data structure. well it seems very easy, i can place conditional statement if-else or can use contain method of set/map to check either received is new or not. but the problem is checking will effect my application performance, in stream i'll receive hundreds for value in second, if i start checking each and every value i received than for sure it effect performance. Any body can suggest me any mechanism or algorithm that solve my issue. either by bypassing checks or atleast minimize them?

    Read the article

  • How should I compress a file with multiple bytes that are the same with Huffman coding?

    - by Omega
    On my great quest for compressing/decompressing files with a Java implementation of Huffman coding (http://en.wikipedia.org/wiki/Huffman_coding) for a school assignment, I am now at the point of building a list of prefix codes. Such codes are used when decompressing a file. Basically, the code is made of zeroes and ones, that are used to follow a path in a Huffman tree (left or right) for, ultimately, finding a byte. In this Wikipedia image, to reach the character m the prefix code would be 0111 The idea is that when you compress the file, you will basically convert all the bytes of the file into prefix codes instead (they tend to be smaller than 8 bits, so there's some gain). So every time the character m appears in a file (which in binary is actually 1101101), it will be replaced by 0111 (if we used the tree above). Therefore, 1101101110110111011011101101 becomes 0111011101110111 in the compressed file. I'm okay with that. But what if the following happens: In the file to be compressed there exists only one unique byte, say 1101101. There are 1000 of such byte. Technically, the prefix code of such byte would be... none, because there is no path to follow, right? I mean, there is only one unique byte anyway, so the tree has just one node. Therefore, if the prefix code is none, I would not be able to write the prefix code in the compressed file, because, well, there is nothing to write. Which brings this problem: how would I compress/decompress such file if it is impossible to write a prefix code when compressing? (using Huffman coding, due to the school assignment's rules) This tutorial seems to explain a bit better about prefix codes: http://www.cprogramming.com/tutorial/computersciencetheory/huffman.html but doesn't seem to address this issue either.

    Read the article

  • Internships and certification in IT Business Analyst? [closed]

    - by light
    I'm new in this field, I have almost no experience. But I know here many people who do it every day and have many years of experience behind! And I hope they can help me! Generally, I'm interested to know the following: certificates to fully understand technology and to show to employers maybe you know good places to have an internship (country doesn't matter) EDIT #1: question was changed to be more specific. EDIT #2: thanks, I think the question should be closed, because the question depends from my need, what type of work I want to have).

    Read the article

  • Where’s my MD.050?

    - by Dave Burke
    A question that I’m sometimes asked is “where’s my MD.050 in OUM?” For those not familiar with an MD.050, it serves the purpose of being a Functional Design Document (FDD) in one of Oracle’s legacy Methods. Functional Design Documents have existed for many years with their primary purpose being to describe the functional aspects of one or more components of an IT system, typically, a Custom Extension of some sort. So why don’t we have a direct replacement for the MD.050/FDD in OUM? In simple terms, the disadvantage of the MD.050/FDD approach is that it tends to lead practitioners into “Design mode” too early in the process. Whereas OUM encourages more emphasis on gathering, and describing the functional requirements of a system ahead of the formal Analysis and Design process. So that just means more work up front for the Business Analyst or Functional Consultants right? Well no…..the design of a solution, particularly when it involves a complex custom extension, does not necessarily take longer just because you put more thought into the functional requirements. In fact, one could argue the complete opposite, in that by putting more emphasis on clearly understanding the nuances of functionality requirements early in the process, then the overall time and cost incurred during the Analysis to Design process should be less. In short, as your understanding of requirements matures over time, it is far easier (and more cost effective) to update a document or a diagram, than to change lines of code. So how does that translate into Tasks and Work Products in OUM? Let us assume you have reached a point on a project where a Custom Extension is needed. One of the first things you should consider doing is creating a Use Case, and remember, a Use Case could be as simple as a few lines of text reflecting a “User Story”, or it could be what Cockburn1 describes a “fully dressed Use Case”. It is worth mentioned at this point the highly scalable nature of OUM in the sense that “documents” should not be produced just because that is the way we have always done things. Some projects may well be predicated upon a base of electronic documents, whilst other projects may take a much more Agile approach to describing functional requirements; through “User Stories” perhaps. In any event, it is quite common for a Custom Extension to involve the creation of several “components”, i.e. some new screens, an interface, a report etc. Therefore several Use Cases might be required, which in turn can then be assembled into a Use Case Package. Once you have the Use Cases attributed to an appropriate (fit-for-purpose) level of detail, and assembled into a Package, you can now create an Analysis Model for the Package. An Analysis Model is conceptual in nature, and depending on the solution being developing, would involve the creation of one or more diagrams (i.e. Sequence Diagrams, Collaboration Diagrams etc.) which collectively describe the Data, Behavior and Use Interface requirements of the solution. If required, the various elements of the Analysis Model may be indexed via an Analysis Specification. For Custom Extension projects that follow a pure Object Orientated approach, then the Analysis Model will naturally support the development of the Design Model without any further artifacts. However, for projects that are transitioning to this approach, then the various elements of the Analysis Model may be represented within the Analysis Specification. If we now return to the original question of “Where’s my MD.050”. The full answer would be: Capture the functional requirements within a Use Case Group related Use Cases into a Package Create an Analysis Model for each Package Consider creating an Analysis Specification (AN.100) as a index to each Analysis Model artifact An alternative answer for a relatively simple Custom Extension would be: Capture the functional requirements within a Use Case Optionally, group related Use Cases into a Package Create an Analysis Specification (AN.100) for each package 1 Cockburn, A, 2000, Writing Effective Use Case, Addison-Wesley Professional; Edition 1

    Read the article

< Previous Page | 29 30 31 32 33 34 35 36 37 38 39 40  | Next Page >