Search Results

Search found 2683 results on 108 pages for 'statistical analysis'.

Page 35/108 | < Previous Page | 31 32 33 34 35 36 37 38 39 40 41 42 | Next Page >

Code Measuring and Metrics Tools?

- by David

I'm in the process of setting up a build server for personal projects. This server will handle all the normal CI stuff, including running large suites of tests (unit, integration, automated UI). While I'm working out the kinks for including code coverage output with MSTest, it occurs to me that there may be lots of tools out there which give me additional metrics other than just code coverage. FxCop comes to mind as an example. Though I'm sure there are others. Anything that can generate useful reportable data and metrics would be good. Whether it's class dependency charts (looking for Law of Demeter violations, for example), analyses of the uses of classes/functions (looking for a function that isn't used in the system other than just the tests, for example), and so on. I'm not sure the right way to formulate the question, since polling questions or "What's your favorite code analysis tool" aren't very good. But I'm essentially just looking for recommendations on what metrics to gather and the tools that can gather them. The eventual vision for something like this is to have the CI server run a bunch of automated tests and analysis tools and track performance metrics over time. Imagine a dashboard full of graphs plotting these metrics over time. The lines should all relatively be at an equilibrium, and if one starts to stray toward the negative then it's an early indication of problems with the code. In the age old struggle to quantify code quality with management, this sounds like a potentially helpful means of doing just that.

Read the article
The Workshop Technique Handbook

- by llowitz

The #OUM method pack contains a plethora of information, but if you browse through the activities and tasks contained in OUM, you will see very few references to workshops. Consequently, I am often asked whether OUM supports a workshop-type approach. In general, OUM does not prescribe the manner in which tasks should be conducted, as many factors such as culture, availability of resources, potential travel cost of attendees, can influence whether a workshop is appropriate in a given situation. Although workshops are not typically called out in OUM, OUM encourages the project manager to group the OUM tasks in a way that makes sense for the project. OUM considers a workshop to be a technique that can be applied to any OUM task or group of tasks. If a workshop is conducted, it is important to identify the OUM tasks that are executed during the workshop. For example, a “Requirements Gathering Workshop” is quite likely to Gathering Business Requirements, Gathering Solution Requirements and perhaps Specifying Key Structure Definitions. Not only is a workshop approach to conducting the OUM tasks perfectly acceptable, OUM provides in-depth guidance on how to maximize the value of your workshops. I strongly encourage you to read the “Workshop Techniques Handbook” included in the OUM Manage Focus Area under Method Resources. The Workshop Techniques Handbook provides valuable information on a variety of workshop approaches and discusses the circumstances in which each type of workshop is most affective. Furthermore, it provides detailed information on how to structure a workshops and tips on facilitating the workshop. You will find guidance on some popular workshop techniques such as brainstorming, setting objectives, prioritizing and other more specialized techniques such as Value Chain Analysis, SWOT analysis, the Delphi Technique and much more. Workshops can and should be applied to any type of OUM project, whether that project falls within the Envision, Manage or Implementation focus areas. If you typically employ workshops to gather information, walk through a business process, develop a roadmap or validate your understanding with the client, by all means continue utilizing them to conduct the OUM tasks during your project, but first take the time to review the Workshop Techniques Handbook to refresh your knowledge and hone your skills.

Read the article
What are the boundaries of the product owner in scrum?

- by Saeed Neamati

In another question, I asked about why I feel scrum turns active developers into passive developers, and it seems that the overall problem is not scrumy (related to scrum), and rather it's related to the bad implementation of scrum. So, here I have some questions about the scope of the responsibilities of PO (product owner) and the limitations he/she shouldn't pass. Should PO interfere the UI design, when there are designers at work in scrum team? (an example of this which has happened to us, is to replace checkboxes with a drop down list with two items, namely, yes and no; or to make some boxes larger, or to left-align some content instead of centering them on the page, or stuff like that). If yeah, to what extent? Colors? Layout? Should PO interfere in Design and architecture of coding? This hasn't happened to us yet, but I'm really curious about the boundaries. For example does PO has the right to change the platform (moving from ASP.NET MVC to PHP, or something like that), or choosing the count of servers (tier architecture), etc. Should PO interfere in validation mechanisms? For example, this field should be required, or we don't need to get this piece of information from user. Sometimes, analyzers and designers confirm that something can be handled behind the scene, like extracting the user profile info from another source, instead of asking for it in UI. How granular could/should PO get into the analysis and design? For example, a user story might be: "As a customer, I'd like to be able to buy new domains online". However, scrum team can implement this user story in a wizard of five steps, or in one single page. To which level PO should monitor, or govern, or supervise the technical analysis, design, and implementation? I asked these questions to judge whether our implementation is right or wrong?

Read the article
Prognostications for the Future of BI

- by jacqueline.coolidge(at)oracle.com

Dashboard Insight has published the viewpoints on the future of BI from several vendors' perspectives including ours at Business Intelligence Predictions for 2011 We offered: In 2011, businesses will demand more from BI. With intense competitive and economic pressures, it's not enough to be interesting. BI must be actionable and enable people to respond smarter and faster to the opportunities and challenges of the day. Most companies rely on BI to help them understand what's going on in their business. Many are ready to make the leap from "What's going on?" to "What are we going to do about it?" Seamless integration from reporting to what-if analysis and scenario modeling helps businesses decide the right course of action. The integration of BI with SOA and BPEL will deliver the true payoff for BI by enabling companies to initiate business processes directly from their analysis, turning insight to action for more agile and competitive business. And, I must admit, it's tough to argue with the trends identified by other vendors. Enabling true self-service and engaging a larger community of users Accelerating the adoption of BI on mobile devices Embracing more advanced analytics such as data/text mining and location intelligence Price/performance breakthroughs It's singing to the choir. I look forward to hearing the voices of some customers who are pushing the envelope and will post those stories as I capture them.

Read the article
github team workflow - to fork or not?

- by aporat

We're a small team of web developers currently using subversion but soon we're making a switch to github. I'm looking at different types of github workflows, and we're not sure if the whole forking concept in github for each developer is such a good idea for us. If we use forks, I understand each developer will have his own private remote & local repositories. I'm worried it will make pushing changesets hard and too complex. Also, my biggest concern is that it will force each developer to have 2 remotes: origin (which is the remote fork) and an upstream (which is used to "sync" changes from the main repository). Not sure if it's such a easy way to do things. This is similar to the workflow explained here: https://github.com/usm-data-analysis/usm-data-analysis.github.com/wiki/Git-workflow If we don't use forks, we can probably get by fine by using a central repo creating a branch for each task we're working on, and merge them into the development branch on the same repository. It means we won't be able to restrict merging of branches and might be a little messy to have many branches on the central repository. Any suggestions from teams who tried both workflow?

Read the article
What are the hard and fast rules for Cache Control?

- by Metalshark

Confession: sites I maintain have different rules for Cache Control mostly based on the default configuration of the server followed up with recommendations from the Page Speed & Y-Slow Firefox plug-ins and the Network Resources view in Google's Speed Tracer. Cache-Control is set to private/public depending on what they say to do, ETag's/Last-Modified headers are only tinkered with if Y-Slow suggests there is something wrong and Vary-Accept-Encoding seems necessary when manually gziping files for Amazon CloudFront. When reading through the material on the different options and what they do there seems to be conflicting information, rules for broken proxies and cargo cult configurations. Any of the official information provided by the analysis tools mentioned above is quite inaccessible as it deals with each topic individually instead of as a unified strategy (so there is no cross-referencing of techniques). For example, it seems to make no sense that the speed analysis tools rate a site with ETag's the same as a site without them if they are meant to help with caching. What are the hard and fast rules for a platform agnostic Cache Control strategy? EDIT: A link through Jeff Atwood's article explains Caching in superb depth. For the record though here are the hard and fast rules: If the file is Compressed using GZIP, etc - use "cache-control: private" as a proxy may return the compressed version to a client that does not support it (the browser cache will hold files marked this way though). Also remember to include a "Vary: Accept-Encoding" to say that it is compressible. Use Last-Modified in conjunction with ETag - belt and braces usage provides both validators, whilst ETag is based on file contents instead of modification time alone, using both covers all bases. NOTE: AOL's PageTest has a carte blanche approach against ETags for some reason. If you are using Apache on more than one server to host the same content then remove the implicitly declared inode from ETags by excluding it from the FileETag directive (i.e. "FileETag MTime Size") unless you are genuinely using the same live filesystem. Use "cache-control: public" wherever you can - this means that proxy servers (and the browser cache) will return your content even if the rest of the page needs HTTP authentication, etc.

Read the article
Stakeholder Management in OUM

- by user719921

Where is Stakeholder Management in OUM? Stakeholder Management typically falls into the purview of the Project Manager, which means much of the associated guidance is found in the OUM Manage Focus Area (a.k.a. Manage). There is no process in Manage named Stakeholder Management, but this “touch point” can be found in a variety of other processes including Bid Transition (BT), Communication Management (CMM) and Organizational Change Management (OCHM). • Stakeholder management starts in the Bid Transition process with Stakeholder Analysis • This Stakeholder Analysis is used to build the Project Team Communication Plan in the Communication Management process. • Stakeholder management should be executed during the Execution and Control phase. For example, as issues are resolved, the project manager should take the action item to follow up with the affected stakeholders to ensure they are aware that the issue has been resolved. • The broader topic of Stakeholder management is also addressed very thoroughly in the Organizational Change Management process in the Implement Focus Area, which is a touch point to the Organizational Change Management process in Manage. Check it out and let me know your thoughts!

Read the article
Mocking concrete class - Not recommended

- by Mik378

I've just read an excerpt of "Growing Object-Oriented Software" book which explains some reasons why mocking concrete class is not recommended. Here some sample code of a unit-test for the MusicCentre class: public class MusicCentreTest { @Test public void startsCdPlayerAtTimeRequested() { final MutableTime scheduledTime = new MutableTime(); CdPlayer player = new CdPlayer() { @Override public void scheduleToStartAt(Time startTime) { scheduledTime.set(startTime); } } MusicCentre centre = new MusicCentre(player); centre.startMediaAt(LATER); assertEquals(LATER, scheduledTime.get()); } } And his first explanation: The problem with this approach is that it leaves the relationship between the objects implicit. I hope we've made clear by now that the intention of Test-Driven Development with Mock Objects is to discover relationships between objects. If I subclass, there's nothing in the domain code to make such a relationship visible, just methods on an object. This makes it harder to see if the service that supports this relationship might be relevant elsewhere and I'll have to do the analysis again next time I work with the class. I can't figure out exactly what he means when he says: This makes it harder to see if the service that supports this relationship might be relevant elsewhere and I'll have to do the analysis again next time I work with the class. I understand that the service corresponds to MusicCentre's method called startMediaAt. What does he mean by "elsewhere"? The complete excerpt is here: http://www.mockobjects.com/2007/04/test-smell-mocking-concrete-classes.html

Read the article
Webmatrix The Site has Stopped Fix

- by Tarun Arora

I just got started with AzureWebSites by creating a website by choosing the Wordpress template. Next I tried to install WebMatrix so that I could run the website locally. Every time I tried to run my website from WebMatrix I hit the message “The following site has stopped ‘xxx’” Step 00 – Analysis It took a bit of time to figure out that WebMatrix makes use of IISExpress. But it was easy to figure out that IISExpress was not showing up in the system tray when I started WebMatrix. This was a good indication that IISExpress is having some trouble starting up. So, I opened CMD prompt and tried to run IISExpress.exe this resulted in the below error message So, I ran IISExpress.exe /trace:Error this gave more detailed reason for failure Step 1 – Fixing “The following site has stopped ‘xxx’” Further analysis revealed that the IIS Express config file had been corrupted. So, I navigated to C:\Users\<UserName>\Documents\IISExpress\config and deleted the files applicationhost.config, aspnet.config and redirection.config (please take a backup of these files before deleting them). Come back to CMD and run IISExpress /trace:Error IIS Express successfully started and parked itself in the system tray icon. I opened up WebMatrix and clicked Run, this time the default site successfully loaded up in the browser without any failures. Step 2 – Download WordPress Azure WebSite using WebMatrix Because the config files ‘applicationhost.config’, ‘aspnet.config’ and ‘redirection.config’ were deleted I lost the settings of my Azure based WordPress site that I had downloaded to run from WebMatrix. This was simple to sort out… Open up WebMatrix and go to the Remote tab, click on Download Export the PublishSettings file from Azure Management Portal and upload it on the pop up you get when you had clicked Download in the previous step Now you should have your Azure WordPress website all set up & running from WebMatrix. Enjoy!

Read the article
C# class architecture for REST services

- by user15370

Hi. I am integrating with a set of REST services exposed by our partner. The unit of integration is at the project level meaning that for each project created on our partners side of the fence they will expose a unique set of REST services. To be more clear, assume there are two projects - project1 and project2. The REST services available to access the project data would then be: /project1/search/getstuff?etc... /project1/analysis/getstuff?etc... /project1/cluster/getstuff?etc... /project2/search/getstuff?etc... /project2/analysis/getstuff?etc... /project2/cluster/getstuff?etc... My task is to wrap these services in a C# class to be used by our app developer. I want to make it simple for the app developer and am thinking of providing something like the following class. class ProjectClient { SearchClient _searchclient; AnalysisClient _analysisclient; ClusterClient _clusterclient; string Project {get; set;} ProjectClient(string _project) { Project = _project; } } SearchClient, AnalysisClient and ClusterClient are my classes to support the respective services shown above. The problem with this approach is that ProjectClient will need to provide public methods for each of the API's exposed by SearchClient, etc... public void SearchGetStuff() { _searchclient.getStuff(); } Any suggestions how I can architect this better?

Read the article
E-Bus ATG Advisor Webcast program - June Edition

- by cwarticki

E-Business Suite Applications Technology Group (ATG) Advisor Webcast Program – June 2014 Thursday June 12, 2014 at 18:00 UK / 10:00 PST / 11:00 MST / 13:00 EST EBS Patch Wizard Overview · What is Oracle EBS Patch Wizard Tool? · Benefits of the Patch Wizard utility · Patch Wizard Interface · Patch Impact Analysis Details & Registration : Note 1672371.1 Direct registration link Thursday June 26, 2014 at 18:00 UK / 10:00 PST / 11:00 MST / 13:00 EST EBS Proactive Analyzers Overview · What are Oracle Support Analyzers? · How to identify the Analyzers available? · Short Analyzers Overview · Patch Impact Analysis · How to use an Analyzer? Details & Registration : Note 1672363.1 Direct registration link If you have any question about the schedules or if you have a suggestion for an Advisor Webcast to be planned in future, please send an E-Mail to Ruediger Ziegler.

Read the article
OSB, Service Callouts and OQL

- by Sabha

Oracle Fusion Middleware customers use Oracle Service Bus (OSB) for virtualizing Service endpoints and implementing stateless service orchestrations. Behind the performance and speed of OSB, there are a couple of key design implementations that can affect application performance and behavior under heavy load. One of the heavily used feature in OSB is the Service Callout pipeline action for message enrichment and invoking multiple services as part of one single orchestration. Overuse of this feature, without understanding its internal implementation, can lead to serious problems. This series will delve into OSB internals, the problem associated with usage of Service Callout under high loads, diagnosing it via thread dump and heap dump analysis using tools like ThreadLogic and OQL (Object Query Language) and resolving it. The first section in the series will mainly cover the threading model used internally by OSB for implementing Route Vs. Service Callouts. The second section of the "OSB, Service Callouts and OQL" blog posting will delve into thread dump analysis of OSB server and detecting threading issues relating to Service Callout and using Heap Dump and OQL to identify the related Proxies and Business services involved. The final section of the series will focus on the corrective action to avoid Service Callout related OSB serer hangs. Before we dive into the solution, we need to briefly discus about Work Managers in WLS. Please refer to the blog posting for more details.

Read the article
github team workflow - to fork or not?

- by aporat

We're a small team of web developers currently using subversion but soon we're making a switch to github. I'm looking at different types of github workflows, and we're not sure if the whole forking concept in github for each developer is such a good idea for us. If we use forks, I understand each developer will have his own private remote & local repositories. I'm worried it will make pushing changesets hard and too complex. Also, my biggest concern is that it will force each developer to have 2 remotes: origin (which is the remote fork) and an upstream (which is used to "sync" changes from the main repository). Not sure if it's such a easy way to do things. This is similar to the workflow explained here: https://github.com/usm-data-analysis/usm-data-analysis.github.com/wiki/Git-workflow If we don't use forks, we can probably get by fine by using a central repo creating a branch for each task we're working on, and merge them into the development branch on the same repository. It means we won't be able to restrict merging of branches and might be a little messy to have many branches on the central repository. Any suggestions from teams who tried both workflow?

Read the article
Spend Analytics on a Grand Scale

- by jacqueline.coolidge(at)oracle.com

The Wall St. Journal reports in Billions in Bloat Uncovered in Beltway that a recent study by Government Accountability Office (GAO) has released a massive study of several programs and agencies that cost U.S. taxpayers billions of dollars each year. This report help save $100 to $200 Billion dollars by identifying duplicate spending and ineffective programs that can be consolidated or eliminated. Now, that is spend analytics on a massive scale! It remains to be seen how actionable that information will be. Certainly, there have been studies before that identify wasteful spending. But, it’s a great case of the power of business intelligence and spend analytics. Many companies do find significant savings when they implement spend and procurement analytics. What makes for an excellent spend analysis? It should be: Objective and provide visibility across programs and/or divisions A cross functional analysis that links financial with performance metrics Prescriptive and actionable Spend and procurement analytics have been HOT during the economic downturn! I expect 2011 will see many more companies get serious about spend analytics and would love to hear from companies who are willing to share their experience.

Read the article
A Dozen USB Chargers Analyzed; Or: Beware the Knockoffs

- by Jason Fitzpatrick

When it comes to buying a USB charger one is just as good as another so you might as well buy the cheapest one, right? This interesting and detailed analysis of name brand, off-brand, and counterfeit chargers will have you rethinking that stance. Ken Shirriff gathered up a dozen USB chargers including official Apple chargers, counterfeit Apple chargers, as well as offerings from Monoprice, Belkin, Motorola, and other companies. After putting them all through a battery of tests he gave them overall rankings based on nine different categories including power stability, power quality, and efficiency. The take away from his research? Quality varied widely between brands but when sticking with big companies like Apple or HP the chargers were all safe. The counterfeit chargers (like the $2 Apple iPad charger knock-off he tested) proved to be outright dangerous–several actually melted or caught fire in the course of the project. Hit up the link below for his detailed analysis including power output readings for the dozen chargers. A Dozen USB Chargers in the Lab [via O'Reilly Radar] 6 Start Menu Replacements for Windows 8 What Is the Purpose of the “Do Not Cover This Hole” Hole on Hard Drives? How To Log Into The Desktop, Add a Start Menu, and Disable Hot Corners in Windows 8

Read the article
Smart Phones Shockingly Energy Efficient; Lead to Decreased Household Power Consumption

- by Jason Fitzpatrick

Given how often our smart phones and tablets spend plugged in and topping off their battery reserves, it’s easy to assume they’re sucking down a lot of power. Analysis shows the lilliputian but powerful devices are surprisingly efficient and may be decreasing our overall power consumption. Courtesy of energy-centric blog Outlier, we’re treated to a look at the power sipping habits of popular smart phones and mobile devices. The simple take away? They use shockingly little electricity over the course of the year–you can charge your new iPhone for a year of regular usage for under a buck. The more complex analysis? The proliferation of tiny and energy efficient devices is displacing heavier energy consumers (large televisions, desktop computers, etc.) and driving a more efficient gadget-to-consumption ratio is many households. Hit up the link below to read the full post. How Much Does It Take to Charge an iPhone [via Mashable] 7 Ways To Free Up Hard Disk Space On Windows HTG Explains: How System Restore Works in Windows HTG Explains: How Antivirus Software Works

Read the article
How to get full query string parameters not UrlDecoded

- by developerit

Introduction While developing Developer IT’s website, we came across a problem when the user search keywords containing special character like the plus ‘+’ char. We found it while looking for C++ in our search engine. The request parameter output in ASP.NET was “c “. I found it strange that it removed the ‘++’ and replaced it with a space… Analysis After a bit of Googling and Reflection, it turns out that ASP.NET calls UrlDecode on each parameters retreived by the Request(“item”) method. The Request.Params property is affected by this two since it mashes all QueryString, Forms and other collections into a single one. Workaround Finally, I solve the puzzle usign the Request.RawUrl property and parsing it with the same RegEx I use in my url re-writter. The RawUrl not affected by anything. As its name say it, it’s raw. Published on http://www.developerit.com/

Read the article
Your thoughts on Best Practices for Scientific Computing?

- by John Smith

A recent paper by Wilson et al (2014) pointed out 24 Best Practices for scientific programming. It's worth to have a look. I would like to hear opinions about these points from experienced programmers in scientific data analysis. Do you think these advices are helpful and practical? Or are they good only in an ideal world? Wilson G, Aruliah DA, Brown CT, Chue Hong NP, Davis M, Guy RT, Haddock SHD, Huff KD, Mitchell IM, Plumbley MD, Waugh B, White EP, Wilson P (2014) Best Practices for Scientific Computing. PLoS Biol 12:e1001745. http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001745 Box 1. Summary of Best Practices Write programs for people, not computers. (a) A program should not require its readers to hold more than a handful of facts in memory at once. (b) Make names consistent, distinctive, and meaningful. (c) Make code style and formatting consistent. Let the computer do the work. (a) Make the computer repeat tasks. (b) Save recent commands in a file for re-use. (c) Use a build tool to automate workflows. Make incremental changes. (a) Work in small steps with frequent feedback and course correction. (b) Use a version control system. (c) Put everything that has been created manually in version control. Don’t repeat yourself (or others). (a) Every piece of data must have a single authoritative representation in the system. (b) Modularize code rather than copying and pasting. (c) Re-use code instead of rewriting it. Plan for mistakes. (a) Add assertions to programs to check their operation. (b) Use an off-the-shelf unit testing library. (c) Turn bugs into test cases. (d) Use a symbolic debugger. Optimize software only after it works correctly. (a) Use a profiler to identify bottlenecks. (b) Write code in the highest-level language possible. Document design and purpose, not mechanics. (a) Document interfaces and reasons, not implementations. (b) Refactor code in preference to explaining how it works. (c) Embed the documentation for a piece of software in that software. Collaborate. (a) Use pre-merge code reviews. (b) Use pair programming when bringing someone new up to speed and when tackling particularly tricky problems. (c) Use an issue tracking tool. I'm relatively new to serious programming for scientific data analysis. When I tried to write code for pilot analyses of some of my data last year, I encountered tremendous amount of bugs both in my code and data. Bugs and errors had been around me all the time, but this time it was somewhat overwhelming. I managed to crunch the numbers at last, but I thought I couldn't put up with this mess any longer. Some actions must be taken. Without a sophisticated guide like the article above, I started to adopt "defensive style" of programming since then. A book titled "The Art of Readable Code" helped me a lot. I deployed meticulous input validations or assertions for every function, renamed a lot of variables and functions for better readability, and extracted many subroutines as reusable functions. Recently, I introduced Git and SourceTree for version control. At the moment, because my co-workers are much more reluctant about these issues, the collaboration practices (8a,b,c) have not been introduced. Actually, as the authors admitted, because all of these practices take some amount of time and effort to introduce, it may be generally hard to persuade your reluctant collaborators to comply them. I think I'm asking your opinions because I still suffer from many bugs despite all my effort on many of these practices. Bug fix may be, or should be, faster than before, but I couldn't really measure the improvement. Moreover, much of my time has been invested on defence, meaning that I haven't actually done much data analysis (offence) these days. Where is the point I should stop at in terms of productivity? I've already deployed: 1a,b,c, 2a, 3a,b,c, 4b,c, 5a,d, 6a,b, 7a,7b I'm about to have a go at: 5b,c Not yet: 2b,c, 4a, 7c, 8a,b,c (I could not really see the advantage of using GNU make (2c) for my purpose. Could anyone tell me how it helps my work with MATLAB?)

Read the article
Introducing Microsoft SQL Server 2008 R2 - Business Intelligence Samples

- by smisner

On April 14, 2010, Microsoft Press (blog | twitter) released my latest book, co-authored with Ross Mistry (twitter), as a free ebook download - Introducing Microsoft SQL Server 2008 R2. As the title implies, this ebook is an introduction to the latest SQL Server release. Although you'll find a comprehensive review of the product's features in this book, you will not find the step-by-step details that are typical in my other books. For those readers who are interested in a more interactive learning experience, I have created two samples file for download: IntroSQLServer2008R2Samples project Sales Analysis workbook Here's a recap of the business intelligence chapters and the samples I used to generate the screen shots by chapter: Chapter 6: Scalable Data Warehousing covers a new edition of SQL Server, Parallel Data Warehouse. Understandably, Microsoft did not ship me the software and hardware to set up my own Parallel Data Warehouse environment for testing purposes and consequently you won't see any screenshots in this chapter. I received a lot of information and a lot of help from the product team during the development of this chapter to ensure its technical accuracy. Chapter 7: Master Data Services is a new component in SQL Server. After you install Master Data Services (MDS), which is a separate installation from SQL Server although it's found on the same media, you can install sample models to explore (which is what I did to create screenshots for the book). To do this, you deploying packages found at \Program Files\Microsoft SQL Server\Master Data Services\Samples\Packages. You will first need to use the Configuration Manager (in the Microsoft SQL Server 2008 R2\Master Data Services program group) to create a database and a Web application for MDS. Then when you launch the application, you'll see a Getting Started page which has a Deploy Sample Data link that you can use to deploy any of the sample packages. Chapter 8: Complex Event Processing is an introduction to another new component, StreamInsight. This topic was way too large to cover in-depth in a single chapter, so I focused on information such as architecture, development models, and an overview of the key sections of code you'll need to develop for your own applications. StreamInsight is an engine that operates on data in-flight and as such has no user interface that I could include in the book as screenshots. The November CTP version of SQL Server 2008 R2 included code samples as part of the installation, but these are not the official samples that will eventually be available in Codeplex. At the time of this writing, the samples are not yet published. Chapter 9: Reporting Services Enhancements provides an overview of all the changes to Reporting Services in SQL Server 2008 R2, and there are many! In previous posts, I shared more details than you'll find in the book about new functions (Lookup, MultiLookup, and LookupSet), properties for page numbering, and the new global variable RenderFormat. I will confess that I didn't use actual data in the book for my discussion on the Lookup functions, but I did create real reports for the blog posts and will upload those separately. For the other screenshots and examples in the book, I have created the IntroSQLServer2008R2Samples project for you to download. To preview these reports in Business Intelligence Development Studio, you must have the AdventureWorksDW2008R2 database installed, and you must download and install SQL Server 2008 R2. For the map report, you must execute the PopulationData.sql script that I included in the samples file to add a table to the AdventureWorksDW2008R2 database. The IntroSQLServer2008R2Samples project includes the following files: 01_AggregateOfAggregates.rdl to illustrate the use of embedded aggregate functions 02_RenderFormatAndPaging.rdl to illustrate the use of page break properties (Disabled, ResetPageNumber), the PageName property, and the RenderFormat global variable 03_DataSynchronization.rdl to illustrate the use of the DomainScope property 04_TextboxOrientation.rdl to illustrate the use of the WritingMode property 05_DataBar.rdl 06_Sparklines.rdl 07_Indicators.rdl 08_Map.rdl to illustrate a simple analytical map that uses color to show population counts by state PopulationData.sql to provide the data necessary for the map report Chapter 10: Self-Service Analysis with PowerPivot introduces two new components to the Microsoft BI stack, PowerPivot for Excel and PowerPivot for SharePoint, which you can learn more about at the PowerPivot site. To produce the screenshots for this chapter, I created the Sales Analysis workbook which you can download (although you must have Excel 2010 and the PowerPivot for Excel add-in installed to explore it fully). It's a rather simple workbook because space in the book did not permit a complete exploration of all the wonderful things you can do with PowerPivot. I used a tutorial that was available with the CTP version as a basis for the report so it might look familiar if you've already started learning about PowerPivot. In future posts, I'll continue exploring the new features in greater detail. If there's any special requests, please let me know! Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
SQL SERVER – Example of Performance Tuning for Advanced Users with DB Optimizer

- by Pinal Dave

Performance tuning is such a subject that everyone wants to master it. In beginning everybody is at a novice level and spend lots of time learning how to master the art of performance tuning. However, as we progress further the tuning of the system keeps on getting very difficult. I have understood in my early career there should be no need of ego in the technology field. There are always better solutions and better ideas out there and we should not resist them. Instead of resisting the change and new wave I personally adopt it. Here is a similar example, as I personally progress to the master level of performance tuning, I face that it is getting harder to come up with optimal solutions. In such scenarios I rely on various tools to teach me how I can do things better. Once I learn about tools, I am often able to come up with better solutions when I face the similar situation next time. A few days ago I had received a query where the user wanted to tune it further to get the maximum out of the performance. I have re-written the similar query with the help of AdventureWorks sample database. SELECT * FROM HumanResources.Employee e INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID; User had similar query to above query was used in very critical report and wanted to get best out of the query. When I looked at the query – here were my initial thoughts Use only column in the select statements as much as you want in the application Let us look at the query pattern and data workload and find out the optimal index for it Before I give further solutions I was told by the user that they need all the columns from all the tables and creating index was not allowed in their system. He can only re-write queries or use hints to further tune this query. Now I was in the constraint box – I believe * was not a great idea but if they wanted all the columns, I believe we can’t do much besides using *. Additionally, if I cannot create a further index, I must come up with some creative way to write this query. I personally do not like to use hints in my application but there are cases when hints work out magically and gives optimal solutions. Finally, I decided to use Embarcadero’s DB Optimizer. It is a fantastic tool and very helpful when it is about performance tuning. I have previously explained how it works over here. First open DBOptimizer and open Tuning Job from File >> New >> Tuning Job. Once you open DBOptimizer Tuning Job follow the various steps indicates in the following diagram. Essentially we will take our original script and will paste that into Step 1: New SQL Text and right after that we will enable Step 2 for Generating Various cases, Step 3 for Detailed Analysis and Step 4 for Executing each generated case. Finally we will click on Analysis in Step 5 which will generate the report detailed analysis in the result pan. The detailed pan looks like. It generates various cases of T-SQL based on the original query. It applies various hints and available hints to the query and generate various execution plans of the query and displays them in the resultant. You can clearly notice that original query had a cost of 0.0841 and logical reads about 607 pages. Whereas various options which are just following it has different execution cost as well logical read. There are few cases where we have higher logical read and there are few cases where as we have very low logical read. If we pay attention the very next row to original query have Merge_Join_Query in description and have lowest execution cost value of 0.044 and have lowest Logical Reads of 29. This row contains the query which is the most optimal re-write of the original query. Let us double click over it. Here is the query: SELECT * FROM HumanResources.Employee e INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID OPTION (MERGE JOIN) If you notice above query have additional hint of Merge Join. With the help of this Merge Join query hint this query is now performing much better than before. The entire process takes less than 60 seconds. Please note that it the join hint Merge Join was optimal for this query but it is not necessary that the same hint will be helpful in all the queries. Additionally, if the workload or data pattern changes the query hint of merge join may be no more optimal join. In that case, we will have to redo the entire exercise once again. This is the reason I do not like to use hints in my queries and I discourage all of my users to use the same. However, if you look at this example, this is a great case where hints are optimizing the performance of the query. It is humanly not possible to test out various query hints and index options with the query to figure out which is the most optimal solution. Sometimes, we need to depend on the efficiency tools like DB Optimizer to guide us the way and select the best option from the suggestion provided. Let me know what you think of this article as well your experience with DB Optimizer. Please leave a comment. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Joins, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Podcast interview with Michael Kane

- by mhornick

In this podcast interview with Michael Kane, Data Scientist and Associate Researcher at Yale University, Michael discusses the R statistical programming language, computational challenges associated with big data, and two projects involving data analysis he conducted on the stock market "flash crash" of May 6, 2010, and the tracking of transportation routes bird flu H5N1. Michael also worked with Oracle on Oracle R Enterprise, a component of the Advanced Analytics option to Oracle Database Enterprise Edition. In the closing segment of the interview, Michael comments on the relationship between the data analyst and the database administrator and how Oracle R Enterprise provides secure data management, transparent access to data, and improved performance to facilitate this relationship. Listen now...

Read the article
Code Metrics: Number of IL Instructions

- by DigiMortal

In my previous posting about code metrics I introduced how to measure LoC (Lines of Code) in .NET applications. Now let’s take a step further and let’s take a look how to measure compiled code. This way we can somehow have a picture about what compiler produces. In this posting I will introduce you code metric called number of IL instructions. NB! Number of IL instructions is not something you can use to measure productivity of your team. If you want to get better idea about the context of this metric and LoC then please read my first posting about LoC. What are IL instructions? When code written in some .NET Framework language is compiled then compiler produces assemblies that contain byte code. These assemblies are executed later by Common Language Runtime (CLR) that is code execution engine of .NET Framework. The byte code is called Intermediate Language (IL) – this is more common language than C# and VB.NET by example. You can use ILDasm tool to convert assemblies to IL assembler so you can read them. As IL instructions are building blocks of all .NET Framework binary code these instructions are smaller and highly general – we don’t want very rich low level language because it executes slower than more general language. For every method or property call in some .NET Framework language corresponds set of IL instructions. There is no 1:1 relationship between line in high level language and line in IL assembler. There are more IL instructions than lines in C# code by example. How much instructions there are? I have no common answer because it really depends on your code. Here you can see some metrics from my current community project that is developed on SharePoint Server 2007. As average I have about 7 IL instructions per line of code. This is not metric you should use, it is just illustrative example so you can see the differences between numbers of lines and IL instructions. Why should I measure the number of IL instructions? Just take a look at chart above. Compiler does something that you cannot see – it compiles your code to IL. This is not intuitive process because you usually cannot say what is exactly the end result. You know it at greater plain but you don’t know it exactly. Therefore we can expect some surprises and that’s why we should measure the number of IL instructions. By example, you may find better solution for some method in your source code. It looks nice, it works nice and everything seems to be okay. But on server under load your fix may be way slower than previous code. Although you minimized the number of lines of code it ended up with increasing the number of IL instructions. How to measure the number of IL instructions? My choice is NDepend because Visual Studio is not able to measure this metric. Steps to make are easy. Open your NDepend project or create new and add all your application assemblies to project (you can also add Visual Studio solution to project). Run project analysis and wait until it is done. You can see over-all stats form global summary window. This is the same window I used to read the LoC and the number of IL instructions metrics for my chart. Meanwhile I made some changes to my code (enabled advanced caching for events and event registrations module) and then I ran code analysis again to get results for this section of this posting. NDepend is also able to tell you exactly what parts of code have problematically much IL instructions. The code quality section of CQL Query Explorer shows you how much problems there are with members in analyzed code. If you click on the line Methods too big (NbILInstructions) you can see all the problematic members of classes in CQL Explorer shown in image on right. In my case if have 10 methods that are too big and two of them have horrible number of IL instructions – just take a look at first two methods in this TOP10. Also note the query box. NDepend has easy and SQL-like query language to query code analysis results. You can modify these queries if you like and also you can define your own ones if default set is not enough for you. What is good result? As you can see from query window then the number of IL instructions per member should have maximally 200 IL instructions. Of course, like always, the less instructions you have, the better performing code you have. I don’t mean here little differences but big ones. By example, take a look at my first method in warnings list. The number of IL instructions it has is huge. And believe me – this method looks awful. Conclusion The number of IL instructions is useful metric when optimizing your code. For analyzing code at general level to find out too long methods you can use the number of LoC metric because it is more intuitive for you and you can therefore handle the situation more easily. Also you can use NDepend as code metrics tool because it has a lot of metrics to offer.

Read the article
Arithmetic Coding Questions

- by Xophmeister

I have been reading up on arithmetic coding and, while I understand how it works, all the guides and instructions I've read start with something like: Set up your intervals based upon the frequency of symbols in your data; i.e., more likely symbols get proportionally larger intervals. My main query is, once I have encoded my data, presumably I also need to include this statistical model with the encoding, otherwise the compressed data can't be decoded. Is that correct? I don't see this mentioned anywhere -- the most I've seen is that you need to include the number of iterations (i.e., encoded symbols) -- but unless I'm missing something, this also seems necessary to me. If this is true, that will obviously add an overhead to the final output. At what point does this outweigh the benefits of compression (e.g., say if I'm trying to compress just a few thousand bits)? Will the choice of symbol size also make a significant difference (e.g., if I'm looking at 2-bit words, rather than full octets/whatever)?

Read the article
The Lord of the Rings Project Charts Middle Earth by the Numbers

- by Jason Fitzpatrick

How many characters from the Lord of the Rings series can you name? 923? That’s the number of entries in the LOTR Project–a collection of data that links family trees, timelines, and statistical curiosities about Middle Earth. In addition to families trees and the above chart mapping out the shift in lifespans over the ages of Middle Earth, you’ll find charts mapping out age distributions, the race and gender composition of Middle Earth, populations, time and distance traveled by the Hobbits in pursuit of their quest, and so more. The site is a veritable almanac of trivia about the Lord of the Rings and related books and media. Hit up the link below to explore the facts and figures of Middle Earth. LOTR Project [via Flowing Data] How Hackers Can Disguise Malicious Programs With Fake File Extensions Can Dust Actually Damage My Computer? What To Do If You Get a Virus on Your Computer

Read the article
SQL SERVER – Number-Crunching with SQL Server – Exceed the Functionality of Excel

- by Pinal Dave

Imagine this. Your users have developed an Excel spreadsheet that extracts data from your SQL Server database, manipulates that data through the use of Excel formulas and, possibly, some VBA code which is then used to calculate P&L, hedging requirements or even risk numbers. Management comes to you and tells you that they need to get rid of the spreadsheet and that the results of the spreadsheet calculations need to be persisted on the database. SQL Server has a very small set of functions for analyzing data. Excel has hundreds of functions for analyzing data, with many of them focused on specific financial and statistical calculations. Is it even remotely possible that you can use SQL Server to replace the complex calculations being done in a spreadsheet? Westclintech has developed a library of functions that match or exceed the functionality of Excel’s functions and contains many functions that are not available in EXCEL. Their XLeratorDB library of functions contains over 700 functions that can be incorporated into T-SQL statements. XLeratorDB takes advantage of the SQL CLR architecture introduced in SQL Server 2005. SQL CLR permits managed code to be compiled into the database and run alongside built-in SQL Server functions like COUNT or SUM. The Westclintech developers have taken advantage of this architecture to bring robust analytical functions to the database. In our hypothetical spreadsheet, let’s assume that our users are using the YIELD function and that the data are extracted from a table in our database called BONDS. Here’s what the spreadsheet might look like. We go to column G and see that it contains the following formula. Obviously, SQL Server does not offer a native YIELD function. However, with XLeratorDB we can replicate this calculation in SQL Server with the following statement: SELECT *, wct.YIELD(CAST(GETDATE() AS date),Maturity,Rate,Price,100,Frequency,Basis) AS YIELD FROM BONDS This produces the following result. This illustrates one of the best features about XLeratorDB; it is so easy to use. Since I knew that the spreadsheet was using the YIELD function I could use the same function with the same calling structure to do the calculation in SQL Server. I didn’t need to know anything at all about the mechanics of calculating the yield on a bond. It was pretty close to cut and paste. In fact, that’s one way to construct the SQL. Just copy the function call from the cell in the spreadsheet and paste it into SMS and change the cell references to column names. I built the SQL for this query by starting with this. SELECT * ,YIELD(TODAY(),B2,C2,D2,100,E2,F2) FROM BONDS I then changed the cell references to column names. SELECT * --,YIELD(TODAY(),B2,C2,D2,100,E2,F2) ,YIELD(TODAY(),Maturity,Rate,Price,100,Frequency,Basis) FROM BONDS Finally, I replicated the TODAY() function using GETDATE() and added the schema name to the function name. SELECT * --,YIELD(TODAY(),B2,C2,D2,100,E2,F2) --,YIELD(TODAY(),Maturity,Rate,Price,100,Frequency,Basis) ,wct.YIELD(GETDATE(),Maturity,Rate,Price,100,Frequency,Basis) FROM BONDS Then I am able to execute the statement returning the results seen above. The XLeratorDB libraries are heavy on financial, statistical, and mathematical functions. Where there is an analog to an Excel function, the XLeratorDB function uses the same naming conventions and calling structure as the Excel function, but there are also hundreds of additional functions for SQL Server that are not found in Excel. You can find the functions by opening Object Explorer in SQL Server Management Studio (SSMS) and expanding the Programmability folder under the database where the functions have been installed. The Functions folder expands to show 3 sub-folders: Table-valued Functions; Scalar-valued functions, Aggregate Functions, and System Functions. You can expand any of the first three folders to see the XLeratorDB functions. Since the wct.YIELD function is a scalar function, we will open the Scalar-valued Functions folder, scroll down to the wct.YIELD function and and click the plus sign (+) to display the input parameters. The functions are also Intellisense-enabled, with the input parameters displayed directly in the query tab. The Westclintech website contains documentation for all the functions including examples that can be copied directly into a query window and executed. There are also more one hundred articles on the site which go into more detail about how some of the functions work and demonstrate some of the extensive business processes that can be done in SQL Server using XLeratorDB functions and some T-SQL. XLeratorDB is organized into libraries: finance, statistics; math; strings; engineering; and financial options. There is also a windowing library for SQL Server 2005, 2008, and 2012 which provides functions for calculating things like running and moving averages (which were introduced in SQL Server 2012), FIFO inventory calculations, financial ratios and more, without having to use triangular joins. To get started you can download the XLeratorDB 15-day free trial from the Westclintech web site. It is a fully-functioning, unrestricted version of the software. If you need more than 15 days to evaluate the software, you can simply download another 15-day free trial. XLeratorDB is an easy and cost-effective way to start adding sophisticated data analysis to your SQL Server database without having to know anything more than T-SQL. Get XLeratorDB Today and Now! Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Excel

Read the article

Search Results

Search found 2683 results on 108 pages for 'statistical analysis'.

Page 35/108 | < Previous Page | 31 32 33 34 35 36 37 38 39 40 41 42 | Next Page >

- by David

- by llowitz

- by Saeed Neamati

- by jacqueline.coolidge(at)oracle.com

- by aporat

- by Metalshark

- by user719921

- by Mik378

- by Tarun Arora

- by user15370

- by cwarticki

- by Sabha

- by aporat

- by jacqueline.coolidge(at)oracle.com

- by Jason Fitzpatrick

- by Jason Fitzpatrick

- by developerit

- by John Smith

- by smisner

- by Pinal Dave

- by mhornick

- by DigiMortal

- by Xophmeister

- by Jason Fitzpatrick

- by Pinal Dave

< Previous Page | 31 32 33 34 35 36 37 38 39 40 41 42 | Next Page >