There are many different types of analyses, each one with its own pros
and cons. Relational reports have a predefined structure,
and end users cannot change it. They are simple to use for end users. Reports can use real-time
data and snapshots of
data to show the state of a report at specific points in time. One of the drawbacks is that report authoring is limited to IT pros
and advanced users. Any kind of dynamic restructuring is very limited. If real-time
data is used for a report, the report has a negative impact on the performance of the source system. Processing of the reports might be slow because the
data comes from relational database management systems, which are not optimized for reporting only. If you create a semantic model of your
data, your end users can create ad-hoc report structures. However, the development is more complex because a developer is needed to create these semantic models. For OLAP, you typically use specialized database management systems. You get lightning speed of analyses. End users can use rich
and thin clients to interactively change the structure of the report. Typically, they do it graphically. However, the development of an OLAP system is many times quite complex. It involves the preparation
and maintenance of an enterprise
data warehouse
and OLAP cubes. In order to exploit the possibility of real-time restructuring of reports, the users must be both active
and educated. The
data is usually stale, as it is loaded into
data warehouses
and OLAP cubes with a scheduled process. With
data mining, a structure is not selected in advance; it searches for the structure. As a result,
data mining can give you the most valuable results because you can discover patterns you did not expect. A
data mining model structure is limited only by the attributes that you use to train the model. One of the drawbacks is that a lot of knowledge is needed for a successful
data mining project. End users have to understand the results. Subject matter experts
and IT professionals need to understand business problem thoroughly. The development might be sometimes even more complex than the development of OLAP cubes. Each type of analysis has its own place in an enterprise system. SQL Server has tools for all kinds of analyses. However,
data mining is the most advanced way of analyzing the data; this is the “I” in BI. In order to get the most out of it, you need to learn quite a lot. In this blog post, I am gathering together resources for learning, including forthcoming events. Books Multiple authors: SQL Server MVP Deep Dives – I wrote an introductory
data mining chapter there. Erik Veerman, Teo Lachev
and Dejan Sarka: MCTS Self-Paced Training Kit (Exam 70-448): Microsoft SQL Server 2008 - Business Intelligence Development
and Maintenance – you can find a good overview of a complete BI solution, including
data mining, in this book. Jamie MacLennan, ZhaoHui Tang,
and Bogdan Crivat:
Data Mining with Microsoft SQL Server 2008 – can’t miss this book if you want to mine your
data with SQL Server tools. Michael Berry, Gordon Linoff: Mastering
Data Mining: The Art
and Science of Customer Relationship Management –
data mining from both, business
and technical perspective. Dorian Pyle:
Data Preparation for
Data Mining – an in-depth book about
data preparation. Thomas
and Ronald Wonnacott: Introductory Statistics – if you thought that you could get away without statistics, then you are not serious about
data mining. Jiawei Han
and Micheline Kamber:
Data Mining Concepts
and Techniques – in-depth explanation of the most popular
data mining algorithms. Michael Berry
and Gordon Linoff:
Data Mining Techniques – another book that explains
data mining algorithms, more fro a business perspective. Paolo Guidici: Applied
Data Mining – very mathematical book, only if you enjoy statistics
and mathematics in general. Forthcoming presentations I am presenting two
data mining related sessions during the PASS Summit in Charlotte, NC: Wednesday, October 16th, 2013 - Fraud Detection: Notes from the Field – I am showing how to use
data mining for a specific business problem. The presentation is based on real-life projects. Friday, October 18th: Excel 2013 Advanced Analytics – I am focusing on Excel
Data Mining Add-ins,
and how to use them together with Power Pivot
and other add-ins. This is the most you can get out of Excel. Sinergija 2013, Belgrade, Serbia Tuesday, October 22nd: Excel 2013 Analytics to the Max – another presentation focusing on the most advanced analytics you can get in Excel. SQL Rally Amsterdam, Netherlands Thursday, November 7th: Advanced Analytics in Excel 2013 –
and again I am presenting about
data mining in Excel. Why three different titles for the same presentation? I don’t know, I guess I forgot the name I proposed every time right after I sent the proposal. Courses
Data Mining with SQL Server 2012 – I wrote a 3-day course for SolidQ. If you are interested in this course, which I could also deliver in a shorter seminar way, you can contact your closes SolidQ subsidiary, or, of course, me directly on addresses
[email protected] or
[email protected]. This course could also complement the existing courseware portfolio of training providers, which are welcome to contact me as well. OK, now you know: no more excuses, start learning
data mining, get the most out of your data