Search Results

Search found 2683 results on 108 pages for 'statistical analysis'.

Page 30/108 | < Previous Page | 26 27 28 29 30 31 32 33 34 35 36 37 | Next Page >

SQL SERVER – Microsoft SQL Server Migration Assistant V6.0 Released

- by Pinal Dave

Every company makes a different decision about the database when they start, but as they move forward they mature and make the decision which is based on their experience and best interest of the organization. Similarly, quite a many organizations make different decisions on database, like Sybase, MySQL, Oracle or Access and as time passes by they learn that now they want to move to a different platform. Microsoft makes it easy for SQL Server professional by releasing various Migration Assistant tools. Last week, Microsoft released Microsoft SQL Server Migration Assistant v6.0. Here are different tools released earlier last week to migrate various product to SQL Server. Microsoft SQL Server Migration Assistant v6.0 for Sybase SQL Server Migration Assistant (SSMA) is a free supported tool from Microsoft that simplifies database migration process from Sybase Adaptive Server Enterprise (ASE) to SQL Server and Azure SQL DB. SSMA automates all aspects of migration including migration assessment analysis, schema and SQL statement conversion, data migration as well as migration testing. Microsoft SQL Server Migration Assistant v6.0 for MySQL SQL Server Migration Assistant (SSMA) is a free supported tool from Microsoft that simplifies database migration process from MySQL to SQL Server and Azure SQL DB. SSMA automates all aspects of migration including migration assessment analysis, schema and SQL statement conversion, data migration as well as migration testing. Microsoft SQL Server Migration Assistant v6.0 for Oracle SQL Server Migration Assistant (SSMA) is a free supported tool from Microsoft that simplifies database migration process from Oracle to SQL Server and Azure SQL DB. SSMA automates all aspects of migration including migration assessment analysis, schema and SQL statement conversion, data migration as well as migration testing. Microsoft SQL Server Migration Assistant v6.0 for Access SQL Server Migration Assistant (SSMA) is a free supported tool from Microsoft that simplifies database migration process from Access to SQL Server. SSMA for Access automates conversion of Microsoft Access database objects to SQL Server database objects, loads the objects into SQL Server and Azure SQL DB, and then migrates data from Microsoft Access to SQL Server and Azure SQL DB. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Download, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: SQL Migration

Read the article
Take Control of Workflow with Workflow Analyzer!

- by user793553

Take Control of Workflow with Workflow Analyzer! Immediate Analysis and Output of your EBS Workflow Environment The EBS Workflow Analyzer is a script that reviews the current Workflow Footprint, analyzes the configurations, environment, providing feedback, and recommendations on Best Practices and areas of concern. Go to Doc ID 1369938.1 for more details and script download with a short overview video on it. Proactive Benefits: Immediate Analysis and Output of Workflow Environment Identifies Aged Records Identifies Workflow Errors & Volumes Identifies looping Workflow items and stuck activities Identifies Workflow System Setup and configurations Identifies and Recommends Workflow Best Practices Easy To Add Tool for regular Workflow Maintenance Execute Analysis anytime to compare trending from past outputs The Workflow Analyzer presents key details in an easy to review graphical manner. See the examples below. Workflow Runtime Data Table Gauge The Workflow Runtime Data Table Gauge will show critical (red), bad (yellow) and good (green) depending on the number of workflow items (WF_ITEMS). Workflow Error Notifications Pie Chart A pie chart shows the workflow error notification types. Workflow Runtime Table Footprint Bar Chart A pie chart shows the workflow error notification types and a bar chart shows the workflow runtime table footprint. The analyzer also gives detailed listings of setups and configurations. As an example the workflow services are listed along with their status for review: The analyzer draws attention to key details with yellow and red boxes highlighting areas of review: You can extend on any query by reviewing the SQL Script and then running it on your own or making modifications for your own needs: Find more details in these notes: Doc ID 1369938.1 Workflow Analyzer script for E-Business Suite Worklfow Monitoring and Maintenance Doc ID 1425053.1 How to run EBS Workflow Analyzer Tool as a Concurrent Request Or visit the My Oracle Support EBS - Core Workflow Community

Read the article
Linux to Solaris @ Morgan Stanley

- by mgerdts

I came across this blog entry and the accompanying presentation by Robert Milkoski about his experience switching from Linux to Oracle Solaris 11 for a distributed OpenAFS file serving environment at Morgan Stanley. If you are an IT manager, the presentation will show you: Running Solaris with a support contract can cost less than running Linux (even without a support contract) because of technical advantages of Solaris. IT departments can benefit from hiring computer scientists into Systems Programmer or similar roles. Their computer science background should be nurtured so that they can continue to deliver value (savings and opportunity) to the business as technology advances. If you are a sysadmin, developer, or somewhere in between, the presentation will show you: A presentation that explains your technical analysis can be very influential. Learning and using the non-default options of an OS can make all the difference as to whether one OS is better suited than another. For example, see the graphs on slides 3 - 5. The ZFS default is to not use compression. When trying to convince those that hold the purse strings that your technical direction should be taken, the financial impact can be the part that closes the deal. See slides 6, 9, and 10. Sometimes reducing rack space requirements can be the biggest impact because it may stave off or completely eliminate the need for facilities growth. DTrace can be used to shine light on performance problems that may be suspected but not diagnosed. It is quite likely that these problems have existed in OpenAFS for a decade or more. DTrace made diagnosis possible. DTrace can be used to create performance analysis tools without modifying the source of software that is under analysis. See slides 29 - 32. Microstate accounting, visible in the prstat output on slide 37 can be used to quickly draw focus to problem areas that affect CPU saturation. Note that prstat without -m gives a time-decayed moving average that is not nearly as useful. Instruction level probes (slides 33 - 34) are a super-easy way to identify which part of a function is hot.

Read the article
ASP.NET website deployment [on hold]

- by Rei Brazilva

I am getting my hands wet with ASP and I have been following the tutorials. I deployed the site and in Azure and it worked great. Today I started actually designing the site. And when I published, it looks as if it doesn't read any of the files I just updated, added, and modified. It works on my localhost, but not in the Azure. I thought when you publish, everything goes up, including the new files. I don't have enough reputation to add a picture so, you'll forgive me. SO, basically, how do I get my entire site uploaded? In case anyone does stop by, I was able to pull this out just recently: CA0058 Error Running Code Analysis CA0058 : The referenced assembly 'DotNetOpenAuth.AspNet, Version=4.0.0.0, Culture=neutral, PublicKeyToken=2780ccd10d57b246' could not be found. This assembly is required for analysis and was referenced by: C:\Users\lotusms\Desktop\LOTUS MARKETING\ASP.NET\WebsiteManager\WebsiteManager\bin\WebsiteManager.dll, C:\Users\lotusms\Desktop\LOTUS MARKETING\ASP.NET\WebsiteManager\packages\Microsoft.AspNet.WebPages.OAuth.2.0.20710.0\lib\net40\Microsoft.Web.WebPages.OAuth.dll. [Errors and Warnings] (Global) CA0001 Error Running Code Analysis CA0001 : The following error was encountered while reading module 'Microsoft.Web.WebPages.OAuth': Assembly reference cannot be resolved: DotNetOpenAuth.AspNet, Version=4.0.0.0, Culture=neutral, PublicKeyToken=2780ccd10d57b246. [Errors and Warnings] (Global) Could this have something to do with the problem?

Read the article
JUnit Testing in Multithread Application

- by e2bady

This is a problem me and my team faces in almost all of the projects. Testing certain parts of the application with JUnit is not easy and you need to start early and to stick to it, but that's not the question I'm asking. The actual problem is that with n-Threads, locking, possible exceptions within the threads and shared objects the task of testing is not as simple as testing the class, but testing them under endless possible situations within threading. To be more precise, let me tell you about the design of one of our applications: When a user makes a request several threads are started that each analyse a part of the data to complete the analysis, these threads run a certain time depending on the size of the chunk of data (which are endless and of uncertain quality) to analyse, or they may fail if the data was insufficient/lacking quality. After each completed its analysis they call upon a handler which decides after each thread terminates if the collected analysis-data is sufficient to deliver an answer to the request. All of these analysers share certain parts of the applications (some parts because the instances are very big and only a certain number can be loaded into memory and those instances are reusable, some parts because they have a standing connection, where connecting takes time, ex.gr. sql connections) so locking is very common (done with reentrant-locks). While the applications runs very efficient and fast, it's not very easy to test it under real-world conditions. What we do right now is test each class and it's predefined conditions, but there are no automated tests for interlocking and synchronization, which in my opionion is not very good for quality insurances. Given this example how would you handle testing the threading, interlocking and synchronization?

Read the article
C++ Numerical Recipes – A New Adventure!

- by JoshReuben

I am about to embark on a great journey – over the next 6 weeks I plan to read through C++ Numerical Recipes 3rd edition http://amzn.to/YtdpkS I'll be reading this with an eye to C++ AMP, thinking about implementing the suitable subset (non-recursive, additive, commutative) to run on the GPU. APIs supporting HPC, GPGPU or MapReduce are all useful – providing you have the ability to choose the correct algorithm to leverage on them. I really think this is the most fascinating area of programming – a lot more exciting than LOB CRUD !!! When you think about it , everything is a function – we categorize & we extrapolate. As abstractions get higher & less leaky, sooner or later information systems programming will become a non-programmer task – you will be using WYSIWYG designers to build: GUIs MVVM service mapping & virtualization workflows ORM Entity relations In the data source SharePoint / LightSwitch are not there yet, but every iteration gets closer. For information workers, managed code is a race to the bottom. As MS futures are a bit shaky right now, the provider agnostic nature & higher barriers of entry of both C++ & Numerical Analysis seem like a rational choice to me. Its also fascinating – stepping outside the box. This is not the first time I've delved into numerical analysis. 6 months ago I read Numerical methods with Applications, which can be found for free online: http://nm.mathforcollege.com/ 2 years ago I learned the .NET Extreme Optimization library www.extremeoptimization.com – not bad 2.5 years ago I read Schaums Numerical Analysis book http://amzn.to/V5yuLI - not an easy read, as topics jump back & forth across chapters: 3 years ago I read Practical Numerical Methods with C# http://amzn.to/V5yCL9 (which is a toy learning language for this kind of stuff) I also read through AI a Modern Approach 3rd edition END to END http://amzn.to/V5yQSp - this took me a few years but was the most rewarding experience. I'll post progress updates – see you on the other side !

Read the article
HTML Tidy in NetBeans IDE (Part 2)

- by Geertjan

This is what I was aiming for in the previous blog entry: What you can see above (especially if you click to enlarge it) is that I have HTML Tidy integrated into the NetBeans analyzer functionality, which is pluggable from 7.2 onwards. Well, if you set an implementation dependency on "Static Analysis Core", since it's not an official API yet. Also, the scopes of the analyzer functionality are not pluggable. That means you can 'only' set the analyzer's scope to one or more projects, one or more packages, or one or more files. Not one or more folders, which means you can't have a bunch off HTML files in a folder that you access via the Favorites window and then run the analyzer on that folder (or on multiple folders). Thus, to try out my new code, I had to put some HTML files into a package inside a Java application. Then I chose that package as the scope of the analyzer. Then I ran all the analyzers (i.e., standard NetBeans Java hints, FindBugs, as well as my HTML Tidy extension) on that package. The screenshot above is the result. Here's all the code for the above, which is a port of the Action code from the previous blog entry into a new Analyzer implementation: import java.io.IOException; import java.io.PrintWriter; import java.io.StringWriter; import java.util.ArrayList; import java.util.Collections; import java.util.List; import javax.swing.JComponent; import javax.swing.text.Document; import org.netbeans.api.fileinfo.NonRecursiveFolder; import org.netbeans.modules.analysis.spi.Analyzer; import org.netbeans.modules.analysis.spi.Analyzer.AnalyzerFactory; import org.netbeans.modules.analysis.spi.Analyzer.Context; import org.netbeans.modules.analysis.spi.Analyzer.CustomizerProvider; import org.netbeans.modules.analysis.spi.Analyzer.WarningDescription; import org.netbeans.spi.editor.hints.ErrorDescription; import org.netbeans.spi.editor.hints.ErrorDescriptionFactory; import org.netbeans.spi.editor.hints.Severity; import org.openide.cookies.EditorCookie; import org.openide.filesystems.FileObject; import org.openide.loaders.DataObject; import org.openide.util.Exceptions; import org.openide.util.lookup.ServiceProvider; import org.w3c.tidy.Tidy; public class TidyAnalyzer implements Analyzer { private final Context ctx; private TidyAnalyzer(Context cntxt) { this.ctx = cntxt; } @Override public Iterable<? extends ErrorDescription> analyze() { List<ErrorDescription> result = new ArrayList<ErrorDescription>(); for (NonRecursiveFolder sr : ctx.getScope().getFolders()) { FileObject folder = sr.getFolder(); for (FileObject fo : folder.getChildren()) { for (ErrorDescription ed : doRunHTMLTidy(fo)) { if (fo.getMIMEType().equals("text/html")) { result.add(ed); } } } } return result; } private List<ErrorDescription> doRunHTMLTidy(FileObject sr) { final List<ErrorDescription> result = new ArrayList<ErrorDescription>(); Tidy tidy = new Tidy(); StringWriter stringWriter = new StringWriter(); PrintWriter errorWriter = new PrintWriter(stringWriter); tidy.setErrout(errorWriter); try { Document doc = DataObject.find(sr).getLookup().lookup(EditorCookie.class).openDocument(); tidy.parse(sr.getInputStream(), System.out); String[] split = stringWriter.toString().split("\n"); for (String string : split) { //Bit of ugly string parsing coming up: if (string.startsWith("line")) { final int end = string.indexOf(" c"); int lineNumber = Integer.parseInt(string.substring(0, end).replace("line ", "")); string = string.substring(string.indexOf(": ")).replace(":", ""); result.add(ErrorDescriptionFactory.createErrorDescription( Severity.WARNING, string, doc, lineNumber)); } } } catch (IOException ex) { Exceptions.printStackTrace(ex); } return result; } @Override public boolean cancel() { return true; } @ServiceProvider(service = AnalyzerFactory.class) public static final class MyAnalyzerFactory extends AnalyzerFactory { public MyAnalyzerFactory() { super("htmltidy", "HTML Tidy", "org/jtidy/format_misc.gif"); } public Iterable<? extends WarningDescription> getWarnings() { return Collections.EMPTY_LIST; } @Override public <D, C extends JComponent> CustomizerProvider<D, C> getCustomizerProvider() { return null; } @Override public Analyzer createAnalyzer(Context cntxt) { return new TidyAnalyzer(cntxt); } } } The above only works on packages, not on projects and not on individual files.

Read the article
Project Showcase: SaaS Web Apps Hits a Home Run with New SCMS Database

- by Webgui

We love seeing projects from start to finish, and we’re happy to share the latest example with you. Who: SaaS Web Apps – they use Software as a Service to create web applications that look and feel like desktop applications. What: SaaS Web Apps needed to build a Sports Contract Management System (SCMS) for one of its customers, Premier Stinson Sports. Why: The SCMS database is used for collecting, analyzing and recording college coach and athletic directors’ employment and contract data. The Challenge: Premier Stinson Sports works with a number of partners, each with its own needs and unique requirements. For example, USA Today uses the system to provide cutting edge news analysis while The National Sports Law Institute of Marquette University Law School uses it to for the latest sports contract data and student analysis. In addition, the system needed to be secure due to the sensitivity of the data; it was essential that the user security and permissions be easily configurable. As always, performance was a key factor, especially with the intense reporting and analytical capabilities for this project. Because of this, most of the processing had to be done on a dedicated server but the project called for the richness and responsiveness of a desktop application. The Solution: To execute the project, SaaS Web Apps used APS.Net-based Visual WebGui from Gizmox, combined with SQL Server 2008 and SQL Reporting Services. This combination resulted in a quick deployment for SaaS Web Apps’ customers. The Result: The completed project gave each partner the scalability and availability of a web application with the performance and security of a desktop application. As an example, USA Today pulls data from this database to give readers the latest sports stats – Salary analysis of 2010 Football Bowl Subdivision Coaches. And here’s a screenshot of the database itself. Great work, SaaS Web Apps!

Read the article
Personal Project - Next practical language/tech to learn

- by Paul Nathan

I'm working on a personal project doing some finance analysis. It's a totally new field for me, and I'm really having fun with it so far, plus working in the high-level language arena is a great break from my embedded systems daytime work. I have a MySQL backend on a non-local server with a pile of stock data. My task now is to do some analysis of the stocks and produce something approximating a useful result. There are a couple technical difficulties. (1) I have a lot of records. To be precise, I believe I'm near 100K records right now, and this number grows by 6.1K each weekday. I need to create a way to rummage through these fields and do data analysis - based on a given computation, go look at this other set. Fine and dandy, nothing too outre. But this means I could really use a straightforward API for talking to MySQL. (2) Ideally, it runs on OS X 10.4.11. No Windows/Linux machine at home. (3) I can use PHP, C++, Perl, etc. I even have an R installation. I'm pretty flexible with stuff, so long as it runs on OS X. (Lots of options here, pick water, H20, or dihydrogen monoxide ;-) ) (4)Lack of hassle. While I like clever and fun ways of doing things, I'm trying to get some analysis done, not spend ten hours doing installation work and scratching my head figuring out a theoretical syntax question needed to spout out "hello world". What's the question? I'd like to dig into something different than my usual PHP/C++/C toolset. I'm looking for recommendations for languages/technologies that will assist me and meet the above requirements. In particular, I've heard a lot of buzz about F# and Python on SO. I've used CLISP for small problems before, and kinda liked it. I'm seeking opinions about those in particular. edit:since I rent the DB server and have a limited amount of CPU time online, I'm trying to do the analysis on a local machine.

Read the article
SQL SERVER – List of All the Samples Database Available to Download for FREE

- by Pinal Dave

It is pretty much very common to have a sample database for any database product. Different companies keep on improving their product and keep on coming up with innovation in their product. To demonstrate the capability of their new enhancements they need the sample database. Microsoft have various sample database available for free download for their SQL Server Product. I have collected them here in a single blog post. Download an AdventureWorks Database The AdventureWorks OLTP database supports standard online transaction processing scenarios for a fictitious bicycle manufacturer (Adventure Works Cycles). Scenarios include Manufacturing, Sales, Purchasing, Product Management, Contact Management, and Human Resources. Coconut Dal Coconut Dal is a lightweight data access layer, for use in projects where the Entity Framework cannot be used or Microsoft’s Enterprise Library Data Block is unsuitable. Anyone who is handwriting ADO.NET should use a library instead and Coconut Dal might be the answer. DataBooster – Extension to ADO.NET Data Provider The dbParallel DataBooster library is a high-performance extension to ADO.NET Data Provider, includes two aspects: 1) A slimmed down API encapsulation which simplified the most common data access operations (DbConnection -> DbCommand -> DbParameter -> DbDataReader) into a single class DbAccess, to help application with a clean DAL, avoid over-packing and redundant-copy of data transfer. 2) A booster for writing mass data onto database. Base on a rational utilization of database concurrency and a effective utilization of network bandwidth. Tabular AMO 2012 The sample is made of two project parts. The first part is a library of functions to manage tabular models -AMO2Tabular V2-. The second part is a sample to build a tabular model -AdventureWorks Tabular AMO 2012- using the AMO2Tabular library; the created model is similar to the ‘AdventureWorks Tabular Model 2012. SQL Server Analysis Services Product Samples SQL Server Analysis Services provides, a unified and integrated view of all your business data as the foundation for all of your traditional reporting, online analytical processing (OLAP) analysis, Key Performance Indicator (KPI) scorecards, and data mining. Analysis Services Samples for SQL Server 2008 R2 This release is dedicated to the samples that ship for Microsoft SQL Server 2008R2. For many of these samples you will also need to download the AdventureWorks family of databases. SQL Server Reporting Services Product Samples This project contains Reporting Services samples released with Microsoft SQL Server product. These samples are in the following five categories: Application Samples, Extension Samples, Model Samples, Report Samples, and Script Samples. If you are interested in contributing Reporting Services samples, please let us know by posting in the developers’ forum. Reporting Services Samples for SQL Server 2008 R2 This release is dedicated to the samples that ship for Microsoft SQL Server 2008 R2 PCU1. For many of these samples you will also need to download the AdventureWorks family of databases. SQL Server Integration Services Product Samples This project contains Integration Services samples released with Microsoft SQL Server product. These samples are in the following two categories: Package Samples and Programming Samples. If you are interested in contributing Integration Services samples, please let us know by posting in the developers’ forum. Integration Services Samples for SQL Server 2008 R2 This release is dedicated to the samples that ship for Microsoft SQL Server 2008R2. For many of these samples you will also need to download the AdventureWorks family of databases. Windows Azure SQL Reporting Admin Sample The SQLReportingAdmin sample for Windows Azure SQL Reporting demonstrates the usage of SQL Reporting APIs, and manages (add/update/delete) permissions of SQL Reporting users. Windows Azure SQL Reporting ReportViewer-SOAP API usage sample These sample projects demonstrate how to embed a Microsoft ReportViewer control that points to reports hosted on SQL Reporting report servers and how to use SQL Reporting SOAP APIs in your Windows Azure Web application. Enterprise Library 5.0 – Integration Pack for Windows Azure This NuGet package contains a zip file with the source code for the Enterprise Library Integration Pack for Windows Azure. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Download, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL Sample Database

Read the article
My Take on Hadoop World 2011

- by Jean-Pierre Dijcks

I’m sure some of you have read pieces about Hadoop World and I did see some headlines which were somewhat, shall we say, interesting? I thought the keynote by Larry Feinsmith of JP Morgan Chase & Co was one of the highlights of the conference for me. The reason was very simple, he addressed some real use cases outside of internet and ad platforms. The following are my notes, since the keynote was recorded I presume you can go and look at Hadoopworld.com at some point… On the use cases that were mentioned: ETL – how can I do complex data transformation at scale Doing Basel III liquidity analysis Private banking – transaction filtering to feed [relational] data marts Common Data Platform – a place to keep data that is (or will be) valuable some day, to someone, somewhere 360 Degree view of customers – become pro-active and look at events across lines of business. For example make sure the mortgage folks know about direct deposits being stopped into an account and ensure the bank is pro-active to service the customer Treasury and Security – Global Payment Hub [I think this is really consolidation of data to cross reference activity across business and geographies] Data Mining Bypass data engineering [I interpret this as running a lot of a large data set rather than on samples] Fraud prevention – work on event triggers, say a number of failed log-ins to the website. When they occur grab web logs, firewall logs and rules and start to figure out who is trying to log in. Is this me, who forget his password, or is it someone in some other country trying to guess passwords Trade quality analysis – do a batch analysis or all trades done and run them through an analysis or comparison pipeline One of the key requests – if you can say it like that – was for vendors and entrepreneurs to make sure that new tools work with existing tools. JPMC has a large footprint of BI Tools and Big Data reporting and tools should work with those tools, rather than be separate. Security and Entitlement – how to protect data within a large cluster from unwanted snooping was another topic that came up. I thought his Elephant ears graph was interesting (couldn’t actually read the points on it, but the concept certainly made some sense) and it was interesting – when asked to show hands – how the audience did not (!) think that RDBMS and Hadoop technology would overlap completely within a few years. Another interesting session was the session from Disney discussing how Disney is building a DaaS (Data as a Service) platform and how Hadoop processing capabilities are mixed with Database technologies. I thought this one of the best sessions I have seen in a long time. It discussed real use case, where problems existed, how they were solved and how Disney planned some of it. The planning focused on three things/phases: Determine the Strategy – Design a platform and evangelize this within the organization Focus on the people – Hire key people, grow and train the staff (and do not overload what you have with new things on top of their day-to-day job), leverage a partner with experience Work on Execution of the strategy – Implement the platform Hadoop next to the other technologies and work toward the DaaS platform This kind of fitted with some of the Linked-In comments, best summarized in “Think Platform – Think Hadoop”. In other words [my interpretation], step back and engineer a platform (like DaaS in the Disney example), then layer the rest of the solutions on top of this platform. One general observation, I got the impression that we have knowledge gaps left and right. On the one hand are people looking for more information and details on the Hadoop tools and languages. On the other I got the impression that the capabilities of today’s relational databases are underestimated. Mostly in terms of data volumes and parallel processing capabilities or things like commodity hardware scale-out models. All in all I liked this conference, it was great to chat with a wide range of people on Oracle big data, on big data, on use cases and all sorts of other stuff. Just hope they get a set of bigger rooms next time… and yes, I hope I’m going to be back next year!

Read the article
Big Data – Data Mining with Hive – What is Hive? – What is HiveQL (HQL)? – Day 15 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the operational database in Big Data Story. In this article we will understand what is Hive and HQL in Big Data Story. Yahoo started working on PIG (we will understand that in the next blog post) for their application deployment on Hadoop. The goal of Yahoo to manage their unstructured data. Similarly Facebook started deploying their warehouse solutions on Hadoop which has resulted in HIVE. The reason for going with HIVE is because the traditional warehousing solutions are getting very expensive. What is HIVE? Hive is a datawarehouseing infrastructure for Hadoop. The primary responsibility is to provide data summarization, query and analysis. It supports analysis of large datasets stored in Hadoop’s HDFS as well as on the Amazon S3 filesystem. The best part of HIVE is that it supports SQL-Like access to structured data which is known as HiveQL (or HQL) as well as big data analysis with the help of MapReduce. Hive is not built to get a quick response to queries but it it is built for data mining applications. Data mining applications can take from several minutes to several hours to analysis the data and HIVE is primarily used there. HIVE Organization The data are organized in three different formats in HIVE. Tables: They are very similar to RDBMS tables and contains rows and tables. Hive is just layered over the Hadoop File System (HDFS), hence tables are directly mapped to directories of the filesystems. It also supports tables stored in other native file systems. Partitions: Hive tables can have more than one partition. They are mapped to subdirectories and file systems as well. Buckets: In Hive data may be divided into buckets. Buckets are stored as files in partition in the underlying file system. Hive also has metastore which stores all the metadata. It is a relational database containing various information related to Hive Schema (column types, owners, key-value data, statistics etc.). We can use MySQL database over here. What is HiveSQL (HQL)? Hive query language provides the basic SQL like operations. Here are few of the tasks which HQL can do easily. Create and manage tables and partitions Support various Relational, Arithmetic and Logical Operators Evaluate functions Download the contents of a table to a local directory or result of queries to HDFS directory Here is the example of the HQL Query: SELECT upper(name), salesprice FROM sales; SELECT category, count(1) FROM products GROUP BY category; When you look at the above query, you can see they are very similar to SQL like queries. Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Pig. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL analytical mash-ups deliver real-time WOW! for big data

- by KLaker

One of the overlooked capabilities of SQL as an analysis engine, because we all just take it for granted, is that you can mix and match analytical features to create some amazing mash-ups. As we move into the exciting world of big data these mash-ups can really deliver those "wow, I never knew that" moments. While Java is an incredibly flexible and powerful framework for managing big data there are some significant challenges in using Java and MapReduce to drive your analysis to create these "wow" discoveries. One of these "wow" moments was demonstrated at this year's OpenWorld during Andy Mendelsohn's general keynote session. Here is the scenario - we are looking for fraudulent activities in our big data stream and in this case we identifying potentially fraudulent activities by looking for specific patterns. We using geospatial tagging of each transaction so we can create a real-time fraud-map for our business users. Where we start to move towards a "wow" moment is to extend this basic use of spatial and pattern matching, as shown in the above dashboard screen, to incorporate spatial analytics within the SQL pattern matching clause. This will allow us to compute the distance between transactions. Apologies for the quality of this screenshot….hopefully below you see where we have extended our SQL pattern matching clause to use location of each transaction and to calculate the distance between each transaction: This allows us to compare the time of the last transaction with the time of the current transaction and see if the distance between the two points is possible given the time frame. Obviously if I buy something in Florida from my favourite bike store (may be a new carbon saddle for my Trek) and then 5 minutes later the system sees my credit card details being used in Arizona there is high probability that this transaction in Arizona is actually fraudulent (I am fast on my Trek but not that fast!) and we can flag this up in real-time on our dashboard: In this post I have used the term "real-time" a couple of times and this is an important point and one of the key reasons why SQL really is the only language to use if you want to analyse big data. One of the most important questions that comes up in every big data project is: how do we do analysis? Many enlightened customers are now realising that using Java-MapReduce to deliver analysis does not result in "wow" moments. These "wow" moments only come with SQL because it is offers a much richer environment, it is simpler to use and it is faster - which makes it possible to deliver real-time "Wow!". Below is a slide from Andy's session showing the results of a comparison of Java-MapReduce vs. SQL pattern matching to deliver our "wow" moment during our live demo. You can watch our analytical mash-up "Wow" demo that compares the power of 12c SQL pattern matching + spatial analytics vs. Java-MapReduce here: You can get more information about SQL Pattern Matching on our SQL Analytics home page on OTN, see here http://www.oracle.com/technetwork/database/bi-datawarehousing/sql-analytics-index-1984365.html. You can get more information about our spatial analytics here: http://www.oracle.com/technetwork/database-options/spatialandgraph/overview/index.html If you would like to watch the full Database 12c OOW presentation see here: http://medianetwork.oracle.com/video/player/2686974264001

Read the article
Exalytics and Oracle Business Intelligence Enterprise Edition (OBIEE) Partner Workshop

- by mseika

Workshop Description Oracle Fusion Middleware 11g is the #1 application infrastructure foundation. It enables enterprises to create and run agile and intelligent business applications and maximize IT efficiency by exploiting modern hardware and software architectures. Oracle Exalytics Business Intelligence Machine is the world’s first engineered system specifically designed to deliver high performance analysis, modeling and planning. Built using industry-standard hardware, market-leading business intelligence software and in-memory database technology, Oracle Exalytics is an optimized system that delivers unmatched speed, visualizations and scalability for Business Intelligence and Enterprise Performance Management applications. This FREE hands-on, partner workshop highlights both the hardware and software components that are engineered to work together to deliver Oracle Exalytics - an optimized version of the industry-leading Oracle TimesTen In-Memory Database with analytic extensions, a highly scalable Oracle server designed specifically for in-memory business intelligence, and Oracle’s proven Business Intelligence Foundation with enhanced visualization capabilities and performance optimizations. This workshop will provide hands-on experience with Oracle's latest engineered system. Topics covered will include TimesTen In-Memory Database and the new Summary Advisor for Exalytics, the technical details (including mobile features) of the latest release of visualization enhancements for OBI-EE, and technical updates on Essbase. After taking this course, you will be well prepared to architect, build, demo, and implement an end-to-end Exalytics solution. You will also be able to extend your current analytical and enterprise performance management application implementations with numerous Oracle technologies specifically enhanced to take advantage of the compute capacity and in-memory capabilities of Oracle Exalytics.If you are a BI or Data Warehouse Architect, developer or consultant, you don’t want to miss this 3-day workshop. Register Now! Presentations Exalytics Architectural Overview Upgrade and Lifecycle Management Times Ten for Exalytics Summary Advisor Utility Essbase and EPM System on Exalytics Dashboard and Analysis Interactions OBIEE 11.1.1.6 Features and Advanced Topics Lab OutlineThe labs showcase Oracle Exalytics core components and functionality and provide expertise of Oracle Business Intelligence 11.1.1.6 new features and updates from prior releases. The hands-on activities are based on an Oracle VirtualBox image with software and training samples pre-installed. Lab Environment Setup Creating and Working with Oracle TimesTen In-Memory Database Running Summary Advisor Utility Working with Exalytics Visualization Features – Dashboard and Analysis Interactions Audience Oracle Partners BI and EPM Application Developers and Implementers System Integrators and Solution Consultants Data Warehouse Developers Enterprise Architects Prerequisites Experience and understanding of OBIEE 11g is required Previous attendance of Oracle Business Intelligence Foundation Suite Workshop or BIEE 11gIntroduction Workshop is highly recommended Good understanding of data warehousing and data modeling for reporting and analysis purpose Strong experience with database technologies preferred Equipment RequirementsThis workshop requires attendees to provide their own laptops for this class.Attendee laptops must meet the following minimum hardware/software requirements: Hardware Minimum 8GB RAM 60 GB free space (includes staging) USB 2.0 port (at least one available) It is strongly recommended that you bring a mouse. You will be working in a development environment and using the mouse heavily. Software One of the following operating systems: 64-bit Windows host/laptop OS 64-bit host/laptop OS with a Windows VM (XP, Server, or Win 7, BIC2g, etc.) Internet Explorer 7.x/8.x or Firefox 3.5.x WINRAR or 7ziputility to unzip workshop files: Download-able from http://www.win-rar.com/download.html Download-able from http://www.7zip.com/ Oracle VirtualBox 4.0.2 or higher Downloadable from http://www.virtualbox.org/wiki/Downloads CPU virtualization mode needs to be enabled. We will provide guidance on the day of the workshop. Attendees will be given a VirtualBox image containing a pre-installed Oracle Exalytics environment. Schedule This workshop is 3 days. - Times vary by country!9:00am: Sign-in and technical setup 9:30am: Workshop starts 5:00pm: Workshop ends Oracle Exalytics and Business Intelligence (OBIEE) Workshop December 11-13, 2012: Oracle BVP, Birmingham, UK Register Here. Questions? Send email to: [email protected] Oracle Platform Technologies Enablement Services

Read the article
Oracle Spatial and Graph – A year in review

- by Mandy Ho

Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} What a great year for Oracle Spatial! Or shall I now say, Oracle Spatial and Graph, with our official name change this summer. There were so many exciting events and updates we had this year, and this blog will review and link to some of the events you may have missed over the year. We kicked off 2012 with our webinar: Situational Analysis at OnStar with Oracle Spatial and Graph. We collaborated with OnStar’s Emergency Strategy and Outreach expert, Jeff Joyner ,on how Onstar uses Google Earth Visualization, NAVTEQ data and Oracle Database to deliver fast, accurate emergency services to its customers. In the next webinar in our 2012 series, Oracle partner TARGUSinfo showcased how to build a robust, scalable and secure customer relationship management systems – with built-in mapping and spatial analysis, and deployed in the cloud. This is a very cool system using all Oracle technologies including Oracle Database and Fusion Middleware MapViewer. Attendees learned how to gather market insight, score prospects and customers and perform location analysis. The replay is available here. Our final webinar of the year focused on using Oracle Business Intelligence tools, along with Oracle Spatial and Graph to perform location-aware predictive analysis. Watch the webcast here: In June, we joined up with the Location Intelligence conference in Washington, DC, and had a very successful 2012 Oracle Spatial User Conference. Customers and partners from the US, as well as from EMEA and Asia, flew in to share experiences and ideas, and get technical updates from Oracle experts. Users were excited to hear about spatial-Exadata performance, and advances in MapViewer and BI. Peter Doolan of Oracle Public Sector kicked off the event with a great keynote, and US Census, NOAA, and Ordnance Survey Great Britain were just a few of the presenters. Presentation archive here. We recognized some of the most exceptional partners and customers for their contributions to advancing mainstream solutions using geospatial technologies. Planning for 2013’s conference has already started. Please contribute your papers for consideration here. http://www.locationintelligence.net/ We also launched a new Oracle PartnerNetwork Spatial Specialization – to enable partners to get validated in the marketplace for their expertise in taking solutions to market. Individuals can also get individual certifications. Learn more here. Oracle Open World was not to disappoint, with news regarding our next Oracle Spatial and Graph release, as well as the announcement of our new Oracle Spatial and Graph SIG board! Join the SIG today. One more exciting event as we look to 2013. Spatial and location technologies have a dedicated track at the January BIWA SIG Summit – on January 9-10 in Redwood Shores, CA. View the agenda and register here: www.biwasummit.org. We thank you for all your support during the year of 2012 and look towards an even more exciting 2013! Wishing you and your family a prosperous New Year and Happy Holidays!

Read the article
Form, function and complexity in rule processing

- by Charles Young

Tim Bass posted on ‘Orwellian Event Processing’. I was involved in a heated exchange in the comments, and he has more recently published a post entitled ‘Disadvantages of Rule-Based Systems (Part 1)’. Whatever the rights and wrongs of our exchange, it clearly failed to generate any agreement or understanding of our different positions. I don't particularly want to promote further argument of that kind, but I do want to take the opportunity of offering a different perspective on rule-processing and an explanation of my comments. For me, the ‘red rag’ lay in Tim’s claim that “...rules alone are highly inefficient for most classes of (not simple) problems” and a later paragraph that appears to equate the simplicity of form (‘IF-THEN-ELSE’) with simplicity of function. It is not the first time Tim has expressed these views and not the first time I have responded to his assertions. Indeed, Tim has a long history of commenting on the subject of complex event processing (CEP) and, less often, rule processing in ‘robust’ terms, often asserting that very many other people’s opinions on this subject are mistaken. In turn, I am of the opinion that, certainly in terms of rule processing, which is an area in which I have a specific interest and knowledge, he is often mistaken. There is no simple answer to the fundamental question ‘what is a rule?’ We use the word in a very fluid fashion in English. Likewise, the term ‘rule processing’, as used widely in IT, is equally difficult to define simplistically. The best way to envisage the term is as a ‘centre of gravity’ within a wider domain. That domain contains many other ‘centres of gravity’, including CEP, statistical analytics, neural networks, natural language processing and so much more. Whole communities tend to gravitate towards and build themselves around some of these centres. The term 'rule processing' is associated with many different technology types, various software products, different architectural patterns, the functional capability of many applications and services, etc. There is considerable variation amongst these different technologies, techniques and products. Very broadly, a common theme is their ability to manage certain types of processing and problem solving through declarative, or semi-declarative, statements of propositional logic bound to action-based consequences. It is generally important to be able to decouple these statements from other parts of an overall system or architecture so that they can be managed and deployed independently. As a centre of gravity, ‘rule processing’ is no island. It exists in the context of a domain of discourse that is, itself, highly interconnected and continuous. Rule processing does not, for example, exist in splendid isolation to natural language processing. On the contrary, an on-going theme of rule processing is to find better ways to express rules in natural language and map these to executable forms. Rule processing does not exist in splendid isolation to CEP. On the contrary, an event processing agent can reasonably be considered as a rule engine (a theme in ‘Power of Events’ by David Luckham). Rule processing does not live in splendid isolation to statistical approaches such as Bayesian analytics. On the contrary, rule processing and statistical analytics are highly synergistic. Rule processing does not even live in splendid isolation to neural networks. For example, significant research has centred on finding ways to translate trained nets into explicit rule sets in order to support forms of validation and facilitate insight into the knowledge stored in those nets. What about simplicity of form? Many rule processing technologies do indeed use a very simple form (‘If...Then’, ‘When...Do’, etc.) However, it is a fundamental mistake to equate simplicity of form with simplicity of function. It is absolutely mistaken to suggest that simplicity of form is a barrier to the efficient handling of complexity. There are countless real-world examples which serve to disprove that notion. Indeed, simplicity of form is often the key to handling complexity. Does rule processing offer a ‘one size fits all’. No, of course not. No serious commentator suggests it does. Does the design and management of large knowledge bases, expressed as rules, become difficult? Yes, it can do, but that is true of any large knowledge base, regardless of the form in which knowledge is expressed. The measure of complexity is not a function of rule set size or rule form. It tends to be correlated more strongly with the size of the ‘problem space’ (‘search space’) which is something quite different. Analysis of the problem space and the algorithms we use to search through that space are, of course, the very things we use to derive objective measures of the complexity of a given problem. This is basic computer science and common practice. Sailing a Dreadnaught through the sea of information technology and lobbing shells at some of the islands we encounter along the way does no one any good. Building bridges and causeways between islands so that the inhabitants can collaborate in open discourse offers hope of real progress.

Read the article
SQL Server Express 2008 R2 Installation error at Windows 7

- by Shai Sherman

Hello, I created install script that will install SQL Server 2008 R2 on windows XP SP3, windows vista and windows 7. One of the command that i used in the installation is for silent installation of SQL Server 2008 R2. When i install it on windows XP everything works just fine but when i try to install it on Windows 7 i get an error. What am I doing wrong? Here is the command line that i use: "Setup.exe /ConfigurationFile=Mysetup.ini" Mysetup.ini file: -------------------------------------Start of ini file --------------------------------- ;SQL SERVER 2008 R2 Configuration File ;Version 1.0, 5 May 2010 ; [SQLSERVER2008] ; Specify the Instance ID for the SQL Server features you have specified. SQL Server directory structure, registry structure, and service names will reflect the instance ID of the SQL Server instance. INSTANCEID="MSSQLSERVER" ; Specifies a Setup work flow, like INSTALL, UNINSTALL, or UPGRADE. This is a required parameter. ACTION="Install" ; Specifies features to install, uninstall, or upgrade. The list of top-level features include SQL, AS, RS, IS, and Tools. The SQL feature will install the database engine, replication, and full-text. The Tools feature will install Management Tools, Books online, Business Intelligence Development Studio, and other shared components. FEATURES=SQLENGINE ; Displays the command line parameters usage HELP="False" ; Specifies that the detailed Setup log should be piped to the console. INDICATEPROGRESS="False" ; Setup will not display any user interface. QUIET="False" ; Setup will display progress only without any user interaction. QUIETSIMPLE="True" ; Specifies that Setup should install into WOW64. This command line argument is not supported on an IA64 or a 32-bit system. ;X86="False" ; Specifies the path to the installation media folder where setup.exe is located. ;MEDIASOURCE="z:\" ; Detailed help for command line argument ENU has not been defined yet. ENU="True" ; Parameter that controls the user interface behavior. Valid values are Normal for the full UI, and AutoAdvance for a simplied UI. ; UIMODE="Normal" ; Specify if errors can be reported to Microsoft to improve future SQL Server releases. Specify 1 or True to enable and 0 or False to disable this feature. ERRORREPORTING="False" ; Specify the root installation directory for native shared components. ;INSTALLSHAREDDIR="D:\Program Files\Microsoft SQL Server" ; Specify the root installation directory for the WOW64 shared components. ;INSTALLSHAREDWOWDIR="D:\Program Files (x86)\Microsoft SQL Server" ; Specify the installation directory. ;INSTANCEDIR="D:\Program Files\Microsoft SQL Server" ; Specify that SQL Server feature usage data can be collected and sent to Microsoft. Specify 1 or True to enable and 0 or False to disable this feature. SQMREPORTING="False" ; Specify a default or named instance. MSSQLSERVER is the default instance for non-Express editions and SQLExpress for Express editions. This parameter is required when installing the SQL Server Database Engine (SQL), Analysis Services (AS), or Reporting Services (RS). INSTANCENAME="SQLEXPRESS" SECURITYMODE=SQL SAPWD=SystemAdmin ; Agent account name AGTSVCACCOUNT="NT AUTHORITY\NETWORK SERVICE" ; Auto-start service after installation. AGTSVCSTARTUPTYPE="Manual" ; Startup type for Integration Services. ;ISSVCSTARTUPTYPE="Automatic" ; Account for Integration Services: Domain\User or system account. ;ISSVCACCOUNT="NT AUTHORITY\NetworkService" ; Controls the service startup type setting after the service has been created. ;ASSVCSTARTUPTYPE="Automatic" ; The collation to be used by Analysis Services. ;ASCOLLATION="Latin1_General_CI_AS" ; The location for the Analysis Services data files. ;ASDATADIR="Data" ; The location for the Analysis Services log files. ;ASLOGDIR="Log" ; The location for the Analysis Services backup files. ;ASBACKUPDIR="Backup" ; The location for the Analysis Services temporary files. ;ASTEMPDIR="Temp" ; The location for the Analysis Services configuration files. ;ASCONFIGDIR="Config" ; Specifies whether or not the MSOLAP provider is allowed to run in process. ;ASPROVIDERMSOLAP="1" ; A port number used to connect to the SharePoint Central Administration web application. ;FARMADMINPORT="0" ; Startup type for the SQL Server service. SQLSVCSTARTUPTYPE="Automatic" ; Level to enable FILESTREAM feature at (0, 1, 2 or 3). FILESTREAMLEVEL="0" ; Set to "1" to enable RANU for SQL Server Express. ENABLERANU="1" ; Specifies a Windows collation or an SQL collation to use for the Database Engine. SQLCOLLATION="SQL_Latin1_General_CP1_CI_AS" ; Account for SQL Server service: Domain\User or system account. SQLSVCACCOUNT="NT Authority\System" ; Default directory for the Database Engine user databases. ;SQLUSERDBDIR="K:\Microsoft SQL Server\MSSQL\Data" ; Default directory for the Database Engine user database logs. ;SQLUSERDBLOGDIR="L:\Microsoft SQL Server\MSSQL\Data\Logs" ; Directory for Database Engine TempDB files. ;SQLTEMPDBDIR="T:\Microsoft SQL Server\MSSQL\Data" ; Directory for the Database Engine TempDB log files. ;SQLTEMPDBLOGDIR="T:\Microsoft SQL Server\MSSQL\Data\Logs" ; Provision current user as a Database Engine system administrator for SQL Server 2008 R2 Express. ADDCURRENTUSERASSQLADMIN="True" ; Specify 0 to disable or 1 to enable the TCP/IP protocol. TCPENABLED="1" ; Specify 0 to disable or 1 to enable the Named Pipes protocol. NPENABLED="0" ; Startup type for Browser Service. BROWSERSVCSTARTUPTYPE="Automatic" ; Specifies how the startup mode of the report server NT service. When ; Manual - Service startup is manual mode (default). ; Automatic - Service startup is automatic mode. ; Disabled - Service is disabled ;RSSVCSTARTUPTYPE="Automatic" ; Specifies which mode report server is installed in. ; Default value: “FilesOnly” ;RSINSTALLMODE="FilesOnlyMode" ; Accept SQL Server 2008 R2 license terms IACCEPTSQLSERVERLICENSETERMS="TRUE" ;setup.exe /CONFIGURATIONFILE=Mysetup.ini /INDICATEPROGRESS --------------------------- End of ini file ------------------------------------- And i get this error: 2010-08-31 18:05:53 Slp: Error result: -2068119551 2010-08-31 18:05:53 Slp: Result facility code: 1211 2010-08-31 18:05:53 Slp: Result error code: 1 2010-08-31 18:05:53 Slp: Sco: Attempting to create base registry key HKEY_LOCAL_MACHINE, machine 2010-08-31 18:05:53 Slp: Sco: Attempting to open registry subkey 2010-08-31 18:05:53 Slp: Sco: Attempting to open registry subkey Software\Microsoft\PCHealth\ErrorReporting\DW\Installed 2010-08-31 18:05:53 Slp: Sco: Attempting to get registry value DW0200 2010-08-31 18:05:53 Slp: Submitted 1 of 1 failures to the Watson data repository What the meaning of this? What do i need to do to fix that problem? Here is the Summary file: Overall summary: Final result: SQL Server installation failed. To continue, investigate the reason for the failure, correct the problem, uninstall SQL Server, and then rerun SQL Server Setup. Exit code (Decimal): -2068119551 Exit facility code: 1211 Exit error code: 1 Exit message: SQL Server installation failed. To continue, investigate the reason for the failure, correct the problem, uninstall SQL Server, and then rerun SQL Server Setup. Start time: 2010-08-31 18:03:44 End time: 2010-08-31 18:05:51 Requested action: Install Log with failure: C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log\20100831_180236\Detail.txt Exception help link: http%3a%2f%2fgo.microsoft.com%2ffwlink%3fLinkId%3d20476%26ProdName%3dMicrosoft%2bSQL%2bServer%26EvtSrc%3dsetup.rll%26EvtID%3d50000%26ProdVer%3d10.50.1600.1%26EvtType%3d0x6121810A%400xC24842DB Machine Properties: Machine name: NVR Machine processor count: 2 OS version: Windows 7 OS service pack: OS region: United States OS language: English (United States) OS architecture: x86 Process architecture: 32 Bit OS clustered: No Product features discovered: Product Instance Instance ID Feature Language Edition Version Clustered Package properties: Description: SQL Server Database Services 2008 R2 ProductName: SQL Server 2008 R2 Type: RTM Version: 10 SPLevel: 0 Installation location: C:\Disk1\setupsql\x86\setup\ Installation edition: EXPRESS User Input Settings: ACTION: Install ADDCURRENTUSERASSQLADMIN: True AGTSVCACCOUNT: NT AUTHORITY\NETWORK SERVICE AGTSVCPASSWORD: * AGTSVCSTARTUPTYPE: Disabled ASBACKUPDIR: Backup ASCOLLATION: Latin1_General_CI_AS ASCONFIGDIR: Config ASDATADIR: Data ASDOMAINGROUP: ASLOGDIR: Log ASPROVIDERMSOLAP: 1 ASSVCACCOUNT: ASSVCPASSWORD: * ASSVCSTARTUPTYPE: Automatic ASSYSADMINACCOUNTS: ASTEMPDIR: Temp BROWSERSVCSTARTUPTYPE: Automatic CONFIGURATIONFILE: C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log\20100831_180236\ConfigurationFile.ini CUSOURCE: ENABLERANU: True ENU: True ERRORREPORTING: False FARMACCOUNT: FARMADMINPORT: 0 FARMPASSWORD: * FEATURES: SQLENGINE FILESTREAMLEVEL: 0 FILESTREAMSHARENAME: FTSVCACCOUNT: FTSVCPASSWORD: * HELP: False IACCEPTSQLSERVERLICENSETERMS: True INDICATEPROGRESS: False INSTALLSHAREDDIR: C:\Program Files\Microsoft SQL Server\ INSTALLSHAREDWOWDIR: C:\Program Files\Microsoft SQL Server\ INSTALLSQLDATADIR: INSTANCEDIR: C:\Program Files\Microsoft SQL Server\ INSTANCEID: MSSQLSERVER INSTANCENAME: SQLEXPRESS ISSVCACCOUNT: NT AUTHORITY\NetworkService ISSVCPASSWORD: * ISSVCSTARTUPTYPE: Automatic NPENABLED: 0 PASSPHRASE: * PCUSOURCE: PID: * QUIET: False QUIETSIMPLE: True ROLE: AllFeatures_WithDefaults RSINSTALLMODE: FilesOnlyMode RSSVCACCOUNT: NT AUTHORITY\NETWORK SERVICE RSSVCPASSWORD: * RSSVCSTARTUPTYPE: Automatic SAPWD: * SECURITYMODE: SQL SQLBACKUPDIR: SQLCOLLATION: SQL_Latin1_General_CP1_CI_AS SQLSVCACCOUNT: NT Authority\System SQLSVCPASSWORD: * SQLSVCSTARTUPTYPE: Automatic SQLSYSADMINACCOUNTS: SQLTEMPDBDIR: SQLTEMPDBLOGDIR: SQLUSERDBDIR: SQLUSERDBLOGDIR: SQMREPORTING: False TCPENABLED: 1 UIMODE: AutoAdvance X86: False Configuration file: C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log\20100831_180236\ConfigurationFile.ini Detailed results: Feature: Database Engine Services Status: Failed: see logs for details MSI status: Passed Configuration status: Failed: see details below Configuration error code: 0x0A2FBD17@1211@1 Configuration error description: The process cannot access the file because it is being used by another process. Configuration log: C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log\20100831_180236\Detail.txt Rules with failures: Global rules: Scenario specific rules: Rules report file: C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log\20100831_180236\SystemConfigurationCheck_Report.htm What should I do and why does this problem occur? Thanks , Shai.

Read the article
Testing Workflows – Test-After

- by Timothy Klenke

Originally posted on: http://geekswithblogs.net/TimothyK/archive/2014/05/30/testing-workflows-ndash-test-after.aspxIn this post I’m going to outline a few common methods that can be used to increase the coverage of of your test suite. This won’t be yet another post on why you should be doing testing; there are plenty of those types of posts already out there. Assuming you know you should be testing, then comes the problem of how do I actual fit that into my day job. When the opportunity to automate testing comes do you take it, or do you even recognize it? There are a lot of ways (workflows) to go about creating automated tests, just like there are many workflows to writing a program. When writing a program you can do it from a top-down approach where you write the main skeleton of the algorithm and call out to dummy stub functions, or a bottom-up approach where the low level functionality is fully implement before it is quickly wired together at the end. Both approaches are perfectly valid under certain contexts. Each approach you are skilled at applying is another tool in your tool belt. The more vectors of attack you have on a problem – the better. So here is a short, incomplete list of some of the workflows that can be applied to increasing the amount of automation in your testing and level of quality in general. Think of each workflow as an opportunity that is available for you to take. Test workflows basically fall into 2 categories: test first or test after. Test first is the best approach. However, this post isn’t about the one and only best approach. I want to focus more on the lesser known, less ideal approaches that still provide an opportunity for adding tests. In this post I’ll enumerate some test-after workflows. In my next post I’ll cover test-first. Bug Reporting When someone calls you up or forwards you a email with a vague description of a bug its usually standard procedure to create or verify a reproduction plan for the bug via manual testing and log that in a bug tracking system. This can be problematic. Often reproduction plans when written down might skip a step that seemed obvious to the tester at the time or they might be missing some crucial environment setting. Instead of data entry into a bug tracking system, try opening up the test project and adding a failing unit test to prove the bug. The test project guarantees that all aspects of the environment are setup properly and no steps are missing. The language in the test project is much more precise than the English that goes into a bug tracking system. This workflow can easily be extended for Enhancement Requests as well as Bug Reporting. Exploratory Testing Exploratory testing comes in when you aren’t sure how the system will behave in a new scenario. The scenario wasn’t planned for in the initial system requirements and there isn’t an existing test for it. By definition the system behaviour is “undefined”. So write a new unit test to define that behaviour. Add assertions to the tests to confirm your assumptions. The new test becomes part of the living system specification that is kept up to date with the test suite. Examples This workflow is especially good when developing APIs. When you are finally done your production API then comes the job of writing documentation on how to consume the API. Good documentation will also include code examples. Don’t let these code examples merely exist in some accompanying manual; implement them in a test suite. Example tests and documentation do not have to be created after the production API is complete. It is best to write the example code (tests) as you go just before the production code. Smoke Tests Every system has a typical use case. This represents the basic, core functionality of the system. If this fails after an upgrade the end users will be hosed and they will be scratching their heads as to how it could be possible that an update got released with this core functionality broken. The tests for this core functionality are referred to as “smoke tests”. It is a good idea to have them automated and run with each build in order to avoid extreme embarrassment and angry customers. Coverage Analysis Code coverage analysis is a tool that reports how much of the production code base is exercised by the test suite. In Visual Studio this can be found under the Test main menu item. The tool will report a total number for the code coverage, which can be anywhere between 0 and 100%. Coverage Analysis shouldn’t be used strictly for numbers reporting. Companies shouldn’t set minimum coverage targets that mandate that all projects must have at least 80% or 100% test coverage. These arbitrary requirements just invite gaming of the coverage analysis, which makes the numbers useless. The analysis tool will break down the coverage by the various classes and methods in projects. Instead of focusing on the total number, drill down into this view and see which classes have high or low coverage. It you are surprised by a low number on a class this is an opportunity to add tests. When drilling through the classes there will be generally two types of reaction to a surprising low test coverage number. The first reaction type is a recognition that there is low hanging fruit to be picked. There may be some classes or methods that aren’t being tested, which could easy be. The other reaction type is “OMG”. This were you find a critical piece of code that isn’t under test. In both cases, go and add the missing tests. Test Refactoring The general theme of this post up to this point has been how to add more and more tests to a test suite. I’ll step back from that a bit and remind that every line of code is a liability. Each line of code has to be read and maintained, which costs money. This is true regardless whether the code is production code or test code. Remember that the primary goal of the test suite is that it be easy to read so that people can easily determine the specifications of the system. Make sure that adding more and more tests doesn’t interfere with this primary goal. Perform code reviews on the test suite as often as on production code. Hold the test code up to the same high readability standards as the production code. If the tests are hard to read then change them. Look to remove duplication. Duplicate setup code between two or more test methods that can be moved to a shared function. Entire test methods can be removed if it is found that the scenario it tests is covered by other tests. Its OK to delete a test that isn’t pulling its own weight anymore. Remember to only start refactoring when all the test are green. Don’t refactor the tests and the production code at the same time. An automated test suite can be thought of as a double entry book keeping system. The unchanging, passing production code serves as the tests for the test suite while refactoring the tests. As with all refactoring, it is best to fit this into your regular work rather than asking for time later to get it done. Fit this into the standard red-green-refactor cycle. The refactor step no only applies to production code but also the tests, but not at the same time. Perhaps the cycle should be called red-green-refactor production-refactor tests (not quite as catchy). That about covers most of the test-after workflows I can think of. In my next post I’ll get into test-first workflows.

Read the article
Load and Web Performance Testing using Visual Studio Ultimate 2010-Part 3

- by Tarun Arora

Welcome back once again, in Part 1 of Load and Web Performance Testing using Visual Studio 2010 I talked about why Performance Testing the application is important, the test tools available in Visual Studio Ultimate 2010 and various test rig topologies, in Part 2 of Load and Web Performance Testing using Visual Studio 2010 I discussed the details of web performance & load tests as well as why it’s important to follow a goal based pattern while performance testing your application. In part 3 I’ll be discussing Test Result Analysis, Test Result Drill through, Test Report Generation, Test Run Comparison, Asp.net Profiler and some closing thoughts. Test Results – I see some creepy worms! In Part 2 we put together a web performance test and a load test, lets run the test to see load test to see how the Web site responds to the load simulation. While the load test is running you will be able to see close to real time analysis in the Load Test Analyser window. You can use the Load Test Analyser to conduct load test analysis in three ways: Monitor a running load test - A condensed set of the performance counter data is maintained in memory. To prevent the results memory requirements from growing unbounded, up to 200 samples for each performance counter are maintained. This includes 100 evenly spaced samples that span the current elapsed time of the run and the most recent 100 samples. After the load test run is completed - The test controller spools all collected performance counter data to a database while the test is running. Additional data, such as timing details and error details, is loaded into the database when the test completes. The performance data for a completed test is loaded from the database and analysed by the Load Test Analyser. Below you can see a screen shot of the summary view, this provides key results in a format that is compact and easy to read. You can also print the load test summary, this is generated after the test has completed or been stopped. Analyse the load test results of a previously run load test – We’ll see this in the section where i discuss comparison between two test runs. The performance counters can be plotted on the graphs. You also have the option to highlight a selected part of the test and view details, drill down to the user activity chart where you can hover over to see more details of the test run. Generate Report => Test Run Comparisons The level of reports you can generate using the Load Test Analyser is astonishing. You have the option to create excel reports and conduct side by side analysis of two test results or to track trend analysis. The tools also allows you to export the graph data either to MS Excel or to a CSV file. You can view the ASP.NET profiler report to conduct further analysis as well. View Data and Diagnostic Attachments opens the Choose Diagnostic Data Adapter Attachment dialog box to select an adapter to analyse the result type. For example, you can select an IntelliTrace adapter, click OK and open the IntelliTrace summary for the test agent that was used in the load test. Compare results This creates a set of reports that compares the data from two load test results using tables and bar charts. I have taken these screen shots from the MSDN documentation, I would highly recommend exploring the wealth of knowledge available on MSDN. Leaving Thoughts While load testing the application with an excessive load for a longer duration of time, i managed to bring the IIS to its knees by piling up a huge queue of requests waiting to be processed. This clearly means that the IIS had run out of threads as all the threads were busy processing existing request, one easy way of fixing this is by increasing the default number of allocated threads, but this might escalate the problem. The better suggestion is to try and drill down to the actual root cause of the problem. When ever the garbage collection runs it stops processing any pages so all requests that come in during that period are queued up, but realistically the garbage collection completes in fraction of a a second. To understand this better lets look at the .net heap, it is divided into large heap and small heap, anything greater than 85kB in size will be allocated to the Large object heap, the Large object heap is non compacting and remember large objects are expensive to move around, so if you are allocating something in the large object heap, make sure that you really need it! The small object heap on the other hand is divided into generations, so all objects that are supposed to be short-lived are suppose to live in Gen-0 and the long living objects eventually move to Gen-2 as garbage collection goes through. As you can see in the picture below all < 85 KB size objects are first assigned to Gen-0, when Gen-0 fills up and a new object comes in and finds Gen-0 full, the garbage collection process is started, the process checks for all the dead objects and assigns them as the valid candidate for deletion to free up memory and promotes all the remaining objects in Gen-0 to Gen-1. So in the future when ever you clean up Gen-1 you have to clean up Gen-0 as well. When you fill up Gen – 0 again, all of Gen – 1 dead objects are drenched and rest are moved to Gen-2 and Gen-0 objects are moved to Gen-1 to free up Gen-0, but by this time your Garbage collection process has started to take much more time than it usually takes. Now as I mentioned earlier when garbage collection is being run all page requests that come in during that period are queued up. Does this explain why possibly page requests are getting queued up, apart from this it could also be the case that you are waiting for a long running database process to complete. Lets explore the heap a bit more… What is really a case of crisis is when the objects are living long enough to make it to Gen-2 and then dying, this is definitely a high cost operation. But sometimes you need objects in memory, for example when you cache data you hold on to the objects because you need to use them right across the user session, which is acceptable. But if you wanted to see what extreme caching can do to your server then write a simple application that chucks in a lot of data in cache, run a load test over it for about 10-15 minutes, forcing a lot of data in memory causing the heap to run out of memory. If you get to such a state where you start running out of memory the IIS as a mode of recovery restarts the worker process. It is great way to free up all your memory in the heap but this would clear the cache. The problem with this is if the customer had 10 items in their shopping basket and that data was stored in the application cache, the user basket will now be empty forcing them either to get frustrated and go to a competitor website or if the customer is really patient, give it another try! How can you address this, well two ways of addressing this; 1. Workaround – A x86 bit processor only allows a maximum of 4GB of RAM, this means the machine effectively has around 3.4 GB of RAM available, the OS needs about 1.5 GB of RAM to run efficiently, the IIS and .net framework also need their share of memory, leaving you a heap of around 800 MB to play with. Because Team builds by default build your application in ‘Compile as any mode’ it means the application is build such that it will run in x86 bit mode if run on a x86 bit processor and run in a x64 bit mode if run on a x64 but processor. The problem with this is not all applications are really x64 bit compatible specially if you are using com objects or external libraries. So, as a quick win if you compiled your application in x86 bit mode by changing the compile as any selection to compile as x86 in the team build, you will be able to run your application on a x64 bit machine in x86 bit mode (WOW – By running Windows on Windows) and what that means is, you could use 8GB+ worth of RAM, if you take away everything else your application will roughly get a heap size of at least 4 GB to play with, which is immense. If you need a heap size of more than 4 GB you have either build a software for NASA or there is something fundamentally wrong in your application. 2. Solution – Now that you have put a workaround in place the IIS will not restart the worker process that regularly, which means you can take a breather and start working to get to the root cause of this memory leak. But this begs a question “How do I Identify possible memory leaks in my application?” Well i won’t say that there is one single tool that can tell you where the memory leak is, but trust me, ‘Performance Profiling’ is a great start point, it definitely gets you started in the right direction, let’s have a look at how. Performance Wizard - Start the Performance Wizard and select Instrumentation, this lets you measure function call counts and timings. Before running the performance session right click the performance session settings and chose properties from the context menu to bring up the Performance session properties page and as shown in the screen shot below, check the check boxes in the group ‘.NET memory profiling collection’ namely ‘Collect .NET object allocation information’ and ‘Also collect the .NET Object lifetime information’. Now if you fire off the profiling session on your pages you will notice that the results allows you to view ‘Object Lifetime’ which shows you the number of objects that made it to Gen-0, Gen-1, Gen-2, Large heap, etc. Another great feature about the profile is that if your application has > 5% cases where objects die right after making to the Gen-2 storage a threshold alert is generated to alert you. Since you have the option to also view the most expensive methods and by capturing the IntelliTrace data you can drill in to narrow down to the line of code that is the root cause of the problem. Well now that we have seen how crucial memory management is and how easy Visual Studio Ultimate 2010 makes it for us to identify and reproduce the problem with the best of breed tools in the product. Caching One of the main ways to improve performance is Caching. Which basically means you tell the web server that instead of going to the database for each request you keep the data in the webserver and when the user asks for it you serve it from the webserver itself. BUT that can have consequences! Let’s look at some code, trust me caching code is not very intuitive, I define a cache key for almost all searches made through the common search page and cache the results. The approach works fine, first time i get the data from the database and second time data is served from the cache, significant performance improvement, EXCEPT when two users try to do the same operation and run into each other. But it is easy to handle this by adding the lock as you can see in the snippet below. So, as long as a user comes in and finds that the cache is empty, the user locks and starts to get the cache no more concurrency issues. But lets say you are processing 10 requests per second, by the time i have locked the operation to get the results from the database, 9 other users came in and found that the cache key is null so after i have come out and populated the cache they will still go in to get the results again. The application will still be faster because the next set of 10 users and so on would continue to get data from the cache. BUT if we added another null check after locking to build the cache and before actual call to the db then the 9 users who follow me would not make the extra trip to the database at all and that would really increase the performance, but didn’t i say that the code won’t be very intuitive, may be you should leave a comment you don’t want another developer to come in and think what a fresher why is he checking for the cache key null twice !!! The downside of caching is, you are storing the data outside of the database and the data could be wrong because the updates applied to the database would make the data cached at the web server out of sync. So, how do you invalidate the cache? Well if you only had one way of updating the data lets say only one entry point to the data update you can write some logic to say that every time new data is entered set the cache object to null. But this approach will not work as soon as you have several ways of feeding data to the system or your system is scaled out across a farm of web servers. The perfect solution to this is Micro Caching which means you cache the query for a set time duration and invalidate the cache after that set duration. The advantage is every time the user queries for that data with in the time span for which you have cached the results there are no calls made to the database and the data is served right from the server which makes the response immensely quick. Now figuring out the appropriate time span for which you micro cache the query results really depends on the application. Lets say your website gets 10 requests per second, if you retain the cache results for even 1 minute you will have immense performance gains. You would reduce 90% hits to the database for searching. Ever wondered why when you go to e-bookers.com or xpedia.com or yatra.com to book a flight and you click on the book button because the fare seems too exciting and you get an error message telling you that the fare is not valid any more. Yes, exactly => That is a cache failure! These travel sites or price compare engines are not going to hit the database every time you hit the compare button instead the results will be served from the cache, because the query results are micro cached, its a perfect trade-off, by micro caching the results the site gains 100% performance benefits but every once in a while annoys a customer because the fare has expired. But the trade off works in the favour of these sites as they are still able to process up to 30+ page requests per second which means cater to the site traffic by may be losing 1 customer every once in a while to a competitor who is also using a similar caching technique what are the odds that the user will not come back to their site sooner or later? Recap Resources Below are some Key resource you might like to review. I would highly recommend the documentation, walkthroughs and videos available on MSDN. You can always make use of Fiddler to debug Web Performance Tests. Some community test extensions and plug ins available on Codeplex might also be of interest to you. The Road Ahead Thank you for taking the time out and reading this blog post, you may also want to read Part I and Part II if you haven’t so far. If you enjoyed the post, remember to subscribe to http://feeds.feedburner.com/TarunArora. Questions/Feedback/Suggestions, etc please leave a comment. Next ‘Load Testing in the cloud’, I’ll be working on exploring the possibilities of running Test controller/Agents in the Cloud. See you on the other side! Thank You! Share this post : CodeProject

Read the article
CodePlex Daily Summary for Tuesday, November 01, 2011

CodePlex Daily Summary for Tuesday, November 01, 2011Popular ReleasesAgFx Windows Phone Application Data Caching Framework: AgFx November Update: This update is primarily a bug fix release, which provides a number of stability and performance enhancements to AgFx. Available as a NuGet package here. Release Notes Fix IsoStore exceptions during shutdown (Changelist 81729) Thread safety and sequencings issues (Changelists 78682,79252, 80707) Fix landscape layout issue in auth sample Fix async keyword collision Supporting overlapping/multiple notifications for Load requestsiTuner - The iTunes Companion: iTuner 1.4.4322: Added German (unverified, apologies if incorrect) Properly source invariant resources with correct resIDs Replaced obsolete lyric providers with working providers Fix Pseudolater to correctly morph every third char Fix null reference in CatalogBaseDevpad: 4.6: Whats new for Devpad 4.6: New Recent Files New Run in Safari Minor Bug Fix's, improvements and speed upsWindows Workflow Foundation on Codeplex: Microsoft.Activities v1.8.8: Microsoft.Activities Overview How do I install Microsoft.Activities? Updates in this release9318Technical Analysis Engine for .NET: Technical Analysis Engine 1.25: What's new in the 1.25 release (2011-11-01): - Added Williams %R indicator - Added Moving Average Envelopes indicatorBF3Rcon.NET: BF3Rcon 3.0: This release is targeted for RCON documentation based on R3. Everything should be beta stable, but it's alpha because I haven't been able to fully test it. When a stable release is ready, a proper changelog will be kept. Important Edit: There's one method that will keep this from working in Mono. GeneratePasswordHash uses void HashAlgorithm.Dispose(), which isn't in Mono. This will have to be changed to Clear() in the next release. If anyone needs a Mono version of this immediately, you can...BoxWorld: BoxWorld_2011.10.30: BoxWorld - 8.0.1110.30 This is the initial release of BoxWorld. I'd recommend downloading the installer as it contains the compiled code and everything all nicely contained. By default, you end up with this directory structure: C:\Program Files\ViperWorks\BoxWorld C:\Program Files\ViperWorks\BoxWorld\Data C:\Program Files\ViperWorks\BoxWorld\Interface C:\Program Files\ViperWorks\BoxWorld\Source In the root you have the compiled EXE files, one for the main release, one for the LITE release ...VidCoder: 1.2.1: Fixed a couple regressions: video encoder was blank in queue and crashes with the High Profile preset when opening the Settings window. Fixed problem with auto-update introduced in 1.2.0. If you have 1.2.0 you will need to update manually to get this.AssaultCube Reloaded: Release 2.3: THE RELEASE YOU'VE ALL BEEN WAITING FOR! IT CAN NOW BE CONSIDERED STABLE Linux has Debian 64-bit precompiled binaries, but you can compile your own as it also contains the source. If you are using Mac or other operating systems, download the Linux package. The server pack is ready for both Windows and Linux, but you might need to compile your own for linux (source included) If you are using Windows and require the source code, download the source package!Koober: Koober - The Ebook Creator 0.2: The official release of Koober as Open source. Koober is a ebook creator for Windows, and Koob Reader is the reader.A Microblog API (SINA weibo.com open API in C#, .Net???????API): AMicroblogAPI v1.0: AMicroBlogAPI is a C# implementation, a strong-typed deep encapsulation, a easy-to-use .net wrapper of SINA microblog API v1.0. App developers no longer need to parse various HTTP responses -- all responses are parsed into strong types; No longer need to construt the request query strings -- just simply gives the values of parameters; No longer need to implement the OAuth -- a single call could logs the user on. In this release (AMicroblogAPI v1.0), all basic data APIs are implemented. For ...patterns & practices: Enterprise Library Contrib: Enterprise Library Contrib - 5.0 (Oct 2011): This release of Enterprise Library Contrib is based on the Microsoft patterns & practices Enterprise Library 5.0 core and contains the following: Common extensionsTypeConfigurationElement<T> - A Polymorphic Configuration Element without having to be part of a PolymorphicConfigurationElementCollection. AnonymousConfigurationElement - A Configuration element that can be uniquely identified without having to define its name explicitly. Data Access Application Block extensionsMySql Provider - ...Network Monitor Open Source Parsers: Network Monitor Parsers 3.4.2748: The Network Monitor Parsers packages contain parsers for more than 400 network protocols, including RFC based public protocols and protocols for Microsoft products defined in the Microsoft Open Specifications for Windows and SQL Server. NetworkMonitor_Parsers.msi is the base parser package which defines parsers for commonly used public protocols and protocols for Microsoft Windows. In this release, NetowrkMonitor_Parsers.msi continues to improve quality and fix bugs. It has included the fo...Duckworth Lewis Professional Edition Calculator: DLcalc 3.0: DLcalc 3.0 can perform Duckworth/Lewis Professional Edition calculations 100% accurately. It also produces over-by-over and ball-by-ball PAR score tables.Media Companion: MC 3.420b Weekly: Ensure .NET 4.0 Full Framework is installed. (Available from http://www.microsoft.com/download/en/details.aspx?id=17718) Ensure the NFO ID fix is applied when transitioning from versions prior to 3.416b. (Details here) Movies Fixed: Fanart and poster scraping issues TV Shows (Re)Added: Rebuild single show Fixed: Issue when shows are moved from original location Ability to handle " for actor nicknames Crash when episode name contains "<" (does not scrape yet) Clears fanart when switch...patterns & practices - Unity: Unity 3.0 for .NET4.5 Preview: The Unity 3.0.1026.0 Preview enables Unity to work on .NET 4.5 with both the WinRT and desktop profiles. The major changes include: Unity projects updated to target .NET 4.5. Dynamic build plans modified to use compiled lambda expressions instead of Reflection.Emit Converting reflection to use the new TypeInfo for reflection. Projects updated to work with the Microsoft Visual Studio 2011 Preview Notes/Known Issues: The Microsoft.Practices.Unity.UnityServiceLocator class cannot be use...Managed Extensibility Framework: MEF 2 Preview 4: Detailed information on this release is available on the BCL team blog.AcDown????? - Anime&Comic Downloader: AcDown????? v3.6: ?? ● AcDown??????????、??????，??????????????????????，???????Acfun、Bilibili、???、???、???、Tucao.cc、SF???、?????80????，???????????、?????????。 ● AcDown???????????????????????????，???，???????????????????。 ● AcDown???????C#??，????.NET Framework 2.0??。?????"Acfun?????"。 ????32??64? Windows XP/Vista/7 ????????????? ??：????????Windows XP???,?????????.NET Framework 2.0???(x86)?.NET Framework 2.0???(x64)，?????"?????????"??? ??????????????，??????????： ??"AcDown?????"????????? ?? v3.6?? ??“????”...MVC Controls Toolkit: Mvc Controls Toolkit 1.5.0: Added: The new Client Blocks feaure of Views A new "move" js method for the TreeViews The NewHtmlCreated js event to the DataGrid Improved the ChoiceList structure that now allows also the selection list of a dropdown to be chosen with a lambda expression Improved the AcceptViewHintAttribute controller filter. Now a client can specify not only the name of a View or Partial View it prefers, but also to receive just the rough data in Json format. Now the the SMinimum and SMaximum par...Free SharePoint Master Pages: Buried Alive (Halloween) Theme: Release Notes *Created for Halloween, you will find theme file, custom css file and images. *Created by Al Roome @AlstarRoome Features: Custom styling for web part Custom background *Screenshot https://s3.amazonaws.com/kkhipple/post/sharepoint-showcase-halloween.pngNew ProjectsarlTH: "Airport Link Thailand" Application for Windows Phone 7.1Audio Tags Editor For Excel: The purpose of the project is to create a tool to edit tags of audio files through the excel application. The program is an Excel add-in.AW Load Tester: AW Load Tester is a simple and easy to use website load tester that very closely mimics a regular users browsing session on your website. Many popular load tester only download HTML. AW Load Tester downloads all the related items on a page simultaneously. AzMan - Be Good Do Right !: The AzMan is a collection of reusable software components designed to assist software developers with common enterprise development challenges. http://bbs.fengshupo.com/BestWayWP7UI: The User interface of best wat wp7 applicationCheckout Subsystem Optimizer: Checkout Subsystem OptimizerDBScripter - Library for scripting SQL Server database objects: This project is library that allows users to script SQL Server database objects. Library uses dynamic management views for extracting data about databases objects. This library could be used in various situations. The most interesting areas are comparing database objects and generation database documentation. For both cases examples have been prepared.Dependency Checker: Dependency Checker is a utility for verifying that a set of prerequisites is present on a machine. The set of dependencies is defined in a configuration file. For more information, see http://dependencychecker.codeplex.com/documentation DibTer: DibTer is a brand new CMS for ASP.NET Mainly features are NHibernate MVC JQuery HTML5Dynamics CRM Entity Based Stress Tool: Entity Based Stress Tool, is command line program that creates configurable server agents in order to stress a defined entity at a defined Message/Stage. This tool allows CRM Application Stressing, Plugin Stressing, and enlaborate an execution report.Edinger: Edinger is a simple text editor.FreeboxHDVideoPlayer: FreeboxHDVideoPlayer makes it easier for free.fr french ISP users to play and manage videos stored on the HDD of the Freebox HD. FreeboxHDVideoPlayer simplifie, pour les utilisateurs de free.fr, la lecture et la gestion des vidéos stockées sur le disque du boitier Freebox HD.guglex - google docs xtraz: my studying project for asp.net mvc 3. now displaying spreadsheet rows in card viewIP-Camera MJPEG HTTP Web-Server Emulator: Turns web-camera into IP-camera with mjpeg streaming and access it over network. Usage of i-Lids (Imagery Library for Intelligent Detection Systems) and other data for surveillance systems development. Sends WMV, folder with images, web-camera data over the wire.LIM: LIM is .net software for archive manage.Luminji.CorWeb: luminji's company web.Map Generator: Map Generator for your Civ/Col/Roguelike games in VB.NET.Microformat Parsers for .NET: Microformat's Parsers for .NET helps you to collect information you run into on the web, that is stored by means of microformats. It's written in C# 3.0.Mini Rx: Mini Rx provides LINQ like queries over events for C# and F# using extension methods, somewhat like the Reactive Extensions (Rx) but in a lightweight open source package with one non-strong named assembly.MVC Music Store by NNT: Demo Music Store Follow tut from Microsoft. Team Explorer VS 2010 .NET 4PHP MySQL Web Site Creator: PHP MySQL Web Site Creator helps you create a basic template for your web site using a UI form. I creates all PHP, CSS, HTML basic pages to connect to MySQL database.PQLRD: Paco KLk!Project Web And Application: Project MobileStore in ASP use ASP.NET MVC 3RoboZip: RoboZip combines MS Robocopy and Zip-functionality. Create a job and RoboZip will zip all files (from a list of filetypes) within a time span in the past. The zip file name can contain date and time values of the time period it covers. It's developed in C#. The zip component is the DotNetZip Library from http://dotnetzip.codeplex.com. The logging is done using Apache log4net from http://logging.apache.org/log4net. The idea here is not to only present the code, but also to document how the de...Scriptster Gearbox: Scriptster Gearbox is a full C# scripting system based on Scriptster (also on CodePlex). It is aimed at Visual Studio developers and power users who need to automate the sorts of things that DOS is generally good at, but which using C# can make more powerful and more familiar.Technical Analysis Engine for .NET: Technical Analysis Engine is a .NET based library for technical analysis. It is fast, free, easy to use and provides functionality for financial market analysis.ValidationDSL: A Fluent validation engine with ease to use and reusable around the multi-tenant applicationsWebFormsUtilities: WebFormsUtilities is a collection of tools that assist in using MVC/MVP functionality to webforms in an unobtrusive manner. This differs from a hybrid MVC/WebForms page in that you can use methods like UpdateModel() or TryValidateModel() in a WebForms Page class.Wen - WPF 3D Navigator: Control the camera and the lights of a WPF 3D application without any code modification.WindStudios: WindStudiosWrite My Expect Statement: Have Rhino Mocks automatically generate your expect() and return() statements for you.zion xtraz: some basic helpful classes for use in every project

Read the article
Using R to Analyze G1GC Log Files

- by user12620111

Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=|| Using R to Analyze G1GC Log Files Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z,]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\$ " , " " ); gsub ( "\ $" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period: par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight. plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

Read the article
REST api design to retrieve summary information

- by Jacques René Mesrine

I have a scenario in which I have REST API which manages a Resource which we will call Group. A Group is similar in concept to a discussion forum in Google Groups. Now I have two GET access method which I believe needs separate representations. The 1st GET access method retrieves the minimal amount of information about a Group. Given a *group_id* it should return a minimal amount of information like { group_id: "5t7yu8i9io0op", group_name: "Android Developers", is_moderated: true, number_of_users: 34, new_messages: 5, icon: "http://boo.com/pic.png" } The 2nd GET access method retrives summary information which are more statistical in nature like: { group_id: "5t7yu8i9io0op", top_ranking_users: { [ { user: "george", posts: 789, rank: 1 }, { user: "joel", posts: 560, rank: 2 } ...] }, popular_topics: { [ ... ] } } I want to separate these data access methods and I'm currently planning on this design: GET /group/:group_id/ GET /group/:group_id/stat Only the latter will return the statistical information about the group. What do you think about this ?

Read the article
Design pattern for cost calculator app?

- by Anders Svensson

Hi, I have a problem that I’ve tried to get help for before, but I wasn’t able to solve it then, so I’m trying to simplify the problem now to see if I can get some more concrete help with this because it is driving me crazy… Basically, I have a working (more complex) version of this application, which is a project cost calculator. But because I am at the same time trying to learn to design my applications better, I would like some input on how I could improve this design. Basically the main thing I want is input on the conditionals that (here) appear repeated in two places. The suggestions I got before was to use the strategy pattern or factory pattern. I also know about the Martin Fowler book with the suggestion to Refactor conditional with polymorphism. I understand that principle in his simpler example. But how can I do either of these things here (if any would be suitable)? The way I see it, the calculation is dependent on a couple of conditions: 1. What kind of service is it, writing or analysis? 2. Is the project small, medium or large? (Please note that there may be other parameters as well, equally different, such as “are the products new or previously existing?” So such parameters should be possible to add, but I tried to keep the example simple with only two parameters to be able to get concrete help) So refactoring with polymorphism would imply creating a number of subclasses, which I already have for the first condition (type of service), and should I really create more subclasses for the second condition as well (size)? What would that become, AnalysisSmall, AnalysisMedium, AnalysisLarge, WritingSmall, etc…??? No, I know that’s not good, I just don’t see how to work with that pattern anyway else? I see the same problem basically for the suggestions of using the strategy pattern (and the factory pattern as I see it would just be a helper to achieve the polymorphism above). So please, if anyone has concrete suggestions as to how to design these classes the best way I would be really grateful! Please also consider whether I have chosen the objects correctly too, or if they need to be redesigned. (Responses like "you should consider the factory pattern" will obviously not be helpful... I've already been down that road and I'm stumped at precisely how in this case) Regards, Anders The code (very simplified, don’t mind the fact that I’m using strings instead of enums, not using a config file for data etc, that will be done as necessary in the real application once I get the hang of these design problems): public abstract class Service { protected Dictionary<string, int> _hours; protected const int SMALL = 2; protected const int MEDIUM = 8; public int NumberOfProducts { get; set; } public abstract int GetHours(); } public class Writing : Service { public Writing(int numberOfProducts) { NumberOfProducts = numberOfProducts; _hours = new Dictionary<string, int> { { "small", 125 }, { "medium", 100 }, { "large", 60 } }; } public override int GetHours() { if (NumberOfProducts <= SMALL) return _hours["small"] * NumberOfProducts; if (NumberOfProducts <= MEDIUM) return (_hours["small"] * SMALL) + (_hours["medium"] * (NumberOfProducts - SMALL)); return (_hours["small"] * SMALL) + (_hours["medium"] * (MEDIUM - SMALL)) + (_hours["large"] * (NumberOfProducts - MEDIUM)); } } public class Analysis : Service { public Analysis(int numberOfProducts) { NumberOfProducts = numberOfProducts; _hours = new Dictionary<string, int> { { "small", 56 }, { "medium", 104 }, { "large", 200 } }; } public override int GetHours() { if (NumberOfProducts <= SMALL) return _hours["small"]; if (NumberOfProducts <= MEDIUM) return _hours["medium"]; return _hours["large"]; } } public partial class Form1 : Form { public Form1() { InitializeComponent(); List<int> quantities = new List<int>(); for (int i = 0; i < 100; i++) { quantities.Add(i); } comboBoxNumberOfProducts.DataSource = quantities; } private void comboBoxNumberOfProducts_SelectedIndexChanged(object sender, EventArgs e) { Service writing = new Writing((int) comboBoxNumberOfProducts.SelectedItem); Service analysis = new Analysis((int) comboBoxNumberOfProducts.SelectedItem); labelWriterHours.Text = writing.GetHours().ToString(); labelAnalysisHours.Text = analysis.GetHours().ToString(); } }

Read the article
Open an Application .exe window inside another application

- by jpnavarini

I have an application in WPF running. I would like that, when a button is clicked inside this application, another application opens, with its window maximized. However, I don't want my first application to stop and wait. I want both to be open and running independently. When the button is clicked again, in case the application is minimized, the application is maximized. In case it is not, it is open again. How is it possible using C#? I have tried the following: Process process = Process.GetProcesses().FirstOrDefault(f => f.ProcessName.Contains("Analysis")); ShowWindow((process ?? Process.Start("..\\..\\..\\MS Analysis\\bin\\Debug\\Chemtech.RT.MS.Analysis.exe")).MainWindowHandle.ToInt32(), SW_MAXIMIZE); But the window does not open, even though the process does start.

Read the article
Is there a way of getting a file name and inserting into Matlab script?

- by torr

In a folder, I have both my .m file that contains the script and an imaging .dcm file that needs to be analyzed. Folder structure: Folder1/analysis.m Folder1/meas_dynamic_123.dcm My script (analysis.m) begins as follows: target =''; <== here should go the full path to the file + filename example: /Volumes/Data/Folder1/meas_dynamic_123.m txt = dir(target); // etc So I'm wondering if there is a way of when running analysis.m it will: automatically search the folder it's in, grab the full path + filename of file containing string dynamic in the name, insert its full path + name into target variable continue running the script Does anyone have any pointers on how to achieve this? Using ffpath?

Read the article

Search Results

Search found 2683 results on 108 pages for 'statistical analysis'.

Page 30/108 | < Previous Page | 26 27 28 29 30 31 32 33 34 35 36 37 | Next Page >

- by Pinal Dave

- by user793553

- by mgerdts

- by Rei Brazilva

- by e2bady

- by JoshReuben

- by Geertjan

- by Webgui

- by Paul Nathan

- by Pinal Dave

- by Jean-Pierre Dijcks

- by Pinal Dave

- by KLaker

- by mseika

- by Mandy Ho

- by Charles Young

- by Shai Sherman

- by Timothy Klenke

- by Tarun Arora

- by user12620111

- by Jacques René Mesrine

- by Anders Svensson

- by jpnavarini

- by torr

< Previous Page | 26 27 28 29 30 31 32 33 34 35 36 37 | Next Page >