offtopic but important - Page 304

Product Development Investment: A Measure of Vendor Performance

- by Jim Mcglothlin

The relationship between a large, complex organization and its key suppliers of information technology is normally more than just "strategic". Expectations about the duration of the relationship typically exceed 20 years. Enterprise applications and technology infrastructure are not expected to be changed out like petunias. So how would you rate the due diligence processes as performed in Higher Education when selecting critical, transformational information technology? My observation: I see a lot of effort put into elaborate demonstration of basic software functionality. I see a lot of attention paid to the cost element of technology acquisition, including the contracted cost of implementation consulting services. But the factor that receives only cursory analysis and due diligence is long-term performance--the ability of a vendor to grow, expand, and develop, and bring its customers along with it. So what should you look for in a long-term IT supplier? Oracle has a public track record for product development. The annual investment has been on a run rate of almost $3 Billion organic product development. Oracle's well-publicized acquisitions and mergers have been supplemental to its R&D. This is important for Higher Education. Another meaningful way to evaluate a company is to look at the tangible track record of enhancement. Consider the Oracle-PeopleSoft enterprise business platform since acquired by Oracle 6 years ago: Product or Technology Enhancement Customer or User Impact Service Oriented Architecture (SOA) 300+ new web services delivered in versions 9.0 & 9.1 provide flexibility, so that customers can integrate PeopleSoft with other applications. Campus Solutions has added Admissions and Constituent Web Services. Constituent Relationship Management PeopleSoft CRM 9.1 for Higher Education introduced new process flows for student recruiting and retention to support "Student Success" initiatives. A 360 view of the constituent is now delivered, and the concept of a single-stop Student Services Center is now in CRM 9.1 with tight integration to PeopleSoft Campus Solutions. Human Capital Management Contract Pay for Education, with flexibility for configuration and calculation, has been extended in HCM 9.1. New chartfield integration among Project Costing - Time & Labor - Payroll to serve the labor distribution requirements for Grants / Sponsored Research. Talent Management PeopleSoft 9.0 and 9.1 feature an integrated talent management approach centered on definitions in "Profile Manager", with all new usability improvements. Internal and external candidate pools, and the entire recruitment process, are driven by delivered configurable selection and on-boarding processes. Interview scheduling, and online job offers are newly delivered processes. Performance Management PeopleSoft HCM ePerformance 9.1 will include significant new functionality designed to help organizations more effectively align business objectives with employee goals. Using an Organization Chart view, your business goals can flow down to become tangible objectives per employee. Succession Planning / Workforce Development New in HCM 9.0, enhanced in 9.1, is a planning capability for regular or unusual (major organizational change) succession of internal or external candidates. PeopleSoft supports employee-based career planning, which ultimately increases the integrity of the succession planning process (identify their career needs, plans, preferences, and interests). Dashboards / Oracle Business Intelligence Application Suite Oracle Human Resources Analytics provides the workforce information foundation that integrates data from HR functional areas and Finance. Oracle Human Resources Analytics delivers 9 dashboards and over 200 reports. Provide your HR professionals and front-line managers the tools to analyze workforce staffing, retention, productivity, to better source high-quality applicants, and to reduce absence costs. Multi-year Planning and Commitment Control External funding sources, especially Grants, require a multi-year encumbrance business process. PeopleSoft HCM 9.1 adds multi-year funding and commitment control, including budget checking. The newly designed Real Time Budget Checking will provide the customer with an updated snapshot of their budget and encumbrances at any given time. Position Budgeting with Hyperion Hyperion Planning world-class products now include delivered integration to PeopleSoft HCM. Position Budgeting is available in the new Public Sector Planning module of Hyperion. Web 2.0 features for the latest in usability PeopleSoft 9.1 features a contemporary internet user experience: Partial-page refreshing Drag and drop pagelets New menu structure Navigation pagelets Modal popup message windows Favorites & recently used links Type-ahead Drag and drop grid columns, pop-out grids Portal Workspaces Enterprise 2.0 for your collaborative web communities, using new content management, along with Wikis, blogs, and discussion forums in PeopleSoft Portal 9.1. PeopleTools enhanced by Oracle Fusion Middleware Standards-based tools have been added to the PeopleTools application infrastructure: BI (XML) Publisher, Java tools. Certified for use with PeopleSoft: Oracle Business Intelligence (OBIEE), Oracle Enterprise Manager, Oracle Weblogic Server, Oracle SOA Suite. Hosting for PeopleSoft applications A solid new deployment option: Oracle On Demand remote hosting center for high scalability, security, and continuity of operations. Business Process Outsourcing (BPO) for HCM / Payroll functions Partnership with AT&T provides hosting of HR/Payroll application along with payroll business process operations, and subscription-based service fees (SaaS). AT&T BPO full service includes pay sheet processing, bank and 3rd party file transfer, payroll tax handling, etc. Continuous Delivery Model Feature Packs provide faster time-to-benefit; new features become available in PeopleSoft 9.1 (or Campus Solutions 9.0) without need to perform upgrade. Golden person data model across all campus applications Oracle Higher Education Constituent Hub provides synchronization and data governance of person data across any application, e.g. HR/ Payroll, Student Information System, Housing, Emergency Contact, LMS, CRM. Oracle's aggressive enhancement plans within the "Applications Unlimited" program continue, as new functionality is under development for a new version of a PeopleSoft release planned for 2012. Meanwhile, new capabilities are planned on an annual basis in Feature Packs. PeopleSoft just delivered the HCM 2010 Feature Pack and another is planned for 2011. In February we plan to have over 100 customers from our Customer Advisory Boards at our PeopleSoft Development Center in California to review designs for all of these releases. For those of you near New York City The investment and progressive development story described above is the subject of an Oracle road show event on February 9, 2011. Charting Your Course with Oracle Applications is a global event series designed to help business and IT executives assess the impact of new inflection points on their business and applications roadmap: changing workforces, shifting customer and constituent bases, and increased volatility. Learn how innovations ranging from new deployment models like cloud computing to the introduction of social applications and smart devices are delivering results across all areas of business and industry. THIS DOCUMENT IS FOR INFORMATIONAL PURPOSES ONLY AND MAY NOT BE INCORPORATED INTO A CONTRACT OR AGREEMENT.

Read the article

CodePlex Daily Summary for Saturday, October 22, 2011

CodePlex Daily Summary for Saturday, October 22, 2011Popular ReleasesWatchersNET CKEditor™ Provider for DotNetNuke®: CKEditor Provider 1.12.17: Changes Added FilePath Length Check when Uploading Files. Fixed Issue #6550 Fixed Issue #6536 Fixed Issue #6525 Fixed Issue #6500 Fixed Issue #6401 Fixed Issue #6490DotNet.Framework.Common: DotNet.Framework.Common 4.0: ??????????，????????????XML Explorer: XML Explorer 4.0.5: Changes in 4.0.5: Added 'Copy Attribute XPath to Address Bar' feature. Added methods for decoding node text and value from Base64 encoded strings, and copying them to the clipboard. Added 'ChildNodeDefinitions' to the options, which allows for easier navigation of parent-child and ID-IDREF relationships. Discovery happens on-demand, as nodes are expanded and child nodes are added. Nodes can now have 'virtual' child nodes, defined by an xpath to select an identifier (usually relative to ...Media Companion: MC 3.419b Weekly: A couple of minor bug fixes, but the important fix in this release is to tackle the extremely long load times for users with large TV collections (issue #130). A note has been provided by developer Playos: "One final note, you will have to suffer one final long load and then it should be fixed... alternatively you can delete the TvCache.xml and rebuild your library... The fix was to include the file extension so it doesn't have to look for the video file (checking to see if a file exists is a...CODE Framework: 4.0.11021.0: This build adds a lot of our WPF components, including our MVVC and MVC components as well as a "Metro" and "Battleship" style.Manejo de tags - PHP sobre apache: tagqrphp: Primera version: Para que funcione el programa se debe primero obtener un id para desarrollo del tag eso lo entrega Microsoft registrandose en el sitio. http://tag.microsoft.com - En tagm.php que llama a la libreria Microsoft Tag PHP Library (Codigo que sirve para trabajar con PHP y Tag) - Llenamos los datos del formulario y ejecutamos para obtener el codigo tag de microsoft el cual apunte a la url que le indicamos en el formulario - Libreria MStag.php (tiene mucha explicaciÃ³n del funciona...GridLibre para Visual FoxPro: GridLibre para Visual FoxPro v3.5: GridLibre Para Visual FoxPro: esta herramienta ayudara a los usuarios y programadores en los manejos de los datos, como Filtrar, multiseleccion y el autoformato a las columnas como la asignacion del controlsource.Self-Tracking Entity Generator for WPF and Silverlight: Self-Tracking Entity Generator v 0.9.9: Self-Tracking Entity Generator v 0.9.9 for Entity Framework 4.0Umbraco CMS: Umbraco 5.0 CMS Alpha 3: Umbraco 5 Alpha 3Umbraco 5 (aka Jupiter) will be the next version of everyone's favourite, friendly ASP.NET CMS that already powers over 100,000 websites worldwide. Try out the Alpha of v5 today! If you're new to Umbraco and would like to get a low-down on our popular and easy-to-learn approach to content management, check out our intro video. What's Alpha 3?This is our third Alpha release. It's intended for developers looking to become familiar with the codebase & architecture, or for thos...Vkontakte WP: Vkontakte: source codeWay2Sms Applications for Android, Desktop/Laptop & Java enabled phones: Way2SMS Desktop App v2.0: 1. Fixed issue with sending messages due to changes to Way2Sms site 2. Updated the character limit to 160 from 140GART - Geo Augmented Reality Toolkit: 1.0.1: About Release 1.0.1 Release 1.0.1 is a service release that addresses several issues and improves performance. As always, check the Documentation tab for instructions on how to get started. If you don't have the Windows Phone SDK yet, grab it here. Breaking Change Please note: There is a breaking change in this release. As noted below, the WorldCalculationMode property of ARItem has been replaced by a user-definable function. ARItem is now automatically wired up with a function that perform...Microsoft Ajax Minifier: Microsoft Ajax Minifier 4.32: Fix for issue #16710 - string literals in "constant literal operations" which contain ASP.NET substitutions should not be considered "constant." Move the JS1284 error (Misplaced Function Declaration) so it only fires when in strict mode. I got a couple complaints that people didn't like that error popping up in their existing code when they could verify that the location of that function, although not strict JS, still functions as expected cross-browser.Naked Objects: Naked Objects Release 4.0.110.0: Corresponds to the packaged version 4.0.110.0 available via NuGet. Please note that the easiest way to install and run the Naked Objects Framework is via the NuGet package manager: just search the Official NuGet Package Source for 'nakedobjects'. It is only necessary to download the source code (from here) if you wish to modify or re-build the framework yourself. If you do wish to re-build the framework, consul the file HowToBuild.txt in the release. Documentation Please note that after ...myCollections: Version 1.5: New in this version : Added edit type for selected elements Added clean for selected elements Added Amazon Italia Added Amazon China Added TVDB Italia Added TVDB China Added Turkish language You can now manually add artist Added Order by Rating Improved Add by Media Improved Artist Detail Upgrade Sqlite engine View, Zoom, Grouping, Filter are now saved by category Added group by Artist Added CubeCover View BugFixingFacebook C# SDK: 5.3: This is a BETA release which adds new features and bug fixes to v5.2.1. removed dependency from Code Contracts enabled Task Parallel Support in .NET 4.0+ added support for early preview for .NET 4.5 added additional method overloads for .NET 4.5 to support IProgress<T> for upload progress added new CS-WinForms-AsyncAwait.sln sample demonstrating the use of async/await, upload progress report using IProgress<T> and cancellation support Query/QueryAsync methods uses graph api instead...IronPython: 2.7.1 RC: This is the first release candidate of IronPython 2.7.1. Like IronPython 54498, this release requires .NET 4 or Silverlight 4. This release will replace any existing IronPython installation. If there are no showstopping issues, this will be the only release candidate for 2.7.1, so please speak up if you run into any roadblocks. The highlights of 2.7.1 are: Updated the standard library to match CPython 2.7.2. Add the ast, csv, and unicodedata modules. Fixed several bugs. IronPython To...Rawr: Rawr 4.2.6: This is the Downloadable WPF version of Rawr!For web-based version see http://elitistjerks.com/rawr.php You can find the version notes at: http://rawr.codeplex.com/wikipage?title=VersionNotes Rawr AddonWe now have a Rawr Official Addon for in-game exporting and importing of character data hosted on Curse. The Addon does not perform calculations like Rawr, it simply shows your exported Rawr data in wow tooltips and lets you export your character to Rawr (including bag and bank items) like Char...Home Access Plus+: v7.5: Change Log: New Booking System (out of Beta) New Help Desk (out of Beta) New My Files (Developer Preview) Token now saved into Cookie so the system doesn't TIMEOUT as much File Changes: ~/bin/hap.ad.dll ~/bin/hap.web.dll ~/bin/hap.data.dll ~/bin/hap.web.configuration.dll ~/bookingsystem/admin/default.aspx ~/bookingsystem/default.aspx REMOVED ~/bookingsystem/bookingpopup.ascx REMOVED ~/bookingsystem/daylist.ascx REMOVED ~/bookingsystem/new.aspx ~/helpdesk/default.aspx ...Visual Micro - Arduino for Visual Studio: Arduino for Visual Studio 2008 and 2010: Arduino for Visual Studio 2010 has been extended to support Visual Studio 2008. The same functionality and configuration exists between the two versions. The 2010 addin runs .NET4 and the 2008 addin runs .NET3.5, both are installed using a single msi and both share the same configuration settings. The only known issue in 2008 is that the button and menu icons are missing. Please logon to the visual micro forum and let us know if things are working or not. Read more about this Visual Studio ...New Projects#foo Core: Core functionality extensions of .NET used by all HashFoo projects.#foo Nhib: #foo NHibernate extensions.Aagust G: Hello all ! Its a free JQuery Image Slider....ACP Log Analyzer: ACP Log Analyzer provides a quick and easy mechanism for generating a report on your ACP-based astronomical observing activities. Developed in Microsoft Visual Studio 2010 using C#, the .NET Framework version 4 and WPF.BlobShare Sample: TBDCompletedWorkflowCleanUp: This tool once executed on a list delete all completed workflow instancesCRM 2011 Visual Ribbon Editor: Visual Ribbon Editor is a tool for Microsoft Dynamics CRM 2011 that lets you edit CRM ribbons. This ribbon editor shows a preview of the CRM ribbon as you are editing it and allows you to add ribbon buttons and groups without needing to fully understand the ribbon XML schema.GearMerge: Organizes Movies and TV Series files from one Hard Drive to another. I created it for myself to update external drives with movies and TV shows from my collection.Generic Object Storage Helper for WinRT: ObjectStorageHelper<T> is a Generic class that simplifies storage of data in WinRT applications.Government Sanctioned Espionage RPG: Government Sanctioned is a modern SRD-like espionage game server. Visit http://wiki.government-sanctioned.us:8040 for game design and play information or homepage http://www.government-sanctioned.us Government Sanctioned is an online, text-based espionage RPG (similar to a MUD/MOO) that takes place against the backdrop of a highly-secretive U.S. Government agency whose stated goals don't always match the dirty work the agents tend to find themselves in. - over 15 starting profession...GridLibre para Visual FoxPro: GridLibre Para Visual FoxPro: esta herramienta ayudara a los usuarios y programadores en los manejos de los datos, como Filtrar, multiseleccion y el autoformato a las columnas como la asignacion del controlsource.HTML5 Video Web Part for SharePoint 2010: A web part for SharePoint 2010 that enable users playing video into the page using the Ribbon bar.Jogo do Galo: JOGO DO GALO REGRAS •O tabuleiro é a matriz de três linhas em três colunas. •Dois jogadores escolhem três peças cada um. •Os jogadores jogam alternadamente, uma peça de cada vez, num espaço que esteja vazio. •O objectivo é conseguir três peças iguais em linha, quer horizontal, vKarol sie uczy silverlajta: on sie naprawde tego uczy!Manejo de tags - PHP sobre apache: Hago uso de la libreria Microsoft Tag PHP Library para que pueda funcionar la aplicación sobre Apache finalmente puede crear tag de micrsosoft desde el formulario creado. Modem based SMS Gateway: It is an easy to use modem based SMS server that provide easier solutions for SMS marketing or SMS based services. It is highly programmable and the easy to use API interface makes SMS integration very easy. Embedded SMS processor provides customized solution to many of your needs even without building any custom software.Mund Publishing Feture: Mund Publishing FeatureMyTFSTest: TestNHS HL7 CDA Document Developer: A project to demonstrate how templated HL7 CDA documents can be created by using a common API. The API is designed to be used in .NET applications. A number of examples are provided using C#OpenShell: OpenShell is an open source implementation of the Windows PowerShell engine. It is to make integrating PowerShell into standalone console programs simple.Powershell Script to Copy Lists in a Site Collection in MOSS 2007 and SPS 2010: Hi, This is a powershell script file that copies a list within the same site collection. This works in Sharepoint 2007 and Sharepoint 2010 as well. THis will flash the messages before taking the input values. This will in this way provide the clear ideas about the values to beSharePoint Desktop: SharePoint Desktop is a explorer-like webpart that makes it possible to drag and drop items (documents and folders), copy and paste items and explore all SharePoint document libraries in 1 place.SQL floating point compare function: Comparison of floating point values in SQL Server not always gives the expected result. With this function, comparison is only done on the first 15 significant digits. Since SQL Server only garantees a precision of 15 digits for float datatypes, this is expected to be secure.Stock Analyzer: It is a stock management software. It's main job is to store market realtime data on a database to be able to analyse it latter and create automatic systems that operate directly on the stock exchange market. It will have different modules to do this task: - Realtime data capture. - Realtime data analysis - Historic analysis. - Order execution. - Strategy test. - Strategy execution. It's developed in C# and with WPF.VB_Skype: VB_Skype utilizza la libreria Skype4COM per integrare i servizi Skype con un'applicazione Visual Basic (Windows Forms). L'applicazione comprende un progetto di controllo personalizzato che costituisce il "wrapper" verso la libreria Skype4COM e un progetto con una demo di utilizzo. Progetto che dovrebbe essere utilizzato nella mia sessione, come uno degli speaker della conferenza "WPC 2011" che si terrà ad Assago (MI) nei giorni 22-23-24 Novembre 2011. La mia sessione è in agenda per il 24...Word Template Generator: Custom Template Generator for Microsoft Word, developed in C#?????OA??: ?????OA??

Read the article

SPARC T4-4 Beats 8-CPU IBM POWER7 on TPC-H @3000GB Benchmark

- by Brian

Oracle's SPARC T4-4 server delivered a world record TPC-H @3000GB benchmark result for systems with four processors. This result beats eight processor results from IBM (POWER7) and HP (x86). The SPARC T4-4 server also delivered better performance per core than these eight processor systems from IBM and HP. Comparisons below are based upon system to system comparisons, highlighting Oracle's complete software and hardware solution. This database world record result used Oracle's Sun Storage 2540-M2 arrays (rotating disk) connected to a SPARC T4-4 server running Oracle Solaris 11 and Oracle Database 11g Release 2 demonstrating the power of Oracle's integrated hardware and software solution. The SPARC T4-4 server based configuration achieved a TPC-H scale factor 3000 world record for four processor systems of 205,792 QphH@3000GB with price/performance of $4.10/QphH@3000GB. The SPARC T4-4 server with four SPARC T4 processors (total of 32 cores) is 7% faster than the IBM Power 780 server with eight POWER7 processors (total of 32 cores) on the TPC-H @3000GB benchmark. The SPARC T4-4 server is 36% better in price performance compared to the IBM Power 780 server on the TPC-H @3000GB Benchmark. The SPARC T4-4 server is 29% faster than the IBM Power 780 for data loading. The SPARC T4-4 server is up to 3.4 times faster than the IBM Power 780 server for the Refresh Function. The SPARC T4-4 server with four SPARC T4 processors is 27% faster than the HP ProLiant DL980 G7 server with eight x86 processors on the TPC-H @3000GB benchmark. The SPARC T4-4 server is 52% faster than the HP ProLiant DL980 G7 server for data loading. The SPARC T4-4 server is up to 3.2 times faster than the HP ProLiant DL980 G7 for the Refresh Function. The SPARC T4-4 server achieved a peak IO rate from the Oracle database of 17 GB/sec. This rate was independent of the storage used, as demonstrated by the TPC-H @3000TB benchmark which used twelve Sun Storage 2540-M2 arrays (rotating disk) and the TPC-H @1000TB benchmark which used four Sun Storage F5100 Flash Array devices (flash storage). [*] The SPARC T4-4 server showed linear scaling from TPC-H @1000GB to TPC-H @3000GB. This demonstrates that the SPARC T4-4 server can handle the increasingly larger databases required of DSS systems. [*] The SPARC T4-4 server benchmark results demonstrate a complete solution of building Decision Support Systems including data loading, business questions and refreshing data. Each phase usually has a time constraint and the SPARC T4-4 server shows superior performance during each phase. [*] The TPC believes that comparisons of results published with different scale factors are misleading and discourages such comparisons. Performance Landscape The table lists the leading TPC-H @3000GB results for non-clustered systems. TPC-H @3000GB, Non-Clustered Systems System Processor P/C/T – Memory Composite(QphH) $/perf($/QphH) Power(QppH) Throughput(QthH) Database Available SPARC Enterprise M9000 3.0 GHz SPARC64 VII+ 64/256/256 – 1024 GB 386,478.3 $18.19 316,835.8 471,428.6 Oracle 11g R2 09/22/11 SPARC T4-4 3.0 GHz SPARC T4 4/32/256 – 1024 GB 205,792.0 $4.10 190,325.1 222,515.9 Oracle 11g R2 05/31/12 SPARC Enterprise M9000 2.88 GHz SPARC64 VII 32/128/256 – 512 GB 198,907.5 $15.27 182,350.7 216,967.7 Oracle 11g R2 12/09/10 IBM Power 780 4.1 GHz POWER7 8/32/128 – 1024 GB 192,001.1 $6.37 210,368.4 175,237.4 Sybase 15.4 11/30/11 HP ProLiant DL980 G7 2.27 GHz Intel Xeon X7560 8/64/128 – 512 GB 162,601.7 $2.68 185,297.7 142,685.6 SQL Server 2008 10/13/10 P/C/T = Processors, Cores, Threads QphH = the Composite Metric (bigger is better) $/QphH = the Price/Performance metric in USD (smaller is better) QppH = the Power Numerical Quantity QthH = the Throughput Numerical Quantity The following table lists data load times and refresh function times during the power run. TPC-H @3000GB, Non-Clustered Systems Database Load & Database Refresh System Processor Data Loading(h:m:s) T4Advan RF1(sec) T4Advan RF2(sec) T4Advan SPARC T4-4 3.0 GHz SPARC T4 04:08:29 1.0x 67.1 1.0x 39.5 1.0x IBM Power 780 4.1 GHz POWER7 05:51:50 1.5x 147.3 2.2x 133.2 3.4x HP ProLiant DL980 G7 2.27 GHz Intel Xeon X7560 08:35:17 2.1x 173.0 2.6x 126.3 3.2x Data Loading = database load time RF1 = power test first refresh transaction RF2 = power test second refresh transaction T4 Advan = the ratio of time to T4 time Complete benchmark results found at the TPC benchmark website http://www.tpc.org. Configuration Summary and Results Hardware Configuration: SPARC T4-4 server 4 x SPARC T4 3.0 GHz processors (total of 32 cores, 128 threads) 1024 GB memory 8 x internal SAS (8 x 300 GB) disk drives External Storage: 12 x Sun Storage 2540-M2 array storage, each with 12 x 15K RPM 300 GB drives, 2 controllers, 2 GB cache Software Configuration: Oracle Solaris 11 11/11 Oracle Database 11g Release 2 Enterprise Edition Audited Results: Database Size: 3000 GB (Scale Factor 3000) TPC-H Composite: 205,792.0 QphH@3000GB Price/performance: $4.10/QphH@3000GB Available: 05/31/2012 Total 3 year Cost: $843,656 TPC-H Power: 190,325.1 TPC-H Throughput: 222,515.9 Database Load Time: 4:08:29 Benchmark Description The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB, 10000GB, 30000GB and 100000GB) are not allowed by the TPC. TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system. The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multiple user modes. The benchmark requires reporting of price/performance, which is the ratio of the total HW/SW cost plus 3 years maintenance to the QphH. A secondary metric is the storage efficiency, which is the ratio of total configured disk space in GB to the scale factor. Key Points and Best Practices Twelve Sun Storage 2540-M2 arrays were used for the benchmark. Each Sun Storage 2540-M2 array contains 12 15K RPM drives and is connected to a single dual port 8Gb FC HBA using 2 ports. Each Sun Storage 2540-M2 array showed 1.5 GB/sec for sequential read operations and showed linear scaling, achieving 18 GB/sec with twelve Sun Storage 2540-M2 arrays. These were stand alone IO tests. The peak IO rate measured from the Oracle database was 17 GB/sec. Oracle Solaris 11 11/11 required very little system tuning. Some vendors try to make the point that storage ratios are of customer concern. However, storage ratio size has more to do with disk layout and the increasing capacities of disks – so this is not an important metric in which to compare systems. The SPARC T4-4 server and Oracle Solaris efficiently managed the system load of over one thousand Oracle Database parallel processes. Six Sun Storage 2540-M2 arrays were mirrored to another six Sun Storage 2540-M2 arrays on which all of the Oracle database files were placed. IO performance was high and balanced across all the arrays. The TPC-H Refresh Function (RF) simulates periodical refresh portion of Data Warehouse by adding new sales and deleting old sales data. Parallel DML (parallel insert and delete in this case) and database log performance are a key for this function and the SPARC T4-4 server outperformed both the IBM POWER7 server and HP ProLiant DL980 G7 server. (See the RF columns above.) See Also Transaction Processing Performance Council (TPC) Home Page Ideas International Benchmark Page SPARC T4-4 Server oracle.com OTN Oracle Solaris oracle.com OTN Oracle Database 11g Release 2 Enterprise Edition oracle.com OTN Sun Storage 2540-M2 Array oracle.com OTN Disclosure Statement TPC-H, QphH, $/QphH are trademarks of Transaction Processing Performance Council (TPC). For more information, see www.tpc.org. SPARC T4-4 205,792.0 QphH@3000GB, $4.10/QphH@3000GB, available 5/31/12, 4 processors, 32 cores, 256 threads; IBM Power 780 QphH@3000GB, 192,001.1 QphH@3000GB, $6.37/QphH@3000GB, available 11/30/11, 8 processors, 32 cores, 128 threads; HP ProLiant DL980 G7 162,601.7 QphH@3000GB, $2.68/QphH@3000GB available 10/13/10, 8 processors, 64 cores, 128 threads.

Read the article

Coming to OpenWorld? A must attend session…

- by Ruma Sanyal

Normal 0 false false false EN-US X-NONE X-NONE NTT Docomo, Inc. is the predominant mobile phone operator in Japan. The name is officially an abbreviation of the phrase, "do communications over the mobile network", and is also from a compound word dokomo, meaning "everywhere" in Japanese. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} One of the most important of NTT Docomo’s systems is ALADIN, which is a nationwide operating system shared with its eight regional subsidiaries. ALADIN has five primary functions: customer management, phone number management, information processing and storage, sales information management, and credit investigation. To enhance cost efficiency and help ensure stable operation of ALADIN, NTT Docomo has employed Oracle WebLogic Server as a new application platform. Further information on this can be found here. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Last year at OpenWorld, NTT Docomo was honored as an Innovation Award Winner for: · Implementing real time sales and contract management system enabling all services requested by customers for immediate activations before customer leaves the Docomo store · A robust disaster recovery strategy, room to grow the business, and ability to move custom Java development to a platform with built in standards - WebLogic · Better performance, better reliability, better stability, and smooth migration v\:* {behavior:url(#default#VML);} o\:* {behavior:url(#default#VML);} w\:* {behavior:url(#default#VML);} .shape {behavior:url(#default#VML);} Normal 0 false false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Meet This Year's Most Impressive Innovators! This year we continue to honor customers for their most innovative and cutting-edge solutions using Oracle Fusion Middleware. Join us in celebrating award recipients’ great achievements and commitment to innovation. Oracle Fusion Middleware: Meet This Year's Most Impressive Innovators Session ID: CON7029 Tuesday September 30, 2014 @ 5-5:45 pm (PST) Yerba Buena Center for the Arts YBCA Theater (next to Moscone North) 700 Howard St., San Francisco, CA, 94103 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;}

Read the article

SQL Server Split() Function

- by HighAltitudeCoder

Title goes here Ever wanted a dbo.Split() function, but not had the time to debug it completely? Let me guess - you are probably working on a stored procedure with 50 or more parameters; two or three of them are parameters of differing types, while the other 47 or so all of the same type (id1, id2, id3, id4, id5...). Worse, you've found several other similar stored procedures with the ONLY DIFFERENCE being the number of like parameters taped to the end of the parameter list. If this is the situation you find yourself in now, you may be wondering, "why am I working with three different copies of what is basically the same stored procedure, and why am I having to maintain changes in three different places? Can't I have one stored procedure that accomplishes the job of all three? My answer to you: YES! Here is the Split() function I've created. /****************************************************************************** Split.sql ******************************************************************************/ /****************************************************************************** Split a delimited string into sub-components and return them as a table. Parameter 1: Input string which is to be split into parts. Parameter 2: Delimiter which determines the split points in input string. Works with space or spaces as delimiter. Split() is apostrophe-safe. SYNTAX: SELECT * FROM Split('Dvorak,Debussy,Chopin,Holst', ',') SELECT * FROM Split('Denver|Seattle|San Diego|New York', '|') SELECT * FROM Split('Denver is the super-awesomest city of them all.', ' ') ******************************************************************************/ USE AdventureWorks GO IF EXISTS (SELECT * FROM sysobjects WHERE xtype = 'TF' AND name = 'Split' ) BEGIN DROP FUNCTION Split END GO CREATE FUNCTION Split ( @InputString VARCHAR(8000), @Delimiter VARCHAR(50) ) RETURNS @Items TABLE ( Item VARCHAR(8000) ) AS BEGIN IF @Delimiter = ' ' BEGIN SET @Delimiter = ',' SET @InputString = REPLACE(@InputString, ' ', @Delimiter) END IF (@Delimiter IS NULL OR @Delimiter = '') SET @Delimiter = ',' --INSERT INTO @Items VALUES (@Delimiter) -- Diagnostic --INSERT INTO @Items VALUES (@InputString) -- Diagnostic DECLARE @Item VARCHAR(8000) DECLARE @ItemList VARCHAR(8000) DECLARE @DelimIndex INT SET @ItemList = @InputString SET @DelimIndex = CHARINDEX(@Delimiter, @ItemList, 0) WHILE (@DelimIndex != 0) BEGIN SET @Item = SUBSTRING(@ItemList, 0, @DelimIndex) INSERT INTO @Items VALUES (@Item) -- Set @ItemList = @ItemList minus one less item SET @ItemList = SUBSTRING(@ItemList, @DelimIndex+1, LEN(@ItemList)-@DelimIndex) SET @DelimIndex = CHARINDEX(@Delimiter, @ItemList, 0) END -- End WHILE IF @Item IS NOT NULL -- At least one delimiter was encountered in @InputString BEGIN SET @Item = @ItemList INSERT INTO @Items VALUES (@Item) END -- No delimiters were encountered in @InputString, so just return @InputString ELSE INSERT INTO @Items VALUES (@InputString) RETURN END -- End Function GO ---- Set Permissions --GRANT SELECT ON Split TO UserRole1 --GRANT SELECT ON Split TO UserRole2 --GO The syntax is basically as follows: SELECT <fields> FROM Table 1 JOIN Table 2 ON ... JOIN Table 3 ON ... WHERE LOGICAL CONDITION A AND LOGICAL CONDITION B AND LOGICAL CONDITION C AND TABLE2.Id IN (SELECT * FROM Split(@IdList, ',')) @IdList is a parameter passed into the stored procedure, and the comma (',') is the delimiter you have chosen to split the parameter list on. You can also use it like this: SELECT <fields> FROM Table 1 JOIN Table 2 ON ... JOIN Table 3 ON ... WHERE LOGICAL CONDITION A AND LOGICAL CONDITION B AND LOGICAL CONDITION C HAVING COUNT(SELECT * FROM Split(@IdList, ',') Similarly, it can be used in other aggregate functions at run-time: SELECT MIN(SELECT * FROM Split(@IdList, ','), <fields> FROM Table 1 JOIN Table 2 ON ... JOIN Table 3 ON ... WHERE LOGICAL CONDITION A AND LOGICAL CONDITION B AND LOGICAL CONDITION C GROUP BY <fields> Now that I've (hopefully effectively) explained the benefits to using this function and implementing it in one or more of your database objects, let me warn you of a caveat that you are likely to encounter. You may have a team member who waits until the right moment to ask you a pointed question: "Doesn't this function just do the same thing as using the IN function? Why didn't you just use that instead? In other words, why bother with this function?" What's happening is, one or more team members has failed to understand the reason for implementing this kind of function in the first place. (Note: this is THE MOST IMPORTANT ASPECT OF THIS POST). Allow me to outline a few pros to implementing this function, so you may effectively parry this question. Touche. 1) Code consolidation. You don't have to maintain what is basically the same code and logic, but with varying numbers of the same parameter in several SQL objects. I'm not going to go into the cons related to using this function, because the afore mentioned team member is probably more than adept at pointing these out. Remember, the real positive contribution is ou are decreasing the liklihood that your team fails to update all (x) duplicate copies of what are basically the same stored procedure, and so on... This is the classic downside to duplicate code. It is a virus, and you should kill it. You might be better off rejecting your team member's question, and responding with your own: "Would you rather maintain the same logic in multiple different stored procedures, and hope that the team doesn't forget to always update all of them at the same time?". In his head, he might be thinking "yes, I would like to maintain several different copies of the same stored procedure", although you probably will not get such a direct response. 2) Added flexibility - you can use the Split function elsewhere, and for splitting your data in different ways. Plus, you can use any kind of delimiter you wish. How can you know today the ways in which you might want to examine your data tomorrow? Segue to my next point. 3) Because the function takes a delimiter parameter, you can split the data in any number of ways. This greatly increases the utility of such a function and enables your team to work with the data in a variety of different ways in the future. You can split on a single char, symbol, word, or group of words. You can split on spaces. (The list goes on... test it out). Finally, you can dynamically define the behavior of a stored procedure (or other SQL object) at run time, through the use of this function. Rather than have several objects that accomplish almost the same thing, why not have only one instead?

Read the article

Solaris X86 AESNI OpenSSL Engine

- by danx

Solaris X86 AESNI OpenSSL Engine Cryptography is a major component of secure e-commerce. Since cryptography is compute intensive and adds a significant load to applications, such as SSL web servers (https), crypto performance is an important factor. Providing accelerated crypto hardware greatly helps these applications and will help lead to a wider adoption of cryptography, and lower cost, in e-commerce and other applications. The Intel Westmere microprocessor has six new instructions to acclerate AES encryption. They are called "AESNI" for "AES New Instructions". These are unprivileged instructions, so no "root", other elevated access, or context switch is required to execute these instructions. These instructions are used in a new built-in OpenSSL 1.0 engine available in Solaris 11, the aesni engine. Previous Work Previously, AESNI instructions were introduced into the Solaris x86 kernel and libraries. That is, the "aes" kernel module (used by IPsec and other kernel modules) and the Solaris pkcs11 library (for user applications). These are available in Solaris 10 10/09 (update 8) and above, and Solaris 11. The work here is to add the aesni engine to OpenSSL. X86 AESNI Instructions Intel's Xeon 5600 is one of the processors that support AESNI. This processor is used in the Sun Fire X4170 M2 As mentioned above, six new instructions acclerate AES encryption in processor silicon. The new instructions are: aesenc performs one round of AES encryption. One encryption round is composed of these steps: substitute bytes, shift rows, mix columns, and xor the round key. aesenclast performs the final encryption round, which is the same as above, except omitting the mix columns (which is only needed for the next encryption round). aesdec performs one round of AES decryption aesdeclast performs the final AES decryption round aeskeygenassist Helps expand the user-provided key into a "key schedule" of keys, one per round aesimc performs an "inverse mixed columns" operation to convert the encryption key schedule into a decryption key schedule pclmulqdq Not a AESNI instruction, but performs "carryless multiply" operations to acclerate AES GCM mode. Since the AESNI instructions are implemented in hardware, they take a constant number of cycles and are not vulnerable to side-channel timing attacks that attempt to discern some bits of data from the time taken to encrypt or decrypt the data. Solaris x86 and OpenSSL Software Optimizations Having X86 AESNI hardware crypto instructions is all well and good, but how do we access it? The software is available with Solaris 11 and is used automatically if you are running Solaris x86 on a AESNI-capable processor. AESNI is used internally in the kernel through kernel crypto modules and is available in user space through the PKCS#11 library. For OpenSSL on Solaris 11, AESNI crypto is available directly with a new built-in OpenSSL 1.0 engine, called the "aesni engine." This is in lieu of the extra overhead of going through the Solaris OpenSSL pkcs11 engine, which accesses Solaris crypto and digest operations. Instead, AESNI assembly is included directly in the new aesni engine. Instead of including the aesni engine in a separate library in /lib/openssl/engines/, the aesni engine is "built-in", meaning it is included directly in OpenSSL's libcrypto.so.1.0.0 library. This reduces overhead and the need to manually specify the aesni engine. Since the engine is built-in (that is, in libcrypto.so.1.0.0), the openssl -engine command line flag or API call is not needed to access the engine—the aesni engine is used automatically on AESNI hardware. Ciphers and Digests supported by OpenSSL aesni engine The Openssl aesni engine auto-detects if it's running on AESNI hardware and uses AESNI encryption instructions for these ciphers: AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CFB128, AES-192-CFB128, AES-256-CFB128, AES-128-CTR, AES-192-CTR, AES-256-CTR, AES-128-ECB, AES-192-ECB, AES-256-ECB, AES-128-OFB, AES-192-OFB, and AES-256-OFB. Implementation of the OpenSSL aesni engine The AESNI assembly language routines are not a part of the regular Openssl 1.0.0 release. AESNI is a part of the "HEAD" ("development" or "unstable") branch of OpenSSL, for future release. But AESNI is also available as a separate patch provided by Intel to the OpenSSL project for OpenSSL 1.0.0. A minimal amount of "glue" code in the aesni engine works between the OpenSSL libcrypto.so.1.0.0 library and the assembly functions. The aesni engine code is separate from the base OpenSSL code and requires patching only a few source files to use it. That means OpenSSL can be more easily updated to future versions without losing the performance from the built-in aesni engine. OpenSSL aesni engine Performance Here's some graphs of aesni engine performance I measured by running openssl speed -evp $algorithm where $algorithm is aes-128-cbc, aes-192-cbc, and aes-256-cbc. These are using the 64-bit version of openssl on the same AESNI hardware, a Sun Fire X4170 M2 with a Intel Xeon E5620 @2.40GHz, running Solaris 11 FCS. "Before" is openssl without the aesni engine and "after" is openssl with the aesni engine. The numbers are MBytes/second. OpenSSL aesni engine performance on Sun Fire X4170 M2 (Xeon E5620 @2.40GHz) (Higher is better; "before"=OpenSSL on AESNI without AESNI engine software, "after"=OpenSSL AESNI engine) As you can see the speedup is dramatic for all 3 key lengths and for data sizes from 16 bytes to 8 Kbytes—AESNI is about 7.5-8x faster over hand-coded amd64 assembly (without aesni instructions). Verifying the OpenSSL aesni engine is present The easiest way to determine if you are running the aesni engine is to type "openssl engine" on the command line. No configuration, API, or command line options are needed to use the OpenSSL aesni engine. If you are running on Intel AESNI hardware with Solaris 11 FCS, you'll see this output indicating you are using the aesni engine: intel-westmere $ openssl engine (aesni) Intel AES-NI engine (no-aesni) (dynamic) Dynamic engine loading support (pkcs11) PKCS #11 engine support If you are running on Intel without AESNI hardware you'll see this output indicating the hardware can't support the aesni engine: intel-nehalem $ openssl engine (aesni) Intel AES-NI engine (no-aesni) (dynamic) Dynamic engine loading support (pkcs11) PKCS #11 engine support For Solaris on SPARC or older Solaris OpenSSL software, you won't see any aesni engine line at all. Third-party OpenSSL software (built yourself or from outside Oracle) will not have the aesni engine either. Solaris 11 FCS comes with OpenSSL version 1.0.0e. The output of typing "openssl version" should be "OpenSSL 1.0.0e 6 Sep 2011". 64- and 32-bit OpenSSL OpenSSL comes in both 32- and 64-bit binaries. 64-bit executable is now the default, at /usr/bin/openssl, and OpenSSL 64-bit libraries at /lib/amd64/libcrypto.so.1.0.0 and libssl.so.1.0.0 The 32-bit executable is at /usr/bin/i86/openssl and the libraries are at /lib/libcrytpo.so.1.0.0 and libssl.so.1.0.0. Availability The OpenSSL AESNI engine is available in Solaris 11 x86 for both the 64- and 32-bit versions of OpenSSL. It is not available with Solaris 10. You must have a processor that supports AESNI instructions, otherwise OpenSSL will fallback to the older, slower AES implementation without AESNI. Processors that support AESNI include most Westmere and Sandy Bridge class processor architectures. Some low-end processors (such as for mobile/laptop platforms) do not support AESNI. The easiest way to determine if the processor supports AESNI is with the isainfo -v command—look for "amd64" and "aes" in the output: $ isainfo -v 64-bit amd64 applications pclmulqdq aes sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov amd_sysc cx8 tsc fpu Conclusion The Solaris 11 OpenSSL aesni engine provides easy access to powerful Intel AESNI hardware cryptography, in addition to Solaris userland PKCS#11 libraries and Solaris crypto kernel modules.

Read the article

Enabling Service Availability in WCF Services

- by cibrax

It is very important for the enterprise to know which services are operational at any given point. There are many factors that can affect the availability of the services, some of them are external like a database not responding or any dependant service not working. However, in some cases, you only want to know whether a service is up or down, so a simple heart-beat mechanism with “Ping” messages would do the trick. Unfortunately, WCF does not provide a built-in mechanism to support this functionality, and you probably don’t to implement a “Ping” operation in any service that you have out there. For solving this in a generic way, there is a WCF extensibility point that comes to help us, the “Operation Invokers”. In a nutshell, an operation invoker is the class responsible invoking the service method with a set of parameters and generate the output parameters with the return value. What I am going to do here is to implement a custom operation invoker that intercepts any call to the service, and detects whether a “Ping” header was attached to the message. If the “Ping” header is detected, the operation invoker returns a new header to tell the client that the service is alive, and the real operation execution is omitted. In that way, we have a simple heart beat mechanism based on the messages that include a "Ping” header, so the client application can determine at any point whether the service is up or down. My operation invoker wraps the default implementation attached by default to any operation by WCF. internal class PingOperationInvoker : IOperationInvoker { IOperationInvoker innerInvoker; object[] outputs = null; object returnValue = null; public const string PingHeaderName = "Ping"; public const string PingHeaderNamespace = "http://tellago.serviceModel"; public PingOperationInvoker(IOperationInvoker innerInvoker, OperationDescription description) { this.innerInvoker = innerInvoker; outputs = description.SyncMethod.GetParameters() .Where(p => p.IsOut) .Select(p => DefaultForType(p.ParameterType)).ToArray(); var returnValue = DefaultForType(description.SyncMethod.ReturnType); } private static object DefaultForType(Type targetType) { return targetType.IsValueType ? Activator.CreateInstance(targetType) : null; } public object Invoke(object instance, object[] inputs, out object[] outputs) { object returnValue; if (Invoke(out returnValue, out outputs)) { return returnValue; } else { return this.innerInvoker.Invoke(instance, inputs, out outputs); } } private bool Invoke(out object returnValue, out object[] outputs) { object untypedProperty = null; if (OperationContext.Current .IncomingMessageProperties.TryGetValue(HttpRequestMessageProperty.Name, out untypedProperty)) { var httpRequestProperty = untypedProperty as HttpRequestMessageProperty; if (httpRequestProperty != null) { if (httpRequestProperty.Headers[PingHeaderName] != null) { outputs = this.outputs; if (OperationContext.Current .IncomingMessageProperties.TryGetValue(HttpRequestMessageProperty.Name, out untypedProperty)) { var httpResponseProperty = untypedProperty as HttpResponseMessageProperty; httpResponseProperty.Headers.Add(PingHeaderName, "Ok"); } returnValue = this.returnValue; return true; } } } var headers = OperationContext.Current.IncomingMessageHeaders; if (headers.FindHeader(PingHeaderName, PingHeaderNamespace) > -1) { outputs = this.outputs; MessageHeader<string> header = new MessageHeader<string>("Ok"); var untyped = header.GetUntypedHeader(PingHeaderName, PingHeaderNamespace); OperationContext.Current.OutgoingMessageHeaders.Add(untyped); returnValue = this.returnValue; return true; } returnValue = null; outputs = null; return false; } } The implementation above looks for the “Ping” header either in the Http Request or the Soap message. The next step is to implement a behavior for attaching this operation invoker to the services we want to monitor. [AttributeUsage(AttributeTargets.Method | AttributeTargets.Class, AllowMultiple = false, Inherited = true)] public class PingBehavior : Attribute, IServiceBehavior, IOperationBehavior { public void AddBindingParameters(ServiceDescription serviceDescription, ServiceHostBase serviceHostBase, Collection<ServiceEndpoint> endpoints, BindingParameterCollection bindingParameters) { } public void ApplyDispatchBehavior(ServiceDescription serviceDescription, ServiceHostBase serviceHostBase) { } public void Validate(ServiceDescription serviceDescription, ServiceHostBase serviceHostBase) { foreach (var endpoint in serviceDescription.Endpoints) { foreach (var operation in endpoint.Contract.Operations) { if (operation.Behaviors.Find<PingBehavior>() == null) operation.Behaviors.Add(this); } } } public void AddBindingParameters(OperationDescription operationDescription, BindingParameterCollection bindingParameters) { } public void ApplyClientBehavior(OperationDescription operationDescription, ClientOperation clientOperation) { } public void ApplyDispatchBehavior(OperationDescription operationDescription, DispatchOperation dispatchOperation) { dispatchOperation.Invoker = new PingOperationInvoker(dispatchOperation.Invoker, operationDescription); } public void Validate(OperationDescription operationDescription) { } } As an operation invoker can only be added in an “operation behavior”, a trick I learned in the past is that you can implement a service behavior as well and use the “Validate” method to inject it in all the operations, so the final configuration is much easier and cleaner. You only need to decorate the service with a simple attribute to enable the “Ping” functionality. [PingBehavior] public class HelloWorldService : IHelloWorld { public string Hello(string name) { return "Hello " + name; } } On the other hand, the client application needs to send a dummy message with a “Ping” header to detect whether the service is available or not. In order to simplify this task, I created a extension method in the WCF client channel to do this work. public static class ClientChannelExtensions { const string PingNamespace = "http://tellago.serviceModel"; const string PingName = "Ping"; public static bool IsAvailable<TChannel>(this IClientChannel channel, Action<TChannel> operation) { try { using (OperationContextScope scope = new OperationContextScope(channel)) { MessageHeader<string> header = new MessageHeader<string>(PingName); var untyped = header.GetUntypedHeader(PingName, PingNamespace); OperationContext.Current.OutgoingMessageHeaders.Add(untyped); try { operation((TChannel)channel); var headers = OperationContext.Current.IncomingMessageHeaders; if (headers.Any(h => h.Name == PingName && h.Namespace == PingNamespace)) { return true; } else { return false; } } catch (CommunicationException) { return false; } } } catch (Exception) { return false; } } } This extension method basically adds a “Ping” header to the request message, executes the operation passed as argument (Action<TChannel> operation), and looks for the corresponding “Ping” header in the response to see the results. The client application can use this extension with a single line of code, var client = new ServiceReference.HelloWorldClient(); var isAvailable = client.InnerChannel.IsAvailable<IHelloWorld>((c) => c.Hello(null)); The “isAvailable” variable will tell the client application whether the service is available or not. You can download the complete implementation from this location.

Read the article

General monitoring for SQL Server Analysis Services using Performance Monitor

- by Testas

A recent customer engagement required a setup of a monitoring solution for SSAS, due to the time restrictions placed upon this, native Windows Performance Monitor (Perfmon) and SQL Server Profiler Monitoring Tools was used as using a third party tool would have meant the customer providing an additional monitoring server that was not available.I wanted to outline the performance monitoring counters that was used to monitor the system on which SSAS was running. Due to the slow query performance that was occurring during certain scenarios, perfmon was used to establish if any pressure was being placed on the Disk, CPU or Memory subsystem when concurrent connections access the same query, and Profiler to pinpoint how the query was being managed within SSAS, profiler I will leave for another blogThis guide is not designed to provide a definitive list of what should be used when monitoring SSAS, different situations may require the addition or removal of counters as presented by the situation. However I hope that it serves as a good basis for starting your monitoring of SSAS. I would also like to acknowledge Chris Webb’s awesome chapters from “Expert Cube Development” that also helped shape my monitoring strategy:http://cwebbbi.spaces.live.com/blog/cns!7B84B0F2C239489A!6657.entrySimulating ConnectionsTo simulate the additional connections to the SSAS server whilst monitoring, I used ascmd to simulate multiple connections to the typical and worse performing queries that were identified by the customer. A similar sript can be downloaded from codeplex at http://www.codeplex.com/SQLSrvAnalysisSrvcs. File name: ASCMD_StressTestingScripts.zip. Performance MonitorWithin performance monitor, a counter log was created that contained the list of counters below. The important point to note when running the counter log is that the RUN AS property within the counter log properties should be changed to an account that has rights to the SSAS instance when monitoring MSAS counters. Failure to do so means that the counter log runs under the system account, no errors or warning are given while running the counter log, and it is not until you need to view the MSAS counters that they will not be displayed if run under the default account that has no right to SSAS. If your connection simulation takes hours, this could prove quite frustrating if not done beforehand JThe counters used…… Object Counter Instance Justification System Processor Queue legnth N/A Indicates how many threads are waiting for execution against the processor. If this counter is consistently higher than around 5 when processor utilization approaches 100%, then this is a good indication that there is more work (active threads) available (ready for execution) than the machine's processors are able to handle. System Context Switches/sec N/A Measures how frequently the processor has to switch from user- to kernel-mode to handle a request from a thread running in user mode. The heavier the workload running on your machine, the higher this counter will generally be, but over long term the value of this counter should remain fairly constant. If this counter suddenly starts increasing however, it may be an indicating of a malfunctioning device, especially if the Processor\Interrupts/sec\(_Total) counter on your machine shows a similar unexplained increase Process % Processor Time sqlservr Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process % Processor Time msmdsrv Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process Working Set sqlservr If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Process Working Set msmdsrv If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Processor % Processor Time _Total and individual cores measures the total utilization of your processor by all running processes. If multi-proc then be mindful only an average is provided Processor % Privileged Time _Total To see how the OS is handling basic IO requests. If kernel mode utilization is high, your machine is likely underpowered as it's too busy handling basic OS housekeeping functions to be able to effectively run other applications. Processor % User Time _Total To see how the applications is interacting from a processor perspective, a high percentage utilisation determine that the server is dealing with too many apps and may require increasing thje hardware or scaling out Processor Interrupts/sec _Total The average rate, in incidents per second, at which the processor received and serviced hardware interrupts. Shoulr be consistant over time but a sudden unexplained increase could indicate a device malfunction which can be confirmed using the System\Context Switches/sec counter Memory Pages/sec N/A Indicates the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays, this is the primary counter to watch for indication of possible insufficient RAM to meet your server's needs. A good idea here is to configure a perfmon alert that triggers when the number of pages per second exceeds 50 per paging disk on your system. May also want to see the configuration of the page file on the Server Memory Available Mbytes N/A is the amount of physical memory, in bytes, available to processes running on the computer. if this counter is greater than 10% of the actual RAM in your machine then you probably have more than enough RAM. monitor it regularly to see if any downward trend develops, and set an alert to trigger if it drops below 2% of the installed RAM. Physical Disk Disk Transfers/sec for each physical disk If it goes above 10 disk I/Os per second then you've got poor response time for your disk. Physical Disk Idle Time _total If Disk Transfers/sec is above 25 disk I/Os per second use this counter. which measures the percent time that your hard disk is idle during the measurement interval, and if you see this counter fall below 20% then you've likely got read/write requests queuing up for your disk which is unable to service these requests in a timely fashion. Physical Disk Disk queue legnth For the OLAP and SQL physical disk A value that is consistently less than 2 means that the disk system is handling the IO requests against the physical disk Network Interface Bytes Total/sec For the NIC Should be monitored over a period of time to see if there is anb increase/decrease in network utilisation Network Interface Current Bandwidth For the NIC is an estimate of the current bandwidth of the network interface in bits per second (BPS). MSAS 2005: Memory Memory Limit High KB N/A Shows (as a percentage) the high memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Limit Low KB N/A Shows (as a percentage) the low memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Usage KB N/A Displays the memory usage of the server process. MSAS 2005: Memory File Store KB N/A Displays the amount of memory that is reserved for the Cache. Note if total memory limit in the msmdsrv.ini is set to 0, no memory is reserved for the cache MSAS 2005: Storage Engine Query Queries from Cache Direct / sec N/A Displays the rate of queries answered from the cache directly MSAS 2005: Storage Engine Query Queries from Cache Filtered / Sec N/A Displays the Rate of queries answered by filtering existing cache entry. MSAS 2005: Storage Engine Query Queries from File / Sec N/A Displays the Rate of queries answered from files. MSAS 2005: Storage Engine Query Average time /query N/A Displays the average time of a query MSAS 2005: Connection Current connections N/A Displays the number of connections against the SSAS instance MSAS 2005: Connection Requests / sec N/A Displays the rate of query requests per second MSAS 2005: Locks Current Lock Waits N/A Displays thhe number of connections waiting on a lock MSAS 2005: Threads Query Pool job queue Length N/A The number of queries in the job queue MSAS 2005:Proc Aggregations Temp file bytes written/sec N/A Shows the number of bytes of data processed in a temporary file MSAS 2005:Proc Aggregations Temp file rows written/sec N/A Shows the number of bytes of data processed in a temporary file

Read the article

Documenting C# Library using GhostDoc and SandCastle

- by sreejukg

Documentation is an essential part of any IT project, especially when you are creating reusable components that will be used by other developers (such as class libraries). Without documentation re-using a class library is almost impossible. Just think of coding .net applications without MSDN documentation (Ooops I can’t think of it). Normally developers, who know the bits and pieces of their classes, see this as a boring work to write details again to generate the documentation. Also the amount of work to make this and manage it changes made the process of manual creation of Documentation impossible or tedious. So what is the effective solution? Let me divide this into two steps 1. Generate comments for your code while you are writing the code. 2. Create documentation file using these comments. Now I am going to examine these processes. Step 1: Generate XML Comments automatically Most of the developers write comments for their code. The best thing is that the comments will be entered during the development process. Additionally comments give a good reference to the code, make your code more manageable/readable. Later these comments can be converted into documentation, along with your source code by identifying properties and methods I found an add-in for visual studio, GhostDoc that automatically generates XML documentation comments for C#. The add-in is available in Visual Studio Gallery at MSDN. You can download this from the url http://visualstudiogallery.msdn.microsoft.com/en-us/46A20578-F0D5-4B1E-B55D-F001A6345748. I downloaded the free version from the above url. The free version suits my requirement. There is a professional version (you need to pay some $ for this) available that gives you some more features. I found the free version itself suits my requirements. The installation process is straight forward. A couple of clicks will do the work for you. The best thing with GhostDoc is that it supports multiple versions of visual studio such as 2005, 2008 and 2010. After Installing GhostDoc, when you start Visual studio, the GhostDoc configuration dialog will appear. The first screen asks you to assign a hot key, pressing this hotkey will enter the comment to your code file with the necessary structure required by GhostDoc. Click Assign to go to the next step where you configure the rules for generating the documentation from the code file. Click Create to start creating the rules. Click finish button to close this wizard. Now you performed the necessary configuration required by GhostDoc. Now In Visual Studio tools menu you can find the GhostDoc that gives you some options. Now let us examine how GhostDoc generate comments for a method. I have write the below code in my code behind file. public Char GetChar(string str, int pos) { return str[pos]; } Now I need to generate the comments for this function. Select the function and enter the hot key assigned during the configuration. GhostDoc will generate the comments as follows. /// <summary> /// Gets the char. /// </summary> /// <param name="str">The STR.</param> /// <param name="pos">The pos.</param> /// <returns></returns> public Char GetChar(string str, int pos) { return str[pos]; } So this is a very handy tool that helps developers writing comments easily. You can generate the xml documentation file separately while compiling the project. This will be done by the C# compiler. You can enable the xml documentation creation option (checkbox) under Project properties -> Build tab. Now when you compile, the xml file will created under the bin folder. Step 2: Generate the documentation from the XML file Now you have generated the xml file documentation. Sandcastle is the tool from Microsoft that generates MSDN style documentation from the compiler produced XML file. The project is available in codeplex http://sandcastle.codeplex.com/. Download and install Sandcastle to your computer. Sandcastle is a command line tool that doesn’t have a rich GUI. If you want to automate the documentation generation, definitely you will be using the command line tools. Since I want to generate the documentation from the xml file generated in the previous step, I was expecting a GUI where I can see the options. There is a GUI available for Sandcastle called Sandcastle Help File Builder. See the link to the project in codeplex. http://www.codeplex.com/wikipage?ProjectName=SHFB. You need to install Sandcastle and then the Sandcastle Help file builder. From here I assume that you have installed both sandcastle and Sandcastle help file builder successfully. Once you installed the help file builder, it will be available in your all programs list. Click on the Sandcastle Help File Builder GUI, will launch application. First you need to create a project. Click on File -> New project The New project dialog will appear. Choose a folder to store your project file and give a name for your documentation project. Click the save button. Now you will see your project properties. Now from the Project explorer, right click on the Documentation Sources, Click on the Add Documentation Source link. A documentation source is a file such as an assembly or a Visual Studio solution or project from which information will be extracted to produce API documentation. From the Add Documentation source dialog, I have selected the XML file generated by my project. Once you add the xml file to the project, you will see the dll file automatically added by the help file builder. Now click on the build button. Now the application will generate the help file. The Build window gives to the result of each steps. Once the process completed successfully, you will have the following output in the build window. Now navigate to your Help Project (I have selected the folder My Documents\Documentation), inside help folder, you can find the chm file. Open the chm file will give you MSDN like documentation. Documentation is an important part of development life cycle. Sandcastle with GhostDoc make this process easier so that developers can implement the documentation in the projects with simple to use steps.

Read the article

Master Data Management and Cloud Computing

- by david.butler(at)oracle.com

Cloud Computing is all the rage these days. There are many reasons why this is so. But like its predecessor, Service Oriented Architecture, it can fall on hard times if the underlying data is left unmanaged. Master Data Management is the perfect Cloud companion. It can materially increase the chances for successful Cloud initiatives. In this blog, I'll review the nature of the Cloud and show how MDM fits in. Here's the National Institute of Standards and Technology Cloud definition: • Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources that can be rapidly provisioned and released with minimal management effort or service provider interaction. Cloud architectures have three main layers: applications or Software as a Service (SaaS), Platforms as a Service (PaaS), and Infrastructure as a Service (IaaS). SaaS generally refers to applications that are delivered to end-users over the Internet. Oracle CRM On Demand is an example of a SaaS application. Today there are hundreds of SaaS providers covering a wide variety of applications including Salesforce.com, Workday, and Netsuite. Oracle MDM applications are located in this layer of Oracle's On Demand enterprise Cloud platform. We call it Master Data as a Service (MDaaS). PaaS generally refers to an application deployment platform delivered as a service. They are often built on a grid computing architecture and include database and middleware. Oracle Fusion Middleware is in this category and includes the SOA and Data Integration products used to connect SaaS applications including MDM. Finally, IaaS generally refers to computing hardware (servers, storage and network) delivered as a service. This typically includes the associated software as well: operating systems, virtualization, clustering, etc. Cloud Computing benefits are compelling for a large number of organizations. These include significant cost savings, increased flexibility, and fast deployments. Cost advantages include paying for just what you use. This is especially critical for organizations with variable or seasonal usage. Companies don't have to invest to support peak computing periods. Costs are also more predictable and controllable. Increased agility includes access to the latest technology and experts without making significant up front investments. While Cloud Computing is certainly very alluring with a clear value proposition, it is not without its challenges. An IDC survey of 244 IT executives/CIOs and their line-of-business (LOB) colleagues identified a number of issues: Security - 74% identified security as an issue involving data privacy and resource access control. Integration - 61% found that it is hard to integrate Cloud Apps with in-house applications. Operational Costs - 50% are worried that On Demand will actually cost more given the impact of poor data quality on the rest of the enterprise. Compliance - 49% felt that compliance with required regulatory, legal and general industry requirements (such as PCI, HIPAA and Sarbanes-Oxley) would be a major issue. When control is lost, the ability of a provider to directly manage how and where data is deployed, used and destroyed is negatively impacted. There are others, but I singled out these four top issues because Master Data Management, properly incorporated into a Cloud Computing infrastructure, can significantly ameliorate all of these problems. Cloud Computing can literally rain raw data across the enterprise. According to fellow blogger, Mike Ferguson, "the fracturing of data caused by the adoption of cloud computing raises the importance of MDM in keeping disparate data synchronized." David Linthicum, CTO Blue Mountain Labs blogs that "the lack of MDM will become more of an issue as cloud computing rises. We're moving from complex federated on-premise systems, to complex federated on-premise and cloud-delivered systems." Left unmanaged, non-standard, inconsistent, ungoverned data with questionable quality can pollute analytical systems, increase operational costs, and reduce the ROI in Cloud and On-Premise applications. As cloud computing becomes more relevant, and more data, applications, services, and processes are moved out to cloud computing platforms, the need for MDM becomes ever more important. Oracle's MDM suite is designed to deal with all four of the above Cloud issues listed in the IDC survey. Security - MDM manages all master data attribute privacy and resource access control issues. Integration - MDM pre-integrates Cloud Apps with each other and with On Premise applications at the data level. Operational Costs - MDM significantly reduces operational costs by increasing data quality, thereby improving enterprise business processes efficiency. Compliance - MDM, with its built in Data Governance capabilities, insures that the data is governed according to organizational standards. This facilitates rapid and accurate reporting for compliance purposes. Oracle MDM creates governed high quality master data. A unified cleansed and standardized data view is produced. The Oracle Customer Hub creates a single view of the customer. The Oracle Product Hub creates high quality product data designed to support all go-to-market processes. Oracle Supplier Hub dramatically reduces the chances of 'supplier exceptions'. Oracle Site Hub masters locations. And Oracle Hyperion Data Relationship Management masters financial reference data and manages enterprise hierarchies across operational areas from ERP to EPM and CRM to SCM. Oracle Fusion Middleware connects Cloud and On Premise applications to MDM Hubs and brings high quality master data to your enterprise business processes. An independent analyst once said "Poor data quality is like dirt on the windshield. You may be able to drive for a long time with slowly degrading vision, but at some point, you either have to stop and clear the windshield or risk everything." Cloud Computing has the potential to significantly degrade data quality across the enterprise over time. Deploying a Master Data Management solution prior to or in conjunction with a move to the Cloud can insure that the data flowing into the enterprise from the Cloud is clean and governed. This will in turn insure that expected returns on the investment in Cloud Computing will be realized. Oracle MDM has proven its metal in this area and has the customers to back that up. In fact, I will be hosting a webcast on Tuesday, April 10th at 10 am PT with one of our top Cloud customers, the Church Pension Group. They have moved all mainline applications to a hosted model and use Oracle MDM to insure the master data is managed and cleansed before it is propagated to other cloud and internal systems. I invite you join Martin Hossfeld, VP, IT Operations, and Danette Patterson, Enterprise Data Manager as they review business drivers for MDM and hosted applications, how they did it, the benefits achieved, and lessons learned. You can register for this free webcast here. Hope to see you there.

Read the article

OS Analytics - Deep Dive Into Your OS

- by Eran_Steiner

Enterprise Manager Ops Center provides a feature called "OS Analytics". This feature allows you to get a better understanding of how the Operating System is being utilized. You can research the historical usage as well as real time data. This post will show how you can benefit from OS Analytics and how it works behind the scenes. We will have a call to discuss this blog - please join us!Date: Thursday, November 1, 2012Time: 11:00 am, Eastern Daylight Time (New York, GMT-04:00)1. Go to https://oracleconferencing.webex.com/oracleconferencing/j.php?ED=209833067&UID=1512092402&PW=NY2JhMmFjMmFh&RT=MiMxMQ%3D%3D2. If requested, enter your name and email address.3. If a password is required, enter the meeting password: oracle1234. Click "Join". To join the teleconference:Call-in toll-free number: 1-866-682-4770 (US/Canada) Other countries: https://oracle.intercallonline.com/portlets/scheduling/viewNumbers/viewNumber.do?ownerNumber=5931260&audioType=RP&viewGa=true&ga=ONConference Code: 7629343#Security code: 7777# Here is quick summary of what you can do with OS Analytics in Ops Center: View historical charts and real time value of CPU, memory, network and disk utilization Find the top CPU and Memory processes in real time or at a certain historical day Determine proper monitoring thresholds based on historical data View Solaris services status details Drill down into a process details View the busiest zones if applicable Where to start To start with OS Analytics, choose the OS asset in the tree and click the Analytics tab. You can see the CPU utilization, Memory utilization and Network utilization, along with the current real time top 5 processes in each category (click the image to see a larger version): In the above screen, you can click each of the top 5 processes to see a more detailed view of that process. Here is an example of one of the processes: One of the cool things is that you can see the process tree for this process along with some port binding and open file descriptors. On Solaris machines with zones, you get an extra level of tabs, allowing you to get more information on the different zones: This is a good way to see the busiest zones. For example, one zone may not take a lot of CPU but it can consume a lot of memory, or perhaps network bandwidth. To see the detailed Analytics for each of the zones, simply click each of the zones in the tree and go to its Analytics tab. Next, click the "Processes" tab to see real time information of all the processes on the machine: An interesting column is the "Target" column. If you configured Ops Center to work with Enterprise Manager Cloud Control, then the two products will talk to each other and Ops Center will display the correlated target from Cloud Control in this table. If you are only using Ops Center - this column will remain empty. Next, if you view a Solaris machine, you will have a "Services" tab: By default, all services will be displayed, but you can choose to display only certain states, for example, those in maintenance or the degraded ones. You can highlight a service and choose to view the details, where you can see the Dependencies, Dependents and also the location of the service log file (not shown in the picture as you need to scroll down to see the log file). The "Threshold" tab is particularly helpful - you can view historical trends of different monitored values and based on the graph - determine what the monitoring values should be: You can ask Ops Center to suggest monitoring levels based on the historical values or you can set your own. The different colors in the graph represent the current set levels: Red for critical, Yellow for warning and Blue for Information, allowing you to quickly see how they're positioned against real data. It's important to note that when looking at longer periods, Ops Center smooths out the data and uses averages. So when looking at values such as CPU Usage, try shorter time frames which are more detailed, such as one hour or one day. Applying new monitoring values When first applying new values to monitored attributes - a popup will come up asking if it's OK to get you out of the current Monitoring Policy. This is OK if you want to either have custom monitoring for a specific machine, or if you want to use this current machine as a "Gold image" and extract a Monitoring Policy from it. You can later apply the new Monitoring Policy to other machines and also set it as a default Monitoring Profile. Once you're done with applying the different monitoring values, you can review and change them in the "Monitoring" tab. You can also click the "Extract a Monitoring Policy" in the actions pane on the right to save all the new values to a new Monitoring Policy, which can then be found under "Plan Management" -> "Monitoring Policies". Visiting the past Under the "History" tab you can "go back in time". This is very helpful when you know that a machine was busy a few hours ago (perhaps in the middle of the night?), but you were not around to take a look at it in real time. Here's a view into yesterday's data on one of the machines: You can see an interesting CPU spike happening at around 3:30 am along with some memory use. In the bottom table you can see the top 5 CPU and Memory consumers at the requested time. Very quickly you can see that this spike is related to the Solaris 11 IPS repository synchronization process using the "pkgrecv" command. The "time machine" doesn't stop here - you can also view historical data to determine which of the zones was the busiest at a given time: Under the hood The data collected is stored on each of the agents under /var/opt/sun/xvm/analytics/historical/ An "os.zip" file exists for the main OS. Inside you will find many small text files, named after the Epoch time stamp in which they were taken If you have any zones, there will be a file called "guests.zip" containing the same small files for all the zones, as well as a folder with the name of the zone along with "os.zip" in it If this is the Enterprise Controller or the Proxy Controller, you will have folders called "proxy" and "sat" in which you will find the "os.zip" for that controller The actual script collecting the data can be viewed for debugging purposes as well: On Linux, the location is: /opt/sun/xvmoc/private/os_analytics/collect On Solaris, the location is /opt/SUNWxvmoc/private/os_analytics/collect If you would like to redirect all the standard error into a file for debugging, touch the following file and the output will go into it: # touch /tmp/.collect.stderr The temporary data is collected under /var/opt/sun/xvm/analytics/.collectdb until it is zipped. If you would like to review the properties for the Analytics, you can view those per each agent in /opt/sun/n1gc/lib/XVM.properties. Find the section "Analytics configurable properties for OS and VSC" to view the Analytics specific values. I hope you find this helpful! Please post questions in the comments below. Eran Steiner

Read the article

My Thoughts On the Xbox 180

- by Chris Gardner

Originally posted on: http://geekswithblogs.net/freestylecoding/archive/2013/06/21/my-thoughts-on-the-xbox-180.aspx Everyone seems to be putting their 0.00237 cents into the wishing well over Microsoft's recent decision to reverse the DRM policy on the Xbox One. However, there have been a few issues that nobody has touched. As such, I have decided to dig 0.00237 cents out of my pocket. First, let me be clear about this point. I do not support the decision to reverse the DRM policy on the Xbox One. I wanted that point to be expressed first and unambiguously. I will say it again. I do not support the decision to reverse the DRM policy on the Xbox One. Now that I have that out of the way, let me go into my rationale. This decision removes most of the cool features that enticed me to pre-order the console. No, I didn't cancel my pre-order. There is still five months before the release of the console, and there is still a plethora of information that we, as consumers, do not have. With that, it should be noted that much of the talk in this post is speculation and rhetoric. I do not have any insider information that you do not possess. The persistent connection would have allowed the console to do many of the functions for which we have been begging. That demo where someone was playing Ryse, seamlessly accepted a multiplayer challenge in Killer Instinct, played the match (and a rematch,) and then jumped back into Ryse. That's gone, if you bought the game on disc. The new, DRM free system will require the disc in the system to play a game. That bullet point where one Xbox Live account could have up to 10 slave accounts so families could play together, no matter where they were located. That's gone as well. The promise of huge, expansive, dynamically changing worlds that was brought to us with the power of cloud computing. Well, "the people" didn't want there to be a forced, persistent connection. As such, developers can't rely on a connection and, as such, that feature is gone. This is akin to the removal of the hard drive on the Xbox 360. The list continues, but the enthusiast press has enumerated the list far better than I wish. All of this is because the Xbox team saw the HUGE success of Steam and decided to borrow a few ideas. Yes, Steam. The service that everyone hated for the first six months (for the same reasons the Xbox One is getting flack.) There was an initial growing pain. However, it is now lauded as the way games distribution should be handled. Unless you are Microsoft. I do find it curious that many of the features were originally announced for the PS4 during its unveiling. However, much of that was left strangely absent for Sony's E3 press conference. Instead, we received a single, static slide that basically said the exact opposite of Microsoft's plans. It is not farfetched to believe that slide came into existence during the approximately seven hours between the two media briefings. The thing that majorly annoys me over this whole kerfuffle is that the single thing that caused the call to arms is, really, not an issue. Microsoft never said they were going to block used sales. They said it was up to the publisher to make that decision. This would have allowed publishers to reclaim some of the costs of development in subsequent sales of the product. If you sell your game to GameStop for 7 USD, GameStop is going to sell it for 55 USD. That is 48 USD pure profit for them. Some publishers asked GameStop for a small cut. Was this a huge, money grubbing scheme? Well, yes, but the idea was that they have to handle server infrastructure for dormant accounts, etc. Of course, GameStop flatly refused, and the Online Pass was born. Fortunately, this trend didn’t last, and most publishers have stopped the practice. The ability to sell "licenses" has already begun to be challenged. Are you living in the EU? If so, companies must allow you to sell digital property. With this precedent in place, it's only a matter of time before other areas follow suit. If GameStop were smart, they should have immediately contacted every publisher out there to get the rights to become a clearing house for these licenses. Then, they keep their business model and could reduce their brick and mortar footprint. The digital landscape is changing. We need to not block this process. As Seth MacFarlane best said "Some issues are so important that you should drag people kicking and screaming." I believe this was said on an episode of Real Time with Bill Maher about the issue of Gay Marriages. Much like the original source, this is an issue that we need to drag people to the correct, progressive position. Microsoft, as a company, actually has the resources to weather the transition period. They have a great pool of first and second party developers that can leverage this new framework to prove the validity. Over time, the third party developers will get excited to use these tools. As an old C++ guy, I resisted C# for years. Now, I think it's one of the best languages I've ever used. I have a server room and a Co-Lo full of servers, so I originally didn't see the value in Azure. Now, I wish I could move every one of my projects into the cloud. I still LOVE getting physical packaging, which my music and games collection will proudly attest. However, I have started to see the value in pure digital, and have found ways to integrate this into the ways I consume those products. I can, honestly, understand how some parts of the population would be very apprehensive about this new landscape. There were valid arguments about people with no internet access. There are ways to combat these problems. These methods do not require us to throw the baby out with the bathwater. However, the number of people in the computer industry that I have seen cry foul is truly appalling. We are the forward looking people that help show how technology can improve people's lives. If we can't see the value of the brief pain involved with an exciting new ecosystem, than who will?

Read the article

Handling HumanTask attachments in Oracle BPM 11g PS4FP+ (I)

- by ccasares

Adding attachments to a HumanTask is a feature that exists in Oracle HWF (Human Workflow) since 10g. However, in 11g there have been many improvements on this feature and this entry will try to summarize them. Oracle BPM 11g 11.1.1.5.1 (aka PS4 Feature Pack or PS4FP) introduced two great features: Ability to link attachments at a Task scope or at a Process scope: "Task" attachments are only visible within the scope (lifetime) of a task. This means that, initially, any member of the assignment pattern of the Human Task will be able to handle (add, review or remove) attachments. However, once the task is completed, subsequent human tasks will not have access to them. This does not mean those attachments got lost. Once the human task is completed, attachments can be retrieved in order to, i.e., check them in to a Content Server or to inject them to a new and different human task. Aside note: a "re-initiated" human task will inherit comments and attachments, along with history and -optionally- payload. See here for more info. "Process" attachments are visible within the scope of the process. This means that subsequent human tasks in the same process instance will have access to them. Ability to use Oracle WebCenter Content (previously known as "Oracle UCM") as the backend for the attachments instead of using HWF database backend. This feature adds all content server document lifecycle capabilities to HWF attachments (versioning, RBAC, metadata management, etc). As of today, only Oracle WCC is supported. However, Oracle BPM Suite does include a license of Oracle WCC for the solely usage of document management within BPM scope. Here are some code samples that leverage the above features. Retrieving uploaded attachments -Non UCM- Non UCM attachments (default ones or those that have existed from 10g, and are stored "as-is" in HWK database backend) can be retrieved after the completion of the Human Task. Firstly, we need to know whether any attachment has been effectively uploaded to the human task. There are two ways to find it out: Through an XPath function: Checking the execData/attachment[] structure. For example: Once we are sure one ore more attachments were uploaded to the Human Task, we want to get them. In this example, by "get" I mean to get the attachment name and the payload of the file. Aside note: Oracle HWF lets you to upload two kind of [non-UCM] attachments: a desktop document and a Web URL. This example focuses just on the desktop document one. In order to "retrieve" an uploaded Web URL, you can get it directly from the execData/attachment[] structure. Attachment content (payload) is retrieved through the getTaskAttachmentContents() XPath function: This example shows how to retrieve as many attachments as those had been uploaded to the Human Task and write them to the server using the File Adapter service. The sample process excerpt is as follows: A dummy UserTask using "HumanTask1" Human Task followed by a Embedded Subprocess that will retrieve the attachments (we're assuming at least one attachment is uploaded): and once retrieved, we will write each of them back to a file in the server using a File Adapter service: In detail: We've defined an XSD structure that will hold the attachments (both name and payload): Then, we can create a BusinessObject based on such element (attachmentCollection) and create a variable (named attachmentBPM) of such BusinessObject type. We will also need to keep a copy of the HumanTask output's execData structure. Therefore we need to create a variable of type TaskExecutionData... ...and copy the HumanTask output execData to it: Now we get into the embedded subprocess that will retrieve the attachments' payload. First, and using an XSLT transformation, we feed the attachmentBPM variable with the name of each attachment and setting an empty value to the payload: Please note that we're using the XSLT for-each node to create as many target structures as necessary. Also note that we're setting an Empty text to the payload variable. The reason for this is to make sure the <payload></payload> tag gets created. This is needed when we map the payload to the XML variable later. Aside note: We are assuming that we're retrieving non-UCM attachments. However in real life you might want to check the type of attachment you're handling. The execData/attachment[]/storageType contains the values "UCM" for UCM type attachments, "TASK" for non-UCM ones or "URL" for Web URL ones. Those values are part of the "Ext.Com.Oracle.Xmlns.Bpel.Workflow.Task.StorageTypeEnum" enumeration. Once we have fed the attachmentsBPM structure and so it now contains the name of each of the attachments, it is time to iterate through it and get the payload. Therefore we will use a new embedded subprocess of type MultiInstance, that will iterate over the attachmentsBPM/attachment[] element: In every iteration we will use a Script activity to map the corresponding payload element with the result of the XPath function getTaskAttachmentContents(). Please, note how the target array element is indexed with the loopCounter predefined variable, so that we make sure we're feeding the right element during the array iteration: The XPath function used looks as follows: hwf:getTaskAttachmentContents(bpmn:getDataObject('UserTask1LocalExecData')/ns1:systemAttributes/ns1:taskId, bpmn:getDataObject('attachmentsBPM')/ns:attachment[bpmn:getActivityInstanceAttribute('SUBPROCESS3067107484296', 'loopCounter')]/ns:fileName) where the input parameters are: taskId of the just completed Human Task attachment name we're retrieving the payload from array index (loopCounter predefined variable) Aside note: The reason whereby we're iterating the execData/attachment[] structure through embedded subprocess and not, i.e., using XSLT and for-each nodes, is mostly because the getTaskAttachmentContents() XPath function is currently not available in XSLT mappings. So all this example might be considered as a workaround until this gets fixed/enhanced in future releases. Once this embedded subprocess ends, we will have all attachments (name + payload) in the attachmentsBPM variable, which is the main goal of this sample. But in order to test everything runs fine, we finish the sample writing each attachment to a file. To that end we include a final embedded subprocess to concurrently iterate through each attachmentsBPM/attachment[] element: On each iteration we will use a Service activity that invokes a File Adapter write service. In here we have two important parameters to set. First, the payload itself. The file adapter awaits binary data in base64 format (string). We have to map it using XPath (Simple mapping doesn't recognize a String as a base64-binary valid target): Second, we must set the target filename using the Service Properties dialog box: Again, note how we're making use of the loopCounter index variable to get the right element within the embedded subprocess iteration. Handling UCM attachments will be part of a different and upcoming blog entry. Once I finish will all posts on this matter, I will upload the whole sample project to java.net.

Read the article

Granular Clipboard Control in Oracle IRM

- by martin.abrahams

One of the main leak prevention controls that customers are looking for is clipboard control. After all, there is little point in controlling access to a document if authorised users can simply make unprotected copies by use of the cut and paste mechanism. Oddly, for such a fundamental requirement, many solutions only offer very simplistic clipboard control - and require the customer to make an awkward choice between usability and security. In many cases, clipboard control is simply an ON-OFF option. By turning the clipboard OFF, you disable one of the most valuable edit functions known to man. Try working for any length of time without copying and pasting, and you'll soon appreciate how valuable that function is. Worse, some solutions disable the clipboard completely - not just for the protected document but for all of the various applications you have open at the time. Normal service is only resumed when you close the protected document. In this way, policy enforcement bleeds out of the particular assets you need to protect and interferes with the entire user experience. On the other hand, turning the clipboard ON satisfies a fundamental usability requirement - but also makes it really easy for users to create unprotected copies of sensitive information, maliciously or otherwise. All they need to do is paste into another document. If creating unprotected copies is this simple, you have to question how much you are really gaining by applying protection at all. You may not be allowed to edit, forward, or print the protected asset, but all you need to do is create a copy and work with that instead. And that activity would not be tracked in any way. So, a simple ON-OFF control creates a real tension between usability and security. If you are only using IRM on a small scale, perhaps security can outweigh usability - the business can put up with the restriction if it only applies to a handful of important documents. But try extending protection to large numbers of documents and large user communities, and the restriction rapidly becomes really unwelcome. I am aware of one solution that takes a different tack. Rather than disable the clipboard, pasting is always permitted, but protection is automatically applied to any document that you paste into. At first glance, this sounds great - protection travels with the content. However, at any scale this model may not be so appealing once you've had to deal with support calls from users who have accidentally applied protection to documents that really don't need it - which would be all too easily done. This may help control leakage, but it also pollutes the system with documents that have policies applied with no obvious rhyme or reason, and it can seriously inconvenience the business by making non-sensitive documents difficult to access. And what policy applies if you paste some protected content into an already protected document? Which policy applies? There are no prizes for guessing that Oracle IRM takes a rather different approach. The Oracle IRM Approach Oracle IRM offers a spectrum of clipboard controls between the extremes of ON and OFF, and it leverages the classification-based rights model to give granular control that satisfies both security and usability needs. Firstly, we take it for granted that if you have EDIT rights, of course you can use the clipboard within a given document. Why would we force you to retype a piece of content that you want to move from HERE... to HERE...? If the pasted content remains in the same document, it is equally well protected whether it be at the beginning, middle, or end - or all three. So, the first point is that Oracle IRM always enables the clipboard if you have the right to edit the file. Secondly, whether we enable or disable the clipboard, we only affect the protected document. That is, you can continue to use the clipboard in the usual way for unprotected documents and applications regardless of whether the clipboard is enabled or disabled for the protected document(s). And if you have multiple protected documents open, each may have the clipboard enabled or disabled independently, according to whether you have Edit rights for each. So, even for the simplest cases - the ON-OFF cases - Oracle IRM adds value by containing the effect to the protected documents rather than to the whole desktop environment. Now to the granular options between ON and OFF. Thanks to our classification model, we can define rights that enable pasting between documents in the same classification - ie. between documents that are protected by the same policy. So, if you are working on this month's financial report and you want to pull some data from last month's report, you can simply cut and paste between the two documents. The two documents are classified the same way, subject to the same policy, so the content is equally safe in both documents. However, if you try to paste the same data into an unprotected document or a document in a different classification, you can be prevented. Thus, the control balances legitimate user requirements to allow pasting with legitimate information security concerns to keep data protected. We can take this further. You may have the right to paste between related classifications of document. So, the CFO might want to copy some financial data into a board document, where the two documents are sealed to different classifications. The CFO's rights may well allow this, as it is a reasonable thing for a CFO to want to do. But policy might prevent the CFO from copying the same data into a classification that is accessible to external parties. The above option, to copy between classifications, may be for specific classifications or open-ended. That is, your rights might enable you to go from A to B but not to C, or you might be allowed to paste to any classification subject to your EDIT rights. As for so many features of Oracle IRM, our classification-based rights model makes this type of granular control really easy to manage - you simply define that pasting is permitted between classifications A and B, but omit C. Or you might define that pasting is permitted between all classifications, but not to unprotected locations. The classification model enables millions of documents to be controlled by a few such rules. Finally, you MIGHT have the option to paste anywhere - such that unprotected copies may be created. This is rare, but a legitimate configuration for some users, some use cases, and some classifications - but not something that you have to permit simply because the alternative is too restrictive. As always, these rights are defined in user roles - so different users are subject to different clipboard controls as required in different classifications. So, where most solutions offer just two clipboard options - ON-OFF or ON-but-encrypt-everything-you-touch - Oracle IRM offers real granularity that leverages our classification model. Indeed, I believe it is the lack of a classification model that makes such granularity impractical for other IRM solutions, because the matrix of rules for controlling pasting would be impossible to manage - there are so many documents to consider, and more are being created all the time.

Read the article

Building KPIs to monitor your business Its not really about the Technology

When I have discussions with people about Business Intelligence, one of the questions the inevitably come up is about building KPIs and how to accomplish that. From a technical level the concept of a KPI is very simple, almost too simple in that it is like the tip of an iceberg floating above the water. The key to that iceberg is not really the tip, but the mass of the iceberg that is hidden beneath the surface upon which the tip sits. The analogy of the iceberg is not meant to indicate that the foundation of the KPI is overly difficult or complex. The disparity in size in meant to indicate that the larger thing that needs to be defined is not the technical tip, but the underlying business definition of what the KPI means. From a technical perspective the KPI consists of primarily the following items: Actual Value This is the actual value data point that is being measured. An example would be something like the amount of sales. Target Value This is the target goal for the KPI. This is a number that can be measured against Actual Value. An example would be $10,000 in monthly sales. Target Indicator Range This is the definition of ranges that define what type of indicator the user will see comparing the Actual Value to the Target Value. Most often this is defined by stoplight, but can be any indicator that is going to show a status in a quick fashion to the user. Typically this would be something like: Red Light = Actual Value more than 5% below target; Yellow Light = Within 5% of target either direction; Green Light = More than 5% higher than Target Value Status\Trend Indicator This is an optional attribute of a KPI that is typically used to show some kind of trend. The vast majority of these indicators are used to show some type of progress against a previous period. As an example, the status indicator might be used to show how the monthly sales compare to last month. With this type of indicator there needs to be not only a definition of what the ranges are for your status indictor, but then also what value the number needs to be compared against. So now we have an idea of what data points a KPI consists of from a technical perspective lets talk a bit about tools. As you can see technically there is not a whole lot to them and the choice of technology is not as important as the definition of the KPIs, which we will get to in a minute. There are many different types of tools in the Microsoft BI stack that you can use to expose your KPI to the business. These include Performance Point, SharePoint, Excel, and SQL Reporting Services. There are pluses and minuses to each technology and the right technology is based a lot on your goals and how you want to deliver the information to the users. Additionally, there are other non-Microsoft tools that can be used to expose KPI indicators to your business users. Regardless of the technology used as your front end, the heavy lifting of KPI is in the business definition of the values and benchmarks for that KPI. The discussion about KPIs is very dependent on the history of an organization and how much they are exposed to the attributes of a KPI. Often times when discussing KPIs with a business contact who has not been exposed to KPIs the discussion tends to also be a session educating the business user about what a KPI is and what goes into the definition of a KPI. The majority of times the business user has an idea of what their actual values are and they have been tracking those numbers for some time, generally in Excel and all manually. So they will know the amount of sales last month along with sales two years ago in the same month. Where the conversation tends to get stuck is when you start discussing what the target value should be. The actual value is answering the What and How much questions. When you are talking about the Target values you are asking the question Is this number good or bad. Typically, the user will know whether or not the value is good or bad, but most of the time they are not able to quantify what is good or bad. Their response is usually something like I just know. Because they have been watching the sales quantity for years now, they can tell you that a 5% decrease in sales this month might actually be a good thing, maybe because the salespeople are all waiting until next month when the new versions come out. It can sometimes be very hard to break the business people of this habit. One of the fears generally is that the status indicator is not subjective. Thus, in the scenario above, the business user is going to be fearful that their boss, just looking at a negative red indicator, is going to haul them out to the woodshed for a bad month. But, on the flip side, if all you are displaying is the amount of sales, only a person with knowledge of last month sales and the target amount for this month would have any idea if $10,000 in sales is good or not. Here is where a key point about KPIs needs to be communicated to both the business user and any user who might be viewing the results of that KPI. The KPI is just one tool that is used to report on business performance. The KPI is meant as a quick indicator of one business statistic. It is not meant to tell the entire story. It does not answer the question Why. Its primary purpose is to objectively and quickly expose an area of the business that might warrant more review. There is always going to be the need to do further analysis on any potential negative or neutral KPI. So, hopefully, once you have convinced your business user to come up with some target numbers and ranges for status indicators, you then need to take the next step and help them answer the Why question. The main question here to ask is, Okay, you see the indicator and you need to discover why the number is what is, where do you go?. The answer is usually a combination of sources. A sales manager might have some of the following items at their disposal (Marketing report showing a decrease in the promotional discounts for the month, Pricing Report showing the reduction of prices of older models, an Inventory Report showing the discontinuation of a particular product line, or a memo showing the ending of a large affiliate partnership. The answers to the question Why are never as simple as a single indicator value. Bring able to quickly get to this information is all about designing how a user accesses the KPIs and then also how easily they can get to the additional information they need. This is where a Dashboard mentality can come in handy. For example, the business user can have a dashboard that shows their KPIs, but also has links to some of the common reports that they run regarding Sales Data. The users boss may have the same KPIs on their dashboard, but instead of links to individual reports they are going to have a link to a status report that was created by the user that pulls together all the data about the KPI in a summary format the users boss can review. So some of the key things to think about when building or evaluating KPIs for your organization: Technology should not be the driving factor KPIs are of little value without some indicator for whether a value is good, bad or neutral. KPIs only give an answer to the Is this number good\bad? question Make sure the ability to drill into the Why of a KPI is close at hand and relevant to the user who is viewing the KPI. The KPI is a key business tool when defined properly to help monitor business performance across the enterprise in an objective and consistent manner. At times it might feel like the process of defining the business aspects of a KPI can sometimes be arduous, the payoff in the end can far outweigh the costs. Some of the benefits of going through this process are a better understanding of the key metrics for an organization and the measure of those metrics and a consistent snapshot of business performance that can be utilized across the organization. And I think that these are benefits to any organization regardless of the technology or the implementation.Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

Read the article

New Feature in ODI 11.1.1.6: ODI for Big Data

- by Julien Testut

Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} By Ananth Tirupattur Starting with Oracle Data Integrator 11.1.1.6.0, ODI is offering a solution to process Big Data. This post provides an overview of this feature. With all the buzz around Big Data and before getting into the details of ODI for Big Data, I will provide a brief introduction to Big Data and Oracle Solution for Big Data. So, what is Big Data? Big data includes: structured data (this includes data from relation data stores, xml data stores), semi-structured data (this includes data from weblogs) unstructured data (this includes data from text blob, images) Traditionally, business decisions are based on the information gathered from transactional data. For example, transactional Data from CRM applications is fed to a decision system for analysis and decision making. Products such as ODI play a key role in enabling decision systems. However, with the emergence of massive amounts of semi-structured and unstructured data it is important for decision system to include them in the analysis to achieve better decision making capability. While there is an abundance of opportunities for business for gaining competitive advantages, process of Big Data has challenges. The challenges of processing Big Data include: Volume of data Velocity of data - The high Rate at which data is generated Variety of data In order to address these challenges and convert them into opportunities, we would need an appropriate framework, platform and the right set of tools. Hadoop is an open source framework which is highly scalable, fault tolerant system, for storage and processing large amounts of data. Hadoop provides 2 key services, distributed and reliable storage called Hadoop Distributed File System or HDFS and a framework for parallel data processing called Map-Reduce. Innovations in Hadoop and its related technology continue to rapidly evolve, hence therefore, it is highly recommended to follow information on the web to keep up with latest information. Oracle's vision is to provide a comprehensive solution to address the challenges faced by Big Data. Oracle is providing the necessary Hardware, software and tools for processing Big Data Oracle solution includes: Big Data Appliance Oracle NoSQL Database Cloudera distribution for Hadoop Oracle R Enterprise- R is a statistical package which is very popular among data scientists. ODI solution for Big Data Oracle Loader for Hadoop for loading data from Hadoop to Oracle. Further details can be found here: http://www.oracle.com/us/products/database/big-data-appliance/overview/index.html ODI Solution for Big Data: ODI’s goal is to minimize the need to understand the complexity of Hadoop framework and simplify the adoption of processing Big Data seamlessly in an enterprise. ODI is providing the capabilities for an integrated architecture for processing Big Data. This includes capability to load data in to Hadoop, process data in Hadoop and load data from Hadoop into Oracle. ODI is expanding its support for Big Data by providing the following out of the box Knowledge Modules (KMs). IKM File to Hive (LOAD DATA).Load unstructured data from File (Local file system or HDFS ) into Hive IKM Hive Control AppendTransform and validate structured data on Hive IKM Hive TransformTransform unstructured data on Hive IKM File/Hive to Oracle (OLH)Load processed data in Hive to Oracle RKM HiveReverse engineer Hive tables to generate models Using the Loading KM you can map files (local and HDFS files) to the corresponding Hive tables. For example, you can map weblog files categorized by date into a corresponding partitioned Hive table schema. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Using the Hive control Append KM you can validate and transform data in Hive. In the below example, two source Hive tables are joined and mapped to a target Hive table. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} The Hive Transform KM facilitates processing of semi-structured data in Hive. In the below example, the data from weblog is processed using a Perl script and mapped to target Hive table. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Using the Oracle Loader for Hadoop (OLH) KM you can load data from Hive table or HDFS to a corresponding table in Oracle. OLH is available as a standalone product. ODI greatly enhances OLH capability by generating the configuration and mapping files for OLH based on the configuration provided in the interface and KM options. ODI seamlessly invokes OLH when executing the scenario. In the below example, a HDFS file is mapped to a table in Oracle. Development and Deployment:The following diagram illustrates the development and deployment of ODI solution for Big Data. Using the ODI Studio on your development machine create and develop ODI solution for processing Big Data by connecting to a MySQL DB or Oracle database on a BDA machine or Hadoop cluster. Schedule the ODI scenarios to be executed on the ODI agent deployed on the BDA machine or Hadoop cluster. ODI Solution for Big Data provides several exciting new capabilities to facilitate the adoption of Big Data in an enterprise. You can find more information about the Oracle Big Data connectors on OTN. You can find an overview of all the new features introduced in ODI 11.1.1.6 in the following document: ODI 11.1.1.6 New Features Overview

Read the article

Combination of Operating Mode and Commit Strategy

- by Kevin Yang

If you want to populate a source into multiple targets, you may also want to ensure that every row from the source affects all targets uniformly (or separately). Let’s consider the Example Mapping below. If a row from SOURCE causes different changes in multiple targets (TARGET_1, TARGET_2 and TARGET_3), for example, it can be successfully inserted into TARGET_1 and TARGET_3, but failed to be inserted into TARGET_2, and the current Mapping Property TLO (target load order) is “TARGET_1 -> TARGET_2 -> TARGET_3”. What should Oracle Warehouse Builder do, in order to commit the appropriate data to all affected targets at the same time? If it doesn’t behave as you intended, the data could become inaccurate and possibly unusable. Example Mapping In OWB, we can use Mapping Configuration Commit Strategies and Operating Modes together to achieve this kind of requirements. Below we will explore the combination of these two features and how they affect the results in the target tables Before going to the example, let’s review some of the terms we will be using (Details can be found in white paper Oracle® Warehouse Builder Data Modeling, ETL, and Data Quality Guide11g Release 2): Operating Modes: Set-Based Mode: Warehouse Builder generates a single SQL statement that processes all data and performs all operations. Row-Based Mode: Warehouse Builder generates statements that process data row by row. The select statement is in a SQL cursor. All subsequent statements are PL/SQL. Row-Based (Target Only) Mode: Warehouse Builder generates a cursor select statement and attempts to include as many operations as possible in the cursor. For each target, Warehouse Builder inserts each row into the target separately. Commit Strategies: Automatic: Warehouse Builder loads and then automatically commits data based on the mapping design. If the mapping has multiple targets, Warehouse Builder commits and rolls back each target separately and independently of other targets. Use the automatic commit when the consequences of multiple targets being loaded unequally are not great or are irrelevant. Automatic correlated: It is a specialized type of automatic commit that applies to PL/SQL mappings with multiple targets only. Warehouse Builder considers all targets collectively and commits or rolls back data uniformly across all targets. Use the correlated commit when it is important to ensure that every row in the source affects all affected targets uniformly. Manual: select manual commit control for PL/SQL mappings when you want to interject complex business logic, perform validations, or run other mappings before committing data. Combination of the commit strategy and operating mode To understand the effects of each combination of operating mode and commit strategy, I’ll illustrate using the following example Mapping. Firstly we insert 100 rows into the SOURCE table and make sure that the 99th row and 100th row have the same ID value. And then we create a unique key constraint on ID column for TARGET_2 table. So while running the example mapping, OWB tries to load all 100 rows to each of the targets. But the mapping should fail to load the 100th row to TARGET_2, because it will violate the unique key constraint of table TARGET_2. With different combinations of Commit Strategy and Operating Mode, here are the results ¦ Set-based/ Correlated Commit: Configuration of Example mapping: Result: What’s happening: A single error anywhere in the mapping triggers the rollback of all data. OWB encounters the error inserting into Target_2, it reports an error for the table and does not load the row. OWB rolls back all the rows inserted into Target_1 and does not attempt to load rows to Target_3. No rows are added to any of the target tables. ¦ Row-based/ Correlated Commit: Configuration of Example mapping: Result: What’s happening: OWB evaluates each row separately and loads it to all three targets. Loading continues in this way until OWB encounters an error loading row 100th to Target_2. OWB reports the error and does not load the row. It rolls back the row 100th previously inserted into Target_1 and does not attempt to load row 100 to Target_3. Then, if there are remaining rows, OWB will continue loading them, resuming with loading rows to Target_1. The mapping completes with 99 rows inserted into each target. ¦ Set-based/ Automatic Commit: Configuration of Example mapping: Result: What’s happening: When OWB encounters the error inserting into Target_2, it does not load any rows and reports an error for the table. It does, however, continue to insert rows into Target_3 and does not roll back the rows previously inserted into Target_1. The mapping completes with one error message for Target_2, no rows inserted into Target_2, and 100 rows inserted into Target_1 and Target_3 separately. ¦ Row-based/Automatic Commit: Configuration of Example mapping: Result: What’s happening: OWB evaluates each row separately for loading into the targets. Loading continues in this way until OWB encounters an error loading row 100 to Target_2 and reports the error. OWB does not roll back row 100th from Target_1, does insert it into Target_3. If there are remaining rows, it will continue to load them. The mapping completes with 99 rows inserted into Target_2 and 100 rows inserted into each of the other targets. Note: Automatic Correlated commit is not applicable for row-based (target only). If you design a mapping with the row-based (target only) and correlated commit combination, OWB runs the mapping but does not perform the correlated commit. In set-based mode, correlated commit may impact the size of your rollback segments. Space for rollback segments may be a concern when you merge data (insert/update or update/insert). Correlated commit operates transparently with PL/SQL bulk processing code. The correlated commit strategy is not available for mappings run in any mode that are configured for Partition Exchange Loading or that include a Queue, Match Merge, or Table Function operator. If you want to practice in your own environment, you can follow the steps: 1. Import the MDL file: commit_operating_mode.mdl 2. Fix the location for oracle module ORCL and deploy all tables under it. 3. Insert sample records into SOURCE table, using below plsql code: begin for i in 1..99 loop insert into source values(i, 'col_'||i); end loop; insert into source values(99, 'col_99'); end; 4. Configure MAPPING_1 to any combinations of operating mode and commit strategy you want to test. And make sure feature TLO of mapping is open. 5. Deploy Mapping “MAPPING_1”. 6. Run the mapping and check the result.

Read the article

SQL SERVER – Weekly Series – Memory Lane – #035

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Row Overflow Data Explanation In SQL Server 2005 one table row can contain more than one varchar(8000) fields. One more thing, the exclusions has exclusions also the limit of each individual column max width of 8000 bytes does not apply to varchar(max), nvarchar(max), varbinary(max), text, image or xml data type columns. Comparison Index Fragmentation, Index De-Fragmentation, Index Rebuild – SQL SERVER 2000 and SQL SERVER 2005 An old but like a gold article. Talks about lots of concepts related to Index and the difference from earlier version to the newer version. I strongly suggest that everyone should read this article just to understand how SQL Server has moved forward with the technology. Improvements in TempDB SQL Server 2005 had come up with quite a lots of improvements and this blog post describes them and explains the same. If you ask me what is my the most favorite article from early career. I must point out to this article as when I wrote this one I personally have learned a lot of new things. Recompile All The Stored Procedure on Specific TableI prefer to recompile all the stored procedure on the table, which has faced mass insert or update. sp_recompiles marks stored procedures to recompile when they execute next time. This blog post explains the same with the help of a script. 2008 SQLAuthority Download – SQL Server Cheatsheet You can download and print this cheat sheet and use it for your personal reference. If you have any suggestions, please let me know and I will see if I can update this SQL Server cheat sheet. Difference Between DBMS and RDBMS What is the difference between DBMS and RDBMS? DBMS – Data Base Management System RDBMS – Relational Data Base Management System or Relational DBMS High Availability – Hot Add Memory Hot Add CPU and Hot Add Memory are extremely interesting features of the SQL Server, however, personally I have not witness them heavily used. These features also have few restriction as well. I blogged about them in detail. 2009 Delete Duplicate Rows I have demonstrated in this blog post how one can identify and delete duplicate rows. Interesting Observation of Logon Trigger On All Servers – Solution The question I put forth in my previous article was – In single login why the trigger fires multiple times; it should be fired only once. I received numerous answers in thread as well as in my MVP private news group. Now, let us discuss the answer for the same. The answer is – It happens because multiple SQL Server services are running as well as intellisense is turned on. Blog post demonstrates how we can do the same with the help of SQL scripts. Management Studio New Features I have selected my favorite 5 features and blogged about it. IntelliSense for Query Editing Multi Server Query Query Editor Regions Object Explorer Enhancements Activity Monitors Maximum Number of Index per Table One of the questions I asked in my user group was – What is the maximum number of Index per table? I received lots of answers to this question but only two answers are correct. Let us now take a look at them in this blog post. 2010 Default Statistics on Column – Automatic Statistics on Column The truth is, Statistics can be in a table even though there is no Index in it. If you have the auto- create and/or auto-update Statistics feature turned on for SQL Server database, Statistics will be automatically created on the Column based on a few conditions. Please read my previously posted article, SQL SERVER – When are Statistics Updated – What triggers Statistics to Update, for the specific conditions when Statistics is updated. 2011 T-SQL Scripts to Find Maximum between Two Numbers In this blog post there are two different scripts listed which demonstrates way to find the maximum number between two numbers. I need your help, which one of the script do you think is the most accurate way to find maximum number? Find Details for Statistics of Whole Database – DMV – T-SQL Script I was recently asked is there a single script which can provide all the necessary details about statistics for any database. This question made me write following script. I was initially planning to use sp_helpstats command but I remembered that this is marked to be deprecated in future. 2012 Introduction to Function SIGN SIGN Function is very fundamental function. It will return the value 1, -1 or 0. If your value is negative it will return you negative -1 and if it is positive it will return you positive +1. Let us start with a simple small example. Template Browser – A Very Important and Useful Feature of SSMS Templates are like a quick cheat sheet or quick reference. Templates are available to create objects like databases, tables, views, indexes, stored procedures, triggers, statistics, and functions. Templates are also available for Analysis Services as well. The template scripts contain parameters to help you customize the code. You can Replace Template Parameters dialog box to insert values into the script. An invalid floating point operation occurred If you run any of the above functions they will give you an error related to invalid floating point. Honestly there is no workaround except passing the function appropriate values. SQRT of a negative number will give you result in real numbers which is not supported at this point of time as well LOG of a negative number is not possible (because logarithm is the inverse function of an exponential function and the exponential function is NEVER negative). Validating Spatial Object with IsValidDetailed Function SQL Server 2012 has introduced the new function IsValidDetailed(). This function has made my life very easy. In simple words, this function will check if the spatial object passed is valid or not. If it is valid it will give information that it is valid. If the spatial object is not valid it will return the answer that it is not valid and the reason for the same. This makes it very easy to debug the issue and make the necessary correction. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article

CI Deployment Of Azure Web Roles Using TeamCity

- by srkirkland

After recently migrating an important new website to use Windows Azure “Web Roles” I wanted an easier way to deploy new versions to the Azure Staging environment as well as a reliable process to rollback deployments to a certain “known good” source control commit checkpoint. By configuring our JetBrains’ TeamCity CI server to utilize Windows Azure PowerShell cmdlets to create new automated deployments, I’ll show you how to take control of your Azure publish process. Step 0: Configuring your Azure Project in Visual Studio Before we can start looking at automating the deployment, we should make sure manual deployments from Visual Studio are working properly. Detailed information for setting up deployments can be found at http://msdn.microsoft.com/en-us/library/windowsazure/ff683672.aspx#PublishAzure or by doing some quick Googling, but the basics are as follows: Install the prerequisite Windows Azure SDK Create an Azure project by right-clicking on your web project and choosing “Add Windows Azure Cloud Service Project” (or by manually adding that project type) Configure your Role and Service Configuration/Definition as desired Right-click on your azure project and choose “Publish,” create a publish profile, and push to your web role You don’t actually have to do step #4 and create a publish profile, but it’s a good exercise to make sure everything is working properly. Once your Windows Azure project is setup correctly, we are ready to move on to understanding the Azure Publish process. Understanding the Azure Publish Process The actual Windows Azure project is fairly simple at its core—it builds your dependent roles (in our case, a web role) against a specific service and build configuration, and outputs two files: ServiceConfiguration.Cloud.cscfg: This is just the file containing your package configuration info, for example Instance Count, OsFamily, ConnectionString and other Setting information. ProjectName.Azure.cspkg: This is the package file that contains the guts of your deployment, including all deployable files. When you package your Azure project, these two files will be created within the directory ./[ProjectName].Azure/bin/[ConfigName]/app.publish/. If you want to build your Azure Project from the command line, it’s as simple as calling MSBuild on the “Publish” target: msbuild.exe /target:Publish Windows Azure PowerShell Cmdlets The last pieces of the puzzle that make CI automation possible are the Azure PowerShell Cmdlets (http://msdn.microsoft.com/en-us/library/windowsazure/jj156055.aspx). These cmdlets are what will let us create deployments without Visual Studio or other user intervention. Preparing TeamCity for Azure Deployments Now we are ready to get our TeamCity server setup so it can build and deploy Windows Azure projects, which we now know requires the Azure SDK and the Windows Azure PowerShell Cmdlets. Installing the Azure SDK is easy enough, just go to https://www.windowsazure.com/en-us/develop/net/ and click “Install” Once this SDK is installed, I recommend running a test build to make sure your project is building correctly. You’ll want to setup your build step using MSBuild with the “Publish” target against your solution file. Mine looks like this: Assuming the build was successful, you will now have the two *.cspkg and *cscfg files within your build directory. If the build was red (failed), take a look at the build logs and keep an eye out for “unsupported project type” or other build errors, which will need to be addressed before the CI deployment can be completed. With a successful build we are now ready to install and configure the Windows Azure PowerShell Cmdlets: Follow the instructions at http://msdn.microsoft.com/en-us/library/windowsazure/jj554332 to install the Cmdlets and configure PowerShell After installing the Cmdlets, you’ll need to get your Azure Subscription Info using the Get-AzurePublishSettingsFile command. Store the resulting *.publishsettings file somewhere you can get to easily, like C:\TeamCity, because you will need to reference it later from your deploy script. Scripting the CI Deploy Process Now that the cmdlets are installed on our TeamCity server, we are ready to script the actual deployment using a TeamCity “PowerShell” build runner. Before we look at any code, here’s a breakdown of our deployment algorithm: Setup your variables, including the location of the *.cspkg and *cscfg files produced in the earlier MSBuild step (remember, the folder is something like [ProjectName].Azure/bin/[ConfigName]/app.publish/ Import the Windows Azure PowerShell Cmdlets Import and set your Azure Subscription information (this is basically your authentication/authorization step, so protect your settings file Now look for a current deployment, and if you find one Upgrade it, else Create a new deployment Pretty simple and straightforward. Now let’s look at the code (also available as a gist here: https://gist.github.com/3694398): $subscription = "[Your Subscription Name]" $service = "[Your Azure Service Name]" $slot = "staging" #staging or production $package = "[ProjectName]\bin\[BuildConfigName]\app.publish\[ProjectName].cspkg" $configuration = "[ProjectName]\bin\[BuildConfigName]\app.publish\ServiceConfiguration.Cloud.cscfg" $timeStampFormat = "g" $deploymentLabel = "ContinuousDeploy to $service v%build.number%" Write-Output "Running Azure Imports" Import-Module "C:\Program Files (x86)\Microsoft SDKs\Windows Azure\PowerShell\Azure\*.psd1" Import-AzurePublishSettingsFile "C:\TeamCity\[PSFileName].publishsettings" Set-AzureSubscription -CurrentStorageAccount $service -SubscriptionName $subscription function Publish(){ $deployment = Get-AzureDeployment -ServiceName $service -Slot $slot -ErrorVariable a -ErrorAction silentlycontinue if ($a[0] -ne $null) { Write-Output "$(Get-Date -f $timeStampFormat) - No deployment is detected. Creating a new deployment. " } if ($deployment.Name -ne $null) { #Update deployment inplace (usually faster, cheaper, won't destroy VIP) Write-Output "$(Get-Date -f $timeStampFormat) - Deployment exists in $servicename. Upgrading deployment." UpgradeDeployment } else { CreateNewDeployment } } function CreateNewDeployment() { write-progress -id 3 -activity "Creating New Deployment" -Status "In progress" Write-Output "$(Get-Date -f $timeStampFormat) - Creating New Deployment: In progress" $opstat = New-AzureDeployment -Slot $slot -Package $package -Configuration $configuration -label $deploymentLabel -ServiceName $service $completeDeployment = Get-AzureDeployment -ServiceName $service -Slot $slot $completeDeploymentID = $completeDeployment.deploymentid write-progress -id 3 -activity "Creating New Deployment" -completed -Status "Complete" Write-Output "$(Get-Date -f $timeStampFormat) - Creating New Deployment: Complete, Deployment ID: $completeDeploymentID" } function UpgradeDeployment() { write-progress -id 3 -activity "Upgrading Deployment" -Status "In progress" Write-Output "$(Get-Date -f $timeStampFormat) - Upgrading Deployment: In progress" # perform Update-Deployment $setdeployment = Set-AzureDeployment -Upgrade -Slot $slot -Package $package -Configuration $configuration -label $deploymentLabel -ServiceName $service -Force $completeDeployment = Get-AzureDeployment -ServiceName $service -Slot $slot $completeDeploymentID = $completeDeployment.deploymentid write-progress -id 3 -activity "Upgrading Deployment" -completed -Status "Complete" Write-Output "$(Get-Date -f $timeStampFormat) - Upgrading Deployment: Complete, Deployment ID: $completeDeploymentID" } Write-Output "Create Azure Deployment" Publish Creating the TeamCity Build Step The only thing left is to create a second build step, after your MSBuild “Publish” step, with the build runner type “PowerShell”. Then set your script to “Source Code,” the script execution mode to “Put script into PowerShell stdin with “-Command” arguments” and then copy/paste in the above script (replacing the placeholder sections with your values). This should look like the following: Wrap Up After combining the MSBuild /target:Publish step (which creates the necessary Windows Azure *.cspkg and *.cscfg files) and a PowerShell script step which utilizes the Azure PowerShell Cmdlets, we have a fully deployable build configuration in TeamCity. You can configure this step to run whenever you’d like using build triggers – for example, you could even deploy whenever a new master branch deploy comes in and passes all required tests. In the script I’ve hardcoded that every deployment goes to the Staging environment on Azure, but you could deploy straight to Production if you want to, or even setup a deployment configuration variable and set it as desired. After your TeamCity Build Configuration is complete, you’ll see something that looks like this: Whenever you click the “Run” button, all of your code will be compiled, published, and deployed to Windows Azure! One additional enormous benefit of automating the process this way is that you can easily deploy any specific source control changeset by clicking the little ellipsis button next to "Run.” This will bring up a dialog like the one below, where you can select the last change to use for your deployment. Since Azure Web Role deployments don’t have any rollback functionality, this is a critical feature. Enjoy!

Read the article

Try a sample: Using the counter predicate for event sampling

- by extended_events

Extended Events offers a rich filtering mechanism, called predicates, that allows you to reduce the number of events you collect by specifying criteria that will be applied during event collection. (You can find more information about predicates in Using SQL Server 2008 Extended Events (by Jonathan Kehayias)) By evaluating predicates early in the event firing sequence we can reduce the performance impact of collecting events by stopping event collection when the criteria are not met. You can specify predicates on both event fields and on a special object called a predicate source. Predicate sources are similar to action in that they typically are related to some type of global information available from the server. You will find that many of the actions available in Extended Events have equivalent predicate sources, but actions and predicates sources are not the same thing. Applying predicates, whether on a field or predicate source, is very similar to what you are used to in T-SQL in terms of how they work; you pick some field/source and compare it to a value, for example, session_id = 52. There is one predicate source that merits special attention though, not just for its special use, but for how the order of predicate evaluation impacts the behavior you see. I’m referring to the counter predicate source. The counter predicate source gives you a way to sample a subset of events that otherwise meet the criteria of the predicate; for example you could collect every other event, or only every tenth event. Simple CountingThe counter predicate source works by creating an in memory counter that increments every time the predicate statement is evaluated. Here is a simple example with my favorite event, sql_statement_completed, that only collects the second statement that is run. (OK, that’s not much of a sample, but this is for demonstration purposes. Here is the session definition: CREATE EVENT SESSION counter_test ON SERVERADD EVENT sqlserver.sql_statement_completed (ACTION (sqlserver.sql_text) WHERE package0.counter = 2)ADD TARGET package0.ring_bufferWITH (MAX_DISPATCH_LATENCY = 1 SECONDS) You can find general information about the session DDL syntax in BOL and from Pedro’s post Introduction to Extended Events. The important part here is the WHERE statement that defines that I only what the event where package0.count = 2; in other words, only the second instance of the event. Notice that I need to provide the package name along with the predicate source. You don’t need to provide the package name if you’re using event fields, only for predicate sources. Let’s say I run the following test queries: -- Run three statements to test the sessionSELECT 'This is the first statement'GOSELECT 'This is the second statement'GOSELECT 'This is the third statement';GO Once you return the event data from the ring buffer and parse the XML (see my earlier post on reading event data) you should see something like this: event_name sql_text sql_statement_completed SELECT ‘This is the second statement’ You can see that only the second statement from the test was actually collected. (Feel free to try this yourself. Check out what happens if you remove the WHERE statement from your session. Go ahead, I’ll wait.) Percentage Sampling OK, so that wasn’t particularly interesting, but you can probably see that this could be interesting, for example, lets say I need a 25% sample of the statements executed on my server for some type of QA analysis, that might be more interesting than just the second statement. All comparisons of predicates are handled using an object called a predicate comparator; the simple comparisons such as equals, greater than, etc. are mapped to the common mathematical symbols you know and love (eg. = and >), but to do the less common comparisons you will need to use the predicate comparators directly. You would probably look to the MOD operation to do this type sampling; we would too, but we don’t call it MOD, we call it divides_by_uint64. This comparator evaluates whether one number is divisible by another with no remainder. The general syntax for using a predicate comparator is pred_comp(field, value), field is always first and value is always second. So lets take a look at how the session changes to answer our new question of 25% sampling: CREATE EVENT SESSION counter_test_25 ON SERVERADD EVENT sqlserver.sql_statement_completed (ACTION (sqlserver.sql_text) WHERE package0.divides_by_uint64(package0.counter,4))ADD TARGET package0.ring_bufferWITH (MAX_DISPATCH_LATENCY = 1 SECONDS)GO Here I’ve replaced the simple equivalency check with the divides_by_uint64 comparator to check if the counter is evenly divisible by 4, which gives us back every fourth record. I’ll leave it as an exercise for the reader to test this session. Why order matters I indicated at the start of this post that order matters when it comes to the counter predicate – it does. Like most other predicate systems, Extended Events evaluates the predicate statement from left to right; as soon as the predicate statement is proven false we abandon evaluation of the remainder of the statement. The counter predicate source is only incremented when it is evaluated so whether or not the counter is incremented will depend on where it is in the predicate statement and whether a previous criteria made the predicate false or not. Here is a generic example: Pred1: (WHERE statement_1 AND package0.counter = 2)Pred2: (WHERE package0.counter = 2 AND statement_1) Let’s say I cause a number of events as follows and examine what happens to the counter predicate source. Iteration Statement Pred1 Counter Pred2 Counter A Not statement_1 0 1 B statement_1 1 2 C Not statement_1 1 3 D statement_1 2 4 As you can see, in the case of Pred1, statement_1 is evaluated first, when it fails (A & C) predicate evaluation is stopped and the counter is not incremented. With Pred2 the counter is evaluated first, so it is incremented on every iteration of the event and the remaining parts of the predicate are then evaluated. In this example, Pred1 would return an event for D while Pred2 would return an event for B. But wait, there is an interesting side-effect here; consider Pred2 if I had run my statements in the following order: Not statement_1 Not statement_1 statement_1 statement_1 In this case I would never get an event back from the system because the point at which counter=2, the rest of the predicate evaluates as false so the event is not returned. If you’re using the counter target for sampling and you’re not getting the expected events, or any events, check the order of the predicate criteria. As a general rule I’d suggest that the counter criteria should be the last element of your predicate statement since that will assure that your sampling rate will apply to the set of event records defined by the rest of your predicate. Aside: I’m interested in hearing about uses for putting the counter predicate criteria earlier in the predicate statement. If you have one, post it in a comment to share with the class. - Mike Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article

How to archive data from a table to a local or remote database in SQL 2005 and SQL 2008

- by simonsabin

Often you have the need to archive data from a table. This leads to a number of challenges 1. How can you do it without impacting users 2. How can I make it transactionally consistent, i.e. the data I put in the archive is the data I remove from the main table 3. How can I get it to perform well Points 1 is very much tied to point 3. If it doesn't perform well then the delete of data is going to cause lots of locks and thus potentially blocking. For points 1 and 3 refer to my previous posts DELETE-TOP-x-rows-avoiding-a-table-scan and UPDATE-and-DELETE-TOP-and-ORDER-BY---Part2. In essence you need to be removing small chunks of data from your table and you want to do that avoiding a table scan. So that deals with the delete approach but archiving is about inserting that data somewhere else. Well in SQL 2008 they introduced a new feature INSERT over DML (Data Manipulation Language, i.e. SQL statements that change data), or composable DML. The ability to nest DML statements within themselves, so you can past the results of an insert to an update to a merge. I've mentioned this before here SQL-Server-2008---MERGE-and-optimistic-concurrency. This feature is currently limited to being able to consume the results of a DML statement in an INSERT statement. There are many restrictions which you can find here http://msdn.microsoft.com/en-us/library/ms177564.aspx look for the section "Inserting Data Returned From an OUTPUT Clause Into a Table" Even with the restrictions what we can do is consume the OUTPUT from a DELETE and INSERT the results into a table in another database. Note that in BOL it refers to not being able to use a remote table, remote means a table on another SQL instance. To show this working use this SQL to setup two databases foo and fooArchive create database foo go --create the source table fred in database foo select * into foo..fred from sys.objects go create database fooArchive go if object_id('fredarchive',DB_ID('fooArchive')) is null begin select getdate() ArchiveDate,* into fooArchive..FredArchive from sys.objects where 1=2 end go And then we can use this simple statement to archive the data insert into fooArchive..FredArchive select getdate(),d.* from (delete top (1) from foo..Fred output deleted.*) d go In this statement the delete can be any delete statement you wish so if you are deleting by ids or a range of values then you can do that. Refer to the DELETE-TOP-x-rows-avoiding-a-table-scan post to ensure that your delete is going to perform. The last thing you want to do is to perform 100 deletes each with 5000 records for each of those deletes to do a table scan. For a solution that works for SQL2005 or if you want to archive to a different server then you can use linked servers or SSIS. This example shows how to do it with linked servers. [ONARC-LAP03] is the source server. begin transaction insert into fooArchive..FredArchive select getdate(),d.* from openquery ([ONARC-LAP03],'delete top (1) from foo..Fred output deleted.*') d commit transaction and to prove the transactions work try, you should get the same number of records before and after. select (select count(1) from foo..Fred) fred ,(select COUNT(1) from fooArchive..FredArchive ) fredarchive begin transaction insert into fooArchive..FredArchive select getdate(),d.* from openquery ([ONARC-LAP03],'delete top (1) from foo..Fred output deleted.*') d rollback transaction select (select count(1) from foo..Fred) fred ,(select COUNT(1) from fooArchive..FredArchive ) fredarchive The transactions are very important with this solution. Look what happens when you don't have transactions and an error occurs select (select count(1) from foo..Fred) fred ,(select COUNT(1) from fooArchive..FredArchive ) fredarchive insert into fooArchive..FredArchive select getdate(),d.* from openquery ([ONARC-LAP03],'delete top (1) from foo..Fred output deleted.* raiserror (''Oh doo doo'',15,15)') d select (select count(1) from foo..Fred) fred ,(select COUNT(1) from fooArchive..FredArchive ) fredarchive Before running this think what the result would be. I got it wrong. What seems to happen is that the remote query is executed as a transaction, the error causes that to rollback. However the results have already been sent to the client and so get inserted into the

Read the article

SQL SERVER – Weekly Series – Memory Lane – #052

- by Pinal Dave

Let us continue with the final episode of the Memory Lane Series. Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Set Server Level FILLFACTOR Using T-SQL Script Specifies a percentage that indicates how full the Database Engine should make the leaf level of each index page during index creation or alteration. fillfactor must be an integer value from 1 to 100. The default is 0. Limitation of Online Index Rebuld Operation Online operation means when online operations are happening in the database are in normal operational condition, the processes which are participating in online operations does not require exclusive access to the database. Get Permissions of My Username / Userlogin on Server / Database A few days ago, I was invited to one of the largest database company. I was asked to review database schema and propose changes to it. There was special username or user logic was created for me, so I can review their database. I was very much interested to know what kind of permissions I was assigned per server level and database level. I did not feel like asking Sr. DBA the question about permissions. Simple Example of WHILE Loop With CONTINUE and BREAK Keywords This question is one of those questions which is very simple and most of the users get it correct, however few users find it confusing for the first time. I have tried to explain the usage of simple WHILE loop in the first example. BREAK keyword will exit the stop the while loop and control is moved to the next statement after the while loop. CONTINUE keyword skips all the statement after its execution and control is sent to the first statement of while loop. Forced Parameterization and Simple Parameterization – T-SQL and SSMS When the PARAMETERIZATION option is set to FORCED, any literal value that appears in a SELECT, INSERT, UPDATE or DELETE statement is converted to a parameter during query compilation. When the PARAMETERIZATION database option is SET to SIMPLE, the SQL Server query optimizer may choose to parameterize the queries. 2008 Transaction and Local Variables – Swap Variables – Update All At Once Concept Summary : Transaction have no effect over memory variables. When UPDATE statement is applied over any table (physical or memory) all the updates are applied at one time together when the statement is committed. First of all I suggest that you read the article listed above about the effect of transaction on local variant. As seen there local variables are independent of any transaction effect. Simulate INNER JOIN using LEFT JOIN statement – Performance Analysis Just a day ago, while I was working with JOINs I find one interesting observation, which has prompted me to create following example. Before we continue further let me make very clear that INNER JOIN should be used where it cannot be used and simulating INNER JOIN using any other JOINs will degrade the performance. If there are scopes to convert any OUTER JOIN to INNER JOIN it should be done with priority. 2009 Introduction to Business Intelligence – Important Terms & Definitions Business intelligence (BI) is a broad category of application programs and technologies for gathering, storing, analyzing, and providing access to data from various data sources, thus providing enterprise users with reliable and timely information and analysis for improved decision making. Difference Between Candidate Keys and Primary Key Candidate Key – A Candidate Key can be any column or a combination of columns that can qualify as unique key in database. There can be multiple Candidate Keys in one table. Each Candidate Key can qualify as Primary Key. Primary Key – A Primary Key is a column or a combination of columns that uniquely identify a record. Only one Candidate Key can be Primary Key. 2010 Taking Multiple Backup of Database in Single Command – Mirrored Database Backup I recently had a very interesting experience. In one of my recent consultancy works, I was told by our client that they are going to take the backup of the database and will also a copy of it at the same time. I expressed that it was surely possible if they were going to use a mirror command. In addition, they told me that whenever they take two copies of the database, the size of the database, is always reduced. Now this was something not clear to me, I said it was not possible and so I asked them to show me the script. Corrupted Backup File and Unsuccessful Restore The CTO, who was also present at the location, got very upset with this situation. He then asked when the last successful restore test was done. As expected, the answer was NEVER.There were no successful restore tests done before. During that time, I was present and I could clearly see the stress, confusion, carelessness and anger around me. I did not appreciate the feeling and I was pretty sure that no one in there wanted the atmosphere like me. 2011 TRACEWRITE – Wait Type – Wait Related to Buffer and Resolution SQL Trace is a SQL Server database engine technology which monitors specific events generated when various actions occur in the database engine. When any event is fired it goes through various stages as well various routes. One of the routes is Trace I/O Provider, which sends data to its final destination either as a file or rowset. DATEDIFF – Accuracy of Various Dateparts If you want to have accuracy in seconds, you need to use a different approach. In the first example, the accurate method is to find the number of seconds first and then divide it by 60 to convert it in minutes. Dedicated Access Control for SQL Server Express Edition http://www.youtube.com/watch?v=1k00z82u4OI Book Signing at SQLPASS 2012 Who I Am And How I Got Here – True Story as Blog Post If there was a shortcut to success – I want to know. I learnt SQL Server hard way and I am still learning. There are so many things, I have to learn. There is not enough time to learn everything which we want to learn. I am constantly working on it every day. I welcome you to join my journey as well. Please join me in my journey to learn SQL Server – more the merrier. Vacation, Travel and Study – A New Concept Even those who have advanced degrees and went to college for years, or even decades, find studying hard. There is a difference between studying for a career and studying for a certification. At least to get a degree there is a variety of subjects, with labs, exams, and practice problems to make things more interesting. Order By Numeric Values Formatted as String We have a table which has a column containing alphanumeric data. The data always has first as an integer and later part as a string. The business need is to order the data based on the first part of the alphanumeric data which is an integer. Now the problem is that no matter how we use ORDER BY the result is not produced as expected. Let us understand this with an example. Resolving SQL Server Connection Errors – SQL in Sixty Seconds #030 – Video One of the most famous errors related to SQL Server is about connecting to SQL Server itself. Here is how it goes, most of the time developers have worked with SQL Server and knows pretty much every error which they face during development language. However, hardly they install fresh SQL Server. As the installation of the SQL Server is a rare occasion unless you are a DBA who is responsible for such an instance – the error faced during installations are pretty rare as well. http://www.youtube.com/watch?v=1k00z82u4OI Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article

The blocking nature of aggregates

- by Rob Farley

I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here. In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

Read the article

Creating vCard action result

- by DigiMortal

I added support for vCards to one of my ASP.NET MVC applications. I worked vCard support out as very simple and intelligent solution that fits perfectly to ASP.NET MVC applications. In this posting I will show you how to send vCards out as response to ASP.NET MVC request. We need three things: some vCard class, vCard action result, controller method to test vCard action result. Everything is very simple, let’s get hands on. vCard class As first thing we need vCard class. Last year I introduced vCard class that supports also images. Let’s take this class because it is easy to use and some dirty work is already done for us. NB! Take a look at ASP.NET example in the blog posting referred above. We need it later when we close the topic. Now think about how useful blogging and information sharing with others can be. With this class available at public I saved pretty much time now. :) vCardResult As we have vCard it is now time to write action result that we can use in our controllers. Here’s the code. public class vCardResult : ActionResult { private vCard _card; protected vCardResult() { } public vCardResult(vCard card) { _card = card; } public override void ExecuteResult(ControllerContext context) { var response = context.HttpContext.Response; response.ContentType = "text/vcard"; response.AddHeader("Content-Disposition", "attachment; fileName=" + _card.FirstName + " " + _card.LastName + ".vcf"); var cardString = _card.ToString(); var inputEncoding = Encoding.Default; var outputEncoding = Encoding.GetEncoding("windows-1257"); var cardBytes = inputEncoding.GetBytes(cardString); var outputBytes = Encoding.Convert(inputEncoding, outputEncoding, cardBytes); response.OutputStream.Write(outputBytes, 0, outputBytes.Length); } } And we are done. Some notes: vCard is sent to browser as downloadable file (user can save or open it with Outlook or any other e-mail client that supports vCards), File name is made of first and last name of contact. Encoding is important because Outlook may not understand vCards otherwise (don’t know if this problem is solved in Outlook 2010). Using vCardResult in controller Now let’s tale a look at simple controller method that accepts person ID and returns vCardResult. public class ContactsController : Controller { // ... other controller methods ... public vCardResult vCard(int id) { var person = _partyRepository.GetPersonById(id); var card = new vCard { FirstName=person.FirstName, LastName = person.LastName, StreetAddress = person.StreetAddress, City = person.City, CountryName = person.Country.Name, Mobile = person.Mobile, Phone = person.Phone, Email = person.Email, }; return new vCardResult(card); } } Now you can run Visual Studio and check out how your vCard is moving from your web application to your e-mail client. Conclusion We took old code that worked well with ASP.NET Forms and we divided it into action result and controller method that uses vCard as bridge between our controller and action result. All functionality is located where it should be and we did nothing complex. We wrote only couple of lines of very easy code to achieve our goal. Do you understand now why I love ASP.NET MVC? :)

Read the article

The blocking nature of aggregates

- by Rob Farley

I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here. In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

Search Results

Search found 8274 results on 331 pages for 'offtopic but important'.

Page 304/331 | < Previous Page | 300 301 302 303 304 305 306 307 308 309 310 311 | Next Page >

- by Jim Mcglothlin

- by Brian

- by Ruma Sanyal

- by HighAltitudeCoder

- by danx

- by cibrax

- by Testas

- by sreejukg

- by david.butler(at)oracle.com

- by Eran_Steiner

- by Chris Gardner

- by ccasares

- by martin.abrahams

- by Julien Testut

- by Kevin Yang

- by Pinal Dave

- by srkirkland

- by extended_events

- by simonsabin

- by Pinal Dave

- by Rob Farley

- by DigiMortal

- by Rob Farley

< Previous Page | 300 301 302 303 304 305 306 307 308 309 310 311 | Next Page >