Search Results

Search found 7596 results on 304 pages for 'prepared statement'.

Page 291/304 | < Previous Page | 287 288 289 290 291 292 293 294 295 296 297 298 | Next Page >

The SPARC SuperCluster

- by Karoly Vegh

Oracle has been providing a lead in the Engineered Systems business for quite a while now, in accordance with the motto "Hardware and Software Engineered to Work Together." Indeed it is hard to find a better definition of these systems. Allow me to summarize the idea. It is: Build a compute platform optimized to run your technologies Develop application aware, intelligently caching storage components Take an impressively fast network technology interconnecting it with the compute nodes Tune the application to scale with the nodes to yet unseen performance Reduce the amount of data moving via compression Provide this all in a pre-integrated single product with a single-pane management interface All these ideas have been around in IT for quite some time now. The real Oracle advantage is adding the last one to put these all together. Oracle has built quite a portfolio of Engineered Systems, to run its technologies - and run those like they never ran before. In this post I'll focus on one of them that serves as a consolidation demigod, a multi-purpose engineered system. As you probably have guessed, I am talking about the SPARC SuperCluster. It has many great features inherited from its predecessors, and it adds several new ones. Allow me to pick out and elaborate about some of the most interesting ones from a technological point of view. I. It is the SPARC SuperCluster T4-4. That is, as compute nodes, it includes SPARC T4-4 servers that we learned to appreciate and respect for their features: The SPARC T4 CPUs: Each CPU has 8 cores, each core runs 8 threads. The SPARC T4-4 servers have 4 sockets. That is, a single compute node can in parallel, simultaneously execute 256 threads. Now, a full-rack SPARC SuperCluster has 4 of these servers on board. Remember the keyword demigod. While retaining the forerunner SPARC T3's exceptional throughput, the SPARC T4 CPUs raise the bar with single performance too - a humble 5x better one than their ancestors. actually, the SPARC T4 CPU cores run in both single-threaded and multi-threaded mode, and switch between these two on-the-fly, fulfilling not only single-threaded OR multi-threaded applications' needs, but even mixed requirements (like in database workloads!). Data security, anyone? Every SPARC T4 CPU core has a built-in encryption engine, that is, encryption algorithms cast into silicon. A PCI controller right on the chip for customers who need I/O performance. Built-in, no-cost Virtualization: Oracle VM for SPARC (the former LDoms or Logical Domains) is not a server-emulation virtualization technology but rather a serverpartitioning one, the hypervisor runs in the server firmware, and all the VMs' HW resources (I/O, CPU, memory) are accessed natively, without performance overhead. This enables customers to run a number of Solaris 10 and Solaris 11 VMs separated, independent of each other within a physical server II. For Database performance, it includes Exadata Storage Cells - one of the main reasons why the Exadata Database Machine performs at diabolic speed. What makes them important? They provide DB backend storage for your Oracle Databases to run on the SPARC SuperCluster, that is what they are built and tuned for DB performance. These storage cells are SQL-aware. That is, if a SPARC T4 database compute node executes a query, it doesn't simply request tons of raw datablocks from the storage, filters the received data, and throws away most of it where the statement doesn't apply, but provides the SQL query to the storage node too. The storage cell software speaks SQL, that is, it is able to prefilter and through that transfer only the relevant data. With this, the traffic between database nodes and storage cells is reduced immensely. Less I/O is a good thing - as they say, all the CPUs of the world do one thing just as fast as any other - and that is waiting for I/O. They don't only pre-filter, but also provide data preprocessing features - e.g. if a DB-node requests an aggregate of data, they can calculate it, and handover only the results, not the whole set. Again, less data to transfer. They support the magical HCC, (Hybrid Columnar Compression). That is, data can be stored in a precompressed form on the storage. Less data to transfer. Of course one can't simply rely on disks for performance, there is Flash Storage included there for caching. III. The low latency, high-speed backbone network: InfiniBand, that interconnects all the members with: Real High Speed: 40 Gbit/s. Full Duplex, of course. Oh, and a really low latency. RDMA. Remote Direct Memory Access. This technology allows the DB nodes to do exactly that. Remotely, directly placing SQL commands into the Memory of the storage cells. Dodging all the network-stack bottlenecks, avoiding overhead, placing requests directly into the process queue. You can also run IP over InfiniBand if you please - that's the way the compute nodes can communicate with each other. IV. Including a general-purpose storage too: the ZFSSA, which is a unified storage, providing NAS and SAN access too, with the following features: NFS over RDMA over InfiniBand. Nothing is faster network-filesystem-wise. All the ZFS features onboard, hybrid storage pools, compression, deduplication, snapshot, replication, NFS and CIFS shares Storageheads in a HA-Cluster configuration providing availability of the data DTrace Live Analytics in a web-based Administration UI Being a general purpose application data storage for your non-database applications running on the SPARC SuperCluster over whichever protocol they prefer, easily replicating, snapshotting, cloning data for them. There's a lot of great technology included in Oracle's SPARC SuperCluster, we have talked its interior through. As for external scalability: you can start with a half- of full- rack SPARC SuperCluster, and scale out to several racks - that is, stacking not separate full-rack SPARC SuperClusters, but extending always one large instance of the size of several full-racks. Yes, over InfiniBand network. Add racks as you grow. What technologies shall run on it? SPARC SuperCluster is a general purpose scaleout consolidation/cloud environment. You can run Oracle Databases with RAC scaling, or Oracle Weblogic (end enjoy the SPARC T4's advantages to run Java). Remember, Oracle technologies have been integrated with the Oracle Engineered Systems - this is the Oracle on Oracle advantage. But you can run other software environments such as SAP if you please too. Run any application that runs on Oracle Solaris 10 or Solaris 11. Separate them in Virtual Machines, or even Oracle Solaris Zones, monitor and manage those from a central UI. Here the key takeaways once again: The SPARC SuperCluster: Is a pre-integrated Engineered System Contains SPARC T4-4 servers with built-in virtualization, cryptography, dynamic threading Contains the Exadata storage cells that intelligently offload the burden of the DB-nodes Contains a highly available ZFS Storage Appliance, that provides SAN/NAS storage in a unified way Combines all these elements over a high-speed, low-latency backbone network implemented with InfiniBand Can grow from a single half-rack to several full-rack size Supports the consolidation of hundreds of applications To summarize: All these technologies are great by themselves, but the real value is like in every other Oracle Engineered System: Integration. All these technologies are tuned to perform together. Together they are way more than the sum of all - and a careful and actually very time consuming integration process is necessary to orchestrate all these for performance. The SPARC SuperCluster's goal is to enable infrastructure operations and offer a pre-integrated solution that can be architected and delivered in hours instead of months of evaluations and tests. The tedious and most importantly time and resource consuming part of the work - testing and evaluating - has been done. Now go, provide services. -- charlie

Read the article
What more a Business Service can do?

- by Rajesh Sharma

Business services can be accessed from outside the application via XAI inbound service, or from within the application via scripting, Java, or info zones. Below is an example to what you can do with a business service wrapping an info zone. Generally, a business service is specific to a page service program which references a maintenance object, that means one business service = one service program = one maintenance object. There have been quite a few threads in the forum around this topic where the business service is misconstrued to perform services only on a single object, for e.g. only for CILCSVAP - SA Page Maintenance, CILCPRMP - Premise Page Maintenance, CILCACCP - Account Page Maintenance, etc. So what do you do when you want to retrieve some "non-persistent" field or information associated with some object/entity? Consider few business requirements: · Retrieve all the field activities associated to an account. · Retrieve the last bill date for an account. · Retrieve next bill date for an account. It can be as simple as described below, for this post, we'll use the first scenario - Retrieve all the field activities associated to an account. To achieve this we'll have to do the following: Step 1: Define an info zone (A basic Zone of type F1-DE-SINGLE - Info Data Explorer - Single SQL has been used; you can use F1-DE - Info Data Explorer - Multiple SQLs for more complex scenarios) Parameter Description Value To Enter User Filter 1 F1 Initial Display Columns C1 C2 C3 SQL Condition F1 SQL Statement SELECT FA_ID, FA_STATUS_FLG, CRE_DTTM FROM CI_FA WHERE SP_ID IN (SELECT SP_ID FROM CI_SA_SP WHERE SA_ID IN (SELECT SA_ID FROM CI_SA WHERE ACCT_ID = :F1)) Column 1 source=SQLCOL sqlcol=FA_ID Column 2 source=SQLCOL sqlcol=FA_STATUS_FLG Column 3 type=TIME source=SQLCOL sqlcol=CRE_DTTM order=DESC Note: Zone code specified was 'CM_ACCTFA' Step 2: Define a business service Create a business service linked to 'Service Name' FWLZDEXP - Data Explorer. Schema will look like this: <schema> <zoneCd mapField="ZONE_CD" default="CM_ACCTFA"/> <accountId mapField="F1_VALUE"/> <rowCount mapField="ROW_CNT"/> <result type="group"> <selectList type="list" mapList="DE"> <faId mapField="COL_VALUE"> <row mapList="DE_VAL"> <SEQNO is="1"/> </row> </faId> <status mapField="COL_VALUE"> <row mapList="DE_VAL"> <SEQNO is="2"/> </row> </status> <createdDateTime mapField="COL_VALUE"> <row mapList="DE_VAL"> <SEQNO is="3"/> </row> </createdDateTime> </selectList> </result> </schema> What's next? As mentioned above, you can invoke this business service from an outside application via XAI inbound service or call this business service from within a script. Step 3: Create a XAI inbound service for above created business service Step 4: Test the inbound service Go to XAI Submission and test the newly created service <RXS_AccountFA> <accountId>5922116763</accountId> </RXS_AccountFA>

Read the article
Configuration "diff" across Oracle WebCenter Sites instances

- by Mark Fincham-Oracle

Problem Statement With many Oracle WebCenter Sites environments - how do you know if the various configuration assets and settings are in sync across all of those environments? Background At Oracle we typically have a "W" shaped set of environments. For the "Production" environments we typically have a disaster recovery clone as well and sometimes additional QA environments alongside the production management environment. In the case of www.java.com we have 10 different environments. All configuration assets/settings (CSElements, Templates, Start Menus etc..) start life on the Development Management environment and are then published downstream to other environments as part of the software development lifecycle. Ensuring that each of these 10 environments has the same set of Templates, CSElements, StartMenus, TreeTabs etc.. is impossible to do efficiently without automation. Solution Summary The solution comprises of two components. A JSON data feed from each environment. A simple HTML page that consumes these JSON data feeds. Data Feed: Create a JSON WebService on each environment. The WebService is no more than a SiteEntry + CSElement. The CSElement queries various DB tables to obtain details of the assets/settings returning this data in a JSON feed. Report: Create a simple HTML page that uses JQuery to fetch the JSON feed from each environment and display the results in a table. Since all assets (CSElements, Templates etc..) are published between environments they will have the same last modified date. If the last modified date of an asset is different in the JSON feed or is mising from an environment entirely then highlight that in the report table. Example Solution Details Step 1: Create a Site Entry + CSElement that outputs JSON Site Entry & CSElement Setup The SiteEntry should be uncached so that the most recent configuration information is returned at all times. In the CSElement set the contenttype accordingly: Step 2: Write the CSElement Logic The basic logic, that we repeat for each asset or setting that we are interested in, is to query the DB using <ics:sql> and then loop over the resultset with <ics:listloop>. For example: <ics:sql sql="SELECT name,updateddate FROM Template WHERE status != 'VO'" listname="TemplateList" table="Template" /> "templates": [ <ics:listloop listname="TemplateList"> {"name":"<ics:listget listname="TemplateList" fieldname="name"/>", "modified":"<ics:listget listname="TemplateList" fieldname="updateddate"/>"}, </ics:listloop> ], A comprehensive list of SQL queries to fetch each configuration asset/settings can be seen in the appendix at the end of this article. For the generation of the JSON data structure you could use Jettison (the library ships with the 11.1.1.8 version of the product), native Java 7 capabilities or (as the above example demonstrates) you could roll-your-own JSON output but that is not advised. Step 3: Create an HTML Report The JavaScript logic looks something like this.. 1) Create a list of JSON feeds to fetch: ENVS['dev-mgmngt'] = 'http://dev-mngmnt.example.com/sites/ContentServer?d=&pagename=settings.json'; ENVS['dev-dlvry'] = 'http://dev-dlvry.example.com/sites/ContentServer?d=&pagename=settings.json'; ENVS['test-mngmnt'] = 'http://test-mngmnt.example.com/sites/ContentServer?d=&pagename=settings.json'; ENVS['test-dlvry'] = 'http://test-dlvry.example.com/sites/ContentServer?d=&pagename=settings.json'; 2) Create a function to get the JSON feeds: function getDataForEnvironment(url){ return $.ajax({ type: 'GET', url: url, dataType: 'jsonp', beforeSend: function (jqXHR, settings){ jqXHR.originalEnv = env; jqXHR.originalUrl = url; }, success: function(json, status, jqXHR) { console.log('....success fetching: ' + jqXHR.originalUrl); // store the returned data in ALLDATA ALLDATA[jqXHR.originalEnv] = json; }, error: function(jqXHR, status, e) { console.log('....ERROR: Failed to get data from [' + url + '] ' + status + ' ' + e); } }); } 3) Fetch each JSON feed: for (var env in ENVS) { console.log('Fetching data for env [' + env +'].'); var promisedData = getDataForEnvironment(ENVS[env]); promisedData.success(function (data) {}); } 4) For each configuration asset or setting create a table in the report For example, CSElements: 1) Get a list of unique CSElement names from all of the returned JSON data. 2) For each unique CSElement name, create a row in the table 3) Select 1 environment to represent the master or ideal state (e.g. "Everything should be like Production Delivery") 4) For each environment, compare the last modified date of this envs CSElement to the master. Highlight any differences in last modified date or missing CSElements. 5) Repeat... Appendix This section contains various SQL statements that can be used to retrieve configuration settings from the DB. Templates <ics:sql sql="SELECT name,updateddate FROM Template WHERE status != 'VO'" listname="TemplateList" table="Template" /> CSElements <ics:sql sql="SELECT name,updateddate FROM CSElement WHERE status != 'VO'" listname="CSEList" table="CSElement" /> Start Menus <ics:sql sql="select sm.id, sm.cs_name, sm.cs_description, sm.cs_assettype, sm.cs_assetsubtype, sm.cs_itemtype, smr.cs_rolename, p.name from StartMenu sm, StartMenu_Sites sms, StartMenu_Roles smr, Publication p where sm.id=sms.ownerid and sm.id=smr.cs_ownerid and sms.pubid=p.id order by sm.id" listname="startList" table="Publication,StartMenu,StartMenu_Roles,StartMenu_Sites"/> Publishing Configurations <ics:sql sql="select id, name, description, type, dest, factors from PubTarget" listname="pubTargetList" table="PubTarget" /> Tree Tabs <ics:sql sql="select tt.id, tt.title, tt.tooltip, p.name as pubname, ttr.cs_rolename, ttsect.name as sectname from TreeTabs tt, TreeTabs_Roles ttr, TreeTabs_Sect ttsect,TreeTabs_Sites ttsites LEFT JOIN Publication p on p.id=ttsites.pubid where p.id is not null and tt.id=ttsites.ownerid and ttsites.pubid=p.id and tt.id=ttr.cs_ownerid and tt.id=ttsect.ownerid order by tt.id" listname="treeTabList" table="TreeTabs,TreeTabs_Roles,TreeTabs_Sect,TreeTabs_Sites,Publication" /> Filters <ics:sql sql="select name,description,classname from Filters" listname="filtersList" table="Filters" /> Attribute Types <ics:sql sql="select id,valuetype,name,updateddate from AttrTypes where status != 'VO'" listname="AttrList" table="AttrTypes" /> WebReference Patterns <ics:sql sql="select id,webroot,pattern,assettype,name,params,publication from WebReferencesPatterns" listname="WebRefList" table="WebReferencesPatterns" /> Device Groups <ics:sql sql="select id,devicegroupsuffix,updateddate,name from DeviceGroup" listname="DeviceList" table="DeviceGroup" /> Site Entries <ics:sql sql="select se.id,se.name,se.pagename,se.cselement_id,se.updateddate,cse.rootelement from SiteEntry se LEFT JOIN CSElement cse on cse.id = se.cselement_id where se.status != 'VO'" listname="SiteList" table="SiteEntry,CSElement" /> Webroots <ics:sql sql="select id,name,rooturl,updatedby,updateddate from WebRoot" listname="webrootList" table="WebRoot" /> Page Definitions <ics:sql sql="select pd.id, pd.name, pd.updatedby, pd.updateddate, pd.description, pdt.attributeid, pa.name as nameattr, pdt.requiredflag, pdt.ordinal from PageDefinition pd, PageDefinition_TAttr pdt, PageAttribute pa where pd.status != 'VO' and pa.id=pdt.attributeid and pdt.ownerid=pd.id order by pd.id,pdt.ordinal" listname="pageDefList" table="PageDefinition,PageAttribute,PageDefinition_TAttr" /> FW_Application <ics:sql sql="select id,name,updateddate from FW_Application where status != 'VO'" listname="FWList" table="FW_Application" /> Custom Elements <ics:sql sql="select elementname from ElementCatalog where elementname like 'CustomElements%'" listname="elementList" table="ElementCatalog" />

Read the article
The challenge of communicating externally with IRM secured content

- by Simon Thorpe

I am often asked by customers about how they handle sending IRM secured documents to external parties. Their concern is that using IRM to secure sensitive information they need to share outside their business, is troubled with the inability for third parties to install the software which enables them to gain access to the information. It is a very legitimate question and one i've had to answer many times in the past 10 years whilst helping customers plan successful IRM deployments. The operating system does not provide the required level of content security The problem arises from what IRM delivers, persistent security to your sensitive information where ever it resides and whenever it is in use. Oracle IRM gives customers an array of features that help ensure sensitive information in an IRM document or email is always protected and only accessed by authorized users using legitimate applications. Examples of such functionality are; Control of the clipboard, either by disabling completely in the opened document or by allowing the cut and pasting of information between secured IRM documents but not into insecure applications. Protection against programmatic access to the document. Office documents and PDF documents have the ability to be accessed by other applications and scripts. With Oracle IRM we have to protect against this to ensure content cannot be leaked by someone writing a simple program. Securing of decrypted content in memory. At some point during the process of opening and presenting a sealed document to an end user, we must decrypt it and give it to the application (Adobe Reader, Microsoft Word, Excel etc). This process must be secure so that someone cannot simply get access to the decrypted information. The operating system alone just doesn't have the functionality to deliver these types of features. This is why for every IRM technology there must be some extra software installed and typically this software requires administrative rights to do so. The fact is that if you want to have very strong security and access control over a document you are going to send to someone who is beyond your network infrastructure, there must be some software to provide that functionality. Simple installation with Oracle IRM The software used to control access to Oracle IRM sealed content is called the Oracle IRM Desktop. It is a small, free piece of software roughly about 12mb in size. This software delivers functionality for everything a user needs to work with an Oracle IRM solution. It provides the functionality for all formats we support, the storage and transparent synchronization of user rights and unique to Oracle, the ability to search inside sealed files stored on the local computer. In Oracle we've made every technical effort to ensure that installing this software is a simple as possible. In situations where the user's computer is part of the enterprise, this software is typically deployed using existing technologies such as Systems Management Server from Microsoft or by using Active Directory Group Policies. However when sending sealed content externally, you cannot automatically install software on the end users machine. You need to rely on them to download and install themselves. Again we've made every effort for this manual install process to be as simple as we can. Starting with the small download size of the software itself to the simple installation process, most end users are able to install and access sealed content very quickly. You can see for yourself how easily this is done by walking through our free and easy self service demonstration of using sealed content. How to handle objections and ensure there is value However the fact still remains that end users may object to installing, or may simply be unable to install the software themselves due to lack of permissions. This is often a problem with any technology that requires specialized software to access a new type of document. In Oracle, over the past 10 years, we've learned many ways to get over this barrier of getting software deployed by external users. First and I would say of most importance, is the content MUST have some value to the person you are asking to install software. Without some type of value proposition you are going to find it very difficult to get past objections to installing the IRM Desktop. Imagine if you were going to secure the weekly campus restaurant menu and send this to contractors. Their initial response will be, "why on earth are you asking me to download some software just to access your menu!?". A valid objection... there is no value to the user in doing this. Now consider the scenario where you are sending one of your contractors their employment contract which contains their address, social security number and bank account details. Are they likely to take 5 minutes to install the IRM Desktop? You bet they are, because there is real value in doing so and they understand why you are doing it. They want their personal information to be securely handled and a quick download and install of some software is a small task in comparison to dealing with the loss of this information. Be clear in communicating this value So when sending sealed content to people externally, you must be clear in communicating why you are using an IRM technology and why they need to install some software to access the content. Do not try and avoid the issue, you must be clear and upfront about it. In doing so you will significantly reduce the "I didn't know I needed to do this..." responses and also gain respect for being straight forward. One customer I worked with, 6 months after the initial deployment of Oracle IRM, called me panicking that the partner they had started to share their engineering documents with refused to install any software to access this highly confidential intellectual property. I explained they had to communicate to the partner why they were doing this. I told them to go back with the statement that "the company takes protecting its intellectual property seriously and had decided to use IRM to control access to engineering documents." and if the partner didn't respect this decision, they would find another company that would. The result? A few days later the partner had made the Oracle IRM Desktop part of their approved list of software in the company. Companies are successful when sending sealed content to third parties We have many, many customers who send sensitive content to third parties. Some customers actually sell access to Oracle IRM protected content and therefore 99% of their users are external to their business, one in particular has sold content to hundreds of thousands of external users. Oracle themselves use the technology to secure M&A documents, payroll data and security assessments which go beyond the traditional enterprise security perimeter. Pretty much every company who deploys Oracle IRM will at some point be sending those documents to people outside of the company, these customers must be successful otherwise Oracle IRM wouldn't be successful. Because our software is used by a wide variety of companies, some who use it to sell content, i've often run into people i'm sharing a sealed document with and they already have the IRM Desktop installed due to accessing content from another company. The future In summary I would say that yes, this is a hurdle that many customers are concerned about but we see much evidence that in practice, people leap that hurdle with relative ease as long as they are good at communicating the value of using IRM and also take measures to ensure end users can easily go through the process of installation. We are constantly developing new ideas to reducing this hurdle and maybe one day the operating systems will give us enough rich security functionality to have no software installation. Until then, Oracle IRM is by far the easiest solution to balance security and usability for your business. If you would like to evaluate it for yourselves, please contact us.

Read the article
Full-text Indexing Books Online

- by Most Valuable Yak (Rob Volk)

While preparing for a recent SQL Saturday presentation, I was struck by a crazy idea (shocking, I know): Could someone import the content of SQL Server Books Online into a database and apply full-text indexing to it? The answer is yes, and it's really quite easy to do. The first step is finding the installed help files. If you have SQL Server 2012, BOL is installed under the Microsoft Help Library. You can find the install location by opening SQL Server Books Online and clicking the gear icon for the Help Library Manager. When the new window pops up click the Settings link, you'll get the following: You'll see the path under Library Location. Once you navigate to that path you'll have to drill down a little further, to C:\ProgramData\Microsoft\HelpLibrary\content\Microsoft\store. This is where the help file content is kept if you downloaded it for offline use. Depending on which products you've downloaded help for, you may see a few hundred files. Fortunately they're named well and you can easily find the "SQL_Server_Denali_Books_Online_" files. We are interested in the .MSHC files only, and can skip the Installation and Developer Reference files. Despite the .MHSC extension, these files are compressed with the standard Zip format, so your favorite archive utility (WinZip, 7Zip, WinRar, etc.) can open them. When you do, you'll see a few thousand files in the archive. We are only interested in the .htm files, but there's no harm in extracting all of them to a folder. 7zip provides a command-line utility and the following will extract to a D:\SQLHelp folder previously created: 7z e –oD:\SQLHelp "C:\ProgramData\Microsoft\HelpLibrary\content\Microsoft\store\SQL_Server_Denali_Books_Online_B780_SQL_110_en-us_1.2.mshc" *.htm Well that's great Rob, but how do I put all those files into a full-text index? I'll tell you in a second, but first we have to set up a few things on the database side. I'll be using a database named Explore (you can certainly change that) and the following setup is a fragment of the script I used in my presentation: USE Explore; GO CREATE SCHEMA help AUTHORIZATION dbo; GO -- Create default fulltext catalog for later FT indexes CREATE FULLTEXT CATALOG FTC AS DEFAULT; GO CREATE TABLE help.files(file_id int not null IDENTITY(1,1) CONSTRAINT PK_help_files PRIMARY KEY, path varchar(256) not null CONSTRAINT UNQ_help_files_path UNIQUE, doc_type varchar(6) DEFAULT('.xml'), content varbinary(max) not null); CREATE FULLTEXT INDEX ON help.files(content TYPE COLUMN doc_type LANGUAGE 1033) KEY INDEX PK_help_files; This will give you a table, default full-text catalog, and full-text index on that table for the content you're going to insert. I'll be using the command line again for this, it's the easiest method I know: for %a in (D:\SQLHelp\*.htm) do sqlcmd -S. -E -d Explore -Q"set nocount on;insert help.files(path,content) select '%a', cast(c as varbinary(max)) from openrowset(bulk '%a', SINGLE_CLOB) as c(c)" You'll need to copy and run that as one line in a command prompt. I'll explain what this does while you run it and watch several thousand files get imported: The "for" command allows you to loop over a collection of items. In this case we want all the .htm files in the D:\SQLHelp folder. For each file it finds, it will assign the full path and file name to the %a variable. In the "do" clause, we'll specify another command to be run for each iteration of the loop. I make a call to "sqlcmd" in order to run a SQL statement. I pass in the name of the server (-S.), where "." represents the local default instance. I specify -d Explore as the database, and -E for trusted connection. I then use -Q to run a query that I enclose in double quotes. The query uses OPENROWSET(BULK…SINGLE_CLOB) to open the file as a data source, and to treat it as a single character large object. In order for full-text indexing to work properly, I have to convert the text content to varbinary. I then INSERT these contents along with the full path of the file into the help.files table created earlier. This process continues for each file in the folder, creating one new row in the table. And that's it! 5 SQL Statements and 2 command line statements to unzip and import SQL Server Books Online! In case you're wondering why I didn't use FILESTREAM or FILETABLE, it's simply because I haven't learned them…yet. I may return to this blog after I figure that out and update it with the steps to do so. I believe that will make it even easier. In the spirit of exploration, I'll leave you to work on some fulltext queries of this content. I also recommend playing around with the sys.dm_fts_xxxx DMVs (I particularly like sys.dm_fts_index_keywords, it's pretty interesting). There are additional example queries in the download material for my presentation linked above. Many thanks to Kevin Boles (t) for his advice on (re)checking the content of the help files. Don't let that .htm extension fool you! The 2012 help files are actually XML, and you'd need to specify '.xml' in your document type column in order to extract the full-text keywords. (You probably noticed this in the default definition for the doc_type column.) You can query sys.fulltext_document_types to get a complete list of the types that can be full-text indexed. I also need to thank Hilary Cotter for giving me the original idea. I believe he used MSDN content in a full-text index for an article from waaaaaaaaaaay back, that I can't find now, and had forgotten about until just a few days ago. He is also co-author of Pro Full-Text Search in SQL Server 2008, which I highly recommend. He also has some FTS articles on Simple Talk: http://www.simple-talk.com/sql/learn-sql-server/sql-server-full-text-search-language-features/ http://www.simple-talk.com/sql/learn-sql-server/sql-server-full-text-search-language-features,-part-2/

Read the article
DBCC CHECKDB on VVLDB and latches (Or: My Pain is Your Gain)

- by Argenis

Does your CHECKDB hurt, Argenis? There is a classic blog series by Paul Randal [blog|twitter] called “CHECKDB From Every Angle” which is pretty much mandatory reading for anybody who’s even remotely considering going for the MCM certification, or its replacement (the Microsoft Certified Solutions Master: Data Platform – makes my fingers hurt just from typing it). Of particular interest is the post “Consistency Options for a VLDB” – on it, Paul provides solid, timeless advice (I use the word “timeless” because it was written in 2007, and it all applies today!) on how to perform checks on very large databases. Well, here I was trying to figure out how to make CHECKDB run faster on a restored copy of one of our databases, which happens to exceed 7TB in size. The whole thing was taking several days on multiple systems, regardless of the storage used – SAS, SATA or even SSD…and I actually didn’t pay much attention to how long it was taking, or even bothered to look at the reasons why - as long as it was finishing okay and found no consistency errors. Yes – I know. That was a huge mistake, as corruption found in a database several days after taking place could only allow for further spread of the corruption – and potentially large data loss. In the last two weeks I increased my attention towards this problem, as we noticed that CHECKDB was taking EVEN LONGER on brand new all-flash storage in the SAN! I couldn’t really explain it, and were almost ready to blame the storage vendor. The vendor told us that they could initially see the server driving decent I/O – around 450Mb/sec, and then it would settle at a very slow rate of 10Mb/sec or so. “Hum”, I thought – “CHECKDB is just not pushing the I/O subsystem hard enough”. Perfmon confirmed the vendor’s observations. Dreaded @BlobEater What was CHECKDB doing all the time while doing so little I/O? Eating Blobs. It turns out that CHECKDB was taking an extremely long time on one of our frankentables, which happens to be have 35 billion rows (yup, with a b) and sucks up several terabytes of space in the database. We do have a project ongoing to purge/split/partition this table, so it’s just a matter of time before we deal with it. But the reality today is that CHECKDB is coming to a screeching halt in performance when dealing with this particular table. Checking sys.dm_os_waiting_tasks and sys.dm_os_latch_stats showed that LATCH_EX (DBCC_OBJECT_METADATA) was by far the top wait type. I remembered hearing recently about that wait from another post that Paul Randal made, but that was related to computed-column indexes, and in fact, Paul himself reminded me of his article via twitter. But alas, our pathologic table had no non-clustered indexes on computed columns. I knew that latches are used by the database engine to do internal synchronization – but how could I help speed this up? After all, this is stuff that doesn’t have a lot of knobs to tweak. (There’s a fantastic level 500 talk by Bob Ward from Microsoft CSS [blog|twitter] called “Inside SQL Server Latches” given at PASS 2010 – and you can check it out here. DISCLAIMER: I assume no responsibility for any brain melting that might ensue from watching Bob’s talk!) Failed Hypotheses Earlier on this week I flew down to Palo Alto, CA, to visit our Headquarters – and after having a great time with my Monkey peers, I was relaxing on the plane back to Seattle watching a great talk by SQL Server MVP and fellow MCM Maciej Pilecki [twitter] called “Masterclass: A Day in the Life of a Database Transaction” where he discusses many different topics related to transaction management inside SQL Server. Very good stuff, and when I got home it was a little late – that slow DBCC CHECKDB that I had been dealing with was way in the back of my head. As I was looking at the problem at hand earlier on this week, I thought “How about I set the database to read-only?” I remembered one of the things Maciej had (jokingly) said in his talk: “if you don’t want locking and blocking, set the database to read-only” (or something to that effect, pardon my loose memory). I immediately killed the CHECKDB which had been running painfully for days, and set the database to read-only mode. Then I ran DBCC CHECKDB against it. It started going really fast (even a bit faster than before), and then throttled down again to around 10Mb/sec. All sorts of expletives went through my head at the time. Sure enough, the same latching scenario was present. Oh well. I even spent some time trying to figure out if NUMA was hurting performance. Folks on Twitter made suggestions in this regard (thanks, Lonny! [twitter]) …Eureka? This past Friday I was still scratching my head about the whole thing; I was ready to start profiling with XPERF to see if I could figure out which part of the engine was to blame and then get Microsoft to look at the evidence. After getting a bunch of good news I’ll blog about separately, I sat down for a figurative smack down with CHECKDB before the weekend. And then the light bulb went on. A sparse column. I thought that I couldn’t possibly be experiencing the same scenario that Paul blogged about back in March showing extreme latching with non-clustered indexes on computed columns. Did I even have a non-clustered index on my sparse column? As it turns out, I did. I had one filtered non-clustered index – with the sparse column as the index key (and only column). To prove that this was the problem, I went and setup a test. Yup, that'll do it The repro is very simple for this issue: I tested it on the latest public builds of SQL Server 2008 R2 SP2 (CU6) and SQL Server 2012 SP1 (CU4). First, create a test database and a test table, which only needs to contain a sparse column: CREATE DATABASE SparseColTest; GO USE SparseColTest; GO CREATE TABLE testTable (testCol smalldatetime SPARSE NULL); GO INSERT INTO testTable (testCol) VALUES (NULL); GO 1000000 That’s 1 million rows, and even though you’re inserting NULLs, that’s going to take a while. In my laptop, it took 3 minutes and 31 seconds. Next, we run DBCC CHECKDB against the database: DBCC CHECKDB('SparseColTest') WITH NO_INFOMSGS, ALL_ERRORMSGS; This runs extremely fast, as least on my test rig – 198 milliseconds. Now let’s create a filtered non-clustered index on the sparse column: CREATE NONCLUSTERED INDEX [badBadIndex] ON testTable (testCol) WHERE testCol IS NOT NULL; With the index in place now, let’s run DBCC CHECKDB one more time: DBCC CHECKDB('SparseColTest') WITH NO_INFOMSGS, ALL_ERRORMSGS; In my test system this statement completed in 11433 milliseconds. 11.43 full seconds. Quite the jump from 198 milliseconds. I went ahead and dropped the filtered non-clustered indexes on the restored copy of our production database, and ran CHECKDB against that. We went down from 7+ days to 19 hours and 20 minutes. Cue the “Argenis is not impressed” meme, please, Mr. LaRock. My pain is your gain, folks. Go check to see if you have any of such indexes – they’re likely causing your consistency checks to run very, very slow. Happy CHECKDBing, -Argenis ps: I plan to file a Connect item for this issue – I consider it a pretty serious bug in the engine. After all, filtered indexes were invented BECAUSE of the sparse column feature – and it makes a lot of sense to use them together. Watch this space and my twitter timeline for a link.

Read the article
SQL University: What and why of database refactoring

- by Mladen Prajdic

This is a post for a great idea called SQL University started by Jorge Segarra also famously known as SqlChicken on Twitter. It’s a collection of blog posts on different database related topics contributed by several smart people all over the world. So this week is mine and we’ll be talking about database testing and refactoring. In 3 posts we’ll cover: SQLU part 1 - What and why of database testing SQLU part 2 - What and why of database refactoring SQLU part 3 - Tools of the trade This is a second part of the series and in it we’ll take a look at what database refactoring is and why do it. Why refactor a database To know why refactor we first have to know what refactoring actually is. Code refactoring is a process where we change module internals in a way that does not change that module’s input/output behavior. For successful refactoring there is one crucial thing we absolutely must have: Tests. Automated unit tests are the only guarantee we have that we haven’t broken the input/output behavior before refactoring. If you haven’t go back ad read my post on the matter. Then start writing them. Next thing you need is a code module. Those are views, UDFs and stored procedures. By having direct table access we can kiss fast and sweet refactoring good bye. One more point to have a database abstraction layer. And no, ORM’s don’t fall into that category. But also know that refactoring is NOT adding new functionality to your code. Many have fallen into this trap. Don’t be one of them and resist the lure of the dark side. And it’s a strong lure. We developers in general love to add new stuff to our code, but hate fixing our own mistakes or changing existing code for no apparent reason. To be a good refactorer one needs discipline and focus. Now we know that refactoring is all about changing inner workings of existing code. This can be due to performance optimizations, changing internal code workflows or some other reason. This is a typical black box scenario to the outside world. If we upgrade the car engine it still has to drive on the road (preferably faster) and not fly (no matter how cool that would be). Also be aware that white box tests will break when we refactor. What to refactor in a database Refactoring databases doesn’t happen that often but when it does it can include a lot of stuff. Let us look at a few common cases. Adding or removing database schema objects Adding, removing or changing table columns in any way, adding constraints, keys, etc… All of these can be counted as internal changes not visible to the data consumer. But each of these carries a potential input/output behavior change. Dropping a column can result in views not working anymore or stored procedure logic crashing. Adding a unique constraint shows duplicated data that shouldn’t exist. Foreign keys break a truncate table command executed from an application that runs once a month. All these scenarios are very real and can happen. With the proper database abstraction layer fully covered with black box tests we can make sure something like that does not happen (hopefully at all). Changing physical structures Physical structures include heaps, indexes and partitions. We can pretty much add or remove those without changing the data returned by the database. But the performance can be affected. So here we use our performance tests. We do have them, right? Just by adding a single index we can achieve orders of magnitude performance improvement. Won’t that make users happy? But what if that index causes our write operations to crawl to a stop. again we have to test this. There are a lot of things to think about and have tests for. Without tests we can’t do successful refactoring! Fixing bad code We all have some bad code in our systems. We usually refer to that code as code smell as they violate good coding practices. Examples of such code smells are SQL injection, use of SELECT *, scalar UDFs or cursors, etc… Each of those is huge code smell and can result in major code changes. Take SELECT * from example. If we remove a column from a table the client using that SELECT * statement won’t have a clue about that until it runs. Then it will gracefully crash and burn. Not to mention the widely unknown SELECT * view refresh problem that Tomas LaRock (@SQLRockstar on Twitter) and Colin Stasiuk (@BenchmarkIT on Twitter) talk about in detail. Go read about it, it’s informative. Refactoring this includes replacing the * with column names and most likely change to application using the database. Breaking apart huge stored procedures Have you ever seen seen a stored procedure that was 2000 lines long? I have. It’s not pretty. It hurts the eyes and sucks the will to live the next 10 minutes. They are a maintenance nightmare and turn into things no one dares to touch. I’m willing to bet that 100% of time they don’t have a single test on them. Large stored procedures (and functions) are a clear sign that they contain business logic. General opinion on good database coding practices says that business logic has no business in the database. That’s the applications part. Refactoring such behemoths requires writing lots of edge case tests for the stored procedure input/output behavior and then start to refactor it. First we split the logic inside into smaller parts like new stored procedures and UDFs. Those then get called from the master stored procedure. Once we’ve successfully modularized the database code it’s best to transfer that logic into the applications consuming it. This only leaves the stored procedure with common data manipulation logic. Of course this isn’t always possible so having a plethora of performance and behavior unit tests is absolutely necessary to confirm we’ve actually improved the codebase in some way. Refactoring is not a popular chore amongst developers or managers. The former don’t like fixing old code, the latter can’t see the financial benefit. Remember how we talked about being lousy at estimating future costs in the previous post? But there comes a time when it must be done. Hopefully I’ve given you some ideas how to get started. In the last post of the series we’ll take a look at the tools to use and an example of testing and refactoring.

Read the article
Finally, upgrade from Nokia X3 to Samsung Galaxy S III

This time, something slightly different but nonetheless not less interesting, hopefully. Living on a remote island like Mauritius, ill-praised 'Cyber Island' in the Indian Ocean, has its advantages in life style and relaxed environment to life in but in terms of technological aspects it can be quite a nightmare. Well, I guess this might be different story to report about... one day. Cyber Island Mauritius Despite it's shiny advertisement as Cyber Island and business in ICT hub to Africa, Mauritius is not on the latest track of available models in computer hardware or, in the context of this article, cellulars or smart-phone, or communication technology in general. Okay, I have to admit that this statement is only partly true. Money can buy, even here in Mauritius. Luckily, there are ways and ways to deal with this outcry of modern, read: technological, civilisation issues. Online shopping you might think? Yes, for sure, until you discover in your checkout procedure that a small island in the Indian Ocean isn't a preferred destination for delivery and the precious time you spent on putting your items into your cart and feeding your personal level of anticipation gets ruined on the last stint. Ordering from abroad saves you money Anyway, I got in touch with my personal courier and luckily there were some extra-kilos left in the luggage. First obstacle sorted, we have a Transporter! Okay, on the next occasion off to Amazon online and using their Prime service for fast delivery. Actually, the order was placed on Saturday evening and everything got delivered on Tuesday morning - nice job in less than 72 hours. Okay, among the items of that shopping rush I ordered a shiny Samsung Galaxy S III 16GB in oceanic blue - did I mention, that you hardly get a blue model in Mauritius? - for my BWE. Interesting side-notes: First, Amazon Germany dropped the prices for roughly 30% on the S3, and we got the 16GB model for less than 500 Euro (or approx. Rs. 19.500,-) compared to the usual Rs. 27.000,- on the local market. It even varies whether the local price is inclusive or exclusive VAT (15%). Second, since a while she was bothering me to get an iPhone and an iPad for her, fair enough I thought, decent hardware, posh design and reliable services. Until we watched the 'magical' introduction of Samsung's new models at the IFA exhibition, she read the bashing comments on Google+ on the iPhone 5 and I gave her a brief summary on the law suit between Apple and Samsung in the USA. So, yes, Samsung USA is right, the next big thing is already here - literally. My BWE loves the look and touch of the Galaxy S3. And for me it was more cost-effective in terms of purchases done at the App Store, ups, Play Store. Transfer of contacts, text messages and media files Okay, now that the hardware is in place, how to transfer all those contacts, text messages, media files, etc. between those two devices? In the past, I used to use the Nokia Communication Suite between various models but now for Android? Well, as usual Google and Bing are reliable friends and among the first hits I came across an article about How to Transfer Contacts from Nokia to Android. Couldn't be easier, right? Well, sort of... my main Windows systems are already running on Windows 8, and this actually caused problems with the mobile/smart-phone device drivers. The article provides the download for an older version 1.10 which upgrades to 2.11 (as time of writing this entry) but both couldn't get the Galaxy S3 and the Nokia connected. Shame on me... the product page clearly doesn't mention Windows 8 (for now) and Windows 8 isn't available for the general audience at all... After I took a spare machine running on Windows Vista everything went smooth. Software installed, upgrade done, device drivers for Android automatically downloaded and installed, and the same painless routine for the Nokia part. I think, I rebooted the system twice during the whole setup procedure but hey, it was more or less a distraction while coding some stuff in ASP.NET MVC and Telerik Kendo UI. The transfer of contacts and text messages was done via Wondershare MobileGo for Android, and all media files by moving the additional microSD card from one device to the other. But even without an external SD card, it would have been very easy to copy the files via Windows Explorer directly. Little catch and excellent service Fine, we are almost done and the only step left is to shift the SIM card... Ouch, gotcha! The X3 uses a standard size SIM card while the S III only accepts microSIM form factor. What an irony, bigger smartphone needs smaller SIM card. Luckily, the next showroom of Emtel is just 5 mins away up the road, and the service staff over there know their job. Finally, after roughly 10 mins of paper work, activation and small chit-chat, the S3 came to life on the mobile network. Owning a smart-phone now and knowing that my BWE would like to interact more on social networks away from home, especially to upload pictures and provide local 'check-ins', I activated a data package for her in advance, too. Even that it is Saturday, everything was already done and ready to be used. Nice bonus: The Emtel clerk directly offered me to set up the configuration for the Emtel data services, yes sure, go ahead, this saves me to search for that in the settings. Okay, spoiler-alert here, setting a static APN to access the Emtel network and the internet wouldn't be a challenge. But hey, she already had the phone in her hands and I could keep my eyes on the children. Well done, Emtel! Resume Thanks to the useful software package by Wondershare is was a hands-free experience to transfer all the data from a Nokia mobile on Symbian S60 to a Samsung Galaxy S III on Android Ice Cream Sandwich (ICS). In the future, this wont be a serious issue at all anymore thanks to synchronisation services and cloud storage. And for now, I'm only waiting for the official upgrades for Jelly Bean.

Read the article
SQL SERVER – Windows File/Folder and Share Permissions – Notes from the Field #029

- by Pinal Dave

[Note from Pinal]: This is a 29th episode of Notes from the Field series. Security is the task which we should give it to the experts. If there is a small overlook or misstep, there are good chances that security of the organization is compromised. This is very true, but there are always devils’s advocates who believe everyone should know the security. As a DBA and Administrator, I often see people not taking interest in the Windows Security hiding behind the reason of not expert of Windows Server. We all often miss the important mission statement for the success of any organization – Teamwork. In this blog post Brian tells the story in very interesting lucid language. Read On! In this episode of the Notes from the Field series database expert Brian Kelley explains a very crucial issue DBAs and Developer faces on their production server. Linchpin People are database coaches and wellness experts for a data driven world. Read the experience of Brian in his own words. When I talk security among database professionals, I find that most have at least a working knowledge of how to apply security within a database. When I talk with DBAs in particular, I find that most have at least a working knowledge of security at the server level if we’re speaking of SQL Server. One area I see continually that is weak is in the area of Windows file/folder (NTFS) and share permissions. The typical response is, “I’m a database developer and the Windows system administrator is responsible for that.” That may very well be true – the system administrator may have the primary responsibility and accountability for file/folder and share security for the server. However, if you’re involved in the typical activities surrounding databases and moving data around, you should know these permissions, too. Otherwise, you could be setting yourself up where someone is able to get to data he or she shouldn’t, or you could be opening the door where human error puts bad data in your production system. File/Folder Permission Basics: I wrote about file/folder permissions a few years ago to give the basic permissions that are most often seen. Here’s what you must know as a minimum at the file/folder level: Read - Allows you to read the contents of the file or folder. Having read permissions allows you to copy the file or folder. Write – Again, as the name implies, it allows you to write to the file or folder. This doesn’t include the ability to delete, however, nothing stops a person with this access from writing an empty file. Delete - Allows the file/folder to be deleted. If you overwrite files, you may need this permission. Modify - Allows read, write, and delete. Full Control - Same as modify + the ability to assign permissions. File/Folder permissions aggregate, unless there is a DENY (where it trumps, just like within SQL Server), meaning if a person is in one group that gives Read and antoher group that gives Write, that person has both Read and Write permissions. As you might expect me to say, always apply the Principle of Least Privilege. This likely means that any additional permission you might add does not need Full Control. Share Permission Basics: At the share level, here are the permissions. Read - Allows you to read the contents on the share. Change - Allows you to read, write, and delete contents on the share. Full control - Change + the ability to modify permissions. Like with file/folder permissions, these permissions aggregate, and DENY trumps. So What Access Does a Person / Process Have? Figuring out what someone or some process has depends on how the location is being accessed: Access comes through the share (\\ServerName\Share) – a combination of permissions is considered. Access is through a drive letter (C:\, E:\, S:\, etc.) – only the file/folder permissions are considered. The only complicated one here is access through the share. Here’s what Windows does: Figures out what the aggregated permissions are at the file/folder level. Figures out what the aggregated permissions are at the share level. Takes the most restrictive of the two sets of permissions. You can test this by granting Full Control over a folder (this is likely already in place for the Users local group) and then setting up a share. Give only Read access through the share, and that includes to Administrators (if you’re creating a share, likely you have membership in the Administrators group). Try to read a file through the share. Now try to modify it. The most restrictive permission is the Share level permissions. It’s set to only allow Read. Therefore, if you come through the share, it’s the most restrictive. Does This Knowledge Really Help Me? In my experience, it does. I’ve seen cases where sensitive files were accessible by every authenticated user through a share. Auditors, as you might expect, have a real problem with that. I’ve also seen cases where files to be imported as part of the nightly processing were overwritten by files intended from development. And I’ve seen cases where a process can’t get to the files it needs for a process because someone changed the permissions. If you know file/folder and share permissions, you can spot and correct these types of security flaws. Given that there are a lot of database professionals that don’t understand these permissions, if you know it, you set yourself apart. And if you’re able to help on critical processes, you begin to set yourself up as a linchpin (link to .pdf) for your organization. If you want to get started with performance tuning and database security with the help of experts, read more over at Fix Your SQL Server. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Notes from the Field, PostADay, SQL, SQL Authority, SQL Query, SQL Security, SQL Server, SQL Tips and Tricks, T SQL

Read the article
DRY and SRP

- by Timothy Klenke

Originally posted on: http://geekswithblogs.net/TimothyK/archive/2014/06/11/dry-and-srp.aspxKent Beck’s XP Simplicity Rules (aka Four Rules of Simple Design) are a prioritized list of rules that when applied to your code generally yield a great design. As you’ll see from the above link the list has slightly evolved over time. I find today they are usually listed as: All Tests Pass Don’t Repeat Yourself (DRY) Express Intent Minimalistic These are prioritized. If your code doesn’t work (rule 1) then everything else is forfeit. Go back to rule one and get the code working before worrying about anything else. Over the years the community have debated whether the priority of rules 2 and 3 should be reversed. Some say a little duplication in the code is OK as long as it helps express intent. I’ve debated it myself. This recent post got me thinking about this again, hence this post. I don’t think it is fair to compare “Expressing Intent” against “DRY”. This is a comparison of apples to oranges. “Expressing Intent” is a principal of code quality. “Repeating Yourself” is a code smell. A code smell is merely an indicator that there might be something wrong with the code. It takes further investigation to determine if a violation of an underlying principal of code quality has actually occurred. For example “using nouns for method names”, “using verbs for property names”, or “using Booleans for parameters” are all code smells that indicate that code probably isn’t doing a good job at expressing intent. They are usually very good indicators. But what principle is the code smell of Duplication pointing to and how good of an indicator is it? Duplication in the code base is bad for a couple reasons. If you need to make a change and that needs to be made in a number of locations it is difficult to know if you have caught all of them. This can lead to bugs if/when one of those locations is overlooked. By refactoring the code to remove all duplication there will be left with only one place to change, thereby eliminating this problem. With most projects the code becomes the single source of truth for a project. If a production code base is inconsistent with a five year old requirements or design document the production code that people are currently living with is usually declared as the current reality (or truth). Requirement or design documents at this age in a project life cycle are usually of little value. Although comparing production code to external documentation is usually straight forward, duplication within the code base muddles this declaration of truth. When code is duplicated small discrepancies will creep in between the two copies over time. The question then becomes which copy is correct? As different factions debate how the software should work, trust in the software and the team behind it erodes. The code smell of Duplication points to a violation of the “Single Source of Truth” principle. Let me define that as: A stakeholder’s requirement for a software change should never cause more than one class to change. Violation of the Single Source of Truth principle will always result in duplication in the code. However, the inverse is not always true. Duplication in the code does not necessarily indicate that there is a violation of the Single Source of Truth principle. To illustrate this, let’s look at a retail system where the system will (1) send a transaction to a bank and (2) print a receipt for the customer. Although these are two separate features of the system, they are closely related. The reason for printing the receipt is usually to provide an audit trail back to the bank transaction. Both features use the same data: amount charged, account number, transaction date, customer name, retail store name, and etcetera. Because both features use much of the same data, there is likely to be a lot of duplication between them. This duplication can be removed by making both features use the same data access layer. Then start coming the divergent requirements. The receipt stakeholder wants a change so that the account number has the last few digits masked out to protect the customer’s privacy. That can be solve with a small IF statement whilst still eliminating all duplication in the system. Then the bank wants to take a picture of the customer as well as capture their signature and/or PIN number for enhanced security. Then the receipt owner wants to pull data from a completely different system to report the customer’s loyalty program point total. After a while you realize that the two stakeholders have somewhat similar, but ultimately different responsibilities. They have their own reasons for pulling the data access layer in different directions. Then it dawns on you, the Single Responsibility Principle: There should never be more than one reason for a class to change. In this example we have two stakeholders giving two separate reasons for the data access class to change. It is clear violation of the Single Responsibility Principle. That’s a problem because it can often lead the project owner pitting the two stakeholders against each other in a vein attempt to get them to work out a mutual single source of truth. But that doesn’t exist. There are two completely valid truths that the developers need to support. How is this to be supported and honour the Single Responsibility Principle? The solution is to duplicate the data access layer and let each stakeholder control their own copy. The Single Source of Truth and Single Responsibility Principles are very closely related. SST tells you when to remove duplication; SRP tells you when to introduce it. They may seem to be fighting each other, but really they are not. The key is to clearly identify the different responsibilities (or sources of truth) over a system. Sometimes there is a single person with that responsibility, other times there are many. This can be especially difficult if the same person has dual responsibilities. They might not even realize they are wearing multiple hats. In my opinion Single Source of Truth should be listed as the second rule of simple design with Express Intent at number three. Investigation of the DRY code smell should yield to the proper application SST, without violating SRP. When necessary leave duplication in the system and let the class names express the different people that are responsible for controlling them. Knowing all the people with responsibilities over a system is the higher priority because you’ll need to know this before you can express it. Although it may be a code smell when there is duplication in the code, it does not necessarily mean that the coder has chosen to be expressive over DRY or that the code is bad.

Read the article
Agile Testing Days 2012 – Day 3 – Agile or agile?

- by Chris George

Another early start for my last Lean Coffee of the conference, and again it was not wasted. We had some really interesting discussions around how to determine what test automation is useful, if agile is not faster, why do it? and a rather existential discussion on whether unicorns exist! First keynote of the day was entitled “Fast Feedback Teams” by Ola Ellnestam. Again this relates nicely to the releasing faster talk on day 2, and something that we are looking at and some teams are actively trying. Introducing the notion of feedback, Ola describes a game he wrote for his eldest child. It was a simple game where every time he clicked a button, it displayed “You’ve Won!”. He then changed it to be a Win-Lose-Win-Lose pattern and watched the feedback from his son who then twigged the pattern and got his younger brother to play, alternating turns… genius! (must do that with my children). The idea behind this was that you need that feedback loop to learn and progress. If you are not getting the feedback you need to close that loop. An interesting point Ola made was to solve problems BEFORE writing software. It may be that you don’t have to write anything at all, perhaps it’s a communication/training issue? Perhaps the problem can be solved another way. Writing software, although it’s the business we are in, is expensive, and this should be taken into account. He again mentions frequent releases, and how they should be made as soon as stuff is ready to be released, don’t leave stuff on the shelf cause it’s not earning you anything, money or data. I totally agree with this and it’s something that we will be aiming for moving forwards. “Exceptions, Assumptions and Ambiguity: Finding the truth behind the story” by David Evans started off very promising by making references to ‘Grim up North’ referring to the north of England. Not sure it was appreciated by most of the audience, but it made me laugh! David explained how there are always risks associated with exceptions, giving the example of a one-way road near where he lives, with an exception sign giving rights to coaches to go the wrong way. Therefore you could merrily swing around the corner of the one way road straight into a coach! David showed the danger in making assumptions with lyrical quotes from Lola by The Kinks “I’m glad I’m a man, and so is Lola” and with a picture of a toilet flush that needed instructions to operate the full and half flush. With this particular flush, you pulled the handle all the way down to half flush, and half way down to full flush! hmmm, a bit of a crappy user experience methinks! Then through a clever use of a passage from the Jabberwocky, David then went onto show how mis-translation/ambiguity is the can completely distort the original meaning of something, and this is a real enemy of software development. This was all helping to demonstrate that the term Story is often heavily overloaded in the Agile world, and should really be stripped back to what it is really for, stating a business problem, and offering a technical solution. Therefore a story could be worded as “In order to {make some improvement}, we will { do something}”. The first ‘in order to’ statement is stakeholder neutral, and states the problem through requesting an improvement to the software/process etc. The second part of the story is the verb, the doing bit. So to achieve the ‘improvement’ which is not currently true, we will do something to make this true in the future. My PM is very interested in this, and he’s observed some of the problems of overloading stories so I’m hoping between us we can use some of David’s suggestions to help clarify our stories better. The second keynote of the day (and our last) proved to be the most entertaining and exhausting of the conference for me. “The ongoing evolution of testing in agile development” by Scott Barber. I’ve never had the pleasure of seeing Scott before… OMG I would love to have even half of the energy he has! What struck me during this presentation was Scott’s explanation of how testing has become the role/job that it is (largely) today, and how this has led to the need for ‘methodologies’ to make dev and test work! The argument that we should be trying to converge the roles again is a very valid one, and one that a couple of the teams at work are actively doing with great results. Making developers as responsible for quality as testers is something that has been lost over the years, but something that we are now striving to achieve. The idea that we (testers) should be testing experts/specialists, not testing ‘union members’, supports this idea so the entire team works on all aspects of a feature/product, with the ‘specialists’ taking the lead and advising/coaching the others. This leads to better propagation of information around the team, a greater holistic understanding of the project and it allows the team to continue functioning if some of it’s members are off sick, for example. Feeling somewhat drained from Scott’s keynote (but at the same time excited that alot of the points he raised supported actions we are taking at work), I headed into my last presentation for Agile Testing Days 2012 before having to make my way to Tegel to catch the flight home. “Thinking and working agile in an unbending world” with Pete Walen was a talk I was not going to miss! Having spoken to Pete several times during the past few days, I was looking forward to hearing what he was going to say, and I was not disappointed. Pete started off by trying to separate the definitions of ‘Agile’ as in the methodology, and ‘agile’ as in the adjective by pronouncing them the ‘english’ and ‘american’ ways. So Agile pronounced (Ajyle) and agile pronounced (ajul). There was much confusion around what the hell he was talking about, although I thought it was quite clear. Agile – Software development methodology agile – Marked by ready ability to move with quick easy grace; Having a quick resourceful and adaptable character. Anyway, that aside (although it provided a few laughs during the presentation), the point was that many teams that claim to be ‘Agile’ but are not, in fact, ‘agile’ by nature. Implementing ‘Agile’ methodologies that are so prescriptive actually goes against the very nature of Agile development where a team should anticipate, adapt and explore. Pete made a valid point that very few companies intentionally put up roadblocks to impede work, so if work is being blocked/delayed, why? This is where being agile as a team pays off because the team can inspect what’s going on, explore options and adapt their processes. It is through experimentation (and that means trying and failing as well as trying and succeeding) that a team will improve and grow leading to focussing on what really needs to be done to achieve X. So, that was it, the last talk of our conference. I was gutted that we had to miss the closing keynote from Matt Heusser, as Matt was another person I had spoken too a few times during the conference, but the flight would not wait, and just as well we left when we did because the traffic was a nightmare! My Takeaway Triple from Day 3: Release often and release small – don’t leave stuff on the shelf Keep the meaning of the word ‘agile’ in mind when working in ‘Agile Look at testing as more of a skill than a role

Read the article
Thou shalt not put code on a piedestal - Code is a tool, no more, no less

- by Ralf Westphal

“Write great code and everything else becomes easier” is what Paul Pagel believes in. That´s his version of an adage by Brian Marick he cites: “treat code as an end, not just a means.” And he concludes: “My post-Agile world is software craftsmanship.” I wonder, if that´s really the way to go. Will “simply” writing great code lead the software industry into the light? He´s alluding to the philosopher Kant who proposed, a human beings should never be treated as a means, but always as an end. But should we transfer this ethical statement into the world of software? I doubt it. Reason #1: Human beings are categorially different from code. They are autonomous entities who need to find a way of living happily together. To Kant it seemed this goal could only be reached if nobody (ab)used a human being for his/her purposes. Because using a human being, i.e. treating it as a means, would contradict the fundamental autonomy and freedom of human beings. People should hold up a symmetric view of their relationships: Since nobody wants to be (ab)used, nobody should (ab)use anybody else. If you want to be treated decently, with respect, in accordance with your own free will - which means as an end - then do the same to other people. Code is dead, it´s a product, it´s a tool for people to reach their goals. No company spends any money on code other than to save money or earn money in the long run. Code is not a puppy. Enterprises do not commission software development to just feel good in its company. Code is not a buddy. Code is a slave, if you will. A mechanical slave, a non-tangible robot. Code is a tool, is a tool. And if we start to treat it differently, if we elevate its status unduely… I guess that will contort our relationship in a contraproductive way. Please get me right: Just because something is “just a tool”, “just a product” does not mean we should not be careful while designing, building, using it. Right to the contrary. We should be very careful when writing code – but not for the code´s sake! We should be careful because we respect our customers who are fellow human beings who should be treated as an end. If we are careless, neglectful, ignorant when producing code on their behalf, then we´re using them. Being sloppy means you´re caring more for yourself that for your customer. You´re then treating the customer as a means to fulfill some of your own needs. That´s plain unethical behavior. Reason #2: The focus should always be on your purpose, not on any tool. But if code is treated as an end, then the focus is on the code. That might sound right, because where else should be your focus as a software developer? But, well, I´d say, your focus should be on delivering value to your customer. Because in the end your customer does not care if you write a single line of code. She just wants her problem to be solved. Solving problems is the purpose of any contractor. Code must be treated just as a means, a tool we know how to handle very well. But if we´re really trying to be craftsmen then we should be conscious about exactly that and act ethically. That means we must never be so focused on our tool as to be unable to suggest better solutions to the problems of our customers than code. I´m all with Paul when he urges us to “Write great code”. Sure, if you need to write code, then by all means do so. Write the best code you can think of – and then try to improve it. Paul has all the best intentions when he signs Brians “treat code as an end” - but as we all know: “The road to hell is paved with best intentions” ;-) Yes, I can imagine a “hell of code focus”. In fact, I don´t need to imagine it, I´m seeing it quite often. Because code hell is whereever two developers stand together and are so immersed in talking about all sorts of coding tricks, design patterns, code smells, technologies, platforms, tools that they lose sight of the big picture. Talking about TDD or SOLID or refactoring is a sign of consciousness – relative to the “cowboy coders” view of the world. But from yet another point of view TDD, SOLID, and refactoring are just cures for ailments within a system. And I fear, if “Writing great code” is the only focus or the main focus of software development, then we as an industry lose the ability to see that. Focus draws a line around something, it defines a horizon for perceptions and thinking. So if we focus on code our horizon ends where “the land of code” ends. I don´t think that should be our professional attitude. So what about Software Craftsmanship as the next big thing after Agility? I think Software Craftsmanship has an important message for all software developers and beyond. But to make it the successor of the Agility movement seems to miss a point. Agility never claimed to solve all software development problems, I´d say. So to blame it for having missed out on certain aspects of it is wrong. If I had to summarize Agility in one word I´d say “Value”. Agility put value for the customer back in software development. Focus on delivering value early and often – that´s Agility´s mantra. All else follows from that. And I ask you: Is that obsolete? Is delivering value not hip anymore? No, sure not. That´s our very purpose as software developers. So how can Agility become obsolete and need to be replaced? We need to do away with this “either/or”-thinking. It´s either Agility or Lean or Software Craftsmanship or whatnot. Instead we should start integrating concepts and movements. Think “both/and”. Think Agility plus Software Craftsmanship plus Lean plus whatnot. We don´t neet to tear down anything from a piedestal and replace it with a new idol. Instead we should do away with piedestals and arrange whatever is helpful is a circle. Then we can turn to concepts, movements for whatever they are best. After 10 years of Agility we should be able to identify what it was good at – and keep that. Keep Agility around and add whatever Agility was lacking or never concerned with. Add whatever is at the core of Software Craftsmanship. Add whatever is at the core of Lean etc. But don´t call out the age of Post-Agility. Because it better never will end. Because once we start to lose Agility´s core we´re losing focus of the customer.

Read the article
Finally, upgrade from Nokia X3 to Samsung Galaxy S III

This time, something slightly different but nonetheless not less interesting, hopefully. Living on a remote island like Mauritius, ill-praised 'Cyber Island' in the Indian Ocean, has its advantages in life style and relaxed environment to life in but in terms of technological aspects it can be quite a nightmare. Well, I guess this might be different story to report about... one day. Cyber Island Mauritius Despite it's shiny advertisement as Cyber Island and business in ICT hub to Africa, Mauritius is not on the latest track of available models in computer hardware or, in the context of this article, cellulars or smart-phone, or communication technology in general. Okay, I have to admit that this statement is only partly true. Money can buy, even here in Mauritius. Luckily, there are ways and ways to deal with this outcry of modern, read: technological, civilisation issues. Online shopping you might think? Yes, for sure, until you discover in your checkout procedure that a small island in the Indian Ocean isn't a preferred destination for delivery and the precious time you spent on putting your items into your cart and feeding your personal level of anticipation gets ruined on the last stint. Ordering from abroad saves you money Anyway, I got in touch with my personal courier and luckily there were some extra-kilos left in the luggage. First obstacle sorted, we have a Transporter! Okay, on the next occasion off to Amazon online and using their Prime service for fast delivery. Actually, the order was placed on Saturday evening and everything got delivered on Tuesday morning - nice job in less than 72 hours. Okay, among the items of that shopping rush I ordered a shiny Samsung Galaxy S III 16GB in oceanic blue - did I mention, that you hardly get a blue model in Mauritius? - for my BWE. Interesting side-notes: First, Amazon Germany dropped the prices for roughly 30% on the S3, and we got the 16GB model for less than 500 Euro (or approx. Rs. 19.500,-) compared to the usual Rs. 27.000,- on the local market. It even varies whether the local price is inclusive or exclusive VAT (15%). Second, since a while she was bothering me to get an iPhone and an iPad for her, fair enough I thought, decent hardware, posh design and reliable services. Until we watched the 'magical' introduction of Samsung's new models at the IFA exhibition, she read the bashing comments on Google+ on the iPhone 5 and I gave her a brief summary on the law suit between Apple and Samsung in the USA. So, yes, Samsung USA is right, the next big thing is already here - literally. My BWE loves the look and touch of the Galaxy S3. And for me it was more cost-effective in terms of purchases done at the App Store, ups, Play Store. Transfer of contacts, text messages and media files Okay, now that the hardware is in place, how to transfer all those contacts, text messages, media files, etc. between those two devices? In the past, I used to use the Nokia Communication Suite between various models but now for Android? Well, as usual Google and Bing are reliable friends and among the first hits I came across an article about How to Transfer Contacts from Nokia to Android. Couldn't be easier, right? Well, sort of... my main Windows systems are already running on Windows 8, and this actually caused problems with the mobile/smart-phone device drivers. The article provides the download for an older version 1.10 which upgrades to 2.11 (as time of writing this entry) but both couldn't get the Galaxy S3 and the Nokia connected. Shame on me... the product page clearly doesn't mention Windows 8 (for now) and Windows 8 isn't available for the general audience at all... After I took a spare machine running on Windows Vista everything went smooth. Software installed, upgrade done, device drivers for Android automatically downloaded and installed, and the same painless routine for the Nokia part. I think, I rebooted the system twice during the whole setup procedure but hey, it was more or less a distraction while coding some stuff in ASP.NET MVC and Telerik Kendo UI. The transfer of contacts and text messages was done via Wondershare MobileGo for Android, and all media files by moving the additional microSD card from one device to the other. But even without an external SD card, it would have been very easy to copy the files via Windows Explorer directly. Little catch and excellent service Fine, we are almost done and the only step left is to shift the SIM card... Ouch, gotcha! The X3 uses a standard size SIM card while the S III only accepts microSIM form factor. What an irony, bigger smartphone needs smaller SIM card. Luckily, the next showroom of Emtel is just 5 mins away up the road, and the service staff over there know their job. Finally, after roughly 10 mins of paper work, activation and small chit-chat, the S3 came to life on the mobile network. Owning a smart-phone now and knowing that my BWE would like to interact more on social networks away from home, especially to upload pictures and provide local 'check-ins', I activated a data package for her in advance, too. Even that it is Saturday, everything was already done and ready to be used. Nice bonus: The Emtel clerk directly offered me to set up the configuration for the Emtel data services, yes sure, go ahead, this saves me to search for that in the settings. Okay, spoiler-alert here, setting a static APN to access the Emtel network and the internet wouldn't be a challenge. But hey, she already had the phone in her hands and I could keep my eyes on the children. Well done, Emtel! Resume Thanks to the useful software package by Wondershare is was a hands-free experience to transfer all the data from a Nokia mobile on Symbian S60 to a Samsung Galaxy S III on Android Ice Cream Sandwich (ICS). In the future, this wont be a serious issue at all anymore thanks to synchronisation services and cloud storage. And for now, I'm only waiting for the official upgrades for Jelly Bean.

Read the article
Emtel Knowledge Series - Q2/2014

From Cyber Island to Smart Mauritius Cyber Island? Smart Mauritius? - What is Emtel talking about? "With the majority of the population living in urban environments today, the concept of "Smart Cities" has become an urgent necessity. "Smart Cities" refer to an urban transformation which, by using latest ICT technologies makes cities more efficient. Many Governments are setting out ambitious plans to build the cities of the future based on massive connectivity, high bandwidth communications, intelligent sensors and analysis of huge volumes of data. Various researches have shown four key enablers for smart city success - Government leadership, suitable technology infrastructure, solid public-private partnerships and engaged citizens. It is around these enabling factors that telecoms companies can play a vital role in assisting governments to deliver on the smart city vision." The Emtel Knowledge Series goes in compliance with Emtel's 25th anniversary celebrations throughout the year and the master of ceremony, Kim Andersen, mentioned that there will be more upcoming events on a quarterly base. As a representative of the Mauritius Software Craftsmanship Community (MSCC) there was absolutely no hesitation to join in again. Following my visit to the first Emtel Knowledge Series workshop back in February this year, it was great to have another opportunity to meet and exchange with technology experts. But quite frankly what is it with those buzz words... As far as I remember and how it was mentioned "Cyber Island" is an old initiative from around 2005/2006 which has been refreshed in 2010. It implies the empowerment of Information & Communication Technologies (ICT) as an essential factor of growth by the government here in Mauritius. Actually, the first promotional period of Cyber Island brought me here but that's another story. The venue and its own problems Like last time the event was organised and held at the Conference Hall at Cyber Tower I in Ebene. As I've been working there for some years, I know about the frustrating situation of finding a proper parking. So, does Smart Island include better solutions for the search of parking spaces? Maybe, let's see whether I will be able to answer that question at the end of the article. Anyway, after circling around the tower almost two times, I finally got a decent space to put the car, without risking to get a ticket or damage actually. International speakers and their experience Once again, Emtel did a great job to get international expertise onto the stage to share their experience and vision on this kind of embarkment. Personally, I really appreciated the fact they were speakers of global reach and could provide own-experience knowledge. Johan Gott spoke about the fundamental change that the Swedish government ignited in order to move their society and workers' environment away from heavy industry towards a knowledge-based approach. Additionally, we spoke about the effort and transformation of New York City into a greener and more efficient Smart City. Given modern technology he also advised that any kind of available Big Data should be opened to the general public - this openness would provide a playground for anyone to garner new ideas and most probably solid solutions of which no one else thought about before. Emtel Knowledge Series on moving from Cyber Island to Smart Mauritus Later during the afternoon that exact statement regarding openness to and transparency of government-owned Big Data has been emphasised again by the Danish speaker Kim Andersen and his former colleague Mika Jantunen from Finland. Mika continued to underline the important role of the government to provide a solid foundation for a knowledge-based society and mentioned that Finnish citizens have a constitutional right to broadband connectivity. Next to free higher (tertiary) education Finland already produced a good number of innovations, among them are: First country to grant voting rights to women Free higher education Constitutional right to broadband connectivity Nokia Linux Angry Birds Sauna and others... General access to internet via broadband and/or mobile connectivity is surely a key factor towards Smart Cities, or better said Smart Mauritius given the area dimensions and size of population. CTO Paul Valette gave the audience a brief overview of the essential role that Emtel will have to move Mauritius forward towards a knowledge-based and innovation-driven environment for its citizen. What I have seen looks really promising and with recently published information that Mauritians have 127% of mobile capacity - meaning more than 1 mobile, smartphone or tablet per person - it will be crucial to have the right infrastructure for these connected devices. How would it be possible to achieve a knowledge-based society? YouTube to the rescue!Seriously, gaining more knowledge will require to have fast access to educational course material as explained by Dr Kaviraj Sukon, General Director of the Open University of Mauritius. According to him a good number of high-profile universities in the world have opened their course libraries to the general public, among them EDX, Coursera and Open University. Nowadays, you're actually able and enabled to learn for and earn a BSc or even MSc certification on your own pace - no need to attend classed on campus. It was really impressive to see the number of available hours - more than enough for a life-long learning experience! {loadposition content_adsense} Networking in the name of MSCC As briefly mentioned above I was about to combine two approaches for this workshop. Of course, getting latest information and updates on Emtel services available, especially for my business here on the west coast of the island, but also to meet and greet new people for the MSCC. And I think it was very positive on both sides. Let me quickly describe some of the key aspects that happened during the day: Met with Arnaud Meslier and Kellie, both Microsoft to swap latest information on IT events. Hereby, I got an invite to Microsoft Windows Phone 8.1 Dev Camp. Got in touch with Arvin Lockee, Emtel to check our options to meet with the data team, and seizing the opportunity to have a visiting tour at the Emtel Data Centre. Had a great chat with Avinash Meetoo, Knowledge 7, Kim Andersen and Mika Jantunen about the situation of teaching and learning in general and specifically in the private sector here in Mauritius. Additionally, a number of various other interesting chats... Once again, I'm catching up on a couple of business cards in order to provide more background information about the MSCC, and to create a better awareness of MSCC within the local IT businesses. There is more to come soon! Resume of the day The number of attendees during this event has been doubled or even tripled this time. The whole organisation has been improved massively and the combination of presentation and summarizing panel discussions was better than during the previous workshop back in February. Overall, once again a well-organised workshop and I'm already looking forward to join the next workshop in Q3. Update End of July we finally managed to visit the Emtel Data Centre in Arsenal. It was an interesting opportunity for some of our MSCC members.

Read the article
Why you need to learn async in .NET

- by PSteele

I had an opportunity to teach a quick class yesterday about what’s new in .NET 4.0. One of the topics was the TPL (Task Parallel Library) and how it can make async programming easier. I also stressed that this is the direction Microsoft is going with for C# 5.0 and learning the TPL will greatly benefit their understanding of the new async stuff. We had a little time left over and I was able to show some code that uses the Async CTP to accomplish some stuff, but it wasn’t a simple demo that you could jump in to and understand so I thought I’d thrown one together and put it in a blog post. The entire solution file with all of the sample projects is located here. A Simple Example Let’s start with a super-simple example (WindowsApplication01 in the solution). I’ve got a form that displays a label and a button. When the user clicks the button, I want to start displaying the current time for 15 seconds and then stop. What I’d like to write is this: lblTime.ForeColor = Color.Red; for (var x = 0; x < 15; x++) { lblTime.Text = DateTime.Now.ToString("HH:mm:ss"); Thread.Sleep(1000); } lblTime.ForeColor = SystemColors.ControlText; (Note that I also changed the label’s color while counting – not quite an ILM-level effect, but it adds something to the demo!) As I’m sure most of my readers are aware, you can’t write WinForms code this way. WinForms apps, by default, only have one thread running and it’s main job is to process messages from the windows message pump (for a more thorough explanation, see my Visual Studio Magazine article on multithreading in WinForms). If you put a Thread.Sleep in the middle of that code, your UI will be locked up and unresponsive for those 15 seconds. Not a good UX and something that needs to be fixed. Sure, I could throw an “Application.DoEvents()” in there, but that’s hacky. The Windows Timer Then I think, “I can solve that. I’ll use the Windows Timer to handle the timing in the background and simply notify me when the time has changed”. Let’s see how I could accomplish this with a Windows timer (WindowsApplication02 in the solution): public partial class Form1 : Form { private readonly Timer clockTimer; private int counter; public Form1() { InitializeComponent(); clockTimer = new Timer {Interval = 1000}; clockTimer.Tick += UpdateLabel; } private void UpdateLabel(object sender, EventArgs e) { lblTime.Text = DateTime.Now.ToString("HH:mm:ss"); counter++; if (counter == 15) { clockTimer.Enabled = false; lblTime.ForeColor = SystemColors.ControlText; } } private void cmdStart_Click(object sender, EventArgs e) { lblTime.ForeColor = Color.Red; counter = 0; clockTimer.Start(); } } Holy cow – things got pretty complicated here. I use the timer to fire off a Tick event every second. Inside there, I can update the label. Granted, I can’t use a simple for/loop and have to maintain a global counter for the number of iterations. And my “end” code (when the loop is finished) is now buried inside the bottom of the Tick event (inside an “if” statement). I do, however, get a responsive application that doesn’t hang or stop repainting while the 15 seconds are ticking away. But doesn’t .NET have something that makes background processing easier? The BackgroundWorker Next I try .NET’s BackgroundWorker component – it’s specifically designed to do processing in a background thread (leaving the UI thread free to process the windows message pump) and allows updates to be performed on the main UI thread (WindowsApplication03 in the solution): public partial class Form1 : Form { private readonly BackgroundWorker worker; public Form1() { InitializeComponent(); worker = new BackgroundWorker {WorkerReportsProgress = true}; worker.DoWork += StartUpdating; worker.ProgressChanged += UpdateLabel; worker.RunWorkerCompleted += ResetLabelColor; } private void StartUpdating(object sender, DoWorkEventArgs e) { var workerObject = (BackgroundWorker) sender; for (int x = 0; x < 15; x++) { workerObject.ReportProgress(0); Thread.Sleep(1000); } } private void UpdateLabel(object sender, ProgressChangedEventArgs e) { lblTime.Text = DateTime.Now.ToString("HH:mm:ss"); } private void ResetLabelColor(object sender, RunWorkerCompletedEventArgs e) { lblTime.ForeColor = SystemColors.ControlText; } private void cmdStart_Click(object sender, EventArgs e) { lblTime.ForeColor = Color.Red; worker.RunWorkerAsync(); } } Well, this got a little better (I think). At least I now have my simple for/next loop back. Unfortunately, I’m still dealing with event handlers spread throughout my code to co-ordinate all of this stuff in the right order. Time to look into the future. The async way Using the Async CTP, I can go back to much simpler code (WindowsApplication04 in the solution): private async void cmdStart_Click(object sender, EventArgs e) { lblTime.ForeColor = Color.Red; for (var x = 0; x < 15; x++) { lblTime.Text = DateTime.Now.ToString("HH:mm:ss"); await TaskEx.Delay(1000); } lblTime.ForeColor = SystemColors.ControlText; } This code will run just like the Timer or BackgroundWorker versions – fully responsive during the updates – yet is way easier to implement. In fact, it’s almost a line-for-line copy of the original version of this code. All of the async plumbing is handled by the compiler and the framework. My code goes back to representing the “what” of what I want to do, not the “how”. I urge you to download the Async CTP. All you need is .NET 4.0 and Visual Studio 2010 sp1 – no need to set up a virtual machine with the VS2011 beta (unless, of course, you want to dive right in to the C# 5.0 stuff!). Starting playing around with this today and see how much easier it will be in the future to write async-enabled applications.

Read the article
Benchmarking MySQL Replication with Multi-Threaded Slaves

- by Mat Keep

0 0 1 1145 6530 Homework 54 15 7660 14.0 Normal 0 false false false EN-US JA X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin; mso-ansi-language:EN-US;} The objective of this benchmark is to measure the performance improvement achieved when enabling the Multi-Threaded Slave enhancement delivered as a part MySQL 5.6. As the results demonstrate, Multi-Threaded Slaves delivers 5x higher replication performance based on a configuration with 10 databases/schemas. For real-world deployments, higher replication performance directly translates to: · Improved consistency of reads from slaves (i.e. reduced risk of reading "stale" data) · Reduced risk of data loss should the master fail before replicating all events in its binary log (binlog) The multi-threaded slave splits processing between worker threads based on schema, allowing updates to be applied in parallel, rather than sequentially. This delivers benefits to those workloads that isolate application data using databases - e.g. multi-tenant systems deployed in cloud environments. Multi-Threaded Slaves are just one of many enhancements to replication previewed as part of the MySQL 5.6 Development Release, which include: · Global Transaction Identifiers coupled with MySQL utilities for automatic failover / switchover and slave promotion · Crash Safe Slaves and Binlog · Optimized Row Based Replication · Replication Event Checksums · Time Delayed Replication These and many more are discussed in the “MySQL 5.6 Replication: Enabling the Next Generation of Web & Cloud Services” Developer Zone article Back to the benchmark - details are as follows. Environment The test environment consisted of two Linux servers: · one running the replication master · one running the replication slave. Only the slave was involved in the actual measurements, and was based on the following configuration: - Hardware: Oracle Sun Fire X4170 M2 Server - CPU: 2 sockets, 6 cores with hyper-threading, 2930 MHz. - OS: 64-bit Oracle Enterprise Linux 6.1 - Memory: 48 GB Test Procedure Initial Setup: Two MySQL servers were started on two different hosts, configured as replication master and slave. 10 sysbench schemas were created, each with a single table: CREATE TABLE `sbtest` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `k` int(10) unsigned NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', `pad` char(60) NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `k` (`k`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 10,000 rows were inserted in each of the 10 tables, for a total of 100,000 rows. When the inserts had replicated to the slave, the slave threads were stopped. The slave data directory was copied to a backup location and the slave threads position in the master binlog noted. 10 sysbench clients, each configured with 10 threads, were spawned at the same time to generate a random schema load against each of the 10 schemas on the master. Each sysbench client executed 10,000 "update key" statements: UPDATE sbtest set k=k+1 WHERE id = <random row> In total, this generated 100,000 update statements to later replicate during the test itself. Test Methodology: The number of slave workers to test with was configured using: SET GLOBAL slave_parallel_workers=<workers> Then the slave IO thread was started and the test waited for all the update queries to be copied over to the relay log on the slave. The benchmark clock was started and then the slave SQL thread was started. The test waited for the slave SQL thread to finish executing the 100k update queries, doing "select master_pos_wait()". When master_pos_wait() returned, the benchmark clock was stopped and the duration calculated. The calculated duration from the benchmark clock should be close to the time it took for the SQL thread to execute the 100,000 update queries. The 100k queries divided by this duration gave the benchmark metric, reported as Queries Per Second (QPS). Test Reset: The test-reset cycle was implemented as follows: · the slave was stopped · the slave data directory replaced with the previous backup · the slave restarted with the slave threads replication pointer repositioned to the point before the update queries in the binlog. The test could then be repeated with identical set of queries but a different number of slave worker threads, enabling a fair comparison. The Test-Reset cycle was repeated 3 times for 0-24 number of workers and the QPS metric calculated and averaged for each worker count. MySQL Configuration The relevant configuration settings used for MySQL are as follows: binlog-format=STATEMENT relay-log-info-repository=TABLE master-info-repository=TABLE As described in the test procedure, the slave_parallel_workers setting was modified as part of the test logic. The consequence of changing this setting is: 0 worker threads: - current (i.e. single threaded) sequential mode - 1 x IO thread and 1 x SQL thread - SQL thread both reads and executes the events 1 worker thread: - sequential mode - 1 x IO thread, 1 x Coordinator SQL thread and 1 x Worker thread - coordinator reads the event and hands it to the worker who executes 2+ worker threads: - parallel execution - 1 x IO thread, 1 x Coordinator SQL thread and 2+ Worker threads - coordinator reads events and hands them to the workers who execute them Results Figure 1 below shows that Multi-Threaded Slaves deliver ~5x higher replication performance when configured with 10 worker threads, with the load evenly distributed across our 10 x schemas. This result is compared to the current replication implementation which is based on a single SQL thread only (i.e. zero worker threads). Figure 1: 5x Higher Performance with Multi-Threaded Slaves The following figure shows more detailed results, with QPS sampled and reported as the worker threads are incremented. The raw numbers behind this graph are reported in the Appendix section of this post. Figure 2: Detailed Results As the results above show, the configuration does not scale noticably from 5 to 9 worker threads. When configured with 10 worker threads however, scalability increases significantly. The conclusion therefore is that it is desirable to configure the same number of worker threads as schemas. Other conclusions from the results: · Running with 1 worker compared to zero workers just introduces overhead without the benefit of parallel execution. · As expected, having more workers than schemas adds no visible benefit. Aside from what is shown in the results above, testing also demonstrated that the following settings had a very positive effect on slave performance: relay-log-info-repository=TABLE master-info-repository=TABLE For 5+ workers, it was up to 2.3 times as fast to run with TABLE compared to FILE. Conclusion As the results demonstrate, Multi-Threaded Slaves deliver significant performance increases to MySQL replication when handling multiple schemas. This, and the other replication enhancements introduced in MySQL 5.6 are fully available for you to download and evaluate now from the MySQL Developer site (select Development Release tab). You can learn more about MySQL 5.6 from the documentation Please don’t hesitate to comment on this or other replication blogs with feedback and questions. Appendix – Detailed Results

Read the article
I see no LOBs!

- by Paul White

Is it possible to see LOB (large object) logical reads from STATISTICS IO output on a table with no LOB columns? I was asked this question today by someone who had spent a good fraction of their afternoon trying to work out why this was occurring – even going so far as to re-run DBCC CHECKDB to see if any corruption had taken place. The table in question wasn’t particularly pretty – it had grown somewhat organically over time, with new columns being added every so often as the need arose. Nevertheless, it remained a simple structure with no LOB columns – no TEXT or IMAGE, no XML, no MAX types – nothing aside from ordinary INT, MONEY, VARCHAR, and DATETIME types. To add to the air of mystery, not every query that ran against the table would report LOB logical reads – just sometimes – but when it did, the query often took much longer to execute. Ok, enough of the pre-amble. I can’t reproduce the exact structure here, but the following script creates a table that will serve to demonstrate the effect: IF OBJECT_ID(N'dbo.Test', N'U') IS NOT NULL DROP TABLE dbo.Test GO CREATE TABLE dbo.Test ( row_id NUMERIC IDENTITY NOT NULL, col01 NVARCHAR(450) NOT NULL, col02 NVARCHAR(450) NOT NULL, col03 NVARCHAR(450) NOT NULL, col04 NVARCHAR(450) NOT NULL, col05 NVARCHAR(450) NOT NULL, col06 NVARCHAR(450) NOT NULL, col07 NVARCHAR(450) NOT NULL, col08 NVARCHAR(450) NOT NULL, col09 NVARCHAR(450) NOT NULL, col10 NVARCHAR(450) NOT NULL, CONSTRAINT [PK dbo.Test row_id] PRIMARY KEY CLUSTERED (row_id) ) ; The next script loads the ten variable-length character columns with one-character strings in the first row, two-character strings in the second row, and so on down to the 450th row: WITH Numbers AS ( -- Generates numbers 1 - 450 inclusive SELECT TOP (450) n = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM master.sys.columns C1, master.sys.columns C2, master.sys.columns C3 ORDER BY n ASC ) INSERT dbo.Test WITH (TABLOCKX) SELECT REPLICATE(N'A', N.n), REPLICATE(N'B', N.n), REPLICATE(N'C', N.n), REPLICATE(N'D', N.n), REPLICATE(N'E', N.n), REPLICATE(N'F', N.n), REPLICATE(N'G', N.n), REPLICATE(N'H', N.n), REPLICATE(N'I', N.n), REPLICATE(N'J', N.n) FROM Numbers AS N ORDER BY N.n ASC ; Once those two scripts have run, the table contains 450 rows and 10 columns of data like this: Most of the time, when we query data from this table, we don’t see any LOB logical reads, for example: -- Find the maximum length of the data in -- column 5 for a range of rows SELECT result = MAX(DATALENGTH(T.col05)) FROM dbo.Test AS T WHERE row_id BETWEEN 50 AND 100 ; But with a different query… -- Read all the data in column 1 SELECT result = MAX(DATALENGTH(T.col01)) FROM dbo.Test AS T ; …suddenly we have 49 LOB logical reads, as well as the ‘normal’ logical reads we would expect. The Explanation If we had tried to create this table in SQL Server 2000, we would have received a warning message to say that future INSERT or UPDATE operations on the table might fail if the resulting row exceeded the in-row storage limit of 8060 bytes. If we needed to store more data than would fit in an 8060 byte row (including internal overhead) we had to use a LOB column – TEXT, NTEXT, or IMAGE. These special data types store the large data values in a separate structure, with just a small pointer left in the original row. Row Overflow SQL Server 2005 introduced a feature called row overflow, which allows one or more variable-length columns in a row to move to off-row storage if the data in a particular row would otherwise exceed 8060 bytes. You no longer receive a warning when creating (or altering) a table that might need more than 8060 bytes of in-row storage; if SQL Server finds that it can no longer fit a variable-length column in a particular row, it will silently move one or more of these columns off the row into a separate allocation unit. Only variable-length columns can be moved in this way (for example the (N)VARCHAR, VARBINARY, and SQL_VARIANT types). Fixed-length columns (like INTEGER and DATETIME for example) never move into ‘row overflow’ storage. The decision to move a column off-row is done on a row-by-row basis – so data in a particular column might be stored in-row for some table records, and off-row for others. In general, if SQL Server finds that it needs to move a column into row-overflow storage, it moves the largest variable-length column record for that row. Note that in the case of an UPDATE statement that results in the 8060 byte limit being exceeded, it might not be the column that grew that is moved! Sneaky LOBs Anyway, that’s all very interesting but I don’t want to get too carried away with the intricacies of row-overflow storage internals. The point is that it is now possible to define a table with non-LOB columns that will silently exceed the old row-size limit and result in ordinary variable-length columns being moved to off-row storage. Adding new columns to a table, expanding an existing column definition, or simply storing more data in a column than you used to – all these things can result in one or more variable-length columns being moved off the row. Note that row-overflow storage is logically quite different from old-style LOB and new-style MAX data type storage – individual variable-length columns are still limited to 8000 bytes each – you can just have more of them now. Having said that, the physical mechanisms involved are very similar to full LOB storage – a column moved to row-overflow leaves a 24-byte pointer record in the row, and the ‘separate storage’ I have been talking about is structured very similarly to both old-style LOBs and new-style MAX types. The disadvantages are also the same: when SQL Server needs a row-overflow column value it needs to follow the in-row pointer a navigate another chain of pages, just like retrieving a traditional LOB. And Finally… In the example script presented above, the rows with row_id values from 402 to 450 inclusive all exceed the total in-row storage limit of 8060 bytes. A SELECT that references a column in one of those rows that has moved to off-row storage will incur one or more lob logical reads as the storage engine locates the data. The results on your system might vary slightly depending on your settings, of course; but in my tests only column 1 in rows 402-450 moved off-row. You might like to play around with the script – updating columns, changing data type lengths, and so on – to see the effect on lob logical reads and which columns get moved when. You might even see row-overflow columns moving back in-row if they are updated to be smaller (hint: reduce the size of a column entry by at least 1000 bytes if you hope to see this). Be aware that SQL Server will not warn you when it moves ‘ordinary’ variable-length columns into overflow storage, and it can have dramatic effects on performance. It makes more sense than ever to choose column data types sensibly. If you make every column a VARCHAR(8000) or NVARCHAR(4000), and someone stores data that results in a row needing more than 8060 bytes, SQL Server might turn some of your column data into pseudo-LOBs – all without saying a word. Finally, some people make a distinction between ordinary LOBs (those that can hold up to 2GB of data) and the LOB-like structures created by row-overflow (where columns are still limited to 8000 bytes) by referring to row-overflow LOBs as SLOBs. I find that quite appealing, but the ‘S’ stands for ‘small’, which makes expanding the whole acronym a little daft-sounding…small large objects anyone? © Paul White 2011 email: [email protected] twitter: @SQL_Kiwi

Read the article
How do I restrict concurrent statistics gathering to a small set of tables from a single schema?

- by Maria Colgan

I got an interesting question from one of my colleagues in the performance team last week about how to restrict a concurrent statistics gather to a small subset of tables from one schema, rather than the entire schema. I thought I would share the solution we came up with because it was rather elegant, and took advantage of concurrent statistics gathering, incremental statistics, and the not so well known “obj_filter_list” parameter in DBMS_STATS.GATHER_SCHEMA_STATS procedure. You should note that the solution outline below with “obj_filter_list” still applies, even when concurrent statistics gathering and/or incremental statistics gathering is disabled. The reason my colleague had asked the question in the first place was because he wanted to enable incremental statistics for 5 large partitioned tables in one schema. The first time you gather statistics after you enable incremental statistics on a table, you have to gather statistics for all of the existing partitions so that a synopsis may be created for them. If the partitioned table in question is large and contains a lot of partition, this could take a considerable amount of time. Since my colleague only had the Exadata environment at his disposal overnight, he wanted to re-gather statistics on 5 partition tables as quickly as possible to ensure that it all finished before morning. Prior to Oracle Database 11g Release 2, the only way to do this would have been to write a script with an individual DBMS_STATS.GATHER_TABLE_STATS command for each partition, in each of the 5 tables, as well as another one to gather global statistics on the table. Then, run each script in a separate session and manually manage how many of this session could run concurrently. Since each table has over one thousand partitions that would definitely be a daunting task and would most likely keep my colleague up all night! In Oracle Database 11g Release 2 we can take advantage of concurrent statistics gathering, which enables us to gather statistics on multiple tables in a schema (or database), and multiple (sub)partitions within a table concurrently. By using concurrent statistics gathering we no longer have to run individual statistics gathering commands for each partition. Oracle will automatically create a statistics gathering job for each partition, and one for the global statistics on each partitioned table. With the use of concurrent statistics, our script can now be simplified to just five DBMS_STATS.GATHER_TABLE_STATS commands, one for each table. This approach would work just fine but we really wanted to get this down to just one command. So how can we do that? You may be wondering why we didn’t just use the DBMS_STATS.GATHER_SCHEMA_STATS procedure with the OPTION parameter set to ‘GATHER STALE’. Unfortunately the statistics on the 5 partitioned tables were not stale and enabling incremental statistics does not mark the existing statistics stale. Plus how would we limit the schema statistics gather to just the 5 partitioned tables? So we went to ask one of the statistics developers if there was an alternative way. The developer told us the advantage of the “obj_filter_list” parameter in DBMS_STATS.GATHER_SCHEMA_STATS procedure. The “obj_filter_list” parameter allows you to specify a list of objects that you want to gather statistics on within a schema or database. The parameter takes a collection of type DBMS_STATS.OBJECTTAB. Each entry in the collection has 5 feilds; the schema name or the object owner, the object type (i.e., ‘TABLE’ or ‘INDEX’), object name, partition name, and subpartition name. You don't have to specify all five fields for each entry. Empty fields in an entry are treated as if it is a wildcard field (similar to ‘*’ character in LIKE predicates). Each entry corresponds to one set of filter conditions on the objects. If you have more than one entry, an object is qualified for statistics gathering as long as it satisfies the filter conditions in one entry. You first must create the collection of objects, and then gather statistics for the specified collection. It’s probably easier to explain this with an example. I’m using the SH sample schema but needed a couple of additional partitioned table tables to get recreate my colleagues scenario of 5 partitioned tables. So I created SALES2, SALES3, and COSTS2 as copies of the SALES and COSTS table respectively (setup.sql). I also deleted statistics on all of the tables in the SH schema beforehand to more easily demonstrate our approach. Step 0. Delete the statistics on the tables in the SH schema. Step 1. Enable concurrent statistics gathering. Remember, this has to be done at the global level. Step 2. Enable incremental statistics for the 5 partitioned tables. Step 3. Create the DBMS_STATS.OBJECTTAB and pass it to the DBMS_STATS.GATHER_SCHEMA_STATS command. Here, you will notice that we defined two variables of DBMS_STATS.OBJECTTAB type. The first, filter_lst, will be used to pass the list of tables we want to gather statistics on, and will be the value passed to the obj_filter_list parameter. The second, obj_lst, will be used to capture the list of tables that have had statistics gathered on them by this command, and will be the value passed to the objlist parameter. In Oracle Database 11g Release 2, you need to specify the objlist parameter in order to get the obj_filter_list parameter to work correctly due to bug 14539274. Will also needed to define the number of objects we would supply in the obj_filter_list. In our case we ere specifying 5 tables (filter_lst.extend(5)). Finally, we need to specify the owner name and object name for each of the objects in the list. Once the list definition is complete we can issue the DBMS_STATS.GATHER_SCHEMA_STATS command. Step 4. Confirm statistics were gathered on the 5 partitioned tables. Here are a couple of other things to keep in mind when specifying the entries for the obj_filter_list parameter. If a field in the entry is empty, i.e., null, it means there is no condition on this field. In the above example , suppose you remove the statement Obj_filter_lst(1).ownname := ‘SH’; You will get the same result since when you have specified gather_schema_stats so there is no need to further specify ownname in the obj_filter_lst. All of the names in the entry are normalized, i.e., uppercased if they are not double quoted. So in the above example, it is OK to use Obj_filter_lst(1).objname := ‘sales’;. However if you have a table called ‘MyTab’ instead of ‘MYTAB’, then you need to specify Obj_filter_lst(1).objname := ‘”MyTab”’; As I said before, although we have illustrated the usage of the obj_filter_list parameter for partitioned tables, with concurrent and incremental statistics gathering turned on, the obj_filter_list parameter is generally applicable to any gather_database_stats, gather_dictionary_stats and gather_schema_stats command. You can get a copy of the script I used to generate this post here. +Maria Colgan

Read the article
Not attending the LUGM mini-meetup - 05. Oct 2013

Not attending a meeting of the LUGM can be fun, too. It's getting a bit of a habit that Ish is organising small gatherings, aka mini-meetups, of the Linux User Group Mauritius/Meta (LUGM) almost every Saturday. There they mainly discuss and talk about various elements of using Linux as ones main operating systems and the possibilities you are going to have. On top of course, some tips & tricks about mastering the command line and initial steps in scripting or even writing HTML. In general, sounds like a good portion of fun and great spirit of community. Unfortunately, I'm usually quite busy with private and family matters during the weekend and so I already signalised that I wouldn't be around. Well, at least not physically... But this Saturday a couple of things worked out faster than expected and so I was hanging out on my machine. I made virtual contact with one of Pawan's messages over on Facebook... And somehow that kicked off some kind of an online game fun on basic configuration of Apache HTTPd 2.2.x, PHP 5.x and how to improve the overall performance of a newly installed blog based on WordPress. Default configuration files Nitin's website finally came alive and despite the dark theme and the hidden Apple 'fanboy' advertisement I was more interested in the technical situation. As with any new installation there is usually quite some adjustment to be done. And Nitin's page was no exception. Unfortunately, out of the box installations of Apache httpd and PHP are too verbose and expose too much information under the hood. You might think that this isn't really a problem at all, well, think about it again after completely reading this article. First, I checked the HTTP response headers - using either Chrome Developer Tools or Firefox Web Developer extension - of Nitin's page and based on that I advised him to lower the noise levels a little bit. It's not really necessary that detailed information about web server software and scripting language has to be published in every response made. Quite a number of script kiddies and exploits actually check for version specifics prior to an attack. So, removing at least version details hardens the system a little bit. In particular, I'm talking about these response values: Server X-Powered-By How to achieve that? By tweaking the configuration files... Namely, we are going to look into the following ones: apache2.conf httpd.conf .htaccess php.ini The above list contains some additional files, I'm talking about in the next paragraphs. Anyway, those are the ones involved. Tweaking Apache Open your favourite text editor and start to modify the apache2.conf. Eventually, you might like to have a quick peak at the file to see whether it is necessary to adjust it or not. Following is a handy combination of commands to get an overview of your active directives: # sudo grep -v '#' /etc/apache2/apache2.conf | grep -v '^$' | less There you keep an eye on those two Apache directives: ServerSignature Off ServerTokens Prod If that's not the case, change them as highlighted above. In order to activate your modifications you have to restart Apache httpd server. On Debian and Ubuntu you might use apache2ctl for that, on other distributions you might have to use service or run the init-scripts again: # sudo apache2ctl configtestSyntax OK# sudo apache2ctl restart Refresh your website and check the HTTP response header. Tweaking PHP5 (a little bit) Next, check your php.ini file with the following statement: # sudo grep -v ';' /etc/php5/apache2/php.ini | grep -v '^$' | less And check the value of expose_php = Off Again, if it's not as highlighted, change it... Some more Apache love Okay, back to Apache it might also be interesting to improve the situation about browser caching and removing more obsolete information. When you run your website against the usual performance checks like Google Page Speed and Yahoo YSlow you might see those check points with bad grades on a standard, default configuration. Well, this can be done easily. Configure entity tags (ETags) ETags are only interesting when you run your websites on a farm of multiple web servers. Removing this data for your static resources is very simple in Apache. As we are going to deal with the HTTP response header information you have to ensure that Apache is capable to manipulate them. First, check your enabled modules: # sudo ls -al /etc/apache2/mods-enabled/ | grep headers And in case that the 'headers' module is not listed, you have to enable it from the available ones: # sudo a2enmod headers Second, check your httpd.conf file (in case it exists): # sudo grep -v '#' /etc/apache2/httpd.conf | grep -v '^$' | less In newer (better said fresh) installations you might have to create a new configuration file below your conf.d folder with your favourite text editor like so: # sudo nano /etc/apache2/conf.d/headers.conf Then, in order to tweak your HTTP responses either check for those lines or add them: Header unset ETagFileETag None In case that your file doesn't exist or those lines are missing, feel free to create/add them. Afterwards, check your Apache configuration syntax and restart your running instances as already shown above: # sudo apache2ctl configtestSyntax OK# sudo apache2ctl restart Add Expires headers To improve the loading performance of your website, you should take some care into the proper configuration of how to leverage the browser's ability to cache certain resources and files. This is done by adding an Expires: value to the HTTP response header. Generally speaking it is advised that you specify a near-future, read: 1 week or a little bit more, for your static content like JavaScript files or Cascading Style Sheets. One solution to adjust this is to put some instructions into the .htaccess file in the root folder of your web site. Of course, this could also be placed into a more generic location of your Apache installation but honestly, I'd like to keep this at the web site level. Following some adjustments I'm currently using on this blog site: # Turn on Expires and set default to 0ExpiresActive OnExpiresDefault A0 # Set up caching on media files for 1 year (forever?)<FilesMatch "\.(flv|ico|pdf|avi|mov|ppt|doc|mp3|wmv|wav)$">ExpiresDefault A29030400Header append Cache-Control "public"</FilesMatch> # Set up caching on media files for 1 week<FilesMatch "\.(js|css)$">ExpiresDefault A604800Header append Cache-Control "public"</FilesMatch> # Set up caching on media files for 31 days<FilesMatch "\.(gif|jpg|jpeg|png|swf)$">ExpiresDefault A2678400Header append Cache-Control "public"</FilesMatch> As we are editing the .htaccess files, it is not necessary to restart Apache. In case that your web site doesn't load anymore or you're experiencing an error while trying to restart your httpd, check that the 'expires' module is actually an enabled module: # ls -al /etc/apache2/mods-enabled/ | grep expires# sudo a2enmod expires Of course, the instructions above a re not feature complete but I hope that they might provide a better default configuration for your LAMP stack. Resume of the day Within a couple of hours, and while being occupied with an eLearning course on SQL Server 2012, I had some good fun in helping and assisting other LUGM members while they were some kilometers away at Bagatelle. According to other blog articles it seems that Nitin had quite some moments of desperation. Just for the records: At no time it was my intention to either kick his butt or pull a leg on him. Simply, providing some input based on the lessons I've learned over the last couple of years configuring Apache HTTPd and PHP. Check out the other blogs, too: LUGM mini-meetup... Epic! Superb Saturday Linux Meetup And last but not least, the man himself: The end of a new beginning Cheers, and happy community'ing! Updates Due to our weekly Code & Coffee sessions in the MSCC community, I had a chance to talk to Nitin directly and he showed me the problems directly on his machine. This led to update this article hence the paragraphs on enabling the modules 'headers' and 'expires'.

Read the article
When is a Seek not a Seek?

- by Paul White

The following script creates a single-column clustered table containing the integers from 1 to 1,000 inclusive. IF OBJECT_ID(N'tempdb..#Test', N'U') IS NOT NULL DROP TABLE #Test ; GO CREATE TABLE #Test ( id INTEGER PRIMARY KEY CLUSTERED ); ; INSERT #Test (id) SELECT V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 1000 ; Let’s say we need to find the rows with values from 100 to 170, excluding any values that divide exactly by 10. One way to write that query would be: SELECT T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; That query produces a pretty efficient-looking query plan: Knowing that the source column is defined as an INTEGER, we could also express the query this way: SELECT T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; We get a similar-looking plan: If you look closely, you might notice that the line connecting the two icons is a little thinner than before. The first query is estimated to produce 61.9167 rows – very close to the 63 rows we know the query will return. The second query presents a tougher challenge for SQL Server because it doesn’t know how to predict the selectivity of the modulo expression (T.id % 10 > 0). Without that last line, the second query is estimated to produce 68.1667 rows – a slight overestimate. Adding the opaque modulo expression results in SQL Server guessing at the selectivity. As you may know, the selectivity guess for a greater-than operation is 30%, so the final estimate is 30% of 68.1667, which comes to 20.45 rows. The second difference is that the Clustered Index Seek is costed at 99% of the estimated total for the statement. For some reason, the final SELECT operator is assigned a small cost of 0.0000484 units; I have absolutely no idea why this is so, or what it models. Nevertheless, we can compare the total cost for both queries: the first one comes in at 0.0033501 units, and the second at 0.0034054. The important point is that the second query is costed very slightly higher than the first, even though it is expected to produce many fewer rows (20.45 versus 61.9167). If you run the two queries, they produce exactly the same results, and both complete so quickly that it is impossible to measure CPU usage for a single execution. We can, however, compare the I/O statistics for a single run by running the queries with STATISTICS IO ON: Table '#Test'. Scan count 63, logical reads 126, physical reads 0. Table '#Test'. Scan count 01, logical reads 002, physical reads 0. The query with the IN list uses 126 logical reads (and has a ‘scan count’ of 63), while the second query form completes with just 2 logical reads (and a ‘scan count’ of 1). It is no coincidence that 126 = 63 * 2, by the way. It is almost as if the first query is doing 63 seeks, compared to one for the second query. In fact, that is exactly what it is doing. There is no indication of this in the graphical plan, or the tool-tip that appears when you hover your mouse over the Clustered Index Seek icon. To see the 63 seek operations, you have click on the Seek icon and look in the Properties window (press F4, or right-click and choose from the menu): The Seek Predicates list shows a total of 63 seek operations – one for each of the values from the IN list contained in the first query. I have expanded the first seek node to show the details; it is seeking down the clustered index to find the entry with the value 101. Each of the other 62 nodes expands similarly, and the same information is contained (even more verbosely) in the XML form of the plan. Each of the 63 seek operations starts at the root of the clustered index B-tree and navigates down to the leaf page that contains the sought key value. Our table is just large enough to need a separate root page, so each seek incurs 2 logical reads (one for the root, and one for the leaf). We can see the index depth using the INDEXPROPERTY function, or by using the a DMV: SELECT S.index_type_desc, S.index_depth FROM sys.dm_db_index_physical_stats ( DB_ID(N'tempdb'), OBJECT_ID(N'tempdb..#Test', N'U'), 1, 1, DEFAULT ) AS S ; Let’s look now at the Properties window when the Clustered Index Seek from the second query is selected: There is just one seek operation, which starts at the root of the index and navigates the B-tree looking for the first key that matches the Start range condition (id >= 101). It then continues to read records at the leaf level of the index (following links between leaf-level pages if necessary) until it finds a row that does not meet the End range condition (id <= 169). Every row that meets the seek range condition is also tested against the Residual Predicate highlighted above (id % 10 > 0), and is only returned if it matches that as well. You will not be surprised that the single seek (with a range scan and residual predicate) is much more efficient than 63 singleton seeks. It is not 63 times more efficient (as the logical reads comparison would suggest), but it is around three times faster. Let’s run both query forms 10,000 times and measure the elapsed time: DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON; SET STATISTICS XML OFF; ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; GO DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; On my laptop, running SQL Server 2008 build 4272 (SP2 CU2), the IN form of the query takes around 830ms and the range query about 300ms. The main point of this post is not performance, however – it is meant as an introduction to the next few parts in this mini-series that will continue to explore scans and seeks in detail. When is a seek not a seek? When it is 63 seeks © Paul White 2011 email: [email protected] twitter: @SQL_kiwi

Read the article
How to Recover From a Virus Infection: 3 Things You Need to Do

- by Chris Hoffman

If your computer becomes infected with a virus or another piece of malware, removing the malware from your computer is only the first step. There’s more you need to do to ensure you’re secure. Note that not every antivirus alert is an actual infection. If your antivirus program catches a virus before it ever gets a chance to run on your computer, you’re safe. If it catches the malware later, you have a bigger problem. Change Your Passwords You’ve probably used your computer to log into your email, online banking websites, and other important accounts. Assuming you had malware on your computer, the malware could have logged your passwords and uploaded them to a malicious third party. With just your email account, the third party could reset your passwords on other websites and gain access to almost any of your online accounts. To prevent this, you’ll want to change the passwords for your important accounts — email, online banking, and whatever other important accounts you’ve logged into from the infected computer. You should probably use another computer that you know is clean to change the passwords, just to be safe. When changing your passwords, consider using a password manager to keep track of strong, unique passwords and two-factor authentication to prevent people from logging into your important accounts even if they know your password. This will help protect you in the future. Ensure the Malware Is Actually Removed Once malware gets access to your computer and starts running, it has the ability to do many more nasty things to your computer. For example, some malware may install rootkit software and attempt to hide itself from the system. Many types of Trojans also “open the floodgates” after they’re running, downloading many different types of malware from malicious web servers to the local system. In other words, if your computer was infected, you’ll want to take extra precautions. You shouldn’t assume it’s clean just because your antivirus removed what it found. It’s probably a good idea to scan your computer with multiple antivirus products to ensure maximum detection. You may also want to run a bootable antivirus program, which runs outside of Windows. Such bootable antivirus programs will be able to detect rootkits that hide themselves from Windows and even the software running within Windows. avast! offers the ability to quickly create a bootable CD or USB drive for scanning, as do many other antivirus programs. You may also want to reinstall Windows (or use the Refresh feature on Windows 8) to get your computer back to a clean state. This is more time-consuming, especially if you don’t have good backups and can’t get back up and running quickly, but this is the only way you can have 100% confidence that your Windows system isn’t infected. It’s all a matter of how paranoid you want to be. Figure Out How the Malware Arrived If your computer became infected, the malware must have arrived somehow. You’ll want to examine your computer’s security and your habits to prevent more malware from slipping through in the same way. Windows is complex. For example, there are over 50 different types of potentially dangerous file extensions that can contain malware to keep track of. We’ve tried to cover many of the most important security practices you should be following, but here are some of the more important questions to ask: Are you using an antivirus? – If you don’t have an antivirus installed, you should. If you have Microsoft Security Essentials (known as Windows Defender on Windows 8), you may want to switch to a different antivirus like the free version of avast!. Microsoft’s antivirus product has been doing very poorly in tests. Do you have Java installed? – Java is a huge source of security problems. The majority of computers on the Internet have an out-of-date, vulnerable version of Java installed, which would allow malicious websites to install malware on your computer. If you have Java installed, uninstall it. If you actually need Java for something (like Minecraft), at least disable the Java browser plugin. If you’re not sure whether you need Java, you probably don’t. Are any browser plugins out-of-date? – Visit Mozilla’s Plugin Check website (yes, it also works in other browsers, not just Firefox) and see if you have any critically vulnerable plugins installed. If you do, ensure you update them — or uninstall them. You probably don’t need older plugins like QuickTime or RealPlayer installed on your computer, although Flash is still widely used. Are your web browser and operating system set to automatically update? – You should be installing updates for Windows via Windows Update when they appear. Modern web browsers are set to automatically update, so they should be fine — unless you went out of your way to disable automatic updates. Using out-of-date web browsers and Windows versions is dangerous. Are you being careful about what you run? – Watch out when downloading software to ensure you don’t accidentally click sketchy advertisements and download harmful software. Avoid pirated software that may be full of malware. Don’t run programs from email attachments. Be careful about what you run and where you get it from in general. If you can’t figure out how the malware arrived because everything looks okay, there’s not much more you can do. Just try to follow proper security practices. You may also want to keep an extra-close eye on your credit card statement for a while if you did any online-shopping recently. As so much malware is now related to organized crime, credit card numbers are a popular target.

Read the article
Fun with Aggregates

- by Paul White

There are interesting things to be learned from even the simplest queries. For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join. The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units. The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too? To answer that, let’s look at some other ways to formulate this query. This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones. The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above. It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join! The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join. In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting. The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set. In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL). To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709; -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case. The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate. This is something I would like to see exposed in show plan so I suggested it on Connect. Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all. Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans. Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :) The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier. What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too). Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places. It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates. As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

Read the article
Real tortoises keep it slow and steady. How about the backups?

- by Maria Zakourdaev

… Four tortoises were playing in the backyard when they decided they needed hibiscus flower snacks. They pooled their money and sent the smallest tortoise out to fetch the snacks. Two days passed and there was no sign of the tortoise. "You know, she is taking a lot of time", said one of the tortoises. A little voice from just out side the fence said, "If you are going to talk that way about me I won't go." Is it too much to request from the quite expensive 3rd party backup tool to be a way faster than the SQL server native backup? Or at least save a respectable amount of storage by producing a really smaller backup files? By saying “really smaller”, I mean at least getting a file in half size. After Googling the internet in an attempt to understand what other “sql people” are using for database backups, I see that most people are using one of three tools which are the main players in SQL backup area: LiteSpeed by Quest SQL Backup by Red Gate SQL Safe by Idera The feedbacks about those tools are truly emotional and happy. However, while reading the forums and blogs I have wondered, is it possible that many are accustomed to using the above tools since SQL 2000 and 2005. This can easily be understood due to the fact that a 300GB database backup for instance, using regular a SQL 2005 backup statement would have run for about 3 hours and have produced ~150GB file (depending on the content, of course). Then you take a 3rd party tool which performs the same backup in 30 minutes resulting in a 30GB file leaving you speechless, you run to management persuading them to buy it due to the fact that it is definitely worth the price. In addition to the increased speed and disk space savings you would also get backup file encryption and virtual restore - features that are still missing from the SQL server. But in case you, as well as me, don’t need these additional features and only want a tool that performs a full backup MUCH faster AND produces a far smaller backup file (like the gain you observed back in SQL 2005 days) you will be quite disappointed. SQL Server backup compression feature has totally changed the market picture. Medium size database. Take a look at the table below, check out how my SQL server 2008 R2 compares to other tools when backing up a 300GB database. It appears that when talking about the backup speed, SQL 2008 R2 compresses and performs backup in similar overall times as all three other tools. 3rd party tools maximum compression level takes twice longer. Backup file gain is not that impressive, except the highest compression levels but the price that you pay is very high cpu load and much longer time. Only SQL Safe by Idera was quite fast with it’s maximum compression level but most of the run time have used 95% cpu on the server. Note that I have used two types of destination storage, SATA 11 disks and FC 53 disks and, obviously, on faster storage have got my backup ready in half time. Looking at the above results, should we spend money, bother with another layer of complexity and software middle-man for the medium sized databases? I’m definitely not going to do so. Very large database As a next phase of this benchmark, I have moved to a 6 terabyte database which was actually my main backup target. Note, how multiple files usage enables the SQL Server backup operation to use parallel I/O and remarkably increases it’s speed, especially when the backup device is heavily striped. SQL Server supports a maximum of 64 backup devices for a single backup operation but the most speed is gained when using one file per CPU, in the case above 8 files for a 2 Quad CPU server. The impact of additional files is minimal. However, SQLsafe doesn’t show any speed improvement between 4 files and 8 files. Of course, with such huge databases every half percent of the compression transforms into the noticeable numbers. Saving almost 470GB of space may turn the backup tool into quite valuable purchase. Still, the backup speed and high CPU are the variables that should be taken into the consideration. As for us, the backup speed is more critical than the storage and we cannot allow a production server to sustain 95% cpu for such a long time. Bottomline, 3rd party backup tool developers, we are waiting for some breakthrough release. There are a few unanswered questions, like the restore speed comparison between different tools and the impact of multiple backup files on restore operation. Stay tuned for the next benchmarks. Benchmark server: SQL Server 2008 R2 sp1 2 Quad CPU Database location: NetApp FC 15K Aggregate 53 discs Backup statements: No matter how good that UI is, we need to run the backup tasks from inside of SQL Server Agent to make sure they are covered by our monitoring systems. I have used extended stored procedures (command line execution also is an option, I haven’t noticed any impact on the backup performance). SQL backup LiteSpeed SQL Backup SQL safe backup database <DBNAME> to disk= '\\<networkpath>\par1.bak' , disk= '\\<networkpath>\par2.bak', disk= '\\<networkpath>\par3.bak' with format, compression EXECUTE master.dbo.xp_backup_database @database = N'<DBName>', @backupname= N'<DBName> full backup', @desc = N'Test', @compressionlevel=8, @filename= N'\\<networkpath>\par1.bak', @filename= N'\\<networkpath>\par2.bak', @filename= N'\\<networkpath>\par3.bak', @init = 1 EXECUTE master.dbo.sqlbackup '-SQL "BACKUP DATABASE <DBNAME> TO DISK= ''\\<networkpath>\par1.sqb'', DISK= ''\\<networkpath>\par2.sqb'', DISK= ''\\<networkpath>\par3.sqb'' WITH DISKRETRYINTERVAL = 30, DISKRETRYCOUNT = 10, COMPRESSION = 4, INIT"' EXECUTE master.dbo.xp_ss_backup @database = 'UCMSDB', @filename = '\\<networkpath>\par1.bak', @backuptype = 'Full', @compressionlevel = 4, @backupfile = '\\<networkpath>\par2.bak', @backupfile = '\\<networkpath>\par3.bak' If you still insist on using 3rd party tools for the backups in your production environment with maximum compression level, you will definitely need to consider limiting cpu usage which will increase the backup operation time even more: RedGate : use THREADPRIORITY option ( values 0 – 6 ) LiteSpeed : use @throttle ( percentage, like 70%) SQL safe : the only thing I have found was @Threads option. Yours, Maria

Read the article
Nvidia drivers don't work with mainline kernel

- by dutchie

I want to try some of the new features in the btrfs filesystem, and to do that I need to use a newer kernel than is included in Ubuntu 12.04. To do that, I have installed linux-headers-3.4.0-030400_3.4.0-030400.201205210521_all.deb, linux-headers-3.4.0-030400-generic_3.4.0-030400.201205210521_amd64.deb, and linux-image-3.4.0-030400-generic_3.4.0-030400.201205210521_amd64.deb from the mainline kernel download here. However, on rebooting into the 3.4 kernel, my desktop is stuck at a very low resolution and I cannot increase it to the full. This did happen when I first installed, but a simple install of the nvidia-current package got everything working nicely with my GTX570 card. There were appear to be some DKMS errors when I installed the kernel, and they indicated I should look at /var/lib/dkms/nvidia-current/295.40/build/make.log: josh@sirius:~/Downloads$ sudo dpkg -i linux-*.deb Selecting previously unselected package linux-headers-3.4.0-030400. (Reading database ... 309400 files and directories currently installed.) Unpacking linux-headers-3.4.0-030400 (from linux-headers-3.4.0-030400_3.4.0-030400.201205210521_all.deb) ... Selecting previously unselected package linux-headers-3.4.0-030400-generic. Unpacking linux-headers-3.4.0-030400-generic (from linux-headers-3.4.0-030400-generic_3.4.0-030400.201205210521_amd64.deb) ... Selecting previously unselected package linux-image-3.4.0-030400-generic. Unpacking linux-image-3.4.0-030400-generic (from linux-image-3.4.0-030400-generic_3.4.0-030400.201205210521_amd64.deb) ... Done. Setting up linux-headers-3.4.0-030400 (3.4.0-030400.201205210521) ... Setting up linux-headers-3.4.0-030400-generic (3.4.0-030400.201205210521) ... Examining /etc/kernel/header_postinst.d. run-parts: executing /etc/kernel/header_postinst.d/dkms 3.4.0-030400-generic /boot/vmlinuz-3.4.0-030400-generic ERROR (dkms apport): kernel package linux-headers-3.4.0-030400-generic is not supported Error! Bad return status for module build on kernel: 3.4.0-030400-generic (x86_64) Consult /var/lib/dkms/nvidia-current/295.40/build/make.log for more information. Setting up linux-image-3.4.0-030400-generic (3.4.0-030400.201205210521) ... Running depmod. update-initramfs: deferring update (hook will be called later) Examining /etc/kernel/postinst.d. run-parts: executing /etc/kernel/postinst.d/dkms 3.4.0-030400-generic /boot/vmlinuz-3.4.0-030400-generic ERROR (dkms apport): kernel package linux-headers-3.4.0-030400-generic is not supported Error! Bad return status for module build on kernel: 3.4.0-030400-generic (x86_64) Consult /var/lib/dkms/nvidia-current/295.40/build/make.log for more information. run-parts: executing /etc/kernel/postinst.d/initramfs-tools 3.4.0-030400-generic /boot/vmlinuz-3.4.0-030400-generic update-initramfs: Generating /boot/initrd.img-3.4.0-030400-generic run-parts: executing /etc/kernel/postinst.d/pm-utils 3.4.0-030400-generic /boot/vmlinuz-3.4.0-030400-generic run-parts: executing /etc/kernel/postinst.d/update-notifier 3.4.0-030400-generic /boot/vmlinuz-3.4.0-030400-generic run-parts: executing /etc/kernel/postinst.d/zz-update-grub 3.4.0-030400-generic /boot/vmlinuz-3.4.0-030400-generic Generating grub.cfg ... Found linux image: /boot/vmlinuz-3.4.0-030400-generic Found initrd image: /boot/initrd.img-3.4.0-030400-generic Found linux image: /boot/vmlinuz-3.2.0-24-generic Found initrd image: /boot/initrd.img-3.2.0-24-generic Found memtest86+ image: /memtest86+.bin Found Ubuntu 12.04 LTS (12.04) on /dev/sda1 Found Windows 7 (loader) on /dev/sda2 Found Windows 7 (loader) on /dev/sda3 done /var/lib/dkms/nvidia-current/295.40/build/make.log: DKMS make.log for nvidia-current-295.40 for kernel 3.4.0-030400-generic (x86_64) Thu Jun 7 00:58:39 BST 2012 NVIDIA: calling KBUILD... test -e include/generated/autoconf.h -a -e include/config/auto.conf || ( \ echo; \ echo " ERROR: Kernel configuration is invalid."; \ echo " include/generated/autoconf.h or include/config/auto.conf are missing.";\ echo " Run 'make oldconfig && make prepare' on kernel src to fix it."; \ echo; \ /bin/false) mkdir -p /var/lib/dkms/nvidia-current/295.40/build/.tmp_versions ; rm -f /var/lib/dkms/nvidia-current/295.40/build/.tmp_versions/* make -f scripts/Makefile.build obj=/var/lib/dkms/nvidia-current/295.40/build cc -Wp,-MD,/var/lib/dkms/nvidia-current/295.40/build/.nv.o.d -nostdinc -isystem /usr/lib/gcc/x86_64-linux-gnu/4.6/include -I/usr/src/linux-headers-3.4.0-030400-generic/arch/x86/include -Iarch/x86/include/generated -Iinclude -include /usr/src/linux-headers-3.4.0-030400-generic/include/linux/kconfig.h -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Wno-format-security -fno-delete-null-pointer-checks -O2 -m64 -mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args -fstack-protector -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -Wframe-larger-than=1024 -Wno-unused-but-set-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -pg -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fconserve-stack -DCC_HAVE_ASM_GOTO -I/var/lib/dkms/nvidia-current/295.40/build -Wall -MD -Wsign-compare -Wno-cast-qual -Wno-error -D__KERNEL__ -DMODULE -DNVRM -DNV_VERSION_STRING=\"295.40\" -Wno-unused-function -Wuninitialized -mno-red-zone -mcmodel=kernel -UDEBUG -U_DEBUG -DNDEBUG -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(nv)" -D"KBUILD_MODNAME=KBUILD_STR(nvidia)" -c -o /var/lib/dkms/nvidia-current/295.40/build/.tmp_nv.o /var/lib/dkms/nvidia-current/295.40/build/nv.c In file included from include/linux/kernel.h:19:0, from include/linux/sched.h:55, from include/linux/utsname.h:35, from /var/lib/dkms/nvidia-current/295.40/build/nv-linux.h:38, from /var/lib/dkms/nvidia-current/295.40/build/nv.c:13: include/linux/bitops.h: In function ‘hweight_long’: include/linux/bitops.h:66:41: warning: signed and unsigned type in conditional expression [-Wsign-compare] In file included from /usr/src/linux-headers-3.4.0-030400-generic/arch/x86/include/asm/uaccess.h:577:0, from include/linux/poll.h:14, from /var/lib/dkms/nvidia-current/295.40/build/nv-linux.h:97, from /var/lib/dkms/nvidia-current/295.40/build/nv.c:13: /usr/src/linux-headers-3.4.0-030400-generic/arch/x86/include/asm/uaccess_64.h: In function ‘copy_from_user’: /usr/src/linux-headers-3.4.0-030400-generic/arch/x86/include/asm/uaccess_64.h:53:6: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] In file included from /var/lib/dkms/nvidia-current/295.40/build/nv.c:13:0: /var/lib/dkms/nvidia-current/295.40/build/nv-linux.h: At top level: /var/lib/dkms/nvidia-current/295.40/build/nv-linux.h:114:75: fatal error: asm/system.h: No such file or directory compilation terminated. make[3]: *** [/var/lib/dkms/nvidia-current/295.40/build/nv.o] Error 1 make[2]: *** [_module_/var/lib/dkms/nvidia-current/295.40/build] Error 2 NVIDIA: left KBUILD. nvidia.ko failed to build! make[1]: *** [module] Error 1 make: *** [module] Error 2

Read the article
D2K to OA Framework Transition

- by PRajkumar

What is the difference between D2K form and OA Framework? It is a very innocent but important question for someone that desires to make transition from D2K to OA Framework. I hope you have already read and implemented OA Framework Getting Started. I will re-visit my own experience of implementing HelloWorld program in "OA Framework". When I implemented HelloWorld a year ago, I had no clue as to what I was doing & why I was doing those steps. I merely copied the steps from Oracle Tutorial without understanding them. Hence in this blog, I will try to explain in simple manner the meaning of OA Framework HelloWorld Program and compare the steps to D2K form [where possible]. To keep things simple, only basics will be discussed. Following key Steps were needed for HelloWorld Step 1 Create a new Workspace and a new Project as dictated by Oracle's tutorial. When defining project, you will specify a default package, which in this case was oracle.apps.ak.hello This means the following: - ak is the short name of the Application in Oracle [means fnd_applications.short_name] hello is the name of your project Step 2 Next, you will create a OA Page within hello project Think OA Page as the fmx file itself in D2K. I am saying so because this page gets attached to the form function. This page will be created within hello project, hence the package name oracle.apps.ak.hello.webui Note the webui, it is a convention to have page in webui, means this page represents the Web User Interface You will assign the default AM [OAApplicationModule]. Think of AM "Connection Manager" and "Transaction State Manager" for your page I can't co-relate this to anything in D2k, as there is no concept of Connection Pooling and that D2k is not stateless. Reason being that as soon as you kick off a D2K Form, it connects to a single session of Oracle and sticks to that single Oracle database session. So is not the case in OAF, hence AM is needed. Step 3 You create Region within the Page. · Region is what will store your fields. Text input fields will be of type messageTextInput. Think of Canvas in D2K. You can have nested regions. Stacked Canvas in D2K comes the closest to this component of OA Framework Step 4 Add a button to one of the nested regions The itemStyle should be submitButton, in case you want the page to be submitted when this button is clicked There is no WHEN-BUTTON-PRESSED trigger in OAF. In Framework, you will add a controller java code to handle events like Form Submit button clicks. JDeveloper generates the default code for you. Primarily two functions [should I call methods] will be created processRequest [for UI Rendering Handling] and processFormRequest Think of processRequest as WHEN-NEW-FORM-INSTANCE, though processRequest is very restrictive. Note What is the difference between processRequest and processFormRequest? These two methods are available in the Default Controller class that gets created. processFormRequest This method is commonly used to react/respond to the event that has taken place, for example click of a button. Some examples are if(oapagecontext.getParameter("Cancel") != null) (Do your processing for Cancellation/ Rollback) if(oapagecontext.getParameter("Submit") != null) (Do your validations and commit here) if(oapagecontext.getParameter("Update") != null) (Do your validations and commit here) In the above three examples, you could be calling oapagecontext.forwardImmediately to re-direct the page navigation to some other page if needed. processRequest In this method, usually page rendering related code is written. Effectively, each GUI component is a bean that gets initialised during processRequest. Those who are familiar with D2K forms, something like pre-query may be written in this method. Step 5 In the controller to access the value in field "HelloName" the command is String userContent = pageContext.getParameter("HelloName"); In D2k, we used :block.field. In OAFramework, at submission of page, all the field values get passed into to OAPageContext object. Use getParameter to access the field value To set the value of the field, use OAMessageTextInputBean field HelloName = (OAMessageTextInputBean)webBean.findChildRecursive("HelloName"); fieldHelloName.setText(pageContext,"Setting the default value" ); Note when setting field value in controller: Note 1. Do not set the value in processFormRequest Note 2. If the field comes from View Object, then do not use setText in controller Note 3. For control fields [that are not based on View Objects], you can use setText to assign values in processRequest method Lets take some notes to expand beyond the HelloWorld Project Note 1 In D2K-forms we sort of created a Window, attached to Canvas, and then fields within that Canvas. However in OA Framework, think of Page being fmx/Window, think of Region being a Canvas, and fields being within Regions. This is not a formal/accurate understanding of analogy between D2k and Framework, but is close to being logical. Note 2 In D2k, your Forms fmb file was compiled to fmx. It was fmx file that was deployed on mid-tier. In case of OAF, your OA Page is nothing but a XML file. We call this MDS [meta data]. Whatever name you give to "Page" in OAF, an XML file of the same name gets created. This xml file must then be loaded into database by using XML Importer command. Note 3 Apart from MDS XML file, almost everything else is merely deployed to your mid-tier. Usually this is underneath $JAVA_TOP/oracle/apps/../.. All java files will go underneath java top/oracle/apps/../.. etc. Note 4 When building tutorial, ignore the steps for setting "Attribute Sets". These are not mandatory. Oracle might just have developed their tutorials without including these. Think of these like Visual Attributes of D2K forms Note 5 Controller is where you will write any java code in OA Framework. You can create a Controller per Page or have a different Controller for each of the Regions with the same Page. Note 6 In the method processFormRequest of the Controller, you can access the values of the page by using notation pageContext.getParameter("<fieldname here>"). This method processFormRequest is executed when the OAF Screen/Page is submitted by click of a button. Note 7 Inside the controller, all the Database Related interactions for example interaction with View Objects happen via Application Module. But why so? Because Application Module Manages the transaction state of the Application. OAApplicationModuleImpl oaapplicationmoduleimpl = OAApplicationModuleImpl)oapagecontext.getApplicationModule(oawebbean); OADBTransaction oadbtransaction = OADBTransaction)oaapplicationmoduleimpl.getDBTransaction(); Note 8 In D2K, we have control block or a block based on database view. Similarly, in OA Framework, if the field does not have view Object attached, then it is like a control field. Hence in HelloWorld example, field HelloName is a control field [in D2K terminology]. A view Object can either be based on a view/table, synonym or on a SQL statement. Note 9 I wish to access the fields in multi record block that is based on view Object. Can I do this in Controller? Sure you can. To traverse through those records, do the below · Get the reference to the View Object using (OAViewObject)oapagecontext.getApplicationModule(oawebbean).findViewObject("VO Name Here") · Loop through the records in View Objects using count returned from oaviewobject.getFetchedRowCount() · For each record, fetch the value of the fields within the loop as oracle.jbo.Row row = oaviewobject.getRowAtRangeIndex(loop index here); (String)row.getAttribute("Column name of VO here ");

Read the article

Search Results

Search found 7596 results on 304 pages for 'prepared statement'.

Page 291/304 | < Previous Page | 287 288 289 290 291 292 293 294 295 296 297 298 | Next Page >

- by Karoly Vegh

- by Rajesh Sharma

- by Mark Fincham-Oracle

- by Simon Thorpe

- by Most Valuable Yak (Rob Volk)

- by Argenis

- by Mladen Prajdic

- by Pinal Dave

- by Timothy Klenke

- by Chris George

- by Ralf Westphal

- by PSteele

- by Mat Keep

- by Paul White

- by Maria Colgan

- by Paul White

- by Chris Hoffman

- by Paul White

- by Maria Zakourdaev

- by dutchie

- by PRajkumar

< Previous Page | 287 288 289 290 291 292 293 294 295 296 297 298 | Next Page >