Search Results

Search found 14182 results on 568 pages for 'andrew answer'.

Page 546/568 | < Previous Page | 542 543 544 545 546 547 548 549 550 551 552 553 | Next Page >

SQL Server Split() Function

- by HighAltitudeCoder

Title goes here Ever wanted a dbo.Split() function, but not had the time to debug it completely? Let me guess - you are probably working on a stored procedure with 50 or more parameters; two or three of them are parameters of differing types, while the other 47 or so all of the same type (id1, id2, id3, id4, id5...). Worse, you've found several other similar stored procedures with the ONLY DIFFERENCE being the number of like parameters taped to the end of the parameter list. If this is the situation you find yourself in now, you may be wondering, "why am I working with three different copies of what is basically the same stored procedure, and why am I having to maintain changes in three different places? Can't I have one stored procedure that accomplishes the job of all three? My answer to you: YES! Here is the Split() function I've created. /****************************************************************************** Split.sql ******************************************************************************/ /****************************************************************************** Split a delimited string into sub-components and return them as a table. Parameter 1: Input string which is to be split into parts. Parameter 2: Delimiter which determines the split points in input string. Works with space or spaces as delimiter. Split() is apostrophe-safe. SYNTAX: SELECT * FROM Split('Dvorak,Debussy,Chopin,Holst', ',') SELECT * FROM Split('Denver|Seattle|San Diego|New York', '|') SELECT * FROM Split('Denver is the super-awesomest city of them all.', ' ') ******************************************************************************/ USE AdventureWorks GO IF EXISTS (SELECT * FROM sysobjects WHERE xtype = 'TF' AND name = 'Split' ) BEGIN DROP FUNCTION Split END GO CREATE FUNCTION Split ( @InputString VARCHAR(8000), @Delimiter VARCHAR(50) ) RETURNS @Items TABLE ( Item VARCHAR(8000) ) AS BEGIN IF @Delimiter = ' ' BEGIN SET @Delimiter = ',' SET @InputString = REPLACE(@InputString, ' ', @Delimiter) END IF (@Delimiter IS NULL OR @Delimiter = '') SET @Delimiter = ',' --INSERT INTO @Items VALUES (@Delimiter) -- Diagnostic --INSERT INTO @Items VALUES (@InputString) -- Diagnostic DECLARE @Item VARCHAR(8000) DECLARE @ItemList VARCHAR(8000) DECLARE @DelimIndex INT SET @ItemList = @InputString SET @DelimIndex = CHARINDEX(@Delimiter, @ItemList, 0) WHILE (@DelimIndex != 0) BEGIN SET @Item = SUBSTRING(@ItemList, 0, @DelimIndex) INSERT INTO @Items VALUES (@Item) -- Set @ItemList = @ItemList minus one less item SET @ItemList = SUBSTRING(@ItemList, @DelimIndex+1, LEN(@ItemList)-@DelimIndex) SET @DelimIndex = CHARINDEX(@Delimiter, @ItemList, 0) END -- End WHILE IF @Item IS NOT NULL -- At least one delimiter was encountered in @InputString BEGIN SET @Item = @ItemList INSERT INTO @Items VALUES (@Item) END -- No delimiters were encountered in @InputString, so just return @InputString ELSE INSERT INTO @Items VALUES (@InputString) RETURN END -- End Function GO ---- Set Permissions --GRANT SELECT ON Split TO UserRole1 --GRANT SELECT ON Split TO UserRole2 --GO The syntax is basically as follows: SELECT <fields> FROM Table 1 JOIN Table 2 ON ... JOIN Table 3 ON ... WHERE LOGICAL CONDITION A AND LOGICAL CONDITION B AND LOGICAL CONDITION C AND TABLE2.Id IN (SELECT * FROM Split(@IdList, ',')) @IdList is a parameter passed into the stored procedure, and the comma (',') is the delimiter you have chosen to split the parameter list on. You can also use it like this: SELECT <fields> FROM Table 1 JOIN Table 2 ON ... JOIN Table 3 ON ... WHERE LOGICAL CONDITION A AND LOGICAL CONDITION B AND LOGICAL CONDITION C HAVING COUNT(SELECT * FROM Split(@IdList, ',') Similarly, it can be used in other aggregate functions at run-time: SELECT MIN(SELECT * FROM Split(@IdList, ','), <fields> FROM Table 1 JOIN Table 2 ON ... JOIN Table 3 ON ... WHERE LOGICAL CONDITION A AND LOGICAL CONDITION B AND LOGICAL CONDITION C GROUP BY <fields> Now that I've (hopefully effectively) explained the benefits to using this function and implementing it in one or more of your database objects, let me warn you of a caveat that you are likely to encounter. You may have a team member who waits until the right moment to ask you a pointed question: "Doesn't this function just do the same thing as using the IN function? Why didn't you just use that instead? In other words, why bother with this function?" What's happening is, one or more team members has failed to understand the reason for implementing this kind of function in the first place. (Note: this is THE MOST IMPORTANT ASPECT OF THIS POST). Allow me to outline a few pros to implementing this function, so you may effectively parry this question. Touche. 1) Code consolidation. You don't have to maintain what is basically the same code and logic, but with varying numbers of the same parameter in several SQL objects. I'm not going to go into the cons related to using this function, because the afore mentioned team member is probably more than adept at pointing these out. Remember, the real positive contribution is ou are decreasing the liklihood that your team fails to update all (x) duplicate copies of what are basically the same stored procedure, and so on... This is the classic downside to duplicate code. It is a virus, and you should kill it. You might be better off rejecting your team member's question, and responding with your own: "Would you rather maintain the same logic in multiple different stored procedures, and hope that the team doesn't forget to always update all of them at the same time?". In his head, he might be thinking "yes, I would like to maintain several different copies of the same stored procedure", although you probably will not get such a direct response. 2) Added flexibility - you can use the Split function elsewhere, and for splitting your data in different ways. Plus, you can use any kind of delimiter you wish. How can you know today the ways in which you might want to examine your data tomorrow? Segue to my next point. 3) Because the function takes a delimiter parameter, you can split the data in any number of ways. This greatly increases the utility of such a function and enables your team to work with the data in a variety of different ways in the future. You can split on a single char, symbol, word, or group of words. You can split on spaces. (The list goes on... test it out). Finally, you can dynamically define the behavior of a stored procedure (or other SQL object) at run time, through the use of this function. Rather than have several objects that accomplish almost the same thing, why not have only one instead?

Read the article
How do I set up MVP for a Winforms solution?

- by JonWillis

Question moved from Stackoverflow - http://stackoverflow.com/questions/4971048/how-do-i-set-up-mvp-for-a-winforms-solution I have used MVP and MVC in the past, and I prefer MVP as it controls the flow of execution so much better in my opinion. I have created my infrastructure (datastore/repository classes) and use them without issue when hard coding sample data, so now I am moving onto the GUI and preparing my MVP. Section A I have seen MVP using the view as the entry point, that is in the views constructor method it creates the presenter, which in turn creates the model, wiring up events as needed. I have also seen the presenter as the entry point, where a view, model and presenter are created, this presenter is then given a view and model object in its constructor to wire up the events. As in 2, but the model is not passed to the presenter. Instead the model is a static class where methods are called and responses returned directly. Section B In terms of keeping the view and model in sync I have seen. Whenever a value in the view in changed, i.e. TextChanged event in .Net/C#. This fires a DataChangedEvent which is passed through into the model, to keep it in sync at all times. And where the model changes, i.e. a background event it listens to, then the view is updated via the same idea of raising a DataChangedEvent. When a user wants to commit changes a SaveEvent it fires, passing through into the model to make the save. In this case the model mimics the view's data and processes actions. Similar to #b1, however the view does not sync with the model all the time. Instead when the user wants to commit changes, SaveEvent is fired and the presenter grabs the latest details and passes them into the model. in this case the model does not know about the views data until it is required to act upon it, in which case it is passed all the needed details. Section C Displaying of business objects in the view, i.e. a object (MyClass) not primitive data (int, double) The view has property fields for all its data that it will display as domain/business objects. Such as view.Animals exposes a IEnumerable<IAnimal> property, even though the view processes these into Nodes in a TreeView. Then for the selected animal it would expose SelectedAnimal as IAnimal property. The view has no knowledge of domain objects, it exposes property for primitive/framework (.Net/Java) included objects types only. In this instance the presenter will pass an adapter object the domain object, the adapter will then translate a given business object into the controls visible on the view. In this instance the adapter must have access to the actual controls on the view, not just any view so becomes more tightly coupled. Section D Multiple views used to create a single control. i.e. You have a complex view with a simple model like saving objects of different types. You could have a menu system at the side with each click on an item the appropriate controls are shown. You create one huge view, that contains all of the individual controls which are exposed via the views interface. You have several views. You have one view for the menu and a blank panel. This view creates the other views required but does not display them (visible = false), this view also implements the interface for each view it contains (i.e. child views) so it can expose to one presenter. The blank panel is filled with other views (Controls.Add(myview)) and ((myview.visible = true). The events raised in these "child"-views are handled by the parent view which in turn pass the event to the presenter, and visa versa for supplying events back down to child elements. Each view, be it the main parent or smaller child views are each wired into there own presenter and model. You can literately just drop a view control into an existing form and it will have the functionality ready, just needs wiring into a presenter behind the scenes. Section E Should everything have an interface, now based on how the MVP is done in the above examples will affect this answer as they might not be cross-compatible. Everything has an interface, the View, Presenter and Model. Each of these then obviously has a concrete implementation. Even if you only have one concrete view, model and presenter. The View and Model have an interface. This allows the views and models to differ. The presenter creates/is given view and model objects and it just serves to pass messages between them. Only the View has an interface. The Model has static methods and is not created, thus no need for an interface. If you want a different model, the presenter calls a different set of static class methods. Being static the Model has no link to the presenter. Personal thoughts From all the different variations I have presented (most I have probably used in some form) of which I am sure there are more. I prefer A3 as keeping business logic reusable outside just MVP, B2 for less data duplication and less events being fired. C1 for not adding in another class, sure it puts a small amount of non unit testable logic into a view (how a domain object is visualised) but this could be code reviewed, or simply viewed in the application. If the logic was complex I would agree to an adapter class but not in all cases. For section D, i feel D1 creates a view that is too big atleast for a menu example. I have used D2 and D3 before. Problem with D2 is you end up having to write lots of code to route events to and from the presenter to the correct child view, and its not drag/drop compatible, each new control needs more wiring in to support the single presenter. D3 is my prefered choice but adds in yet more classes as presenters and models to deal with the view, even if the view happens to be very simple or has no need to be reused. i think a mixture of D2 and D3 is best based on circumstances. As to section E, I think everything having an interface could be overkill I already do it for domain/business objects and often see no advantage in the "design" by doing so, but it does help in mocking objects in tests. Personally I would see E2 as a classic solution, although have seen E3 used in 2 projects I have worked on previously. Question Am I implementing MVP correctly? Is there a right way of going about it? I've read Martin Fowler's work that has variations, and I remember when I first started doing MVC, I understood the concept, but could not originally work out where is the entry point, everything has its own function but what controls and creates the original set of MVC objects.

Read the article
Fun with Aggregates

- by Paul White

There are interesting things to be learned from even the simplest queries. For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join. The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units. The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too? To answer that, let’s look at some other ways to formulate this query. This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones. The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above. It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join! The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join. In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting. The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set. In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL). To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709; -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case. The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate. This is something I would like to see exposed in show plan so I suggested it on Connect. Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all. Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans. Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :) The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier. What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too). Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places. It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates. As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

Read the article
SQL SERVER – SSMS: Disk Usage Report

- by Pinal Dave

Let us start with humor! I think we the series on various reports, we come to a logical point. We covered all the reports at server level. This means the reports we saw were targeted towards activities that are related to instance level operations. These are mostly like how a doctor diagnoses a patient. At this point I am reminded of a dialog which I read somewhere: Patient: Doc, It hurts when I touch my head. Doc: Ok, go on. What else have you experienced? Patient: It hurts even when I touch my eye, it hurts when I touch my arms, it even hurts when I touch my feet, etc. Doc: Hmmm … Patient: I feel it hurts when I touch anywhere in my body. Doc: Ahh … now I get it. You need a plaster to your finger John. Sometimes the server level gives an indicator to what is happening in the system, but we need to get to the root cause for a specific database. So, this is the first blog in series where we would start discussing about database level reports. To launch database level reports, expand selected server in Object Explorer, expand the Databases folder, and then right-click any database for which we want to look at reports. From the menu, select Reports, then Standard Reports, and then any of database level reports. In this blog, we would talk about four “disk” reports because they are similar: Disk Usage Disk Usage by Top Tables Disk Usage by Table Disk Usage by Partition Disk Usage This report shows multiple information about the database. Let us discuss them one by one. We have divided the output into 5 different sections. Section 1 shows the high level summary of the database. It shows the space used by database files (mdf and ldf). Under the hood, the report uses, various DMVs and DBCC Commands, it is using sys.data_spaces and DBCC SHOWFILESTATS. Section 2 and 3 are pie charts. One for data file allocation and another for the transaction log file. Pie chart for “Data Files Space Usage (%)” shows space consumed data, indexes, allocated to the SQL Server database, and unallocated space which is allocated to the SQL Server database but not yet filled with anything. “Transaction Log Space Usage (%)” used DBCC SQLPERF (LOGSPACE) and shows how much empty space we have in the physical transaction log file. Section 4 shows the data from Default Trace and looks at Event IDs 92, 93, 94, 95 which are for “Data File Auto Grow”, “Log File Auto Grow”, “Data File Auto Shrink” and “Log File Auto Shrink” respectively. Here is an expanded view for that section. If default trace is not enabled, then this section would be replaced by the message “Trace Log is disabled” as highlighted below. Section 5 of the report uses DBCC SHOWFILESTATS to get information. Here is the enhanced version of that section. This shows the physical layout of the file. In case you have In-Memory Objects in the database (from SQL Server 2014), then report would show information about those as well. Here is the screenshot taken for a different database, which has In-Memory table. I have highlighted new things which are only shown for in-memory database. The new sections which are highlighted above are using sys.dm_db_xtp_checkpoint_files, sys.database_files and sys.data_spaces. The new type for in-memory OLTP is ‘FX’ in sys.data_space. The next set of reports is targeted to get information about a table and its storage. These reports can answer questions like: Which is the biggest table in the database? How many rows we have in table? Is there any table which has a lot of reserved space but its unused? Which partition of the table is having more data? Disk Usage by Top Tables This report provides detailed data on the utilization of disk space by top 1000 tables within the Database. The report does not provide data for memory optimized tables. Disk Usage by Table This report is same as earlier report with few difference. First Report shows only 1000 rows First Report does order by values in DMV sys.dm_db_partition_stats whereas second one does it based on name of the table. Both of the reports have interactive sort facility. We can click on any column header and change the sorting order of data. Disk Usage by Partition This report shows the distribution of the data in table based on partition in the table. This is so similar to previous output with the partition details now. Here is the query taken from profiler. SELECT row_number() OVER (ORDER BY a1.used_page_count DESC, a1.index_id) AS row_number , (dense_rank() OVER (ORDER BY a5.name, a2.name))%2 AS l1 , a1.OBJECT_ID , a5.name AS [schema] , a2.name , a1.index_id , a3.name AS index_name , a3.type_desc , a1.partition_number , a1.used_page_count * 8 AS total_used_pages , a1.reserved_page_count * 8 AS total_reserved_pages , a1.row_count FROM sys.dm_db_partition_stats a1 INNER JOIN sys.all_objects a2 ON ( a1.OBJECT_ID = a2.OBJECT_ID) AND a1.OBJECT_ID NOT IN (SELECT OBJECT_ID FROM sys.tables WHERE is_memory_optimized = 1) INNER JOIN sys.schemas a5 ON (a5.schema_id = a2.schema_id) LEFT OUTER JOIN sys.indexes a3 ON ( (a1.OBJECT_ID = a3.OBJECT_ID) AND (a1.index_id = a3.index_id) ) WHERE (SELECT MAX(DISTINCT partition_number) FROM sys.dm_db_partition_stats a4 WHERE (a4.OBJECT_ID = a1.OBJECT_ID)) >= 1 AND a2.TYPE <> N'S' AND a2.TYPE <> N'IT' ORDER BY a5.name ASC, a2.name ASC, a1.index_id, a1.used_page_count DESC, a1.partition_number Using all of the above reports, you should be able to get the usage of database files and also space used by tables. I think this is too much disk information for a single blog and I hope you have used them in the past to get data. Do let me know if you found anything interesting using these reports in your environments. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, T SQL Tagged: SQL Reports

Read the article
Rockmelt, the technology adoption model, and Facebook's spare internet

- by Roger Hart

Regardless of how good it is, you'd have to have a heart of stone not to make snide remarks about Rockmelt. After all, on the surface it looks a lot like some people spent two years building a browser instead of just bashing out a Chrome extension over a wet weekend. It probably does some more stuff. I don't know for sure because artificial scarcity is cool, apparently, so the "invitation" is still in the post*. I may in fact never know for sure, because I'm not wild about Facebook sign-in as a prerequisite for anything. From the video, and some initial reviews, my early reaction was: I have a browser, I have a Twitter client; what on earth is this for? The answer, of course, is "not me". Rockmelt is, in a way, quite audacious. Oh, sure, on launch day it's Bay Area bar-chat for the kids with no lenses in their retro specs and trousers that give you deep-vein thrombosis, but it's not really about them. Likewise, Facebook just launched Google Wave, or something. And all the tech snobbery and scorn packed into describing it that way is irrelevant next to what they're doing with their platform. Here's something I drew in MS Paint** because I don't want to get sued: (see: The technology adoption lifecycle) A while ago in the Guardian, John Lanchester dusted off the idiom that "technology is stuff that doesn't work yet". The rest of the article would be quite interesting if it wasn't largely about MySpace, and he's sort of got a point. If you bolt on the sentiment that risk-averse businessmen like things that work, you've got the essence of Crossing the Chasm. Products for the mainstream market don't look much like technology. Think for a second about early (1980s ish) hi-fi systems, with all the knobs and fiddly bits, their ostentatious technophile aesthetic. Then consider their sleeker and less (or at least less conspicuously) functional successors in the 1990s. The theory goes that innovators and early adopters like technology, it's a hobby in itself. The rest of the humans seem to like magic boxes with very few buttons that make stuff happen and never trouble them about why. Personally, I consider Apple's maddening insistence that iTunes is an acceptable way to move files around to be more or less morally unacceptable. Most people couldn't care less. Hence Rockmelt, and hence Facebook's continued growth. Rockmelt looks pointless to me, because I aggregate my social gubbins with Digsby, or use TweetDeck. But my use case is different and so are my enthusiasms. If I want to share photos, I'll use Flickr - but Facebook has photo sharing. If I want a short broadcast message, I'll use Twitter - Facebook has status updates. If I want to sell something with relatively little hassle, there's eBay - or Facebook marketplace. YouTube - check, FB Video. Email - messaging. Calendaring apps, yeah there are loads, or FB Events. What if I want to host a simple web page? Sure, they've got pages. Also Notes for blogging, and more games than I can count. This stuff is right there, where millions and millions of users are already, and for what they need it just works. It's not about me, because I'm not in the big juicy area under the curve. It's what 1990s portal sites could never have dreamed of achieving. Facebook is AOL on speed, crack, and some designer drugs it had specially imported from the future. It's a n00b-friendly gateway to the internet that just happens to serve up all the things you want to do online, right where you are. Oh, and everybody else is there too. The price of having all this and the social graph too is that you have all of this, and the social graph too. But plenty of folks have more incisive things to say than me about the whole privacy shebang, and it's not really what I'm talking about. Facebook is maintaining a vast, and fairly fully-featured training-wheels internet. And it makes up a large proportion of the online experience for a lot of people***. It's the entire web (2.0?) experience for the early and late majority. And sure, no individual bit of it is quite as slick or as fully-realised as something like Flickr (which wows me a bit every time I use it. Those guys are good at the web), but it doesn't have to be. It has to be unobtrusively good enough for the regular humans. It has to not feel like technology. This is what Rockmelt sort of is. You're online, you want something nebulously social, and you don't want to faff about with, say, Twitter clients. Wow! There it is on a really distracting sidebar, right in your browser. No effort! Yeah - fish nor fowl, much? It might work, I guess. There may be a demographic who want their social web experience more simply than tech tinkering, and who aren't just getting it from Facebook (or, for that matter, mobile devices). But I'd be surprised. Rockmelt feels like an attempt to grab a slice of Facebook-style "Look! It's right here, where you already are!", but it's still asking the mature market to install a new browser. Presumably this is where that Facebook sign-in predicate comes in handy, though it'll take some potent awareness marketing to make it fly. Meanwhile, Facebook quietly has the entire rest of the internet as a product management resource, and can continue to give most of the people most of what they want. Something that has not gone un-noticed in its potential to look a little sinister. But heck, they might even make Google Wave popular. *This was true last week when I drafted this post. I got an invite subsequently, hence the screenshot. **MS Paint is no fun any more. It's actually good in Windows 7. Farewell ironically-shonky diagrams. *** It's also behind a single sign-in, lending a veneer of confidence, and partially solving the problem of usernames being crummy unique identifiers. I'll be blogging about that at some point.

Read the article
Installing a DHCP Service On Win2k8 ( Windows Server 2008 )

- by Akshay Deep Lamba

Introduction Dynamic Host Configuration Protocol (DHCP) is a core infrastructure service on any network that provides IP addressing and DNS server information to PC clients and any other device. DHCP is used so that you do not have to statically assign IP addresses to every device on your network and manage the issues that static IP addressing can create. More and more, DHCP is being expanded to fit into new network services like the Windows Health Service and Network Access Protection (NAP). However, before you can use it for more advanced services, you need to first install it and configure the basics. Let’s learn how to do that. Installing Windows Server 2008 DHCP Server Installing Windows Server 2008 DCHP Server is easy. DHCP Server is now a “role” of Windows Server 2008 – not a windows component as it was in the past. To do this, you will need a Windows Server 2008 system already installed and configured with a static IP address. You will need to know your network’s IP address range, the range of IP addresses you will want to hand out to your PC clients, your DNS server IP addresses, and your default gateway. Additionally, you will want to have a plan for all subnets involved, what scopes you will want to define, and what exclusions you will want to create. To start the DHCP installation process, you can click Add Roles from the Initial Configuration Tasks window or from Server Manager à Roles à Add Roles. Figure 1: Adding a new Role in Windows Server 2008 When the Add Roles Wizard comes up, you can click Next on that screen. Next, select that you want to add the DHCP Server Role, and click Next. Figure 2: Selecting the DHCP Server Role If you do not have a static IP address assigned on your server, you will get a warning that you should not install DHCP with a dynamic IP address. At this point, you will begin being prompted for IP network information, scope information, and DNS information. If you only want to install DHCP server with no configured scopes or settings, you can just click Next through these questions and proceed with the installation. On the other hand, you can optionally configure your DHCP Server during this part of the installation. In my case, I chose to take this opportunity to configure some basic IP settings and configure my first DHCP Scope. I was shown my network connection binding and asked to verify it, like this: Figure 3: Network connection binding What the wizard is asking is, “what interface do you want to provide DHCP services on?” I took the default and clicked Next. Next, I entered my Parent Domain, Primary DNS Server, and Alternate DNS Server (as you see below) and clicked Next. Figure 4: Entering domain and DNS information I opted NOT to use WINS on my network and I clicked Next. Then, I was promoted to configure a DHCP scope for the new DHCP Server. I have opted to configure an IP address range of 192.168.1.50-100 to cover the 25+ PC Clients on my local network. To do this, I clicked Add to add a new scope. As you see below, I named the Scope WBC-Local, configured the starting and ending IP addresses of 192.168.1.50-192.168.1.100, subnet mask of 255.255.255.0, default gateway of 192.168.1.1, type of subnet (wired), and activated the scope. Figure 5: Adding a new DHCP Scope Back in the Add Scope screen, I clicked Next to add the new scope (once the DHCP Server is installed). I chose to Disable DHCPv6 stateless mode for this server and clicked Next. Then, I confirmed my DHCP Installation Selections (on the screen below) and clicked Install. Figure 6: Confirm Installation Selections After only a few seconds, the DHCP Server was installed and I saw the window, below: Figure 7: Windows Server 2008 DHCP Server Installation succeeded I clicked Close to close the installer window, then moved on to how to manage my new DHCP Server. How to Manage your new Windows Server 2008 DHCP Server Like the installation, managing Windows Server 2008 DHCP Server is also easy. Back in my Windows Server 2008 Server Manager, under Roles, I clicked on the new DHCP Server entry. Figure 8: DHCP Server management in Server Manager While I cannot manage the DHCP Server scopes and clients from here, what I can do is to manage what events, services, and resources are related to the DHCP Server installation. Thus, this is a good place to go to check the status of the DHCP Server and what events have happened around it. However, to really configure the DHCP Server and see what clients have obtained IP addresses, I need to go to the DHCP Server MMC. To do this, I went to Start à Administrative Tools à DHCP Server, like this: Figure 9: Starting the DHCP Server MMC When expanded out, the MMC offers a lot of features. Here is what it looks like: Figure 10: The Windows Server 2008 DHCP Server MMC The DHCP Server MMC offers IPv4 & IPv6 DHCP Server info including all scopes, pools, leases, reservations, scope options, and server options. If I go into the address pool and the scope options, I can see that the configuration we made when we installed the DHCP Server did, indeed, work. The scope IP address range is there, and so are the DNS Server & default gateway. Figure 11: DHCP Server Address Pool Figure 12: DHCP Server Scope Options So how do we know that this really works if we do not test it? The answer is that we do not. Now, let’s test to make sure it works. How do we test our Windows Server 2008 DHCP Server? To test this, I have a Windows Vista PC Client on the same network segment as the Windows Server 2008 DHCP server. To be safe, I have no other devices on this network segment. I did an IPCONFIG /RELEASE then an IPCONFIG /RENEW and verified that I received an IP address from the new DHCP server, as you can see below: Figure 13: Vista client received IP address from new DHCP Server Also, I went to my Windows 2008 Server and verified that the new Vista client was listed as a client on the DHCP server. This did indeed check out, as you can see below: Figure 14: Win 2008 DHCP Server has the Vista client listed under Address Leases With that, I knew that I had a working configuration and we are done!

Read the article
techniques for an AI for a highly cramped turn-based tactics game

- by Adam M.

I'm trying to write an AI for a tactics game in the vein of Final Fantasy Tactics or Vandal Hearts. I can't change the game rules in any way, only upgrade the AI. I have experience programming AI for classic board games (basically minimax and its variants), but I think the branching factor is too great for the approach to be reasonable here. I'll describe the game and some current AI flaws that I'd like to fix. I'd like to hear ideas for applicable techniques. I'm a decent enough programmer, so I only need the ideas, not an implementation (though that's always appreciated). I'd rather not expend effort chasing (too many) dead ends, so although speculation and brainstorming are good and probably helpful, I'd prefer to hear from somebody with actual experience solving this kind of problem. For those who know it, the game is the land battle mini-game in Sid Meier's Pirates! (2004) and you can skim/skip the next two paragraphs. For those who don't, here's briefly how it works. The battle is turn-based and takes place on a 16x16 grid. There are three terrain types: clear (no hindrance), forest (hinders movement, ranged attacks, and sight), and rock (impassible, but does not hinder attacks or sight). The map is randomly generated with roughly equal amounts of each type of terrain. Because there are many rock and forest tiles, movement is typically very cramped. This is tactically important. The terrain is not flat; higher terrain gives minor bonuses. The terrain is known to both sides. The player is always the attacker and the AI is always the defender, so it's perfectly valid for the AI to set up a defensive position and just wait. The player wins by killing all defenders or by getting a unit to the city gates (a tile on the other side of the map). There are very few units on each side, usually 4-8. Because of this, it's crucial not to take damage without gaining some advantage from it. Units can take multiple actions per turn. All units on one side move before any units on the other side. Order of execution is important, and interleaving of actions between units is often useful. Units have melee and ranged attacks. Melee attacks vary widely in strength; ranged attacks have the same strength but vary in range. The main challenges I face are these: Lots of useful move combinations start with a "useless" move that gains no immediate advantage, or even loses advantage, in order to set up a powerful flank attack in the future. And, since the player units are stronger and have longer range, the AI pretty much always has to take some losses before they can start to gain kills. The AI must be able to look ahead to distinguish between sacrificial actions that provide a future benefit and those that don't. Because the terrain is so cramped, most of the tactics come down to achieving good positioning with multiple units that work together to defend an area. For instance, two defenders can often dominate a narrow pass by positioning themselves so an enemy unit attempting to pass must expose itself to a flank attack. But one defender in the same pass would be useless, and three units can defend a slightly larger pass. Etc. The AI should be able to figure out where the player must go to reach the city gates and how to best position its few units to cover the approaches, shifting, splitting, or combining them appropriately as the player moves. Because flank attacks are extremely deadly (and engineering flank attacks is key to the player strategy), the AI should be competent at moving its units so that they cover each other's flanks unless the sacrifice of a unit would give a substantial benefit. They should also be able to force flank attacks on players, for instance by threatening a unit from two different directions such that responding to one threat exposes the flank to the other. The AI should attack if possible, but sometimes there are no good ways to approach the player's position. In that case, the AI should be able to recognize this and set up a defensive position of its own. But the AI shouldn't be vulnerable to a trivial exploit where the player repeatedly opens and closes a hole in his defense and shoots at the AI as it approaches and retreats. That is, the AI should ideally be able to recognize that the player is capable of establishing a solid defense of an area, even if the defense is not currently in place. (I suppose if a good unit allocation algorithm existed, as needed for the second bullet point, the AI could run it on the player units to see where they could defend.) Because it's important to choose a good order of action and interleave actions between units, it's not as simple as just finding the best move for each unit in turn. All of these can be accomplished with a minimax search in theory, but the search space is too large, so specialized techniques are needed. I thought about techniques such as influence mapping, but I don't see how to use the technique to great effect. I thought about assigning goals to the units. This can help them work together in some limited way, and the problem of "how do I accomplish this goal?" is easier to solve than "how do I win this battle?", but assigning good goals is a hard problem in itself, because it requires knowing whether the goal is achievable and whether it's a good use of resources. So, does anyone have specific ideas for techniques that can help cleverize this AI? Update: I found a related question on Stackoverflow: http://stackoverflow.com/questions/3133273/ai-for-a-final-fantasy-tactics-like-game The selected answer gives a decent approach to choosing between alternative actions, but it doesn't seem to have much ability to look into the future and discern beneficial sacrifices from wasteful ones. It also focuses on a single unit at a time and it's not clear how it could be extended to support cooperation between units in defending or attacking.

Read the article
Thread placement policies on NUMA systems - update

- by Dave

In a prior blog entry I noted that Solaris used a "maximum dispersal" placement policy to assign nascent threads to their initial processors. The general idea is that threads should be placed as far away from each other as possible in the resource topology in order to reduce resource contention between concurrently running threads. This policy assumes that resource contention -- pipelines, memory channel contention, destructive interference in the shared caches, etc -- will likely outweigh (a) any potential communication benefits we might achieve by packing our threads more densely onto a subset of the NUMA nodes, and (b) benefits of NUMA affinity between memory allocated by one thread and accessed by other threads. We want our threads spread widely over the system and not packed together. Conceptually, when placing a new thread, the kernel picks the least loaded node NUMA node (the node with lowest aggregate load average), and then the least loaded core on that node, etc. Furthermore, the kernel places threads onto resources -- sockets, cores, pipelines, etc -- without regard to the thread's process membership. That is, initial placement is process-agnostic. Keep reading, though. This description is incorrect. On Solaris 10 on a SPARC T5440 with 4 x T2+ NUMA nodes, if the system is otherwise unloaded and we launch a process that creates 20 compute-bound concurrent threads, then typically we'll see a perfect balance with 5 threads on each node. We see similar behavior on an 8-node x86 x4800 system, where each node has 8 cores and each core is 2-way hyperthreaded. So far so good; this behavior seems in agreement with the policy I described in the 1st paragraph. I recently tried the same experiment on a 4-node T4-4 running Solaris 11. Both the T5440 and T4-4 are 4-node systems that expose 256 logical thread contexts. To my surprise, all 20 threads were placed onto just one NUMA node while the other 3 nodes remained completely idle. I checked the usual suspects such as processor sets inadvertently left around by colleagues, processors left offline, and power management policies, but the system was configured normally. I then launched multiple concurrent instances of the process, and, interestingly, all the threads from the 1st process landed on one node, all the threads from the 2nd process landed on another node, and so on. This happened even if I interleaved thread creating between the processes, so I was relatively sure the effect didn't related to thread creation time, but rather that placement was a function of process membership. I this point I consulted the Solaris sources and talked with folks in the Solaris group. The new Solaris 11 behavior is intentional. The kernel is no longer using a simple maximum dispersal policy, and thread placement is process membership-aware. Now, even if other nodes are completely unloaded, the kernel will still try to pack new threads onto the home lgroup (socket) of the primordial thread until the load average of that node reaches 50%, after which it will pick the next least loaded node as the process's new favorite node for placement. On the T4-4 we have 64 logical thread contexts (strands) per socket (lgroup), so if we launch 48 concurrent threads we will find 32 placed on one node and 16 on some other node. If we launch 64 threads we'll find 32 and 32. That means we can end up with our threads clustered on a small subset of the nodes in a way that's quite different that what we've seen on Solaris 10. So we have a policy that allows process-aware packing but reverts to spreading threads onto other nodes if a node becomes too saturated. It turns out this policy was enabled in Solaris 10, but certain bugs suppressed the mixed packing/spreading behavior. There are configuration variables in /etc/system that allow us to dial the affinity between nascent threads and their primordial thread up and down: see lgrp_expand_proc_thresh, specifically. In the OpenSolaris source code the key routine is mpo_update_tunables(). This method reads the /etc/system variables and sets up some global variables that will subsequently be used by the dispatcher, which calls lgrp_choose() in lgrp.c to place nascent threads. Lgrp_expand_proc_thresh controls how loaded an lgroup must be before we'll consider homing a process's threads to another lgroup. Tune this value lower to have it spread your process's threads out more. To recap, the 'new' policy is as follows. Threads from the same process are packed onto a subset of the strands of a socket (50% for T-series). Once that socket reaches the 50% threshold the kernel then picks another preferred socket for that process. Threads from unrelated processes are spread across sockets. More precisely, different processes may have different preferred sockets (lgroups). Beware that I've simplified and elided details for the purposes of explication. The truth is in the code. Remarks: It's worth noting that initial thread placement is just that. If there's a gross imbalance between the load on different nodes then the kernel will migrate threads to achieve a better and more even distribution over the set of available nodes. Once a thread runs and gains some affinity for a node, however, it becomes "stickier" under the assumption that the thread has residual cache residency on that node, and that memory allocated by that thread resides on that node given the default "first-touch" page-level NUMA allocation policy. Exactly how the various policies interact and which have precedence under what circumstances could the topic of a future blog entry. The scheduler is work-conserving. The x4800 mentioned above is an interesting system. Each of the 8 sockets houses an Intel 7500-series processor. Each processor has 3 coherent QPI links and the system is arranged as a glueless 8-socket twisted ladder "mobius" topology. Nodes are either 1 or 2 hops distant over the QPI links. As an aside the mapping of logical CPUIDs to physical resources is rather interesting on Solaris/x4800. On SPARC/Solaris the CPUID layout is strictly geographic, with the highest order bits identifying the socket, the next lower bits identifying the core within that socket, following by the pipeline (if present) and finally the logical thread context ("strand") on the core. But on Solaris on the x4800 the CPUID layout is as follows. [6:6] identifies the hyperthread on a core; bits [5:3] identify the socket, or package in Intel terminology; bits [2:0] identify the core within a socket. Such low-level details should be of interest only if you're binding threads -- a bad idea, the kernel typically handles placement best -- or if you're writing NUMA-aware code that's aware of the ambient placement and makes decisions accordingly. Solaris introduced the so-called critical-threads mechanism, which is expressed by putting a thread into the FX scheduling class at priority 60. The critical-threads mechanism applies to placement on cores, not on sockets, however. That is, it's an intra-socket policy, not an inter-socket policy. Solaris 11 introduces the Power Aware Dispatcher (PAD) which packs threads instead of spreading them out in an attempt to be able to keep sockets or cores at lower power levels. Maximum dispersal may be good for performance but is anathema to power management. PAD is off by default, but power management polices constitute yet another confounding factor with respect to scheduling and dispatching. If your threads communicate heavily -- one thread reads cache lines last written by some other thread -- then the new dense packing policy may improve performance by reducing traffic on the coherent interconnect. On the other hand if your threads in your process communicate rarely, then it's possible the new packing policy might result on contention on shared computing resources. Unfortunately there's no simple litmus test that says whether packing or spreading is optimal in a given situation. The answer varies by system load, application, number of threads, and platform hardware characteristics. Currently we don't have the necessary tools and sensoria to decide at runtime, so we're reduced to an empirical approach where we run trials and try to decide on a placement policy. The situation is quite frustrating. Relatedly, it's often hard to determine just the right level of concurrency to optimize throughput. (Understanding constructive vs destructive interference in the shared caches would be a good start. We could augment the lines with a small tag field indicating which strand last installed or accessed a line. Given that, we could augment the CPU with performance counters for misses where a thread evicts a line it installed vs misses where a thread displaces a line installed by some other thread.)

Read the article
Try a sample: Using the counter predicate for event sampling

- by extended_events

Extended Events offers a rich filtering mechanism, called predicates, that allows you to reduce the number of events you collect by specifying criteria that will be applied during event collection. (You can find more information about predicates in Using SQL Server 2008 Extended Events (by Jonathan Kehayias)) By evaluating predicates early in the event firing sequence we can reduce the performance impact of collecting events by stopping event collection when the criteria are not met. You can specify predicates on both event fields and on a special object called a predicate source. Predicate sources are similar to action in that they typically are related to some type of global information available from the server. You will find that many of the actions available in Extended Events have equivalent predicate sources, but actions and predicates sources are not the same thing. Applying predicates, whether on a field or predicate source, is very similar to what you are used to in T-SQL in terms of how they work; you pick some field/source and compare it to a value, for example, session_id = 52. There is one predicate source that merits special attention though, not just for its special use, but for how the order of predicate evaluation impacts the behavior you see. I’m referring to the counter predicate source. The counter predicate source gives you a way to sample a subset of events that otherwise meet the criteria of the predicate; for example you could collect every other event, or only every tenth event. Simple CountingThe counter predicate source works by creating an in memory counter that increments every time the predicate statement is evaluated. Here is a simple example with my favorite event, sql_statement_completed, that only collects the second statement that is run. (OK, that’s not much of a sample, but this is for demonstration purposes. Here is the session definition: CREATE EVENT SESSION counter_test ON SERVERADD EVENT sqlserver.sql_statement_completed (ACTION (sqlserver.sql_text) WHERE package0.counter = 2)ADD TARGET package0.ring_bufferWITH (MAX_DISPATCH_LATENCY = 1 SECONDS) You can find general information about the session DDL syntax in BOL and from Pedro’s post Introduction to Extended Events. The important part here is the WHERE statement that defines that I only what the event where package0.count = 2; in other words, only the second instance of the event. Notice that I need to provide the package name along with the predicate source. You don’t need to provide the package name if you’re using event fields, only for predicate sources. Let’s say I run the following test queries: -- Run three statements to test the sessionSELECT 'This is the first statement'GOSELECT 'This is the second statement'GOSELECT 'This is the third statement';GO Once you return the event data from the ring buffer and parse the XML (see my earlier post on reading event data) you should see something like this: event_name sql_text sql_statement_completed SELECT ‘This is the second statement’ You can see that only the second statement from the test was actually collected. (Feel free to try this yourself. Check out what happens if you remove the WHERE statement from your session. Go ahead, I’ll wait.) Percentage Sampling OK, so that wasn’t particularly interesting, but you can probably see that this could be interesting, for example, lets say I need a 25% sample of the statements executed on my server for some type of QA analysis, that might be more interesting than just the second statement. All comparisons of predicates are handled using an object called a predicate comparator; the simple comparisons such as equals, greater than, etc. are mapped to the common mathematical symbols you know and love (eg. = and >), but to do the less common comparisons you will need to use the predicate comparators directly. You would probably look to the MOD operation to do this type sampling; we would too, but we don’t call it MOD, we call it divides_by_uint64. This comparator evaluates whether one number is divisible by another with no remainder. The general syntax for using a predicate comparator is pred_comp(field, value), field is always first and value is always second. So lets take a look at how the session changes to answer our new question of 25% sampling: CREATE EVENT SESSION counter_test_25 ON SERVERADD EVENT sqlserver.sql_statement_completed (ACTION (sqlserver.sql_text) WHERE package0.divides_by_uint64(package0.counter,4))ADD TARGET package0.ring_bufferWITH (MAX_DISPATCH_LATENCY = 1 SECONDS)GO Here I’ve replaced the simple equivalency check with the divides_by_uint64 comparator to check if the counter is evenly divisible by 4, which gives us back every fourth record. I’ll leave it as an exercise for the reader to test this session. Why order matters I indicated at the start of this post that order matters when it comes to the counter predicate – it does. Like most other predicate systems, Extended Events evaluates the predicate statement from left to right; as soon as the predicate statement is proven false we abandon evaluation of the remainder of the statement. The counter predicate source is only incremented when it is evaluated so whether or not the counter is incremented will depend on where it is in the predicate statement and whether a previous criteria made the predicate false or not. Here is a generic example: Pred1: (WHERE statement_1 AND package0.counter = 2)Pred2: (WHERE package0.counter = 2 AND statement_1) Let’s say I cause a number of events as follows and examine what happens to the counter predicate source. Iteration Statement Pred1 Counter Pred2 Counter A Not statement_1 0 1 B statement_1 1 2 C Not statement_1 1 3 D statement_1 2 4 As you can see, in the case of Pred1, statement_1 is evaluated first, when it fails (A & C) predicate evaluation is stopped and the counter is not incremented. With Pred2 the counter is evaluated first, so it is incremented on every iteration of the event and the remaining parts of the predicate are then evaluated. In this example, Pred1 would return an event for D while Pred2 would return an event for B. But wait, there is an interesting side-effect here; consider Pred2 if I had run my statements in the following order: Not statement_1 Not statement_1 statement_1 statement_1 In this case I would never get an event back from the system because the point at which counter=2, the rest of the predicate evaluates as false so the event is not returned. If you’re using the counter target for sampling and you’re not getting the expected events, or any events, check the order of the predicate criteria. As a general rule I’d suggest that the counter criteria should be the last element of your predicate statement since that will assure that your sampling rate will apply to the set of event records defined by the rest of your predicate. Aside: I’m interested in hearing about uses for putting the counter predicate criteria earlier in the predicate statement. If you have one, post it in a comment to share with the class. - Mike Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
C#: LINQ vs foreach - Round 1.

- by James Michael Hare

So I was reading Peter Kellner's blog entry on Resharper 5.0 and its LINQ refactoring and thought that was very cool. But that raised a point I had always been curious about in my head -- which is a better choice: manual foreach loops or LINQ? The answer is not really clear-cut. There are two sides to any code cost arguments: performance and maintainability. The first of these is obvious and quantifiable. Given any two pieces of code that perform the same function, you can run them side-by-side and see which piece of code performs better. Unfortunately, this is not always a good measure. Well written assembly language outperforms well written C++ code, but you lose a lot in maintainability which creates a big techncial debt load that is hard to offset as the application ages. In contrast, higher level constructs make the code more brief and easier to understand, hence reducing technical cost. Now, obviously in this case we're not talking two separate languages, we're comparing doing something manually in the language versus using a higher-order set of IEnumerable extensions that are in the System.Linq library. Well, before we discuss any further, let's look at some sample code and the numbers. First, let's take a look at the for loop and the LINQ expression. This is just a simple find comparison: // find implemented via LINQ public static bool FindViaLinq(IEnumerable<int> list, int target) { return list.Any(item => item == target); } // find implemented via standard iteration public static bool FindViaIteration(IEnumerable<int> list, int target) { foreach (var i in list) { if (i == target) { return true; } } return false; } Okay, looking at this from a maintainability point of view, the Linq expression is definitely more concise (8 lines down to 1) and is very readable in intention. You don't have to actually analyze the behavior of the loop to determine what it's doing. So let's take a look at performance metrics from 100,000 iterations of these methods on a List<int> of varying sizes filled with random data. For this test, we fill a target array with 100,000 random integers and then run the exact same pseudo-random targets through both searches. List<T> On 100,000 Iterations Method Size Total (ms) Per Iteration (ms) % Slower Any 10 26 0.00046 30.00% Iteration 10 20 0.00023 - Any 100 116 0.00201 18.37% Iteration 100 98 0.00118 - Any 1000 1058 0.01853 16.78% Iteration 1000 906 0.01155 - Any 10,000 10,383 0.18189 17.41% Iteration 10,000 8843 0.11362 - Any 100,000 104,004 1.8297 18.27% Iteration 100,000 87,941 1.13163 - The LINQ expression is running about 17% slower for average size collections and worse for smaller collections. Presumably, this is due to the overhead of the state machine used to track the iterators for the yield returns in the LINQ expressions, which seems about right in a tight loop such as this. So what about other LINQ expressions? After all, Any() is one of the more trivial ones. I decided to try the TakeWhile() algorithm using a Count() to get the position stopped like the sample Pete was using in his blog that Resharper refactored for him into LINQ: // Linq form public static int GetTargetPosition1(IEnumerable<int> list, int target) { return list.TakeWhile(item => item != target).Count(); } // traditionally iterative form public static int GetTargetPosition2(IEnumerable<int> list, int target) { int count = 0; foreach (var i in list) { if(i == target) { break; } ++count; } return count; } Once again, the LINQ expression is much shorter, easier to read, and should be easier to maintain over time, reducing the cost of technical debt. So I ran these through the same test data: List<T> On 100,000 Iterations Method Size Total (ms) Per Iteration (ms) % Slower TakeWhile 10 41 0.00041 128% Iteration 10 18 0.00018 - TakeWhile 100 171 0.00171 88% Iteration 100 91 0.00091 - TakeWhile 1000 1604 0.01604 94% Iteration 1000 825 0.00825 - TakeWhile 10,000 15765 0.15765 92% Iteration 10,000 8204 0.08204 - TakeWhile 100,000 156950 1.5695 92% Iteration 100,000 81635 0.81635 - Wow! I expected some overhead due to the state machines iterators produce, but 90% slower? That seems a little heavy to me. So then I thought, well, what if TakeWhile() is not the right tool for the job? The problem is TakeWhile returns each item for processing using yield return, whereas our for-loop really doesn't care about the item beyond using it as a stop condition to evaluate. So what if that back and forth with the iterator state machine is the problem? Well, we can quickly create an (albeit ugly) lambda that uses the Any() along with a count in a closure (if a LINQ guru knows a better way PLEASE let me know!), after all , this is more consistent with what we're trying to do, we're trying to find the first occurence of an item and halt once we find it, we just happen to be counting on the way. This mostly matches Any(). // a new method that uses linq but evaluates the count in a closure. public static int TakeWhileViaLinq2(IEnumerable<int> list, int target) { int count = 0; list.Any(item => { if(item == target) { return true; } ++count; return false; }); return count; } Now how does this one compare? List<T> On 100,000 Iterations Method Size Total (ms) Per Iteration (ms) % Slower TakeWhile 10 41 0.00041 128% Any w/Closure 10 23 0.00023 28% Iteration 10 18 0.00018 - TakeWhile 100 171 0.00171 88% Any w/Closure 100 116 0.00116 27% Iteration 100 91 0.00091 - TakeWhile 1000 1604 0.01604 94% Any w/Closure 1000 1101 0.01101 33% Iteration 1000 825 0.00825 - TakeWhile 10,000 15765 0.15765 92% Any w/Closure 10,000 10802 0.10802 32% Iteration 10,000 8204 0.08204 - TakeWhile 100,000 156950 1.5695 92% Any w/Closure 100,000 108378 1.08378 33% Iteration 100,000 81635 0.81635 - Much better! It seems that the overhead of TakeAny() returning each item and updating the state in the state machine is drastically reduced by using Any() since Any() iterates forward until it finds the value we're looking for -- for the task we're attempting to do. So the lesson there is, make sure when you use a LINQ expression you're choosing the best expression for the job, because if you're doing more work than you really need, you'll have a slower algorithm. But this is true of any choice of algorithm or collection in general. Even with the Any() with the count in the closure it is still about 30% slower, but let's consider that angle carefully. For a list of 100,000 items, it was the difference between 1.01 ms and 0.82 ms roughly in a List<T>. That's really not that bad at all in the grand scheme of things. Even running at 90% slower with TakeWhile(), for the vast majority of my projects, an extra millisecond to save potential errors in the long term and improve maintainability is a small price to pay. And if your typical list is 1000 items or less we're talking only microseconds worth of difference. It's like they say: 90% of your performance bottlenecks are in 2% of your code, so over-optimizing almost never pays off. So personally, I'll take the LINQ expression wherever I can because they will be easier to read and maintain (thus reducing technical debt) and I can rely on Microsoft's development to have coded and unit tested those algorithm fully for me instead of relying on a developer to code the loop logic correctly. If something's 90% slower, yes, it's worth keeping in mind, but it's really not until you start get magnitudes-of-order slower (10x, 100x, 1000x) that alarm bells should really go off. And if I ever do need that last millisecond of performance? Well then I'll optimize JUST THAT problem spot. To me it's worth it for the readability, speed-to-market, and maintainability.

Read the article
SQL SERVER – Weekly Series – Memory Lane – #052

- by Pinal Dave

Let us continue with the final episode of the Memory Lane Series. Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Set Server Level FILLFACTOR Using T-SQL Script Specifies a percentage that indicates how full the Database Engine should make the leaf level of each index page during index creation or alteration. fillfactor must be an integer value from 1 to 100. The default is 0. Limitation of Online Index Rebuld Operation Online operation means when online operations are happening in the database are in normal operational condition, the processes which are participating in online operations does not require exclusive access to the database. Get Permissions of My Username / Userlogin on Server / Database A few days ago, I was invited to one of the largest database company. I was asked to review database schema and propose changes to it. There was special username or user logic was created for me, so I can review their database. I was very much interested to know what kind of permissions I was assigned per server level and database level. I did not feel like asking Sr. DBA the question about permissions. Simple Example of WHILE Loop With CONTINUE and BREAK Keywords This question is one of those questions which is very simple and most of the users get it correct, however few users find it confusing for the first time. I have tried to explain the usage of simple WHILE loop in the first example. BREAK keyword will exit the stop the while loop and control is moved to the next statement after the while loop. CONTINUE keyword skips all the statement after its execution and control is sent to the first statement of while loop. Forced Parameterization and Simple Parameterization – T-SQL and SSMS When the PARAMETERIZATION option is set to FORCED, any literal value that appears in a SELECT, INSERT, UPDATE or DELETE statement is converted to a parameter during query compilation. When the PARAMETERIZATION database option is SET to SIMPLE, the SQL Server query optimizer may choose to parameterize the queries. 2008 Transaction and Local Variables – Swap Variables – Update All At Once Concept Summary : Transaction have no effect over memory variables. When UPDATE statement is applied over any table (physical or memory) all the updates are applied at one time together when the statement is committed. First of all I suggest that you read the article listed above about the effect of transaction on local variant. As seen there local variables are independent of any transaction effect. Simulate INNER JOIN using LEFT JOIN statement – Performance Analysis Just a day ago, while I was working with JOINs I find one interesting observation, which has prompted me to create following example. Before we continue further let me make very clear that INNER JOIN should be used where it cannot be used and simulating INNER JOIN using any other JOINs will degrade the performance. If there are scopes to convert any OUTER JOIN to INNER JOIN it should be done with priority. 2009 Introduction to Business Intelligence – Important Terms & Definitions Business intelligence (BI) is a broad category of application programs and technologies for gathering, storing, analyzing, and providing access to data from various data sources, thus providing enterprise users with reliable and timely information and analysis for improved decision making. Difference Between Candidate Keys and Primary Key Candidate Key – A Candidate Key can be any column or a combination of columns that can qualify as unique key in database. There can be multiple Candidate Keys in one table. Each Candidate Key can qualify as Primary Key. Primary Key – A Primary Key is a column or a combination of columns that uniquely identify a record. Only one Candidate Key can be Primary Key. 2010 Taking Multiple Backup of Database in Single Command – Mirrored Database Backup I recently had a very interesting experience. In one of my recent consultancy works, I was told by our client that they are going to take the backup of the database and will also a copy of it at the same time. I expressed that it was surely possible if they were going to use a mirror command. In addition, they told me that whenever they take two copies of the database, the size of the database, is always reduced. Now this was something not clear to me, I said it was not possible and so I asked them to show me the script. Corrupted Backup File and Unsuccessful Restore The CTO, who was also present at the location, got very upset with this situation. He then asked when the last successful restore test was done. As expected, the answer was NEVER.There were no successful restore tests done before. During that time, I was present and I could clearly see the stress, confusion, carelessness and anger around me. I did not appreciate the feeling and I was pretty sure that no one in there wanted the atmosphere like me. 2011 TRACEWRITE – Wait Type – Wait Related to Buffer and Resolution SQL Trace is a SQL Server database engine technology which monitors specific events generated when various actions occur in the database engine. When any event is fired it goes through various stages as well various routes. One of the routes is Trace I/O Provider, which sends data to its final destination either as a file or rowset. DATEDIFF – Accuracy of Various Dateparts If you want to have accuracy in seconds, you need to use a different approach. In the first example, the accurate method is to find the number of seconds first and then divide it by 60 to convert it in minutes. Dedicated Access Control for SQL Server Express Edition http://www.youtube.com/watch?v=1k00z82u4OI Book Signing at SQLPASS 2012 Who I Am And How I Got Here – True Story as Blog Post If there was a shortcut to success – I want to know. I learnt SQL Server hard way and I am still learning. There are so many things, I have to learn. There is not enough time to learn everything which we want to learn. I am constantly working on it every day. I welcome you to join my journey as well. Please join me in my journey to learn SQL Server – more the merrier. Vacation, Travel and Study – A New Concept Even those who have advanced degrees and went to college for years, or even decades, find studying hard. There is a difference between studying for a career and studying for a certification. At least to get a degree there is a variety of subjects, with labs, exams, and practice problems to make things more interesting. Order By Numeric Values Formatted as String We have a table which has a column containing alphanumeric data. The data always has first as an integer and later part as a string. The business need is to order the data based on the first part of the alphanumeric data which is an integer. Now the problem is that no matter how we use ORDER BY the result is not produced as expected. Let us understand this with an example. Resolving SQL Server Connection Errors – SQL in Sixty Seconds #030 – Video One of the most famous errors related to SQL Server is about connecting to SQL Server itself. Here is how it goes, most of the time developers have worked with SQL Server and knows pretty much every error which they face during development language. However, hardly they install fresh SQL Server. As the installation of the SQL Server is a rare occasion unless you are a DBA who is responsible for such an instance – the error faced during installations are pretty rare as well. http://www.youtube.com/watch?v=1k00z82u4OI Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
IntelliSense for Razor Hosting in non-Web Applications

- by Rick Strahl

When I posted my Razor Hosting article a couple of weeks ago I got a number of questions on how to get IntelliSense to work inside of Visual Studio while editing your templates. The answer to this question is mainly dependent on how Visual Studio recognizes assemblies, so a little background is required. If you open a template just on its own as a standalone file by clicking on it say in Explorer, Visual Studio will open up with the template in the editor, but you won’t get any IntelliSense on any of your related assemblies that you might be using by default. It’ll give Intellisense on base System namespace, but not on your imported assembly types. This makes sense: Visual Studio has no idea what the assembly associations for the single file are. There are two options available to you to make IntelliSense work for templates: Add the templates as included files to your non-Web project Add a BIN folder to your template’s folder and add all assemblies required there Including Templates in your Host Project By including templates into your Razor hosting project, Visual Studio will pick up the project’s assembly references and make IntelliSense available for any of the custom types in your project and on your templates. To see this work I moved the \Templates folder from the samples from the Debug\Bin folder into the project root and included the templates in the WinForm sample project. Here’s what this looks like in Visual Studio after the templates have been included: Notice that I take my original example and type cast the Context object to the specific type that it actually represents – namely CustomContext – by using a simple code block: @{ CustomContext Model = Context as CustomContext; } After that assignment my Model local variable is in scope and IntelliSense works as expected. Note that you also will need to add any namespaces with the using command in this case: @using RazorHostingWinForm which has to be defined at the very top of a Razor document. BTW, while you can only pass in a single Context 'parameter’ to the template with the default template I’ve provided realize that you can also assign a complex object to Context. For example you could have a container object that references a variety of other objects which you can then cast to the appropriate types as needed: @{ ContextContainer container = Context as ContextContainer; CustomContext Model = container.Model; CustomDAO DAO = container.DAO; } and so forth. IntelliSense for your Custom Template Notice also that you can get IntelliSense for the top level template by specifying an inherits tag at the top of the document: @inherits RazorHosting.RazorTemplateFolderHost By specifying the above you can then get IntelliSense on your base template’s properties. For example, in my base template there are Request and Response objects. This is very useful especially if you end up creating custom templates that include your custom business objects as you can get effectively see full IntelliSense from the ‘page’ level down. For Html Help Builder for example, I’d have a Help object on the page and assuming I have the references available I can see all the way into that Help object without even having to do anything fancy. Note that the @inherits key is a GREAT and easy way to override the base template you normally specify as the default template. It allows you to create a custom template and as long as it inherits from the base template it’ll work properly. Since the last post I’ve also made some changes in the base template that allow hooking up some simple initialization logic so it gets much more easy to create custom templates and hook up custom objects with an IntializeTemplate() hook function that gets called with the Context and a Configuration object. These objects are objects you can pass in at runtime from your host application and then assign to custom properties on your template. For example the default implementation for RazorTemplateFolderHost does this: public override void InitializeTemplate(object context, object configurationData) { // Pick up configuration data and stuff into Request object RazorFolderHostTemplateConfiguration config = configurationData as RazorFolderHostTemplateConfiguration; this.Request.TemplatePath = config.TemplatePath; this.Request.TemplateRelativePath = config.TemplateRelativePath; // Just use the entire ConfigData as the model, but in theory // configData could contain many objects or values to set on // template properties this.Model = config.ConfigData as TModel; } to set up a strongly typed Model and the Request object. You can do much more complex hookups here of course and create complex base template pages that contain all the objects that you need in your code with strong typing. Adding a Bin folder to your Template’s Root Path Including templates in your host project works if you own the project and you’re the only one modifying the templates. However, if you are distributing the Razor engine as a templating/scripting solution as part of your application or development tool the original project is likely not available and so that approach is not practical. Another option you have is to add a Bin folder and add all the related assemblies into it. You can also add a Web.Config file with assembly references for any GAC’d assembly references that need to be associated with the templates. Between the web.config and bin folder Visual Studio can figure out how to provide IntelliSense. The Bin folder should contain: The RazorHosting.dll Your host project’s EXE or DLL – renamed to .dll if it’s an .exe Any external (bin folder) dependent assemblies Note that you most likely also want a reference to the host project if it contains references that are going to be used in templates. Visual Studio doesn’t recognize an EXE reference so you have to rename the EXE to DLL to make it work. Apparently the binary signature of EXE and DLL files are identical and it just works – learn something new everyday… For GAC assembly references you can add a web.config file to your template root. The Web.config file then should contain any full assembly references to GAC components: <configuration> <system.web> <compilation debug="true"> <assemblies> <add assembly="System.Web.Mvc, Version=3.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" /> <add assembly="System.Web, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a" /> <add assembly="System.Web.Extensions, Version=4.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" /> </assemblies> </compilation> </system.web> </configuration> And with that you should get full IntelliSense. Note that if you add a BIN folder and you also have the templates in your Visual Studio project Visual Studio will complain about reference conflicts as it’s effectively seeing both the project references and the ones in the bin folder. So it’s probably a good idea to use one or the other but not both at the same time :-) Seeing IntelliSense in your Razor templates is a big help for users of your templates. If you’re shipping an application level scripting solution especially it’ll be real useful for your template consumers/users to be able to get some quick help on creating customized templates – after all that’s what templates are all about – easy customization. Making sure that everything is referenced in your bin folder and web.config is a good idea and it’s great to see that Visual Studio (and presumably WebMatrix/Visual Web Developer as well) will be able to pick up your custom IntelliSense in Razor templates.© Rick Strahl, West Wind Technologies, 2005-2011Posted in Razor

Read the article
The blocking nature of aggregates

- by Rob Farley

I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here. In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

Read the article
The blocking nature of aggregates

- by Rob Farley

I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here. In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

Read the article
The Data Scientist

- by BuckWoody

A new term - well, perhaps not that new - has come up and I’m actually very excited about it. The term is Data Scientist, and since it’s new, it’s fairly undefined. I’ll explain what I think it means, and why I’m excited about it. In general, I’ve found the term deals at its most basic with analyzing data. Of course, we all do that, and the term itself in that definition is redundant. There is no science that I know of that does not work with analyzing lots of data. But the term seems to refer to more than the common practices of looking at data visually, putting it in a spreadsheet or report, or even using simple coding to examine data sets. The term Data Scientist (as far as I can make out this early in it’s use) is someone who has a strong understanding of data sources, relevance (statistical and otherwise) and processing methods as well as front-end displays of large sets of complicated data. Some - but not all - Business Intelligence professionals have these skills. In other cases, senior developers, database architects or others fill these needs, but in my experience, many lack the strong mathematical skills needed to make these choices properly. I’ve divided the knowledge base for someone that would wear this title into three large segments. It remains to be seen if a given Data Scientist would be responsible for knowing all these areas or would specialize. There are pretty high requirements on the math side, specifically in graduate-degree level statistics, but in my experience a company will only have a few of these folks, so they are expected to know quite a bit in each of these areas. Persistence The first area is finding, cleaning and storing the data. In some cases, no cleaning is done prior to storage - it’s just identified and the cleansing is done in a later step. This area is where the professional would be able to tell if a particular data set should be stored in a Relational Database Management System (RDBMS), across a set of key/value pair storage (NoSQL) or in a file system like HDFS (part of the Hadoop landscape) or other methods. Or do you examine the stream of data without storing it in another system at all? This is an important decision - it’s a foundation choice that deals not only with a lot of expense of purchasing systems or even using Cloud Computing (PaaS, SaaS or IaaS) to source it, but also the skillsets and other resources needed to care and feed the system for a long time. The Data Scientist sets something into motion that will probably outlast his or her career at a company or organization. Often these choices are made by senior developers, database administrators or architects in a company. But sometimes each of these has a certain bias towards making a decision one way or another. The Data Scientist would examine these choices in light of the data itself, starting perhaps even before the business requirements are created. The business may not even be aware of all the strategic and tactical data sources that they have access to. Processing Once the decision is made to store the data, the next set of decisions are based around how to process the data. An RDBMS scales well to a certain level, and provides a high degree of ACID compliance as well as offering a well-known set-based language to work with this data. In other cases, scale should be spread among multiple nodes (as in the case of Hadoop landscapes or NoSQL offerings) or even across a Cloud provider like Windows Azure Table Storage. In fact, in many cases - most of the ones I’m dealing with lately - the data should be split among multiple types of processing environments. This is a newer idea. Many data professionals simply pick a methodology (RDBMS with Star Schemas, NoSQL, etc.) and put all data there, regardless of its shape, processing needs and so on. A Data Scientist is familiar not only with the various processing methods, but how they work, so that they can choose the right one for a given need. This is a huge time commitment, hence the need for a dedicated title like this one. Presentation This is where the need for a Data Scientist is most often already being filled, sometimes with more or less success. The latest Business Intelligence systems are quite good at allowing you to create amazing graphics - but it’s the data behind the graphics that are the most important component of truly effective displays. This is where the mathematics requirement of the Data Scientist title is the most unforgiving. In fact, someone without a good foundation in statistics is not a good candidate for creating reports. Even a basic level of statistics can be dangerous. Anyone who works in analyzing data will tell you that there are multiple errors possible when data just seems right - and basic statistics bears out that you’re on the right track - that are only solvable when you understanding why the statistical formula works the way it does. And there are lots of ways of presenting data. Sometimes all you need is a “yes” or “no” answer that can only come after heavy analysis work. In that case, a simple e-mail might be all the reporting you need. In others, complex relationships and multiple components require a deep understanding of the various graphical methods of presenting data. Knowing which kind of chart, color, graphic or shape conveys a particular datum best is essential knowledge for the Data Scientist. Why I’m excited I love this area of study. I like math, stats, and computing technologies, but it goes beyond that. I love what data can do - how it can help an organization. I’ve been fortunate enough in my professional career these past two decades to work with lots of folks who perform this role at companies from aerospace to medical firms, from manufacturing to retail. Interestingly, the size of the company really isn’t germane here. I worked with one very small bio-tech (cryogenics) company that worked deeply with analysis of complex interrelated data. So watch this space. No, I’m not leaving Azure or distributed computing or Microsoft. In fact, I think I’m perfectly situated to investigate this role further. We have a huge set of tools, from RDBMS to Hadoop to allow me to explore. And I’m happy to share what I learn along the way.

Read the article
Workshops, online content show how Oracle infuses simplicity, mobility, extensibility into user experience

- by mvaughan

By Kathy Miedema & Misha Vaughan, Oracle Applications User Experience Oracle has made a huge investment into the user experience of its many different software product families, and recent releases showcase big changes and features that aim to promote end user engagement and efficiency by streamlining navigation and simplifying the user interface. But making Oracle’s enterprise software great-looking and usable doesn’t stop when Oracle products go out the door. The Applications User Experience (UX) team recognizes that our customers may need to customize software to fit their work processes. And that’s why we provide tools such as user experience design patterns to help you maintain the Oracle user experience as you tailor your application to fit your business needs. Often, however, customers may need some context around user experience. How has the Oracle user experience been designed and constructed? Why is a good user experience important for users? How does understanding what goes into the user experience benefit the people who purchase the software for users? There’s a short answer to these questions, and you can read about it on Usable Apps. But truly understanding Oracle’s investment and seeing how it applies across product families occasionally requires a deeper dive into the Oracle user experience, especially if you’re an influencer or decision-maker about Oracle products. To help frame these decisions, the Communications & Outreach team has developed several targeted workshops that explore what Oracle means when it talks about user experience, and provides a roadmap into where the Oracle user experience is going. These workshops require non-disclosure agreements, and have been delivered to Oracle sales folks, Oracle partners, Oracle ACE Directors and ACEs, and a few customers. Some of these audience members have been developers or have a technical background; just as many did not. Here’s a breakdown of the kind of training you can get around the Oracle user experience from the OAUX Communications & Outreach team.For Partners: George Papazzian, Principal, Naviscent with Joyce Ohgi, Oracle Oracle Fusion Applications HCM Pre-Sales Seminar: In concert with Worldwide Alliances and Channels under Applications Partner Enablement Director Jonathan Vinoskey’s guidance, the Applications User Experience team delivers a two-day workshop. Day one focuses on Oracle Fusion Applications HCM and pre-sales strategy, and Day two focuses on positioning and leveraging Oracle’s investment in the Oracle Fusion Applications user experience. The next workshops will occur on the following dates: December 4-5, 2013 @ Manchester, UK January 29-30, 2014 @ Reston, Virginia February 2014 @ Guadalajara, Mexico (email: Shannon Whiteman) March 11-12, 2014 @ Dubai, United Arab Emirates April 1-2, 2014 @ Chicago, Illinois Partner Advisory Board: A two-day board meeting in the U.S. and U.K. to discuss four main user experience areas for Oracle Fusion Applications: simplicity, visualization & analytics, mobility, & futures. This event is limited to Oracle Diamond Partners, UX bloggers, and key UX influencers and requires legal documentation. We will be talking about the Oracle applications UX strategy and roadmap. Partner Implementation Training on User Interface: How to Build Great-Looking, Usable Apps: In this two-day, hands-on workshop built around Oracle’s Application Development Framework, learn how to build desktop and mobile user interfaces and mobile user interfaces based on Oracle’s experience with Fusion Applications. This workshop is for partners with a technology background who are looking for ways to tailor Fusion Applications using ADF, or have built their own custom solutions using ADF. It includes an introduction to UX design patterns and provides tools to build usability-tested UX designs. Nov 5-6, 2013 @ Redwood Shores, CA, USA January 28-29th, 2014 @ Reston, Virginia, USA February 25-26, 2014 @ Guadalajara, Mexico March 9-10, 2014 @ Dubai, United Arab Emirates To register, contact [email protected] Simplified UI Customization & Extensibility: Pilot workshop: We will be reviewing the proposed content for communicating the user experience tool kit available with the next release of Oracle Fusion Applications. Our core focus will be on what toolkit components our system implementors and independent software vendors will need to respond to customer demand, whether they are extending Fusion Applications, or building custom applications, that will need to leverage the simplified UI. Dec 11th, 2013 @ Reading, UK For information: contact [email protected] Private lab tour and demos: Interested in seeing what’s going on in the Apps UX Labs? If you are headed to the San Francisco Bay Area, let us know. We can arrange a spin through our usability labs at headquarters. OAUX Expo: This open-house forum gives partners a look at what the UX team is working on, and showcases the next-generation user experiences in a demo environment where attendees can see and touch the applications. UX Direct: Use the same methods that Oracle uses to develop its own user experiences. We help you define your users and their needs, and then provide direction on how to tailor the best user experience you can for them. For CustomersAngela Johnston, Gozel Aamoth, Teena Singh, and Yen Chan, Oracle Lab tours: See demos of soon-to-be-released products, and take a spin on usability research equipment such as our eye-tracker. Watch this video to get an idea of what you’ll see. Get our newsletter: Learn about newly released products and see where you can meet us at user group conferences. Participate in a feedback session: Join a focus group or customer feedback session to get an early look at user experience designs for the next generation of software, and provide your thoughts on how well it will work. Join the OUAB: The Oracle Usability Advisory Board meets several times a year to discuss trends in the workforce and provide direction on user experience designs. UX Direct: Use the same methods that Oracle uses to develop its own user experiences. We help you define your users and their needs, and then provide direction on how to tailor the best user experience you can for them. For Developers (customers, partners, and consultants): Plinio Arbizu, SP Solutions, Richard Bingham, Oracle, Balaji Kamepalli, EiSTechnoogies, Praveen Pillalamarri, EiSTechnologies How to Build Great-Looking, Usable Apps: This workshop is for attendees with a strong technology background who are looking for ways to tailor customer software using ADF. It includes an introduction to UX design patterns and provides tools to build usability-tested UX designs. See above for dates and times. UX design patterns web site: Cut the length of your project down by months. Use these patterns to build out the task flow you need to develop for your users. The patterns have already been usability-tested and represent the best practices that the Oracle UX research team has found in its studies. UX Direct: Use the same methods that Oracle uses to develop its own user experiences. We help you define your users and their needs, and then provide direction on how to tailor the best user experience you can for them. For Oracle Sales Mike Klein, Jeremy Ashley, Brent White, Oracle Contact your local sales person for more information about the Oracle user experience and the training available from the Applications User Experience Communications & Outreach team. See customer-friendly user experience collateral ranging from the new simplified UI in Oracle Fusion Applications Release 7, to E-Business Suite user experience highlights, to Siebel, PeopleSoft, and JD Edwards user experience highlights. Receive access to the same pre-sales and implementation training we provide to partners. For Oracle Sales only: Oracle-only training on the Oracle Fusion Applications UX Innovation Sales Kit.

Read the article
Using Node.js as an accelerator for WCF REST services

- by Elton Stoneman

Node.js is a server-side JavaScript platform "for easily building fast, scalable network applications". It's built on Google's V8 JavaScript engine and uses an (almost) entirely async event-driven processing model, running in a single thread. If you're new to Node and your reaction is "why would I want to run JavaScript on the server side?", this is the headline answer: in 150 lines of JavaScript you can build a Node.js app which works as an accelerator for WCF REST services*. It can double your messages-per-second throughput, halve your CPU workload and use one-fifth of the memory footprint, compared to the WCF services direct. Well, it can if: 1) your WCF services are first-class HTTP citizens, honouring client cache ETag headers in request and response; 2) your services do a reasonable amount of work to build a response; 3) your data is read more often than it's written. In one of my projects I have a set of REST services in WCF which deal with data that only gets updated weekly, but which can be read hundreds of times an hour. The services issue ETags and will return a 304 if the client sends a request with the current ETag, which means in the most common scenario the client uses its local cached copy. But when the weekly update happens, then all the client caches are invalidated and they all need the same new data. Then the service will get hundreds of requests with old ETags, and they go through the full service stack to build the same response for each, taking up threads and processing time. Part of that processing means going off to a database on a separate cloud, which introduces more latency and downtime potential. We can use ASP.NET output caching with WCF to solve the repeated processing problem, but the server will still be thread-bound on incoming requests, and to get the current ETags reliably needs a database call per request. The accelerator solves that by running as a proxy - all client calls come into the proxy, and the proxy routes calls to the underlying REST service. We could use Node as a straight passthrough proxy and expect some benefit, as the server would be less thread-bound, but we would still have one WCF and one database call per proxy call. But add some smart caching logic to the proxy, and share ETags between Node and WCF (so the proxy doesn't even need to call the servcie to get the current ETag), and the underlying service will only be invoked when data has changed, and then only once - all subsequent client requests will be served from the proxy cache. I've built this as a sample up on GitHub: NodeWcfAccelerator on sixeyed.codegallery. Here's how the architecture looks: The code is very simple. The Node proxy runs on port 8010 and all client requests target the proxy. If the client request has an ETag header then the proxy looks up the ETag in the tag cache to see if it is current - the sample uses memcached to share ETags between .NET and Node. If the ETag from the client matches the current server tag, the proxy sends a 304 response with an empty body to the client, telling it to use its own cached version of the data. If the ETag from the client is stale, the proxy looks for a local cached version of the response, checking for a file named after the current ETag. If that file exists, its contents are returned to the client as the body in a 200 response, which includes the current ETag in the header. If the proxy does not have a local cached file for the service response, it calls the service, and writes the WCF response to the local cache file, and to the body of a 200 response for the client. So the WCF service is only troubled if both client and proxy have stale (or no) caches. The only (vaguely) clever bit in the sample is using the ETag cache, so the proxy can serve cached requests without any communication with the underlying service, which it does completely generically, so the proxy has no notion of what it is serving or what the services it proxies are doing. The relative path from the URL is used as the lookup key, so there's no shared key-generation logic between .NET and Node, and when WCF stores a tag it also stores the "read" URL against the ETag so it can be used for a reverse lookup, e.g: Key Value /WcfSampleService/PersonService.svc/rest/fetch/3 "28cd4796-76b8-451b-adfd-75cb50a50fa6" "28cd4796-76b8-451b-adfd-75cb50a50fa6" /WcfSampleService/PersonService.svc/rest/fetch/3 In Node we read the cache using the incoming URL path as the key and we know that "28cd4796-76b8-451b-adfd-75cb50a50fa6" is the current ETag; we look for a local cached response in /caches/28cd4796-76b8-451b-adfd-75cb50a50fa6.body (and the corresponding .header file which contains the original service response headers, so the proxy response is exactly the same as the underlying service). When the data is updated, we need to invalidate the ETag cache – which is why we need the reverse lookup in the cache. In the WCF update service, we don't need to know the URL of the related read service - we fetch the entity from the database, do a reverse lookup on the tag cache using the old ETag to get the read URL, update the new ETag against the URL, store the new reverse lookup and delete the old one. Running Apache Bench against the two endpoints gives the headline performance comparison. Making 1000 requests with concurrency of 100, and not sending any ETag headers in the requests, with the Node proxy I get 102 requests handled per second, average response time of 975 milliseconds with 90% of responses served within 850 milliseconds; going direct to WCF with the same parameters, I get 53 requests handled per second, mean response time of 1853 milliseconds, with 90% of response served within 3260 milliseconds. Informally monitoring server usage during the tests, Node maxed at 20% CPU and 20Mb memory; IIS maxed at 60% CPU and 100Mb memory. Note that the sample WCF service does a database read and sleeps for 250 milliseconds to simulate a moderate processing load, so this is *not* a baseline Node-vs-WCF comparison, but for similar scenarios where the service call is expensive but applicable to numerous clients for a long timespan, the performance boost from the accelerator is considerable. * - actually, the accelerator will work nicely for any HTTP request, where the URL (path + querystring) uniquely identifies a resource. In the sample, there is an assumption that the ETag is a GUID wrapped in double-quotes (e.g. "28cd4796-76b8-451b-adfd-75cb50a50fa6") – which is the default for WCF services. I use that assumption to name the cache files uniquely, but it is a trivial change to adapt to other ETag formats.

Read the article
SQL SERVER – Core Concepts – Elasticity, Scalability and ACID Properties – Exploring NuoDB an Elastically Scalable Database System

- by pinaldave

I have been recently exploring Elasticity and Scalability attributes of databases. You can see that in my earlier blog posts about NuoDB where I wanted to look at Elasticity and Scalability concepts. The concepts are very interesting, and intriguing as well. I have discussed these concepts with my friend Joyti M and together we have come up with this interesting read. The goal of this article is to answer following simple questions What is Elasticity? What is Scalability? How ACID properties vary from NOSQL Concepts? What are the prevailing problems in the current database system architectures? Why is NuoDB an innovative and welcome change in database paradigm? Elasticity This word’s original form is used in many different ways and honestly it does do a decent job in holding things together over the years as a person grows and contracts. Within the tech world, and specifically related to software systems (database, application servers), it has come to mean a few things - allow stretching of resources without reaching the breaking point (on demand). What are resources in this context? Resources are the usual suspects – RAM/CPU/IO/Bandwidth in the form of a container (a process or bunch of processes combined as modules). When it is about increasing resources the simplest idea which comes to mind is the addition of another container. Another container means adding a brand new physical node. When it is about adding a new node there are two questions which comes to mind. 1) Can we add another node to our software system? 2) If yes, does adding new node cause downtime for the system? Let us assume we have added new node, let us see what the new needs of the system are when a new node is added. Balancing incoming requests to multiple nodes Synchronization of a shared state across multiple nodes Identification of “downstate” and resolution action to bring it to “upstate” Well, adding a new node has its advantages as well. Here are few of the positive points Throughput can increase nearly horizontally across the node throughout the system Response times of application will increase as in-between layer interactions will be improved Now, Let us put the above concepts in the perspective of a Database. When we mention the term “running out of resources” or “application is bound to resources” the resources can be CPU, Memory or Bandwidth. The regular approach to “gain scalability” in the database is to look around for bottlenecks and increase the bottlenecked resource. When we have memory as a bottleneck we look at the data buffers, locks, query plans or indexes. After a point even this is not enough as there needs to be an efficient way of managing such large workload on a “single machine” across memory and CPU bound (right kind of scheduling) workload. We next move on to either read/write separation of the workload or functionality-based sharing so that we still have control of the individual. But this requires lots of planning and change in client systems in terms of knowing where to go/update/read and for reporting applications to “aggregate the data” in an intelligent way. What we ideally need is an intelligent layer which allows us to do these things without us getting into managing, monitoring and distributing the workload. Scalability In the context of database/applications, scalability means three main things Ability to handle normal loads without pressure E.g. X users at the Y utilization of resources (CPU, Memory, Bandwidth) on the Z kind of hardware (4 processor, 32 GB machine with 15000 RPM SATA drives and 1 GHz Network switch) with T throughput Ability to scale up to expected peak load which is greater than normal load with acceptable response times Ability to provide acceptable response times across the system E.g. Response time in S milliseconds (or agreed upon unit of measure) – 90% of the time The Issue – Need of Scale In normal cases one can plan for the load testing to test out normal, peak, and stress scenarios to ensure specific hardware meets the needs. With help from Hardware and Software partners and best practices, bottlenecks can be identified and requisite resources added to the system. Unfortunately this vertical scale is expensive and difficult to achieve and most of the operational people need the ability to scale horizontally. This helps in getting better throughput as there are physical limits in terms of adding resources (Memory, CPU, Bandwidth and Storage) indefinitely. Today we have different options to achieve scalability: Read & Write Separation The idea here is to do actual writes to one store and configure slaves receiving the latest data with acceptable delays. Slaves can be used for balancing out reads. We can also explore functional separation or sharing as well. We can separate data operations by a specific identifier (e.g. region, year, month) and consolidate it for reporting purposes. For functional separation the major disadvantage is when schema changes or workload pattern changes. As the requirement grows one still needs to deal with scale need in manual ways by providing an abstraction in the middle tier code. Using NOSQL solutions The idea is to flatten out the structures in general to keep all values which are retrieved together at the same store and provide flexible schema. The issue with the stores is that they are compromising on mostly consistency (no ACID guarantees) and one has to use NON-SQL dialect to work with the store. The other major issue is about education with NOSQL solutions. Would one really want to make these compromises on the ability to connect and retrieve in simple SQL manner and learn other skill sets? Or for that matter give up on ACID guarantee and start dealing with consistency issues? Hybrid Deployment – Mac, Linux, Cloud, and Windows One of the challenges today that we see across On-premise vs Cloud infrastructure is a difference in abilities. Take for example SQL Azure – it is wonderful in its concepts of throttling (as it is shared deployment) of resources and ability to scale using federation. However, the same abilities are not available on premise. This is not a mistake, mind you – but a compromise of the sweet spot of workloads, customer requirements and operational SLAs which can be supported by the team. In today’s world it is imperative that databases are available across operating systems – which are a commodity and used by developers of all hues. An Ideal Database Ability List A system which allows a linear scale of the system (increase in throughput with reasonable response time) with the addition of resources A system which does not compromise on the ACID guarantees and require developers to learn new paradigms A system which does not force fit a new way interacting with database by learning Non-SQL dialect A system which does not force fit its mechanisms for providing availability across its various modules. Well NuoDB is the first database which has all of the above abilities and much more. In future articles I will cover my hands-on experience with it. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

Read the article
Packaging Swing apps with integrated JavaFX content

- by igor

JavaFX provides a lot of interesting capabilities for developing rich client applications in Java, but what if you are working on an existing Swing application and you want to take advantage of these new features? Maybe you want to use one or two controls like the LineChart or a MediaView. Maybe you want to embed a large Scene Graph as an initial step in porting your application to FX. A hybrid Swing/FX application might just be the answer. Developing a hybrid Swing + JavaFX application is not terribly difficult, but until recently the deployment of hybrid applications has not simple as a "pure" JavaFX application. The existing tools focused on packaging FX Applications, or Swing applications - they did not account for hybrid applications. But with JavaFX 2.2 the tools include support for this hybrid application use case. Solution In JavaFX 2.2 we extended the packaging ant tasks to greatly simplify deploying hybrid applications. You now use the same deployment approach as you would for pure JavaFX applications. Just bundle your main application jar with the fx:jar ant task and then generate html/jnlp files using fx:deploy. The only difference is setting toolkit attribute for the fx:application tag as shown below: <fx:application id="swingFXApp" mainClass="${main.class}" toolkit="swing"/> The value of ${main.class} in the example above is your application class which has a main method. It does not need to extend JavaFX Application class. The resulting package provides support for the same set of execution modes as a package for a JavaFX application, although the packages which are created are not identical to the packages created for a pure FX application. You will see two JNLP files generated in the case of a hybrid application - one for use from Swing applet and another for the webstart launch. Note that these improvements do not alter the set of features available to Swing applications. The packaging tools just make it easier to use the advanced features of JavaFX in your Swing application. The same limits still apply, for example a Swing application can not use JavaFX Preloaders and code changes are necessary to support HTML splash screens. Why should I use the JavaFX ant tasks for packaging my Swing application? While using FX packaging tool for a Swing application may seem like a mismatch at face value, there are some really good reasons to use this approach. The primary justification for our packaging tools is to simplify the creation of your application artifacts, and to reduce manual errors. Plus, no one should have to write JNLP by hand. Some specific benefits include: Your application jar will include a launcher program. This improves your standalone launch by: checking for the JavaFX runtime guiding the user through any necessary installations setting the system proxy for Java The ant tasks will generate JNLP and HTML files for your swing app: avoids learning unnecessary details about JNLP, and eliminates the error-prone hand editing of JNLP files simplifies using advanced features like embedding JNLP and signing jars as BLOBs to improve launch performance.you can also embed the signing certificate details to improve the user's experience allows the use of web page templates to inject the generated code directly into your actual web page instead of being forced to copy/paste the generated code snippets. What about native packing? Absolutely! The very same ant task can generate a native bundle for a Swing application with JavaFX content. Try running one of these sample native bundles for the "SwingInterop" FX example: exe and dmg. I also used another feature on these examples: a click-through license agreement for .exe installers and OS X DMG drag installers. Small Caveat This packaging procedure is optimized around using the JavaFX packaging tools for your entire Swing application. If you are trying to embed JavaFX content into existing project (with an existing build/packing process) then you may need to experiment in order to find the best way to integrate the JavaFX packaging steps into your existing build procedure. As long as you can use ant in your build process this should be a workable approach. It some cases solution could be less than ideal. For example, you need to use fx:jar to package your main jar file in order to produce a double-clickable jar or a native bundle. The jar will be created from scratch, but you may already be creating the main jar file with a custom manifest. This may lead to some redundant steps in your build process. Hopefully the benefits will outweigh the problems. This is an area of ongoing development for the team, and we will continue to refine and improve both the tools and the process. Please share your experiences and suggestions with us. You can comment here on the blog or file issues to JIRA. Sample code Here is the full ant code used to package SwingInterop. You can grab latest JavaFX samples and try it yourself: <target name="-post-jar"> <taskdef resource="com/sun/javafx/tools/ant/antlib.xml" uri="javafx:com.sun.javafx.tools.ant" classpath="${javafx.tools.ant.jar}"/>  <fx:application id="swingFXApp" mainClass="${main.class}" toolkit="swing"/>  <fx:jar destfile="${dist.jar}"> <fileset dir="${build.classes.dir}"/> <fx:application refid="swingFXApp" name="SwingInterop"/> <manifest> <attribute name="Implementation-Vendor" value="${application.vendor}"/> <attribute name="Implementation-Title" value="${application.title}"/> <attribute name="Implementation-Version" value="1.0"/> </manifest> </fx:jar>  <delete file="${build.dir}/test.keystore"/> <genkey alias="TestAlias" storepass="xyz123" keystore="${build.dir}/test.keystore" dname="CN=Samples, OU=JavaFX Dev, O=Oracle, C=US"/> <fx:signjar keystore="${build.dir}/test.keystore" alias="TestAlias" storepass="xyz123"> <fileset file="${dist.jar}"/> </fx:signjar>  <fx:deploy width="960" height="720" includeDT="true" nativeBundles="all" outdir="${basedir}/${dist.dir}" embedJNLP="true" outfile="${application.title}"> <fx:application refId="swingFXApp"/> <fx:resources> <fx:fileset dir="${basedir}/${dist.dir}" includes="SwingInterop.jar"/> </fx:resources> <fx:permissions/> <info title="Sample app: ${application.title}" vendor="${application.vendor}"/> </fx:deploy> </target>

Read the article
Running Solaris 11 as a control domain on a T2000

- by jsavit

There is increased adoption of Oracle Solaris 11, and many customers are deploying it on systems that previously ran Solaris 10. That includes older T1-processor based systems like T1000 and T2000. Even though they are old (from 2005) and don't have the performance of current SPARC servers, they are still functional, stable servers that customers continue to operate. One reason to install Solaris 11 on them is that older machines are attractive for testing OS upgrades before updating current, production systems. Normally this does not present a challenge, because Solaris 11 runs on any T-series or M-series SPARC server. One scenario adds a complication: running Solaris 11 in a control domain on a T1000 or T2000 hosting logical domains. Solaris 11 pre-installed Oracle VM Server for SPARC incompatible with T1 Unlike Solaris 10, Solaris 11 comes with Oracle VM Server for SPARC preinstalled. The ldomsmanager package contains the logical domains manager for Oracle VM Server for SPARC 2.2, which requires a SPARC T2, T2+, T3, or T4 server. It does not work with T1-processor systems, which are only supported by LDoms Manager 1.2 and earlier. The following screenshot shows what happens (bold font) if you try to use Oracle VM Server for SPARC 2.x commands in a Solaris 11 control domain. The commands were issued in a control domain on a T2000 that previously ran Solaris 10. We also display the version of the logical domains manager installed in Solaris 11: root@t2000 psrinfo -vp The physical processor has 4 virtual processors (0-3) UltraSPARC-T1 (chipid 0, clock 1200 MHz) # prtconf|grep T SUNW,Sun-Fire-T200 # ldm -V Failed to connect to logical domain manager: Connection refused # pkg info ldomsmanager Name: system/ldoms/ldomsmanager Summary: Logical Domains Manager Description: LDoms Manager - Virtualization for SPARC T-Series Category: System/Virtualization State: Installed Publisher: solaris Version: 2.2.0.0 Build Release: 5.11 Branch: 0.175.0.8.0.3.0 Packaging Date: May 25, 2012 10:20:48 PM Size: 2.86 MB FMRI: pkg://solaris/system/ldoms/[email protected],5.11-0.175.0.8.0.3.0:20120525T222048Z The 2.2 version of the logical domains manager will have to be removed, and 1.2 installed, in order to use this as a control domain. Preparing to change - create a new boot environment Before doing anything else, lets create a new boot environment: # beadm list BE Active Mountpoint Space Policy Created -- ------ ---------- ----- ------ ------- solaris NR / 2.14G static 2012-09-25 10:32 # beadm create solaris-1 # beadm activate solaris-1 # beadm list BE Active Mountpoint Space Policy Created -- ------ ---------- ----- ------ ------- solaris N / 4.82M static 2012-09-25 10:32 solaris-1 R - 2.14G static 2012-09-29 11:40 # init 0 Normally an init 6 to reboot would have been sufficient, but in the next step I reset the system anyway in order to put the system in factory default mode for a "clean" domain configuration. Preparing to change - reset to factory default There was a leftover domain configuration on the T2000, so I reset it to the factory install state. Since the ldm command is't working yet, it can't be done from the control domain, so I did it by logging onto to the service processor: $ ssh -X admin@t2000-sc Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved. Oracle Advanced Lights Out Manager CMT v1.7.9 Please login: admin Please Enter password: ******** sc> showhost Sun-Fire-T2000 System Firmware 6.7.10 2010/07/14 16:35 Host flash versions: OBP 4.30.4.b 2010/07/09 13:48 Hypervisor 1.7.3.c 2010/07/09 15:14 POST 4.30.4.b 2010/07/09 14:24 sc> bootmode config="factory-default" sc> poweroff Are you sure you want to power off the system [y/n]? y SC Alert: SC Request to Power Off Host. SC Alert: Host system has shut down. sc> poweron SC Alert: Host System has Reset At this point I rebooted into the new Solaris 11 boot environment, and Solaris commands showed it was running on the factory default configuration of a single domain owning all 32 CPUs and 32GB of RAM (that's what it looked like in 2005.) # psrinfo -vp The physical processor has 8 cores and 32 virtual processors (0-31) The core has 4 virtual processors (0-3) The core has 4 virtual processors (4-7) The core has 4 virtual processors (8-11) The core has 4 virtual processors (12-15) The core has 4 virtual processors (16-19) The core has 4 virtual processors (20-23) The core has 4 virtual processors (24-27) The core has 4 virtual processors (28-31) UltraSPARC-T1 (chipid 0, clock 1200 MHz) # prtconf|grep Mem Memory size: 32640 Megabytes Note that the older processor has 4 virtual CPUs per core, while current processors have 8 per core. Remove ldomsmanager 2.2 and install the 1.2 version The Solaris 11 pkg command is now used to remove the 2.2 version that shipped with Solaris 11: # pkg uninstall ldomsmanager Packages to remove: 1 Create boot environment: No Create backup boot environment: No Services to change: 2 PHASE ACTIONS Removal Phase 130/130 PHASE ITEMS Package State Update Phase 1/1 Package Cache Update Phase 1/1 Image State Update Phase 2/2 Finally, LDoms 1.2 installed via its install script, the same way it was done years ago: # unzip LDoms-1_2-Integration-10.zip # cd LDoms-1_2-Integration-10/Install/ # ./install-ldm Welcome to the LDoms installer. You are about to install the Logical Domains Manager package that will enable you to create, destroy and control other domains on your system. Given the capabilities of the LDoms domain manager, you can now change the security configuration of this Solaris instance using the Solaris Security Toolkit. ... ... normal install messages omitted ... The Solaris Security Toolkit applies to Solaris 10, and cannot be used in Solaris 11 (in which several things hardened by the Toolkit are already hardened by default), so answer b in the choice below: You are about to install the Logical Domains Manager package that will enable you to create, destroy and control other domains on your system. Given the capabilities of the LDoms domain manager, you can now change the security configuration of this Solaris instance using the Solaris Security Toolkit. Select a security profile from this list: a) Hardened Solaris configuration for LDoms (recommended) b) Standard Solaris configuration c) Your custom-defined Solaris security configuration profile Enter a, b, or c [a]: b ... other install messages omitted for brevity... After install I ensure that the necessary services are enabled, and verify the version of the installed LDoms Manager: # svcs ldmd STATE STIME FMRI online 22:00:36 svc:/ldoms/ldmd:default # svcs vntsd STATE STIME FMRI disabled Aug_19 svc:/ldoms/vntsd:default # ldm -V Logical Domain Manager (v 1.2-debug) Hypervisor control protocol v 1.3 Using Hypervisor MD v 1.1 System PROM: Hypervisor v. 1.7.3. @(#)Hypervisor 1.7.3.c 2010/07/09 15:14\015 OpenBoot v. 4.30.4. @(#)OBP 4.30.4.b 2010/07/09 13:48 Set up control domain and domain services At this point we have a functioning LDoms 1.2 environment that can be configured in the usual fashion. One difference is that LDoms 1.2 behavior had 'delayed configuration mode (as expected) during initial configuration before rebooting the control domain. Another minor difference with a Solaris 11 control domain is that you define virtual switches using the 'vanity name' of the network interface, rather than the hardware driver name as in Solaris 10. # ldm list ------------------------------------------------------------------------------ Notice: the LDom Manager is running in configuration mode. Configuration and resource information is displayed for the configuration under construction; not the current active configuration. The configuration being constructed will only take effect after it is downloaded to the system controller and the host is reset. ------------------------------------------------------------------------------ NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME primary active -n-c-- SP 32 32640M 3.2% 4d 2h 50m # ldm add-vdiskserver primary-vds0 primary # ldm add-vconscon port-range=5000-5100 primary-vcc0 primary # ldm add-vswitch net-dev=net0 primary-vsw0 primary # ldm set-mau 2 primary # ldm set-vcpu 8 primary # ldm set-memory 4g primary # ldm add-config initial # ldm list-spconfig factory-default initial [current] That's it, really. After reboot, we are ready to install guest domains. Summary - new wine in old bottles This example shows that (new) Solaris 11 can be installed on (old) T2000 servers and used as a control domain. The main activity is to remove the preinstalled Oracle VM Server for 2.2 and install Logical Domains 1.2 - the last version of LDoms to support T1-processor systems. I tested Solaris 10 and Solaris 11 guest domains running on this server and they worked without any surprises. This is a viable way to get further into Solaris 11 adoption, even on older T-series equipment.

Read the article
Establishing WebLogic Server HTTPS Trust of IIS Using a Microsoft Local Certificate Authority

- by user647124

Everyone agrees that self-signed and demo certificates for SSL and HTTPS should never be used in production and preferred not to be used elsewhere. Most self-signed and demo certificates are provided by vendors with the intention that they are used only to integrate within the same environment. In a vendor’s perfect world all application servers in a given enterprise are from the same vendor, which makes this lack of interoperability in a non-production environment an advantage. For us working in the real world, where not only do we not use a single vendor everywhere but have to make do with self-signed certificates for all but production, testing HTTPS between an IIS ASP.NET service provider and a WebLogic J2EE consumer application can be very frustrating to set up. It was for me, especially having found many blogs and discussion threads where various solutions were described but did not quite work and were all mostly similar but just a little bit different. To save both you and my future (who always seems to forget the hardest-won lessons) all of the pain and suffering, I am recording the steps that finally worked here for reference and sanity. How You Know You Need This The first cold clutches of dread that tells you it is going to be a long day is when you attempt to a WSDL published by IIS in WebLogic over HTTPS and you see the following: <Jul 30, 2012 2:51:31 PM EDT> <Warning> <Security> <BEA-090477> <Certificate chain received from myserver.mydomain.com - 10.555.55.123 was not trusted causing SSL handshake failure.> weblogic.wsee.wsdl.WsdlException: Failed to read wsdl file from url due to -- javax.net.ssl.SSLKeyException: [Security:090477]Certificate chain received from myserver02.mydomain.com - 10.555.55.123 was not trusted causing SSL handshake failure. The above is what started a three day sojourn into searching for a solution. Even people who had solved it before would tell me how they did, and then shrug when I demonstrated that the steps did not end in the success they claimed I would experience. Rather than torture you with the details of everything I did that did not work, here is what finally did work. Export the Certificates from IE First, take the offending WSDL URL and paste it into IE (if you have an internal Microsoft CA, you have IE, even if you don’t use it in favor of some other browser). To state the semi-obvious, if you received the error above there is a certificate configured for the IIS host of the service and the SSL port has been configured properly. Otherwise there would be a different error, usually about the site not found or connection failed. Once the WSDL loads, to the right of the address bar there will be a lock icon. Click the lock and then click View Certificates in the resulting dialog (if you do not have a lock icon but do have a Certificate Error message, see http://support.microsoft.com/kb/931850 for steps to install the certificate then you can continue from the point of finding the lock icon). Figure 1: View Certificates in IE Next, select the Details tab in the resulting dialog Figure 2: Use Certificate Details to Export Certificate Click Copy to File, then Next, then select the Base-64 encoded option for the format Figure 3: Select the Base-64 encoded option for the format For the sake of simplicity, I choose to save this to the root of the WebLogic domain. It will work from anywhere, but later you will need to type in the full path rather than just the certificate name if you save it elsewhere. Figure 4: Browse to Save Location Figure 5: Save the Certificate to the Domain Root for Convenience This is the point where I ran into some confusion. Some articles mentioned exporting the entire chain of certificates. This supposedly works for some types of certificates, or if you have a few other tools and the time to learn them. For the SSL experts out there, they already have these tools, know how to use them well, and should not be wasting their time reading this article meant for folks who just want to get things wired up and back to unit testing and development. For the rest of us, the easiest way to make sure things will work is to just export all the links in the chain individually and let WebLogic Server worry about re-assembling them into a chain (which it does quite nicely). While perhaps not the most elegant solution, the multi-step process is easy to repeat and uses only tools that are immediately available and require no learning curve. So… Next, go to Tools then Internet Options then the Content tab and click Certificates. Go to the Trust Root Certificate Authorities tab and find the certificate root for your Microsoft CA cert (look for the Issuer of the certificate you exported earlier). Figure 6: Trusted Root Certification Authorities Tab Export this one the same way as before, with a different name Figure 7: Use a Unique Name for Each Certificate Repeat this once more for the Intermediate Certificate tab. Import the Certificates to the WebLogic Domain Now, open an command prompt, navigate to [WEBLOGIC_DOMAIN_ROOT]\bin and execute setDomainEnv. You should then be in the root of the domain. If not, CD to the domain root. Assuming you saved the certificate in the domain root, execute the following: keytool -importcert -alias [ALIAS-1] -trustcacerts -file [FULL PATH TO .CER 1] -keystore truststore.jks -storepass [PASSWORD] An example with the variables filled in is: keytool -importcert -alias IIS-1 -trustcacerts -file microsftcert.cer -keystore truststore.jks -storepass password After several lines out output you will be prompted with: Trust this certificate? [no]: The correct answer is ‘yes’ (minus the quotes, of course). You’ll you know you were successful if the response is: Certificate was added to keystore If not, check your typing, as that is generally the source of an error at this point. Repeat this for all three of the certificates you exported, changing the [ALIAS-1] and [FULL PATH TO .CER 1] value each time. For example: keytool -importcert -alias IIS-1 -trustcacerts -file microsftcert.cer -keystore truststore.jks -storepass password keytool -importcert -alias IIS-2 -trustcacerts -file microsftcertRoot.cer -keystore truststore.jks -storepass password keytool -importcert -alias IIS-3 -trustcacerts -file microsftcertIntermediate.cer -keystore truststore.jks -storepass password In the above we created a new JKS key store. You can re-use an existing one by changing the name of the JKS file to one you already have and change the password to the one that matches that JKS file. For the DemoTrust.jks that is included with WebLogic the password is DemoTrustKeyStorePassPhrase. An example here would be: keytool -importcert -alias IIS-1 -trustcacerts -file microsoft.cer -keystore DemoTrust.jks -storepass DemoTrustKeyStorePassPhrase keytool -importcert -alias IIS-2 -trustcacerts -file microsoftRoot.cer -keystore DemoTrust.jks -storepass DemoTrustKeyStorePassPhrase keytool -importcert -alias IIS-2 -trustcacerts -file microsoftInter.cer -keystore DemoTrust.jks -storepass DemoTrustKeyStorePassPhrase Whichever keystore you use, you can check your work with: keytool -list -keystore truststore.jks -storepass password Where “truststore.jks” and “password” can be replaced appropriately if necessary. The output will look something like this: Figure 8: Output from keytool -list -keystore Update the WebLogic Keystore Configuration If you used an existing keystore rather than creating a new one, you can restart your WebLogic Server and skip the rest of this section. For those of us who created a new one because that is the instructions we found online… Next, we need to tell WebLogic to use the JKS file (truststore.jks) we just created. Log in to the WebLogic Server Administration Console and navigate to Servers > AdminServer > Configuration > Keystores. Scroll down to “Custom Trust Keystore:” and change the value to “truststore.jks” and the value of “Custom Trust Keystore Passphrase:” and “Confirm Custom Trust Keystore Passphrase:” to the password you used when earlier, then save your changes. You will get a nice message similar to the following: Figure 9: To Be Safe, Restart Anyways The “No restarts are necessary” is somewhat of an exaggeration. If you want to be able to use the keystore you may need restart the server(s). To save myself aggravation, I always do. Your mileage may vary. Conclusion That should get you there. If there are some erroneous steps included for your situation in particular, I will offer up a semi-apology as the process described above does not take long at all and if there is one step that could be dropped from it, is still much faster than trying to figure this out from other sources.

Read the article
You Say You Want a (Customer Experience) Revolution

- by Christie Flanagan

Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} rev-o-lu-tion [rev-uh-loo-shuhn] noun 1. a sudden, radical or complete change 2. fundamental change in the way of thinking about or visualizing something; a change of paradigm 3. a changeover in use or preference especially in technology <the computer revolution> Lately, I've been hearing an awful lot about the customer experience revolution. Tonight Oracle will be hosting The Experience Revolution, an evening of exploration and networking with customer experience executives in New York City where Oracle President Mark Hurd will introduce Oracle Customer Experience, a cross-stack suite of customer experience products that includes Oracle WebCenter and a number of other Oracle technologies. Then on Tuesday and Wednesday, the Forrester Customer Experience Forum East also kicks off in New York City where they'll examine how businesses can "reap the full business benefits of the customer experience revolution." So, are we in the midst of a customer experience revolution? As a consumer, I can answer that question with a definitive “yes.” When I bought my very first car, I had a lot of questions. How do I know if I’m paying a fair price? How do I know if this dealer is honest? Why do I have to sit through these good cop, bad cop shenanigans between sales and sales management at the dealership? Why do I feel like I’m doing these people a favor by giving them my business? In the end the whole experience left me feeling deeply unsatisfied. I didn’t feel that I held all that much power over the experience and the only real negotiating trick I had was to walk out, which I did, many times before actually making a purchase. Fast forward to a year ago and I found myself back in the market for a new car. The very first car that I bought had finally kicked the bucket after many years, many repair bills, and much wear and tear. Man, I had loved that car. It was time to move on, but I had a knot in my stomach when I reflected back on my last car purchase experience and dreaded the thought of going through that again. Could that have been the reason why I drove my old car for so long? But as I started the process of researching new cars, I started to feel really confident. I had a wealth of online information that helped me in my search. I went to Edmunds and plugged in some information on my preferences and left with a short list of vehicles. After an afternoon spent test driving the cars my short list, I had determined my favorite – it was a model I didn’t even know about until my research on Edmunds! But I didn’t want to go back to the dealership where I test drove it. They were clearly old school and wanted me to buy the way that they wanted to sell. No thanks! After that I went back online. I figured out exactly what people had paid for this car in my area. I found out what kind of discount others were able to negotiate from an online community forum dedicated to the make and model. I found out how the sales people were being incentivized by the manufacturer that month. I learned which dealers had the best ratings and reviews. This was actually getting exciting. I was feeling really empowered. My next step was to request online quotes from the some of the highest rated dealers but I already knew exactly how much I was going to pay. This was really a test for the dealers. My new mantra was “let he who delivers the best customer experience win.” An inside sales rep from one dealer responded to my quote request within a couple of hours. I told him I had already decided on the make and model and it was just a matter of figuring out who I would buy it from. I also told them that I was really busy and wouldn’t set foot in the dealership unless we had come to terms beforehand. Lastly, I let him know that I’d prefer to work out the details via email. He promised to get back to me shortly with a detailed quote. Over the next few days I received calls from other dealers. One asked me a host of questions that I had already answered in their lengthy online form. Another blamed their website performance issues for their delay in responding to my request. But by then it didn’t really matter because I’d already bought the car days before from the dealer who responded to me first and who was willing to adjust their sales process to accommodate my buying one. So, yes, I really do believe we are in the midst of a customer experience revolution. And every revolution leaves some victorious and other vanquished. Which side do you want to be on when it comes to the customer experience revolution?

Read the article
Reg Gets a Job at Red Gate (and what happens behind the scenes)

- by red(at)work

Mr Reg Gater works at one of Cambridge’s many high-tech companies. He doesn’t love his job, but he puts up with it because... well, it could be worse. Every day he drives to work around the Red Gate roundabout, wondering what his boss is going to blame him for today, and wondering if there could be a better job out there for him. By late morning he already feels like handing his notice in. He got the hacky look from his boss for being 5 minutes late, and then they ran out of tea. Again. He goes to the local sandwich shop for lunch, and picks up a Red Gate job menu and a Book of Red Gate while he’s waiting for his order. That night, he goes along to Cambridge Geek Nights and sees some very enthusiastic Red Gaters talking about the work they do; it sounds interesting and, of all things, fun. He takes a quick look at the job vacancies on the Red Gate website, and an hour later realises he’s still there – looking at videos, photos and people profiles. He especially likes the Red Gate’s Got Talent page, and is very impressed with Simon Johnson’s marathon time. He thinks that he’d quite like to work with such awesome people. It just so happens that Red Gate recently decided that they wanted to hire another hot shot team member. Behind the scenes, the wheels were set in motion: the recruitment team met with the hiring manager to understand exactly what they’re looking for, and to decide what interview tests to do, who will do the interviews, and to kick-start any interview training those people might need. Next up, a job description and job advert were written, and the job was put on the market. Reg applies, and his CV lands in the Recruitment team’s inbox and they open it up with eager anticipation that Reg could be the next awesome new starter. He looks good, and in a jiffy they’ve arranged an interview. Reg arrives for his interview, and is greeted by a smiley receptionist. She offers him a selection of drinks and he feels instantly relaxed. A couple of interviews and an assessment later, he gets a job offer. We make his day and he makes ours by accepting, and becoming one of the 60 new starters so far this year. Behind the scenes, things start moving all over again. The HR team arranges for a “Welcome” goodie box to be whisked out to him, prepares his contract, sends an email to Information Services (Or IS for short - we’ll come back to them), keeps in touch with Reg to make sure he knows what to expect on his first day, and of course asks him to fill in the all-important wiki questionnaire so his new colleagues can start to get to know him before he even joins. Meanwhile, the IS team see an email in SupportWorks from HR. They see that Reg will be starting in the sales team in a few days’ time, and they know exactly what to do. They pull out a new machine, and within minutes have used their automated deployment software to install every piece of software that a new recruit could ever need. They also check with Reg’s new manager to see if he has any special requirements that they could help with. Reg starts and is amazed to find a fully configured machine sitting on his desk, complete with stationery and all the other tools he’ll need to do his job. He feels even more cared for after he gets a workstation assessment, and realises he’d be comfier with an ergonomic keyboard and a footstool. They arrive minutes later, just like that. His manager starts him off on his induction and sales training. Along with job-specific training, he’ll also have a buddy to help him find his feet, and loads of pre-arranged demos and introductions. Reg settles in nicely, and is great at his job. He enjoys the canteen, and regularly eats one of the 40,000 meals provided each year. He gets used to the selection of teas that are available, develops a taste for champagne launch parties, and has his fair share of the 25,000 cups of coffee downed at Red Gate towers each year. He goes along to some Feel Good Fund events, and donates a little something to charity in exchange for a turn on the chocolate fountain. He’s looking a little scruffy, so he decides to get his hair cut in between meetings, just in time for the Red Gate birthday company photo. Reg starts a new project: identifying existing customers to up-sell to new bundles. He talks with the web team to generate lists of qualifying customers who haven’t recently been sent marketing emails, and sends emails out, using a new in-house developed tool to schedule follow-up calls in CRM for the same group. The customer responds, saying they’d like to upgrade but are having a licensing problem – Reg sends the issue to Support, and it gets routed to the web team. The team identifies a workaround, and the bug gets scheduled into the next maintenance release in a fortnight’s time (hey; they got lucky). With all the new stuff Reg is working on, he realises that he’d be way more efficient if he had a third monitor. He speaks to IS and they get him one - no argument. He also needs a test machine and then some extra memory. Done. He then thinks he needs an iPad, and goes to ask for one. He gets told to stop pushing his luck. Some time later, Reg’s wife has a baby, so Reg gets 2 weeks of paid paternity leave and a bunch of flowers sent to his house. He signs up to the childcare scheme so that he doesn’t have to pay National Insurance on the first £243 of his childcare. The accounts team makes it all happen seamlessly, as they did with his Give As You Earn payments, which come out of his wages and go straight to his favorite charity. Reg’s sales career is going well. He’s grateful for the help that he gets from the product support team. How do they answer all those 900-ish support calls so effortlessly each month? He’s impressed with the patches that are sent out to customers who find “interesting behavior” in their tools, and to the customers who just must have that new feature. A little later in his career at Red Gate, Reg decides that he’d like to learn about management. He goes on some management training specially customised for Red Gate, joins the Management Book Club, and gets together with other new managers to brainstorm how to get the most out of one to one meetings with his team. Reg decides to go for a game of Foosball to celebrate his good fortune with his team, and has to wait for Finance to finish. While he’s waiting, he reflects on the wonderful time he’s had at Red Gate. He can’t put his finger on what it is exactly, but he knows he’s on to a good thing. All of the stuff that happened to Reg didn’t just happen magically. We’ve got teams of people working relentlessly behind the scenes to make sure that everyone here is comfortable, safe, well fed and caffeinated to the max.

Read the article
Five Ways Enterprise 2.0 Can Transform Your Business - Q&A from the Webcast

- by [email protected]

A few weeks ago, Vince Casarez and I presented with KMWorld on the Five Ways Enterprise 2.0 Can Transform Your Business. It was an enjoyable, interactive webcast in which Vince and I discussed the ways Enterprise 2.0 can transform your business and more importantly, highlighted key customer examples of how to do so. If you missed the webcast, you can catch a replay here. We had a lot of audience participation in some of the polls we conducted and in the Q&A session. We weren't able to address all of the questions during the broadcast, so we attempted to answer them here: Q: Which area within your firm focuses on Web 2.0? Meaning, do you find new departments developing just to manage the web 2.0 (Twitter, Facebook, etc.) user experience or are you structuring current departments? A: There are three distinct efforts within Oracle. The first is around delivery of these Web 2.0 services for enterprise deployments. This is the focus of the WebCenter team. The second effort is injecting these Web 2.0 services into use cases that drive the different enterprise applications. This effort is focused on how to manage these external services and bring them into a cohesive flow for marketing programs, customer care, and purchasing. The third effort is how we consume these services internally to enhance Oracle's business delivery. It leverages the technologies and use cases of the first two but also pushes the envelope with regards to future directions of these other two areas. Q: In a business, Web 2.0 is mostly like action logs. How can we leverage the official process practice versus the logs of a recent action? Example: a system configuration modified last night on a call out versus the official practice that everybody would use in the morning.A: The key thing to remember is that most Web 2.0 actions / activity streams today are based on collaboration and communication type actions. At least with public social sites like Facebook and Twitter. What we're delivering as part of the WebCenter Suite are not just these types of activities but also enterprise application activities. These enterprise application activities come from different application modules: purchasing, HR, order entry, sales opportunity, etc. The actions within these systems are normally tied to a business object or process: purchase order/customer, employee or department, customer and supplier, customer and product, respectively. Therefore, the activities or "logs" as you name them are able to be "typed" so that as a viewer, you can filter or decide to see only certain types of information. In your example, you could have a view that only showed you recent "configuration" changes and this could be right next to a view that showed off the items to be watched every morning. Q: It's great to hear about customers using the software but is there any plan for future webinars to show what the products/installs look like? That would be very helpful.A: We don't have a webinar planned to show off the install process. However, we have a viewlet that's posted on Oracle Technology Network. You can see it here:http://www.oracle.com/technetwork/testcontent/wcs-install-098014.htmlAnd we've got excellent documentation that walks you through the steps here:http://download.oracle.com/docs/cd/E14571_01/install.1111/e12001/install.htmAnd there's a whole set of demos and examples of what WebCenter can do at this URL:http://www.oracle.com/technetwork/middleware/webcenter/release11-demos-097468.html Q: How do you anticipate managing metadata across the enterprise to make content findable?A: We need to first make sure we are all talking about the same thing when we use a word like "metadata". Here's why... For a developer, metadata means information that describes key elements of the portal or application and what the portal or application can do. For content systems, metadata means key terms that provide a taxonomy or folksonomy about the information that is being indexed, ordered, and managed. For business intelligence systems, metadata means key terms that provide labels to groups of data that most non-mathematicians need to understand. And for SOA, metadata means labels for parts of the processes that business owners should understand that connect development terminology. There are also additional requirements for metadata to be available to the team building these new solutions as well as requirements to make this metadata available to the running system. These requirements are often separated by "design time" and "run time" respectively. So clearly, a general goal of managing metadata across the enterprise is very challenging. We've invested a huge amount of resources around Oracle Metadata Services (MDS) to be able to provide a more generic system for all of these elements. No other vendor has anything like this technology foundation in their products. This provides a huge benefit to our customers as they will now be able to find content, processes, people, and information from a common set of search interfaces with consistent enterprise wide results. Q: Can you give your definition of terms as to document and content, please?A: Content applies to a broad category of information from Word documents, presentations and reports through attachments to invoices and/or purchase orders. Content is essentially any type of digital asset including images, video, and voice. A document is just one type of content. Q: Do you have special integration tools to realize an interaction between UCM and WebCenter Spaces/Services?A: Yes, we've dedicated a whole team of engineers to exploit the key features of Oracle UCM within WebCenter. While ensuring that WebCenter can connect to other non-Oracle systems, we've made sure that with the combined set of Oracle technology, no other solution can match the combined power and integration. This is part of the Oracle Fusion Middleware strategy which is to provide best in class capabilities for Content and Portals. When combined together, the synergy between the two products enables users to quickly add capabilities when they are needed. For example, simple document sharing is part of the combined product offering, but if legal discovery or archiving is required, Oracle UCM product includes these capabilities that can be quickly added. There's no need to move content around or add another system to support this, it's just a feature that gets turned on within Oracle UCM. Q: All customers have some interaction with their applications and have many older versions, how do you see some of these new Enterprise 2.0 capabilities adding value to existing enterprise application deployments?A: Just as Service Oriented Architectures allowed for connecting the processes of different applications systems to work together, there's a need for a similar approach with regards to these enterprise 2.0 capabilities. Oracle WebCenter is built on a core architecture that allows for SOA of these Enterprise 2.0 services so that one set of scalable services can be used and integrated directly into any type of application. In this way, users can get immediate value out of the Enterprise 2.0 capabilities without having to wait for the next major release or upgrade. These centrally managed WebCenter services expose a set of standard interfaces that make it extremely easy to add them into existing applications no matter what technology the application has been implemented. Q: We've heard about Oracle Next Generation applications called "Fusion Applications", can you tell me how all this works together?A: Oracle WebCenter powers the core collaboration and social computing services found within Fusion Applications. It is the core user experience technology for how all the application screens have been implemented. And the core concept of task flows allows for all the Fusion Applications modules to be adaptable and composable by business users and IT without needing to be a professional developer. Oracle WebCenter is at the heart of the new Fusion Applications. In addition, the same patterns and technologies are now being added to the existing applications including JD Edwards, Siebel, Peoplesoft, and eBusiness Suite. The core technology enables all these customers to have a much smoother upgrade path to Fusion Applications. They get immediate benefits of injecting new user interactions into their existing applications without having to completely move to Fusion Applications. And then when the time comes, their users will already be well versed in how the new capabilities work. Q: Does any of this work with non Oracle software? Other databases? Other application servers? etc.A: We have made sure that Oracle WebCenter delivers the broadest set of development choices so that no matter what technology you developers are using, WebCenter capabilities can be quickly and easily added to the site or application. In addition, we have certified Oracle WebCenter to run against non-Oracle databases like DB2 and SQLServer. We have stated plans for certification against MySQL as well. Later in CY 2011, Oracle will provide certification on non-Oracle application servers such as WebSphere and JBoss. Q: How do we balance User and IT requirements in regards to Enterprise 2.0 technologies?A: Wrong decisions are often made because employee knowledge is not tapped efficiently and opportunities to innovate are often missed because the right people do not work together. Collaboration amongst workers in the right business context is critical for success. While standalone Enterprise 2.0 technologies can improve collaboration for collaboration's sake, using social collaboration tools in the context of business applications and processes will improve business responsiveness and lead companies to a more competitive position. As these systems become more mission critical it is essential that they maintain the highest level of performance and availability while scaling to support larger communities. Q: What are the ways in which Enterprise 2.0 can improve business responsiveness?A: With a wide range of Enterprise 2.0 tools in the marketplace, CIOs need to deploy solutions that will meet the requirements from users as well as address the requirements from IT. Workers want a next-generation user experience that is personalized and aggregates their daily tools and tasks, while IT needs to ensure the solution is secure, scalable, flexible, reliable and easily integrated with existing systems. An open and integrated approach to deploying portals, content management, and collaboration can enhance your business by addressing both the needs of knowledge workers for better information and the IT mandate to conserve resources by simplifying, consolidating and centralizing infrastructure and administration.

Read the article
Recap: Oracle Fusion Middleware Strategies Driving Business Innovation

- by Harish Gaur

Hasan Rizvi, Executive Vice President of Oracle Fusion Middleware & Java took the stage on Tuesday to discuss how Oracle Fusion Middleware helps enable business innovation. Through a series of product demos and customer showcases, Hassan demonstrated how Oracle Fusion Middleware is a complete platform to harness the latest technological innovations (cloud, mobile, social and Fast Data) throughout the application lifecycle. Fig 1: Oracle Fusion Middleware is the foundation of business innovation This Session included 4 demonstrations to illustrate these strategies: 1. Build and deploy native mobile applications using Oracle ADF Mobile 2. Empower business user to model processes, design user interface and have rich mobile experience for process interaction using Oracle BPM Suite PS6. 3. Create collaborative user experience and integrate social sign-on using Oracle WebCenter Portal, Oracle WebCenter Content, Oracle Social Network & Oracle Identity Management 11g R2 4. Deploy and manage business applications on Oracle Exalogic Nike, LA Department of Water & Power and Nintendo joined Hasan on stage to share how their organizations are leveraging Oracle Fusion Middleware to enable business innovation. Managing Performance in the Wrld of Social and Mobile How do you provide predictable scalability and performance for an application that monitors active lifestyle of 8 million users on a daily basis? Nike’s answer is Oracle Coherence, a component of Oracle Fusion Middleware and Oracle Exadata. Fig 2: Oracle Coherence enabled data grid improves performance of Nike+ Digital Sports Platform Nicole Otto, Sr. Director of Consumer Digital Technology discussed the vision of the Nike+ platform, a platform which represents a shift for NIKE from a "product" to a "product +" experience. There are currently nearly 8 million users in the Nike+ system who are using digitally-enabled Nike+ devices. Once data from the Nike+ device is transmitted to Nike+ application, users access the Nike+ website or via the Nike mobile applicatoin, seeing metrics around their daily active lifestyle and even engage in socially compelling experiences to compare, compete or collaborate their data with their friends. Nike expects the number of users to grow significantly this year which will drive an explosion of data and potential new experiences. To deal with this challenge, Nike envisioned building a shared platform that would drive a consumer-centric model for the company. Nike built this new platform using Oracle Coherence and Oracle Exadata. Using Coherence, Nike built a data grid tier as a distributed cache, thereby provide low-latency access to most recent and relevant data to consumers. Nicole discussed how Nike+ Digital Sports Platform is unique in the way that it utilizes the Coherence Grid. Nike takes advantage of Coherence as a traditional cache using both cache-aside and cache-through patterns. This new tier has enabled Nike to create a horizontally scalable distributed event-driven processing architecture. Current data grid volume is approximately 150,000 request per minute with about 40 million objects at any given time on the grid. Improving Customer Experience Across Multiple Channels Customer experience is on top of every CIO's mind. Customer Experience needs to be consistent and secure across multiple devices consumers may use. This is the challenge Matt Lampe, CIO of Los Angeles Department of Water & Power (LADWP) was faced with. Despite being the largest utilities company in the country, LADWP had been relying on a 38 year old customer information system for serving its customers. Their prior system had been unable to keep up with growing customer demands. Last year, LADWP embarked on a journey to improve customer experience for 1.6million LA DWP customers using Oracle WebCenter platform. Figure 3: Multi channel & Multi lingual LADWP.com built using Oracle WebCenter & Oracle Identity Management platform Matt shed light on his efforts to drive customer self-service across 3 dimensions – new website, new IVR platform and new bill payment service. LADWP has built a new portal to increase customer self-service while reducing the transactions via IVR. LADWP's website is powered Oracle WebCenter Portal and is accessible by desktop and mobile devices. By leveraging Oracle WebCenter, LADWP eliminated the need to build, format, and maintain individual mobile applications or websites for different devices. Their entire content is managed using Oracle WebCenter Content and secured using Oracle Identity Management. This new portal automated their paper based processes to web based workflows for customers. This includes automation of Self Service implemented through My Account - like Bill Pay, Payment History, Bill History and Usage Analysis. LADWP's solution went live in April 2012. Matt indicated that LADWP's Self-Service Portal has greatly improved customer satisfaction. In a JD Power Associates website satisfaction survey, results indicate rankings have climbed by 25+ points, marking a remarkable increase in user experience. Bolstering Performance and Simplifying Manageability of Business Applications Ingvar Petursson, Senior Vice Preisdent of IT at Nintendo America joined Hasan on-stage to discuss their choice of Exalogic. Nintendo had significant new requirements coming their way for business systems, both internal and external, in the years to come, especially with new products like the WiiU on the horizon this holiday season. Nintendo needed a platform that could give them performance, availability and ease of management as they deploy business systems. Ingvar selected Engineered Systems for two reasons: 1. High performance 2. Ease of management Figure 4: Nintendo relies on Oracle Exalogic to run ATG eCommerce, Oracle e-Business Suite and several business applications Nintendo made a decision to run their business applications (ATG eCommerce, E-Business Suite) and several Fusion Middleware components on the Exalogic platform. What impressed Ingvar was the "stress” testing results during evaluation. Oracle Exalogic could handle their 3-year load estimates for many functions, which was better than Nintendo expected without any hardware expansion. Faster Processing of Big Data Middleware plays an increasingly important role in Big Data. Last year, we announced at OpenWorld the introduction of Oracle Data Integrator for Hadoop and Oracle Loader for Hadoop which helps in the ability to move, transform, load data to and from Big Data Appliance to Exadata. This year, we’ve added new capabilities to find, filter, and focus data using Oracle Event Processing. This product can natively integrate with Big Data Appliance or runs standalone. Hasan briefly discussed how NTT Docomo, largest mobile operator in Japan, leverages Oracle Event Processing & Oracle Coherence to process mobile data (from 13 million smartphone users) at a speed of 700K events per second before feeding it Hadoop for distributed processing of big data. Figure 5: Mobile traffic data processing at NTT Docomo with Oracle Event Processing & Oracle Coherence

Read the article

Search Results

Search found 14182 results on 568 pages for 'andrew answer'.

Page 546/568 | < Previous Page | 542 543 544 545 546 547 548 549 550 551 552 553 | Next Page >

- by HighAltitudeCoder

- by JonWillis

- by Paul White

- by Pinal Dave

- by Roger Hart

- by Akshay Deep Lamba

- by Adam M.

- by Dave

- by extended_events

- by James Michael Hare

- by Pinal Dave

- by Rick Strahl

- by Rob Farley

- by Rob Farley

- by BuckWoody

- by mvaughan

- by Elton Stoneman

- by pinaldave

- by igor

- by jsavit

- by user647124

- by Christie Flanagan

- by red(at)work

- by [email protected]

- by Harish Gaur

< Previous Page | 542 543 544 545 546 547 548 549 550 551 552 553 | Next Page >