Search Results

Search found 279 results on 12 pages for 'predict'.

Page 11/12 | < Previous Page | 7 8 9 10 11 12 | Next Page >

Cloud MBaaS : The Next Big Thing in Enterprise Mobility

- by shiju

In this blog post, I will take a look at Cloud Mobile Backend as a Service (MBaaS) and how we can leverage Cloud based Mobile Backend as a Service for building enterprise mobile apps. Today, mobile apps are incredibly significant in both consumer and enterprise space and the demand for the mobile apps is unbelievably increasing in day to day business. An enterprise can’t survive in business without a proper mobility strategy. A better mobility strategy and faster delivery of your mobile apps will give you an extra mileage for your business and IT strategy. So organizations and mobile developers are looking for different strategy for meeting this demand and adopting different development strategy for their mobile apps. Some developers are adopting hybrid mobile app development platforms, for delivering their products for multiple platforms, for fast time-to-market. Others are adopting a Mobile enterprise application platform (MEAP) such as Kony for their enterprise mobile apps for fast time-to-market and better business integration. The Challenges of Enterprise Mobility The real challenge of enterprise mobile apps, is not about creating the front-end environment or developing front-end for multiple platforms. The most important thing of enterprise mobile apps is to expose your enterprise data to mobile devices where the real pain is your business data might be residing in lot of different systems including legacy systems, ERP systems etc., and these systems will be deployed with lot of security restrictions. Exposing your data from the on-premises servers, is not a easy thing for most of the business organizations. Many organizations are spending too much time for their front-end development strategy, but they are really lacking for building a strategy on their back-end for exposing the business data to mobile apps. So building a REST services layer and mobile back-end services, on the top of legacy systems and existing middleware systems, is the key part of most of the enterprise mobile apps, where multiple mobile platforms can easily consume these REST services and other mobile back-end services for building mobile apps. For some mobile apps, we can’t predict its user base, especially for products where customers can gradually increase at any time. And for today’s mobile apps, faster time-to-market is very critical so that spending too much time for mobile app’s scalability, will not be worth. The real power of Cloud is the agility and on-demand scalability, where we can scale-up and scale-down our applications very easily. It would be great if we could use the power of Cloud to mobile apps. So using Cloud for mobile apps is a natural fit, where we can use Cloud as the storage for mobile apps and hosting mechanism for mobile back-end services, where we can enjoy the full power of Cloud with greater level of on-demand scalability and operational agility. So Cloud based Mobile Backend as a Service is great choice for building enterprise mobile apps, where enterprises can enjoy the massive scalability power of their mobile apps, provided by public cloud vendors such as Microsoft Windows Azure. Mobile Backend as a Service (MBaaS) We have discussed the key challenges of enterprise mobile apps and how we can leverage Cloud for hosting mobile backend services. MBaaS is a set of cloud-based, server-side mobile services for multiple mobile platforms and HTML5 platform, which can be used as a backend for your mobile apps with the scalability power of Cloud. The information below provides the key features of a typical MBaaS platform: Cloud based storage for your application data. Automatic REST API services on the application data, for CRUD operations. Native push notification services with massive scalability power. User management services for authenticate users. User authentication via Social accounts such as Facebook, Google, Microsoft, and Twitter. Scheduler services for periodically sending data to mobile devices. Native SDKs for multiple mobile platforms such as Windows Phone and Windows Store, Android, Apple iOS, and HTML5, for easily accessing the mobile services from mobile apps, with better security. Typically, a MBaaS platform will provide native SDKs for multiple mobile platforms so that we can easily consume the server-side mobile services. MBaaS based REST APIs can use for integrating to enterprise backend systems. We can use the same mobile services for multiple platform so hat we can reuse the application logic to multiple mobile platforms. Public cloud vendors are building the mobile services on the top of their PaaS offerings. Windows Azure Mobile Services is a great platform for a MBaaS offering that is leveraging Windows Azure Cloud platform’s PaaS capabilities. Hybrid mobile development platform Titanium provides their own MBaaS services. LoopBack is a new MBaaS service provided by Node.js consulting firm StrongLoop, which can be hosted on multiple cloud platforms and also for on-premises servers. The Challenges of MBaaS Solutions If you are building your mobile apps with a new data storage, it will be very easy, since there is not any integration challenges you have to face. But most of the use cases, you have to extract your application data in which stored in on-premises servers which might be under VPNs and firewalls. So exposing these data to your MBaaS solution with a proper security would be a big challenge. The capability of your MBaaS vendor is very important as you have to interact with your legacy systems for many enterprise mobile apps. So you should be very careful about choosing for MBaaS vendor. At the same time, you should have a proper strategy for mobilizing your application data which stored in on-premises legacy systems, where your solution architecture and strategy is more important than platforms and tools. Windows Azure Mobile Services Windows Azure Mobile Services is an MBaaS offerings from Windows Azure cloud platform. IMHO, Microsoft Windows Azure is the best PaaS platform in the Cloud space. Windows Azure Mobile Services extends the PaaS capabilities of Windows Azure, to mobile devices, which can be used as a cloud backend for your mobile apps, which will provide global availability and reach for your mobile apps. Windows Azure Mobile Services provides storage services, user management with social network integration, push notification services and scheduler services and provides native SDKs for all major mobile platforms and HTML5. In Windows Azure Mobile Services, you can write server-side scripts in Node.js where you can enjoy the full power of Node.js including the use of NPM modules for your server-side scripts. In the previous section, we had discussed some challenges of MBaaS solutions. You can leverage Windows Azure Cloud platform for solving many challenges regarding with enterprise mobility. The entire Windows Azure platform can play a key role for working as the backend for your mobile apps where you can leverage the entire Windows Azure platform for your mobile apps. With Windows Azure, you can easily connect to your on-premises systems which is a key thing for mobile backend solutions. Another key point is that Windows Azure provides better integration with services like Active Directory, which makes Windows Azure as the de facto platform for enterprise mobility, for enterprises, who have been leveraging Microsoft ecosystem for their application and IT infrastructure. Windows Azure Mobile Services is going to next evolution where you can expect some exciting features in near future. One area, where Windows Azure Mobile Services should definitely need an improvement, is about the default storage mechanism in which currently it is depends on SQL Server. IMHO, developers should be able to choose multiple default storage option when creating a new mobile service instance. Let’s say, there should be a different storage providers such as SQL Server storage provider and Table storage provider where developers should be able to choose their choice of storage provider when creating a new mobile services project. I have been used Windows Azure and Windows Azure Mobile Services as the backend for production apps for mobile, where it performed very well. MBaaS Over MEAP Recently, many larger enterprises has been adopted Mobile enterprise application platform (MEAP) for their mobile apps. I haven’t worked on any production MEAP solution, but I heard that developers are really struggling with MEAP in different way. The learning curve for a proprietary MEAP platform is very high. I am completely against for using larger proprietary ecosystem for mobile apps. For enterprise mobile apps, I highly recommend to use native iOS/Android/Windows Phone or HTML5 for front-end with a cloud hosted MBaaS solution as the middleware. A MBaaS service can be consumed from multiple mobile apps where REST APIs are using to integrating with enterprise backend systems. Enterprise mobility should start with exposing REST APIs on the enterprise backend systems and these REST APIs can host on Cloud where we can enjoy the power of Cloud for our services. If you are having REST APIs for your enterprise data, then you can easily build mobile frontends for multiple platforms. You can follow me on Twitter @shijucv

Read the article
How Mary Meeker’s Latest Findings May Make You Re-Imagine Commerce

- by Brenna Johnson-Oracle

0 0 1 954 5439 Endeca Technologies 45 12 6381 14.0 Normal 0 false false false EN-US JA X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin;} Today, Mary Meeker released her highly anticipated annual “Internet Trends” presentation for 2014. All 164 slides are jam-packed with pretty much everything you need to know about the state of the Internet. And as luck would have it, Oracle is staying ahead of these trends (but we’ll talk about that later). There were a few surprises, some stats to solidify what you likely already know, and Meeker’s novel observations about where we are all going. What interested me the most is not only how people are engaging in their personal lives, but how they engage with brands. As you could probably predict, Internet usage growth is slowing while tablet user and mobile data traffic growth continue their meteoric rise around the globe, with tremendous growth in underpenetrated markets like China, India, Brazil and Indonesia. Now hold those the “Internet is dead” comments. Keep in mind there’s still plenty of room to grow, and a multiscreen model is Meeker’s vision for our future. Despite 1.5x YOY growth for mobile traffic, mobile still only makes up about 23% of all traffic today. With tablet shipments easily outpacing figures for PCs even at their height (in 2007), mobile will only continue on it’s path, but won’t be everything to everyone. Mobile won’t replace every touchpoint, it’s just created our shorter attention spans and demand for simpler, more personal experiences. As Meeker points out TVs, tablets, PCs, and smartphones are used for different activities at present, but lines will blur (for example, 84% of smartphones owners use their device while watching TV). Day-to-day activities are being re-imagining through simple, beautiful user experiences. It seems like every day I discover a new way a brand/site/app made the most mundane or mounting task enjoyable and frictionless – and I’m not alone. Meeker points out the evolution of how we do everything from how we communicate, get information, use money, meet someone, get places, order a meal, and consume media is all done through new user interfaces that make day-to-day tasks simpler. This movement has caused just about everyone’s patience for a poor UX to take a nosedive. And it’s not just the digital user experience, technology is making a lot of people’s offline lives easier, and less expensive. Today 47% of online shopping utilizes free shipping— nearly half. And Meeker predicts same day local delivery will be the “next big thing” (and you can take a guess on who will own that). Content, Community and Commerce creates the “Internet Trifecta.” Meeker pointed out that when content, communities and commerce occur in a single experience it’s embraced by consumers, which translates to big dollars for brands. The magic happens when consumers can get inspired, research, and buy in a single experience. As the buying cycle has changed and touchpoints (Web, mobile, social, store) are no longer tied to “roles” or steps in the customer journey, brands must make all experiences (content and commerce) available in a single, adaptable experience. (We at Oracle Commerce have a lot to say on this topic – stay tuned!) And in what Meeker calls the “biggest re-imagination of all:” consumers enabled with smartphones and sensors are creating troves of findable and sharable data, which she says is in the early stages, by growing rapidly. She notes that transparency and patterns of consumers with this hardware (FYI - there are up to 10 sensors embedded in smartphones now) has created a Big Data treasure chest to be mined to improve business and the life of the consumer. The opportunities are endless. So what does it all mean for a company doing business online? Start thinking about how you can: Re-imagine your experience. Not your online experience and your mobile experience and your social experience – your overall experience. When consumers can research, buy, and advocate from anywhere (and their attention spans are at an all-time low) channels don’t exist. Enable simple and beautiful interactions informed by all of the online and offline data you leverage across your enterprise. Ethically leverage the endless supply of data (user generated content, clicks, purchases, in-store behavior, social activity) to make experiences more beautiful, more accurate, and more personalized (not to mention, more lucrative for you). Re-imagine content and commerce. Content and commerce must co-exist in a single destination where shoppers can get inspired, explore, research, share, and purchase in a collective experience. Think of how you can deliver an experience where all types of experiences (brand stories and commerce) adapt to every customer need. (Look for more on this topic coming soon). Re-imagine your reach. Look to Meeker’s findings to see how the global appetite for digital experiences is growing, but under-served in many places (i.e.: India, Mexico, Indonesia, Brazil, Philippines, etc.). Growing your online business to a new geography doesn’t have to mean starting from scratch or having an entirely new team manage the new endeavor. Expand using what you’ve already built in a multisite framework, with global language support. And of course, make sure it’s optimized for mobile! Re-imagine the possible. After every Meeker report, I’m always left with the thought “we are just at the beginning.” Everyday there is more data, more possibilities, more online consumers, and more opportunities to use new latest technology to get closer to your customers and be more successful. There’s a lot going on in our Product Development and Product Innovations groups to automate innovation for our customers, so that they can continue to stay ahead of these trends, without disrupting their business. Check out a recent interview with our Innovations Team on some of these new possibilities. Staying on track despite the seemingly endless possibilities out there is the hard part. Prioritizing where you will focus based on your unique brand promise, customer and goals is what you do best. To learn how Oracle Commerce can help your business achieve your goals check out oracle.com/commerce. Check out Meeker’s entire report here.

Read the article
Session memory – who’s this guy named Max and what’s he doing with my memory?

- by extended_events

SQL Server MVP Jonathan Kehayias (blog) emailed me a question last week when he noticed that the total memory used by the buffers for an event session was larger than the value he specified for the MAX_MEMORY option in the CREATE EVENT SESSION DDL. The answer here seems like an excellent subject for me to kick-off my new “401 – Internals” tag that identifies posts where I pull back the curtains a bit and let you peek into what’s going on inside the extended events engine. In a previous post (Option Trading: Getting the most out of the event session options) I explained that we use a set of buffers to store the event data before we write the event data to asynchronous targets. The MAX_MEMORY along with the MEMORY_PARTITION_MODE defines how big each buffer will be. Theoretically, that means that I can predict the size of each buffer using the following formula: max memory / # of buffers = buffer size If it was that simple I wouldn’t be writing this post. I’ll take “boundary” for 64K Alex For a number of reasons that are beyond the scope of this blog, we create event buffers in 64K chunks. The result of this is that the buffer size indicated by the formula above is rounded up to the next 64K boundary and that is the size used to create the buffers. If you think visually, this means that the graph of your max_memory option compared to the actual buffer size that results will look like a set of stairs rather than a smooth line. You can see this behavior by looking at the output of dm_xe_sessions, specifically the fields related to the buffer sizes, over a range of different memory inputs: Note: This test was run on a 2 core machine using per_cpu partitioning which results in 5 buffers. (Seem my previous post referenced above for the math behind buffer count.) input_memory_kb total_regular_buffers regular_buffer_size total_buffer_size 637 5 130867 654335 638 5 130867 654335 639 5 130867 654335 640 5 196403 982015 641 5 196403 982015 642 5 196403 982015 This is just a segment of the results that shows one of the “jumps” between the buffer boundary at 639 KB and 640 KB. You can verify the size boundary by doing the math on the regular_buffer_size field, which is returned in bytes: 196403 – 130867 = 65536 bytes 65536 / 1024 = 64 KB The relationship between the input for max_memory and when the regular_buffer_size is going to jump from one 64K boundary to the next is going to change based on the number of buffers being created. The number of buffers is dependent on the partition mode you choose. If you choose any partition mode other than NONE, the number of buffers will depend on your hardware configuration. (Again, see the earlier post referenced above.) With the default partition mode of none, you always get three buffers, regardless of machine configuration, so I generated a “range table” for max_memory settings between 1 KB and 4096 KB as an example. start_memory_range_kb end_memory_range_kb total_regular_buffers regular_buffer_size total_buffer_size 1 191 NULL NULL NULL 192 383 3 130867 392601 384 575 3 196403 589209 576 767 3 261939 785817 768 959 3 327475 982425 960 1151 3 393011 1179033 1152 1343 3 458547 1375641 1344 1535 3 524083 1572249 1536 1727 3 589619 1768857 1728 1919 3 655155 1965465 1920 2111 3 720691 2162073 2112 2303 3 786227 2358681 2304 2495 3 851763 2555289 2496 2687 3 917299 2751897 2688 2879 3 982835 2948505 2880 3071 3 1048371 3145113 3072 3263 3 1113907 3341721 3264 3455 3 1179443 3538329 3456 3647 3 1244979 3734937 3648 3839 3 1310515 3931545 3840 4031 3 1376051 4128153 4032 4096 3 1441587 4324761 As you can see, there are 21 “steps” within this range and max_memory values below 192 KB fall below the 64K per buffer limit so they generate an error when you attempt to specify them. Max approximates True as memory approaches 64K The upshot of this is that the max_memory option does not imply a contract for the maximum memory that will be used for the session buffers (Those of you who read Take it to the Max (and beyond) know that max_memory is really only referring to the event session buffer memory.) but is more of an estimate of total buffer size to the nearest higher multiple of 64K times the number of buffers you have. The maximum delta between your initial max_memory setting and the true total buffer size occurs right after you break through a 64K boundary, for example if you set max_memory = 576 KB (see the green line in the table), your actual buffer size will be closer to 767 KB in a non-partitioned event session. You get “stepped up” for every 191 KB block of initial max_memory which isn’t likely to cause a problem for most machines. Things get more interesting when you consider a partitioned event session on a computer that has a large number of logical CPUs or NUMA nodes. Since each buffer gets “stepped up” when you break a boundary, the delta can get much larger because it’s multiplied by the number of buffers. For example, a machine with 64 logical CPUs will have 160 buffers using per_cpu partitioning or if you have 8 NUMA nodes configured on that machine you would have 24 buffers when using per_node. If you’ve just broken through a 64K boundary and get “stepped up” to the next buffer size you’ll end up with total buffer size approximately 10240 KB and 1536 KB respectively (64K * # of buffers) larger than max_memory value you might think you’re getting. Using per_cpu partitioning on large machine has the most impact because of the large number of buffers created. If the amount of memory being used by your system within these ranges is important to you then this is something worth paying attention to and considering when you configure your event sessions. The DMV dm_xe_sessions is the tool to use to identify the exact buffer size for your sessions. In addition to the regular buffers (read: event session buffers) you’ll also see the details for large buffers if you have configured MAX_EVENT_SIZE. The “buffer steps” for any given hardware configuration should be static within each partition mode so if you want to have a handy reference available when you configure your event sessions you can use the following code to generate a range table similar to the one above that is applicable for your specific machine and chosen partition mode. DECLARE @buf_size_output table (input_memory_kb bigint, total_regular_buffers bigint, regular_buffer_size bigint, total_buffer_size bigint) DECLARE @buf_size int, @part_mode varchar(8) SET @buf_size = 1 -- Set to the begining of your max_memory range (KB) SET @part_mode = 'per_cpu' -- Set to the partition mode for the table you want to generate WHILE @buf_size <= 4096 -- Set to the end of your max_memory range (KB) BEGIN BEGIN TRY IF EXISTS (SELECT * from sys.server_event_sessions WHERE name = 'buffer_size_test') DROP EVENT SESSION buffer_size_test ON SERVER DECLARE @session nvarchar(max) SET @session = 'create event session buffer_size_test on server add event sql_statement_completed add target ring_buffer with (max_memory = ' + CAST(@buf_size as nvarchar(4)) + ' KB, memory_partition_mode = ' + @part_mode + ')' EXEC sp_executesql @session SET @session = 'alter event session buffer_size_test on server state = start' EXEC sp_executesql @session INSERT @buf_size_output (input_memory_kb, total_regular_buffers, regular_buffer_size, total_buffer_size) SELECT @buf_size, total_regular_buffers, regular_buffer_size, total_buffer_size FROM sys.dm_xe_sessions WHERE name = 'buffer_size_test' END TRY BEGIN CATCH INSERT @buf_size_output (input_memory_kb) SELECT @buf_size END CATCH SET @buf_size = @buf_size + 1 END DROP EVENT SESSION buffer_size_test ON SERVER SELECT MIN(input_memory_kb) start_memory_range_kb, MAX(input_memory_kb) end_memory_range_kb, total_regular_buffers, regular_buffer_size, total_buffer_size from @buf_size_output group by total_regular_buffers, regular_buffer_size, total_buffer_size Thanks to Jonathan for an interesting question and a chance to explore some of the details of Extended Event internals. - Mike

Read the article
Big Data’s Killer App…

- by jean-pierre.dijcks

Recently Keith spent some time talking about the cloud on this blog and I will spare you my thoughts on the whole thing. What I do want to write down is something about the Big Data movement and what I think is the killer app for Big Data... Where is this coming from, ok, I confess... I spent 3 days in cloud land at the Cloud Connect conference in Santa Clara and it was quite a lot of fun. One of the nice things at Cloud Connect was that there was a track dedicated to Big Data, which prompted me to some extend to write this post. What is Big Data anyways? The most valuable point made in the Big Data track was that Big Data in itself is not very cool. Doing something with Big Data is what makes all of this cool and interesting to a business user! The other good insight I got was that a lot of people think Big Data means a single gigantic monolithic system holding gazillions of bytes or documents or log files. Well turns out that most people in the Big Data track are talking about a lot of collections of smaller data sets. So rather than thinking "big = monolithic" you should be thinking "big = many data sets". This is more than just theoretical, it is actually relevant when thinking about big data and how to process it. It is important because it means that the platform that stores data will most likely consist out of multiple solutions. You may be storing logs on something like HDFS, you may store your customer information in Oracle and you may store distilled clickstream information in some distilled form in MySQL. The big question you will need to solve is not what lives where, but how to get it all together and get some value out of all that data. NoSQL and MapReduce Nope, sorry, this is not the killer app... and no I'm not saying this because my business card says Oracle and I'm therefore biased. I think language is important, but as with storage I think pragmatic is better. In other words, some questions can be answered with SQL very efficiently, others can be answered with PERL or TCL others with MR. History should teach us that anyone trying to solve a problem will use any and all tools around. For example, most data warehouses (Big Data 1.0?) get a lot of data in flat files. Everyone then runs a bunch of shell scripts to massage or verify those files and then shoves those files into the database. We've even built shell script support into external tables to allow for this. I think the Big Data projects will do the same. Some people will use MapReduce, although I would argue that things like Cascading are more interesting, some people will use Java. Some data is stored on HDFS making Cascading the way to go, some data is stored in Oracle and SQL does do a good job there. As with storage and with history, be pragmatic and use what fits and neither NoSQL nor MR will be the one and only. Also, a language, while important, does in itself not deliver business value. So while cool it is not a killer app... Vertical Behavioral Analytics This is the killer app! And you are now thinking: "what does that mean?" Let's decompose that heading. First of all, analytics. I would think you had guessed by now that this is really what I'm after, and of course you are right. But not just analytics, which has a very large scope and means many things to many people. I'm not just after Business Intelligence (analytics 1.0?) or data mining (analytics 2.0?) but I'm after something more interesting that you can only do after collecting large volumes of specific data. That all important data is about behavior. What do my customers do? More importantly why do they behave like that? If you can figure that out, you can tailor web sites, stores, products etc. to that behavior and figure out how to be successful. Today's behavior that is somewhat easily tracked is web site clicks, search patterns and all of those things that a web site or web server tracks. that is where the Big Data lives and where these patters are now emerging. Other examples however are emerging, and one of the examples used at the conference was about prediction churn for a telco based on the social network its members are a part of. That social network is not about LinkedIn or Facebook, but about who calls whom. I call you a lot, you switch provider, and I might/will switch too. And that just naturally brings me to the next word, vertical. Vertical in this context means per industry, e.g. communications or retail or government or any other vertical. The reason for being more specific than just behavioral analytics is that each industry has its own data sources, has its own quirky logic and has its own demands and priorities. Of course, the methods and some of the software will be common and some will have both retail and service industry analytics in place (your corner coffee store for example). But the gist of it all is that analytics that can predict customer behavior for a specific focused group of people in a specific industry is what makes Big Data interesting. Building a Vertical Behavioral Analysis System Well, that is going to be interesting. I have not seen much going on in that space and if I had to have some criticism on the cloud connect conference it would be the lack of concrete user cases on big data. The telco example, while a step into the vertical behavioral part is not really on big data. It used a sample of data from the customers' data warehouse. One thing I do think, and this is where I think parts of the NoSQL stuff come from, is that we will be doing this analysis where the data is. Over the past 10 years we at Oracle have called this in-database analytics. I guess we were (too) early? Now the entire market is going there including companies like SAS. In-place btw does not mean "no data movement at all", what it means that you will do this on data's permanent home. For SAS that is kind of the current problem. Most of the inputs live in a data warehouse. So why move it into SAS and back? That all worked with 1 TB data warehouses, but when we are looking at 100TB to 500 TB of distilled data... Comments? As it is still early days with these systems, I'm very interested in seeing reactions and thoughts to some of these thoughts...

Read the article
Time Tracking on an Agile Team

- by Stephen.Walther

What’s the best way to handle time-tracking on an Agile team? Your gut reaction to this question might be to resist any type of time-tracking at all. After all, one of the principles of the Agile Manifesto is “Individuals and interactions over processes and tools”. Forcing the developers on your team to track the amount of time that they devote to completing stories or tasks might seem like useless bureaucratic red tape: an impediment to getting real work done. I completely understand this reaction. I’ve been required to use time-tracking software in the past to account for each hour of my workday. It made me feel like Fred Flintstone punching in at the quarry mine and not like a professional. Why You Really Do Need Time-Tracking There are, however, legitimate reasons to track time spent on stories even when you are a member of an Agile team. First, if you are working with an outside client, you might need to track the number of hours spent on different stories for the purposes of billing. There might be no way to avoid time-tracking if you want to get paid. Second, the Product Owner needs to know when the work on a story has gone over the original time estimated for the story. The Product Owner is concerned with Return On Investment. If the team has gone massively overtime on a story, then the Product Owner has a legitimate reason to halt work on the story and reconsider the story’s business value. Finally, you might want to track how much time your team spends on different types of stories or tasks. For example, if your team is spending 75% of their time doing testing then you might need to bring in more testers. Or, if 10% of your team’s time is expended performing a software build at the end of each iteration then it is time to consider better ways of automating the build process. Time-Tracking in SonicAgile For these reasons, we added time-tracking as a feature to SonicAgile which is our free Agile Project Management tool. We were heavily influenced by Jeff Sutherland (one of the founders of Scrum) in the way that we implemented time-tracking (see his article http://scrum.jeffsutherland.com/2007/03/time-tracking-is-anti-scrum-what-do-you.html). In SonicAgile, time-tracking is disabled by default. If you want to use this feature then the project owner must enable time-tracking in Project Settings. You can choose to estimate using either days or hours. If you are estimating at the level of stories then it makes more sense to choose days. Otherwise, if you are estimating at the level of tasks then it makes more sense to use hours. After you enable time-tracking then you can assign three estimates to a story: Original Estimate – This is the estimate that you enter when you first create a story. You don’t change this estimate. Time Spent – This is the amount of time that you have already devoted to the story. You update the time spent on each story during your daily standup meeting. Time Left – This is the amount of time remaining to complete the story. Again, you update the time left during your daily standup meeting. So when you first create a story, you enter an original estimate that becomes the time left. During each daily standup meeting, you update the time spent and time left for each story on the Kanban. If you had perfect predicative power, then the original estimate would always be the same as the sum of the time spent and the time left. For example, if you predict that a story will take 5 days to complete then on day 3, the story should have 3 days spent and 2 days left. Unfortunately, never in the history of mankind has anyone accurately predicted the exact amount of time that it takes to complete a story. For this reason, SonicAgile does not update the time spent and time left automatically. Each day, during the daily standup, your team should update the time spent and time left for each story. For example, the following table shows the history of the time estimates for a story that was originally estimated to take 3 days but, eventually, takes 5 days to complete: Day Original Estimate Time Spent Time Left Day 1 3 days 0 days 3 days Day 2 3 days 1 day 2 days Day 3 3 days 2 days 2 days Day 4 3 days 3 days 2 days Day 5 3 days 4 days 0 days In the table above, everything goes as predicted until you reach day 3. On day 3, the team realizes that the work will require an additional two days. The situation does not improve on day 4. All of the sudden, on day 5, all of the remaining work gets done. Real work often follows this pattern. There are long periods when nothing gets done punctuated by occasional and unpredictable bursts of progress. We designed SonicAgile to make it as easy as possible to track the time spent and time left on a story. Detecting when a Story Goes Over the Original Estimate Sometimes, stories take much longer than originally estimated. There’s a surprise. For example, you discover that a new software component is incompatible with existing software components. Or, you discover that you have to go through a month-long certification process to finish a story. In those cases, the Product Owner has a legitimate reason to halt work on a story and re-evaluate the business value of the story. For example, the Product Owner discovers that a story will require weeks to implement instead of days, then the story might not be worth the expense. SonicAgile displays a warning on both the Backlog and the Kanban when the time spent on a story goes over the original estimate. An icon of a clock is displayed. Time-Tracking and Tasks Another optional feature of SonicAgile is tasks. If you enable Tasks in Project Settings then you can break stories into one or more tasks. You can perform time-tracking at the level of a story or at the level of a task. If you don’t break a story into tasks then you can enter the time left and time spent for the story. As soon as you break a story into tasks, then you can no longer enter the time left and time spent at the level of the story. Instead, the time left and time spent for a story is rolled up from its tasks. On the Kanban, you can see how the time left and time spent for each task gets rolled up into each story. The progress bar for the story is rolled up from the progress bars for each task. The original estimate is never rolled up – even when you break a story into tasks. A story’s original estimate is entered separately from the original estimates of each of the story’s tasks. Summary Not every Agile team can avoid time-tracking. You might be forced to track time to get paid, to detect when you are spending too much time on a particular story, or to track the amount of time that you are devoting to different types of tasks. We designed time-tracking in SonicAgile to require the least amount of work to track the information that you need. Time-tracking is an optional feature. If you enable time-tracking then you can track the original estimate, time left, and time spent for each story and task. You can use time-tracking with SonicAgile for free. Register at http://SonicAgile.com.

Read the article
Cost Comparison Hard Disk Drive to Solid State Drive on Price per Gigabyte - dispelling a myth!

- by tonyrogerson

It is often said that Hard Disk Drive storage is significantly cheaper per GiByte than Solid State Devices – this is wholly inaccurate within the database space. People need to look at the cost of the complete solution and not just a single component part in isolation to what is really required to meet the business requirement. Buying a single Hitachi Ultrastar 600GB 3.5” SAS 15Krpm hard disk drive will cost approximately £239.60 (http://scan.co.uk, 22nd March 2012) compared to an OCZ 600GB Z-Drive R4 CM84 PCIe costing £2,316.54 (http://scan.co.uk, 22nd March 2012); I’ve not included FusionIO ioDrive because there is no public pricing available for it – something I never understand and personally when companies do this I immediately think what are they hiding, luckily in FusionIO’s case the product is proven though is expensive compared to OCZ enterprise offerings. On the face of it the single 15Krpm hard disk has a price per GB of £0.39, the SSD £3.86; this is what you will see in the press and this is what sales people will use in comparing the two technologies – do not be fooled by this bullshit people! What is the requirement? The requirement is the database will have a static size of 400GB kept static through archiving so growth and trim will balance the database size, the client requires resilience, there will be several hundred call centre staff querying the database where queries will read a small amount of data but there will be no hot spot in the data so the randomness will come across the entire 400GB of the database, estimates predict that the IOps required will be approximately 4,000IOps at peak times, because it’s a call centre system the IO latency is important and must remain below 5ms per IO. The balance between read and write is 70% read, 30% write. The requirement is now defined and we have three of the most important pieces of the puzzle – space required, estimated IOps and maximum latency per IO. Something to consider with regard SQL Server; write activity requires synchronous IO to the storage media specifically the transaction log; that means the write thread will wait until the IO is completed and hardened off until the thread can continue execution, the requirement has stated that 30% of the system activity will be write so we can expect a high amount of synchronous activity. The hardware solution needs to be defined; two possible solutions: hard disk or solid state based; the real question now is how many hard disks are required to achieve the IO throughput, the latency and resilience, ditto for the solid state. Hard Drive solution On a test on an HP DL380, P410i controller using IOMeter against a single 15Krpm 146GB SAS drive, the throughput given on a transfer size of 8KiB against a 40GiB file on a freshly formatted disk where the partition is the only partition on the disk thus the 40GiB file is on the outer edge of the drive so more sectors can be read before head movement is required: For 100% sequential IO at a queue depth of 16 with 8 worker threads 43,537 IOps at an average latency of 2.93ms (340 MiB/s), for 100% random IO at the same queue depth and worker threads 3,733 IOps at an average latency of 34.06ms (34 MiB/s). The same test was done on the same disk but the test file was 130GiB: For 100% sequential IO at a queue depth of 16 with 8 worker threads 43,537 IOps at an average latency of 2.93ms (340 MiB/s), for 100% random IO at the same queue depth and worker threads 528 IOps at an average latency of 217.49ms (4 MiB/s). From the result it is clear random performance gets worse as the disk fills up – I’m currently writing an article on short stroking which will cover this in detail. Given the work load is random in nature looking at the random performance of the single drive when only 40 GiB of the 146 GB is used gives near the IOps required but the latency is way out. Luckily I have tested 6 x 15Krpm 146GB SAS 15Krpm drives in a RAID 0 using the same test methodology, for the same test above on a 130 GiB for each drive added the performance boost is near linear, for each drive added throughput goes up by 5 MiB/sec, IOps by 700 IOps and latency reducing nearly 50% per drive added (172 ms, 94 ms, 65 ms, 47 ms, 37 ms, 30 ms). This is because the same 130GiB is spread out more as you add drives 130 / 1, 130 / 2, 130 / 3 etc. so implicit short stroking is occurring because there is less file on each drive so less head movement required. The best latency is still 30 ms but we have the IOps required now, but that’s on a 130GiB file and not the 400GiB we need. Some reality check here: a) the drive randomness is more likely to be 50/50 and not a full 100% but the above has highlighted the effect randomness has on the drive and the more a drive fills with data the worse the effect. For argument sake let us assume that for the given workload we need 8 disks to do the job, for resilience reasons we will need 16 because we need to RAID 1+0 them in order to get the throughput and the resilience, RAID 5 would degrade performance. Cost for hard drives: 16 x £239.60 = £3,833.60 For the hard drives we will need disk controllers and a separate external disk array because the likelihood is that the server itself won’t take the drives, a quick spec off DELL for a PowerVault MD1220 which gives the dual pathing with 16 disks 146GB 15Krpm 2.5” disks is priced at £7,438.00, note its probably more once we had two controller cards to sit in the server in, racking etc. Minimum cost taking the DELL quote as an example is therefore: {Cost of Hardware} / {Storage Required} £7,438.60 / 400 = £18.595 per GB £18.59 per GiB is a far cry from the £0.39 we had been told by the salesman and the myth. Yes, the storage array is composed of 16 x 146 disks in RAID 10 (therefore 8 usable) giving an effective usable storage availability of 1168GB but the actual storage requirement is only 400 and the extra disks have had to be purchased to get the IOps up. Solid State Drive solution A single card significantly exceeds the IOps and latency required, for resilience two will be required. ( £2,316.54 * 2 ) / 400 = £11.58 per GB With the SSD solution only two PCIe sockets are required, no external disk units, no additional controllers, no redundant controllers etc. Conclusion I hope by showing you an example that the myth that hard disk drives are cheaper per GiB than Solid State has now been dispelled - £11.58 per GB for SSD compared to £18.59 for Hard Disk. I’ve not even touched on the running costs, compare the costs of running 18 hard disks, that’s a lot of heat and power compared to two PCIe cards!Just a quick note: I've left a fair amount of information out due to this being a blog! If in doubt, email me :)I'll also deal with the myth that SSD's wear out at a later date as well - that's just way over done still, yes, 5 years ago, but now - no.

Read the article
When is a Seek not a Seek?

- by Paul White

The following script creates a single-column clustered table containing the integers from 1 to 1,000 inclusive. IF OBJECT_ID(N'tempdb..#Test', N'U') IS NOT NULL DROP TABLE #Test ; GO CREATE TABLE #Test ( id INTEGER PRIMARY KEY CLUSTERED ); ; INSERT #Test (id) SELECT V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 1000 ; Let’s say we need to find the rows with values from 100 to 170, excluding any values that divide exactly by 10. One way to write that query would be: SELECT T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; That query produces a pretty efficient-looking query plan: Knowing that the source column is defined as an INTEGER, we could also express the query this way: SELECT T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; We get a similar-looking plan: If you look closely, you might notice that the line connecting the two icons is a little thinner than before. The first query is estimated to produce 61.9167 rows – very close to the 63 rows we know the query will return. The second query presents a tougher challenge for SQL Server because it doesn’t know how to predict the selectivity of the modulo expression (T.id % 10 > 0). Without that last line, the second query is estimated to produce 68.1667 rows – a slight overestimate. Adding the opaque modulo expression results in SQL Server guessing at the selectivity. As you may know, the selectivity guess for a greater-than operation is 30%, so the final estimate is 30% of 68.1667, which comes to 20.45 rows. The second difference is that the Clustered Index Seek is costed at 99% of the estimated total for the statement. For some reason, the final SELECT operator is assigned a small cost of 0.0000484 units; I have absolutely no idea why this is so, or what it models. Nevertheless, we can compare the total cost for both queries: the first one comes in at 0.0033501 units, and the second at 0.0034054. The important point is that the second query is costed very slightly higher than the first, even though it is expected to produce many fewer rows (20.45 versus 61.9167). If you run the two queries, they produce exactly the same results, and both complete so quickly that it is impossible to measure CPU usage for a single execution. We can, however, compare the I/O statistics for a single run by running the queries with STATISTICS IO ON: Table '#Test'. Scan count 63, logical reads 126, physical reads 0. Table '#Test'. Scan count 01, logical reads 002, physical reads 0. The query with the IN list uses 126 logical reads (and has a ‘scan count’ of 63), while the second query form completes with just 2 logical reads (and a ‘scan count’ of 1). It is no coincidence that 126 = 63 * 2, by the way. It is almost as if the first query is doing 63 seeks, compared to one for the second query. In fact, that is exactly what it is doing. There is no indication of this in the graphical plan, or the tool-tip that appears when you hover your mouse over the Clustered Index Seek icon. To see the 63 seek operations, you have click on the Seek icon and look in the Properties window (press F4, or right-click and choose from the menu): The Seek Predicates list shows a total of 63 seek operations – one for each of the values from the IN list contained in the first query. I have expanded the first seek node to show the details; it is seeking down the clustered index to find the entry with the value 101. Each of the other 62 nodes expands similarly, and the same information is contained (even more verbosely) in the XML form of the plan. Each of the 63 seek operations starts at the root of the clustered index B-tree and navigates down to the leaf page that contains the sought key value. Our table is just large enough to need a separate root page, so each seek incurs 2 logical reads (one for the root, and one for the leaf). We can see the index depth using the INDEXPROPERTY function, or by using the a DMV: SELECT S.index_type_desc, S.index_depth FROM sys.dm_db_index_physical_stats ( DB_ID(N'tempdb'), OBJECT_ID(N'tempdb..#Test', N'U'), 1, 1, DEFAULT ) AS S ; Let’s look now at the Properties window when the Clustered Index Seek from the second query is selected: There is just one seek operation, which starts at the root of the index and navigates the B-tree looking for the first key that matches the Start range condition (id >= 101). It then continues to read records at the leaf level of the index (following links between leaf-level pages if necessary) until it finds a row that does not meet the End range condition (id <= 169). Every row that meets the seek range condition is also tested against the Residual Predicate highlighted above (id % 10 > 0), and is only returned if it matches that as well. You will not be surprised that the single seek (with a range scan and residual predicate) is much more efficient than 63 singleton seeks. It is not 63 times more efficient (as the logical reads comparison would suggest), but it is around three times faster. Let’s run both query forms 10,000 times and measure the elapsed time: DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON; SET STATISTICS XML OFF; ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; GO DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; On my laptop, running SQL Server 2008 build 4272 (SP2 CU2), the IN form of the query takes around 830ms and the range query about 300ms. The main point of this post is not performance, however – it is meant as an introduction to the next few parts in this mini-series that will continue to explore scans and seeks in detail. When is a seek not a seek? When it is 63 seeks © Paul White 2011 email: [email protected] twitter: @SQL_kiwi

Read the article
Augmenting your Social Efforts via Data as a Service (DaaS)

- by Mike Stiles

The following is the 3rd in a series of posts on the value of leveraging social data across your enterprise by Oracle VP Product Development Don Springer and Oracle Cloud Data and Insight Service Sr. Director Product Management Niraj Deo. In this post, we will discuss the approach and value of integrating additional “public” data via a cloud-based Data-as-as-Service platform (or DaaS) to augment your Socially Enabled Big Data Analytics and CX Management. Let’s assume you have a functional Social-CRM platform in place. You are now successfully and continuously listening and learning from your customers and key constituents in Social Media, you are identifying relevant posts and following up with direct engagement where warranted (both 1:1, 1:community, 1:all), and you are starting to integrate signals for communication into your appropriate Customer Experience (CX) Management systems as well as insights for analysis in your business intelligence application. What is the next step? Augmenting Social Data with other Public Data for More Advanced Analytics When we say advanced analytics, we are talking about understanding causality and correlation from a wide variety, volume and velocity of data to Key Performance Indicators (KPI) to achieve and optimize business value. And in some cases, to predict future performance to make appropriate course corrections and change the outcome to your advantage while you can. The data to acquire, process and analyze this is very nuanced: It can vary across structured, semi-structured, and unstructured data It can span across content, profile, and communities of profiles data It is increasingly public, curated and user generated The key is not just getting the data, but making it value-added data and using it to help discover the insights to connect to and improve your KPIs. As we spend time working with our larger customers on advanced analytics, we have seen a need arise for more business applications to have the ability to ingest and use “quality” curated, social, transactional reference data and corresponding insights. The challenge for the enterprise has been getting this data inline into an easily accessible system and providing the contextual integration of the underlying data enriched with insights to be exported into the enterprise’s business applications. The following diagram shows the requirements for this next generation data and insights service or (DaaS): Some quick points on these requirements: Public Data, which in this context is about Common Business Entities, such as - Customers, Suppliers, Partners, Competitors (all are organizations) Contacts, Consumers, Employees (all are people) Products, Brands This data can be broadly categorized incrementally as - Base Utility data (address, industry classification) Public Master Reference data (trade style, hierarchy) Social/Web data (News, Feeds, Graph) Transactional Data generated by enterprise process, workflows etc. This Data has traits of high-volume, variety, velocity etc., and the technology needed to efficiently integrate this data for your needs includes - Change management of Public Reference Data across all categories Applied Big Data to extract statics as well as real-time insights Knowledge Diagnostics and Data Mining As you consider how to deploy this solution, many of our customers will be using an online “cloud” service that provides quality data and insights uniformly to all their necessary applications. In addition, they are requesting a service that is: Agile and Easy to Use: Applications integrated with the service can obtain data on-demand, quickly and simply Cost-effective: Pre-integrated into applications so customers don’t have to Has High Data Quality: Single point access to reference data for data quality and linkages to transactional, curated and social data Supports Data Governance: Becomes more manageable and cost-effective since control of data privacy and compliance can be enforced in a centralized place Data-as-a-Service (DaaS) Just as the cloud has transformed and now offers a better path for how an enterprise manages its IT from their infrastructure, platform, and software (IaaS, PaaS, and SaaS), the next step is data (DaaS). Over the last 3 years, we have seen the market begin to offer a cloud-based data service and gain initial traction. On one side of the DaaS continuum, we see an “appliance” type of service that provides a single, reliable source of accurate business data plus social information about accounts, leads, contacts, etc. On the other side of the continuum we see more of an online market “exchange” approach where ISVs and Data Publishers can publish and sell premium datasets within the exchange, with the exchange providing a rich set of web interfaces to improve the ease of data integration. Why the difference? It depends on the provider’s philosophy on how fast the rate of commoditization of certain data types will occur. How do you decide the best approach? Our perspective, as shown in the diagram below, is that the enterprise should develop an elastic schema to support multi-domain applicability. This allows the enterprise to take the most flexible approach to harness the speed and breadth of public data to achieve value. The key tenet of the proposed approach is that an enterprise carefully federates common utility, master reference data end points, mobility considerations and content processing, so that they are pervasively available. One way you may already be familiar with this approach is in how you do Address Verification treatments for accounts, contacts etc. If you design and revise this service in such a way that it is also easily available to social analytic needs, you could extend this to launch geo-location based social use cases (marketing, sales etc.). Our fundamental belief is that value-added data achieved through enrichment with specialized algorithms, as well as applying business “know-how” to weight-factor KPIs based on innovative combinations across an ever-increasing variety, volume and velocity of data, will be where real value is achieved. Essentially, Data-as-a-Service becomes a single entry point for the ever-increasing richness and volume of public data, with enrichment and combined capabilities to extract and integrate the right data from the right sources with the right factoring at the right time for faster decision-making and action within your core business applications. As more data becomes available (and in many cases commoditized), this value-added data processing approach will provide you with ongoing competitive advantage. Let’s look at a quick example of creating a master reference relationship that could be used as an input for a variety of your already existing business applications. In phase 1, a simple master relationship is achieved between a company (e.g. General Motors) and a variety of car brands’ social insights. The reference data allows for easy sort, export and integration into a set of CRM use cases for analytics, sales and marketing CRM. In phase 2, as you create more data relationships (e.g. competitors, contacts, other brands) to have broader and deeper references (social profiles, social meta-data) for more use cases across CRM, HCM, SRM, etc. This is just the tip of the iceberg, as the amount of master reference relationships is constrained only by your imagination and the availability of quality curated data you have to work with. DaaS is just now emerging onto the marketplace as the next step in cloud transformation. For some of you, this may be the first you have heard about it. Let us know if you have questions, or perspectives. In the meantime, we will continue to share insights as we can.Photo: Erik Araujo, stock.xchng

Read the article
Spotlight Infinite Indexing issue (external data drive)

- by Manca Weeks

This is an external drive, formerly a boot drive which is now in use only to access music files (sibelius, audio, midi, live, logic etc.) without transferring the data into a new boot system, partly because of the issue I am about to describe, but mostly because the majority of the data is mainly there for archival purposes. The user is a composer and prominent musician and needs to be able to rehash the data at will. I have tried several things - here is a list: - make complete filesystem clone with antonio diaz's ddrescue - run Disk Warrior on copy, repair whatever errors occurred - wipe out all ACLs on entire drive - set all permissions to the same value - wide open 777 - remove any system data (applications, system files, including hidden files to the best of my knowledge) by selecting only non-system/app data and using Carbon Copy Cloner to put only the data of interest onto a newly formatted drive - transfer data to newly formatted drive folder by folder, resetting the spotlight index in between adding each to observe for issues (interesting here is that no issues occurred except for in Documents folder - when I transferred only the Documents folder to a newly formatted drive on its own - no trouble. It appears almost as thought it may not be the content but the quantity or specific combination of data that results in problems) - use DataRescue to transfer the data to yet another newly formatted drive to expose any missed hidden files Between each of the above steps I stopped Spotlight (search for anything beginning with md in Activity Monitor - All Processes and quitting it), deleted the .Spotlight-V100 directory from the affected drive. Restart Splotlight indexing by adding drive to Spotlight privacy list and removing it. In each case the same issue occurs - Spotlight begins indexing normally (or so it seems), then the index estimated time increases, usually to 4 hours remaining. This is where it gets stuck and continues to predict 4 hours remaining but never finishes. Sometimes I can't eject the drive and have to quit the md.. processes from Activity Monitor to be able to eject the drive without Force Eject. Once I disconnect the drive after the 4 hours remaining situation - if I reattach it, Spotlight forever estimates remaining time and never gets going again. So there it is. It is apparently not a filesystem issue, not a permissions issue and not tied to any particular piece of hardware or protocol (used USB and FW drives). I have tried this on several machines (3 to be precise) and in 10.5.8 and 10.6.5. Simply disabling Spotlight on this volume is not an option because the owner has no clue where things are as the data on the volume dates back to music projects and compositions from 2003 and before. He needs to be able to query for results. Anyone got any ideas? ---update 2-6-11 Since I have not received any responses except the one below which appears to misunderstand my point, I am updating this post hoping to get more responses. I have used the terminal command sudo opensnoop -p PID where PID is the mdworker process ID to try and determine what Spotlight is doing and hopefully find the files it's having trouble with. Here's what happens: After indexing for a few hours, mdworker is gone. It no longer shows up in Activity Monitor under "All Processes" and the Terminal window with the opensnoop result stops moving. I then proceeded to execute the same command on mds to see what it was doing and here's what I get, repeatedly: 501 57 mds 21 / 501 57 mds 21 /Volumes/Sno Leppard 501 57 mds 21 /Volumes/Tiger 501 57 mds 21 /Volumes/Leppard 501 57 mds 21 /Volumes/Disk Warrior 501 57 mds 21 /Volumes/ONM Data These represent all the volumes currently mounted in the system. All except ONM Data, which is the one I am trying to index, are excluded from SPotlight indexing at the moment. The sequence above repeats over and over, with slight variation, sometimes skipping one of the volumes. Questions - what happened to mdworker? What is mds doing? I will let this run until tomorrow morning and throughout the day and monitor for any changes. Any input would be very much appreciated. Even if you're not sure what the ultimate answer is, please alert me to anything you think I may be missing. Hopefully at some point we will figure this out... Thanks, M __final edit__ I finally resolved the issue and here is how I did it. I used the terminal command "sudo opensnoop -p PID" where the PID is the process id of the processes I was monitoring. I was looking at all instances of mds and mdworker running in the system. After the third time through indexing the same data set (see info above), I contacted Apple and got to their highest level of support - they were flabbergasted as well. They advised me to install yet another default 10.6.6 system and try again. The same pattern repeated - mds and mdworker(s) would start indexing and eventually the spotlight icon would say 6 hours remaining and all mdworkers were gone, mds at 90% or so of CPU. But I did finally figure out that the first time mdworker stopped like that, the last file it touched was always in the same folder. I excluded that folder from spotlight search and the rest of the data set indexed within about 2 hours with no strange behavior or failures. I copied that folder to another machine and Spotlight barfed immediately. Exclude that folder and all is well again. I have no clue what is causing this behavior, still, but I did find a functional solution to the problem. Anyone with a similar problem - run opensnoop on all instances of mds and mdworker and wait patiently for wdworker to exit. Look at the last file it touched and exclude the enclosing folder from being indexed. I was able to repeat the issue and solution on 2 different installs and 2 different copies of the data set. Hope this helps. If we find an actual cause of the folder being such a problem (it is called MICHAEL BRECKER RECORD SOLOS and contains almost 1 GB of audio related files - performer, live, SD2 - things like that), I will edit again to let you all know. Thanks for ay attempts to help, M

Read the article
Skipping scheduled self-tests and predicting drive EOL

- by Steve Madsen

For a few weeks now, smartd has been reporting that it is skipping some of its scheduled self-tests on the weekends: Apr 24 18:29:32 calvin smartd[4758]: Device: /dev/sda, skip scheduled Offline Immediate Test; 40% remaining of current Self-Test. Apr 24 18:29:33 calvin smartd[4758]: Device: /dev/sdb, skip scheduled Offline Immediate Test; 50% remaining of current Self-Test. The drives in this RAID-1 array are set to run an offline test four times a day, a short self-test at 2am every day, and a long self-test on Saturdays at 2am. For some reason, it looks like the long self-test is taking longer, causing the other scheduled tests to be skipped. First question: is this a sign of likely drive failure? Then today, smartd reported that a self-test failed. Here is the output of smartctl -a /dev/sdb: smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.8 family Device Model: ST3250823AS Serial Number: 3ND1GNBC Firmware Version: 3.03 User Capacity: 250,059,350,016 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 7 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Sun Apr 25 13:15:34 2010 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 430) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 84) minutes. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 047 039 006 Pre-fail Always - 168450357 3 Spin_Up_Time 0x0003 098 098 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 33 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 9 7 Seek_Error_Rate 0x000f 087 060 030 Pre-fail Always - 654745480 9 Power_On_Hours 0x0032 055 055 000 Old_age Always - 40141 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 51 194 Temperature_Celsius 0x0022 037 062 000 Old_age Always - 37 (0 17 0 0) 195 Hardware_ECC_Recovered 0x001a 047 039 000 Old_age Always - 168450357 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0 202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 40131 - # 2 Extended offline Completed: read failure 30% 40129 379795511 # 3 Short offline Completed without error 00% 40084 - # 4 Short offline Completed without error 00% 40060 - # 5 Short offline Completed without error 00% 40036 - # 6 Short offline Completed without error 00% 40013 - # 7 Short offline Completed without error 00% 39990 - # 8 Extended offline Completed without error 00% 39977 - # 9 Short offline Completed without error 00% 39919 - #10 Short offline Completed without error 00% 39895 - #11 Short offline Completed without error 00% 39872 - #12 Short offline Completed without error 00% 39848 - #13 Short offline Completed without error 00% 39824 - #14 Short offline Completed without error 00% 39801 - #15 Extended offline Completed without error 00% 39789 - #16 Short offline Completed without error 00% 39754 - #17 Short offline Completed without error 00% 39732 - #18 Short offline Completed without error 00% 39707 - #19 Short offline Completed without error 00% 39683 - #20 Short offline Completed without error 00% 39660 - #21 Short offline Completed without error 00% 39636 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Given that this drive is about 4.5 years old, I am probably tempting fate by keeping it in service. SMART doesn't seem to get much respect as a reliable way to predict drive failure. What else can I use to get an early indication of drive failure?

Read the article
Weighted round robins via TTL - possible?

- by Joe Hopfgartner

I currently use DNS round robin for load balancing, which works great. The records look like this (I have a ttl of 120 seconds) ;; ANSWER SECTION: orion.2x.to. 116 IN A 80.237.201.41 orion.2x.to. 116 IN A 87.230.54.12 orion.2x.to. 116 IN A 87.230.100.10 orion.2x.to. 116 IN A 87.230.51.65 I learned that not every ISP / device treats such a response the same way. For example some DNS servers rotate the addresses randomly or always cycle them through. Some just propagate the first entry, others try to determine which is best (regionally near) by looking at the ip address. However if the userbase is big enough (spreads over multiple ISPs etc) it balances pretty well. The discrepancies from highest to lowest loaded server hardly every exceeds 15%. However now I have the problem that I am introducing more servers into the systems, that not all have the same capacities. I currently only have 1gbps servers, but I want to work with 100mbit and also 10gbps servers too. So what I want is I want to introduce a server with 10 GBps with a weight of 100, a 1 gbps server with a weight of 10 and a 100 mbit server with a weight of 1. I used to add servers twice to bring more traffic to them (which worked nice. the bandwidth doubled almost.) But adding a 10gbit server 100 times to DNS is a bit rediculous. So I thought about using the TTL. If I give server A 240 seconds ttl and server B only 120 seconds (which is about about the minimum to use for round robin, as a lot of dns servers set to 120 if a lower ttl is specified.. so i have heard) I think something like this should occour in an ideal scenario: first 120 seconds 50% of requests get server A -> keep it for 240 seconds. 50% of requests get server B -> keep it for 120 seconds second 120 seconds 50% of requests still have server A cached -> keep it for another 120 seconds. 25% of requests get server A -> keep it for 240 seconds 25% of requests get server B -> keep it for 120 seconds third 120 seconds 25% will get server A (from the 50% of Server A that now expired) -> cache 240 sec 25% will get server B (from the 50% of Server A that now expired) -> cache 120 sec 25% will have server A cached for another 120 seconds 12.5% will get server B (from the 25% of server B that now expired) -> cache 120sec 12.5% will get server A (from the 25% of server B that now expired) -> cache 240 sec fourth 120 seconds 25% will have server A cached -> cache for another 120 secs 12.5% will get server A (from the 25% of b that now expired) -> cache 240 secs 12.5% will get server B (from the 25% of b that now expired) -> cache 120 secs 12.5% will get server A (from the 25% of a that now expired) -> cache 240 secs 12.5% will get server B (from the 25% of a that now expired) -> cache 120 secs 6.25% will get server A (from the 12.5% of b that now expired) -> cache 240 secs 6.25% will get server B (from the 12.5% of b that now expired) -> cache 120 secs 12.5% will have server A cached -> cache another 120 secs ... i think i lost something at this point but i think you get the idea.... As you can see this gets pretty complicated to predict and it will for sure not work out like this in practice. But it should definitely have an effect on the distribution! I know that weighted round robin exists and is just controlled by the root server. It just cycles through dns records when responding and returns dns records with a set propability that corresponds to the weighting. My DNS server does not support this, and my requirements are not that precise. If it doesnt weight perfectly its okay, but it should go into the right direction. I think using the TTL field could be a more elegant and easier solution - and it deosnt require a dns server that controls this dynamically, which saves resources - which is in my opinion the whole point of dns load balancing vs hardware load balancers. My question now is... are there any best prectices / methos / rules of thumb to weight round robin distribution using the TTL attribute of DNS records? Edit: The system is a forward proxy server system. The amount of Bandwidth (not requests) exceeds what one single server with ethernet can handle. So I need a balancing solution that distributes the bandwidth to several servers. Are there any alternative methods than using DNS? Of course I can use a load balancer with fibre channel etc, but the costs are rediciulous and it also increases only the width of the bottleneck and does not eliminate it. The only thing i can think of are anycast (is it anycast or multicast?) ip addresses, but I don't have the means to set up such a system.

Read the article
How to show form in front in C#

- by corlettk

Folks, Please does anyone know how to show a Form from an otherwise invisible application, and have it get the focus (i.e. appear on top of other windows)? I'm working in C# .NET 3.5. I suspect I've taken "completely the wrong approach"... I do not Application.Run(new TheForm ()) instead I (new TheForm()).ShowModal()... The Form is basically a modal dialogue, with a few check-boxes; a text-box, and OK and Cancel Buttons. The user ticks a checkbox and types in a description (or whatever) then presses OK, the form disappears and the process reads the user-input from the Form, Disposes it, and continues processing. This works, except when the form is show it doesn't get the focus, instead it appears behind the "host" application, until you click on it in the taskbar (or whatever). This is a most annoying behaviour, which I predict will cause many "support calls", and the existing VB6 version doesn't have this problem, so I'm going backwards in usability... and users won't accept that (and nor should they). So... I'm starting to think I need to rethink the whole shebang... I should show the form up front, as a "normal application" and attach the remainer of the processing to the OK-button-click event. It should work, But that will take time which I don't have (I'm already over time/budget)... so first I really need to try to make the current approach work... even by quick-and-dirty methods. So please does anyone know how to "force" a .NET 3.5 Form (by fair means or fowl) to get the focus? I'm thinking "magic" windows API calls (I know Twilight Zone: This only appears to be an issue at work, we're I'm using Visual Studio 2008 on Windows XP SP3... I've just failed to reproduce the problem with an SSCCE (see below) at home on Visual C# 2008 on Vista Ulimate... This works fine. Huh? WTF? Also, I'd swear that at work yesterday showed the form when I ran the EXE, but not when F5'ed (or Ctrl-F5'ed) straight from the IDE (which I just put up with)... At home the form shows fine either way. Totaly confusterpating! It may or may not be relevant, but Visual Studio crashed-and-burned this morning when the project was running in debug mode and editing the code "on the fly"... it got stuck what I presumed was an endless loop of error messages. The error message was something about "can't debug this project because it is not the current project, or something... So I just killed it off with process explorer. It started up again fine, and even offered to recover the "lost" file, an offer which I accepted. using System; using System.Windows.Forms; namespace ShowFormOnTop { static class Program { [STAThread] static void Main() { Application.EnableVisualStyles(); Application.SetCompatibleTextRenderingDefault(false); //Application.Run(new Form1()); Form1 frm = new Form1(); frm.ShowDialog(); } } } Background: I'm porting an existing VB6 implementation to .NET... It's a "plugin" for a "client" GIS application called MapInfo. The existing client "worked invisibly" and my instructions are "to keep the new version as close as possible to the old version", which works well enough (after years of patching); it's just written in an unsupported language, so we need to port it. About me: I'm pretty much a noob to C# and .NET generally, though I've got a bottoms wiping certificate, I have been a professional programmer for 10 years; So I sort of "know some stuff". Any insights would be most welcome... and Thank you all for taking the time to read this far. Consiseness isn't (apparently) my forte. Cheers. Keith.

Read the article
Does HTML 5 “Rich vs. Reach” a False Choice?

- by andrewbrust

The competition between the Web and proprietary rich platforms, including Windows, Mac OS, iPhone/iPad, Adobe’s Flash/AIR and Microsoft’s Silverlight, is not new. But with the emergence of HTML 5 and imminent support for it in the next release of the major Web browsers, the battle is heating up. And with the announcements made Wednesday at Google's I/O conference, it's getting kicked up yet another notch. The impact of this platform battle on companies in the media and advertising world, and the developers who serve them, is significant. The most prominent question is whether video and rich media online will shift towards pure HTML and away from plug-ins like Flash and Silverlight. In fact, certain features in HTML 5 make it suitable for development for line of business applications as well, further threatening those plug-in technologies. So what's the deal? Is this real or hype? To answer that question, I've done my own research into HTML 5's features and talked to several media-focused, New York area developers to get their opinions. I present my findings to you in this post. Before bearing down into HTML 5 specifics and practitioners’ quotes, let's set the context. To understand what HTML 5 can do, take a look at this video of Sports Illustrated’s HTML 5 prototype. This should start to get you bought into the idea that HTML 5 could be a game-changer. Next, if you happen to have installed the beta version of Google's Chrome 5 browser, take a look at the page linked to below, and in that page, click on any of the game thumbnails to see what's possible, without a plug-in, in this brave new world. (Note, although the instructions for each game tell you to press the A key to start, press the Z key instead.). Here's the link: http://www.kesiev.com/akihabara As an adjunct to what's enabled by HTML 5, consider the various transforms that are part of CSS 3. If you're running Safari as your browser, the following link will showcase this live; if not, you'll see a bitmap that will give you an idea of what's possible: http://webkit.org/blog/386/3d-transforms Are you starting to get the picture (literally)? What has up until now required browser plug-ins and other patches to HTML, most typically Flash, will soon be renderable, natively, in all major browsers. Moreover, it's looking likely that developers will be able to deliver such content and experiences in these browsers using one base of markup and script code (using straight JavaScript and/or jQuery), without resorting to browser-specific code and workarounds. If you're skeptical of this, I wouldn't blame you, especially with respect to Microsoft's Internet Explorer. However, i can tell you with confidence that even Microsoft is dedicated to full-on HTML 5 support in version 9 of that browser, which is currently under development. So what’s new in HTML 5, specifically, that makes sites like this possible? The specification documents go into deep detail, and there’s no sense in rehashing them here, but a summary is probably in order. Here is a non-authoritative, but useful, list of the major new feature areas in HTML 5: 2D drawing capabilities and 3D transforms. 2D drawing instructions can be embedded statically into a Web page; application interactivity and animation can be achieved through script. As mentioned above, 3D transforms are technically part of version 3 of the CSS (Cascading Style Sheets) spec, rather than HTML 5, but they can nonetheless be thought of as part of the bundle. They allow for rendering of 3D images and animations that, together with 2D drawing, make HTML-based games much more feasible than they are presently, as the links above demonstrate. Embedded audio and video. A media player can appear directly in a rendered Web page, using HTML markup and no plug-ins. Alternately, player controls can be hidden and the content can play automatically. Major enhancements to form-based input. This includes such things as specification of required fields, embedding of text “hints” into a control, limiting valid input on a field to dates, email addresses or a list of values. There’s more to this, but the gist is that line-of-business applications, with complicated input and data validation, are supported directly Offline caching, local storage and client-side SQL database. These facilities allow Web applications to function more like native apps, even if no internet connection is available. User-defined data. Data (or metadata – data about data) can easily be embedded statically and/or retrieved and updated with Javascript code. This avoids having to embed that data in a separate file, or within script code. Taken together, these features position HTML to compete with, and perhaps overtake, Adobe’s Flash/AIR (and Microsoft’s Silverlight) as a viable Web platform for media, RIAs (rich internet applications – apps that function more like desktop software than Web sites) and interactive Web content, including games. What do players in the media world think about this? From the embedded video above, we know what Sports Illustrated (and, therefore, Time Warner) think. Hulu, the major Internet site for broadcast TV content, is on record as saying HTML5 video does not pass muster with them, at least not yet. YouTube, on the other hand, already has an experimental HTML 5-based version of their site. TechCrunch has reported that NetFlix is flirting with HTML 5 too, especially as it pertains to embedded browsers in TV-based devices. And the New York Times’ Web site now embeds some video clips without resorting to Flash. They have to – otherwise iPhone, iPod Touch and iPad users couldn’t see them in the Mobile Safari browser. What do media-focused developers think about all this? I talked to several to get their opinions. Michael Pinto is CEO and Founder of Very Memorable Design whose primary focus has been to help marketing directors get traction online. The firm’s client roster includes the likes Time, Inc., Scholastic and PBS. Pinto predicts that “More and more microsites that were done entirely in Flash will be done more and more using jQuery. I can also see slideshows and video now being done without Flash. However if you needed to create a game or highly interactive activity Flash would still be the way to go for the web.” A dissenting view comes from Jesse Erlbaum, CEO of The Erlbaum Group, LLC, which serves numerous clients in the magazine publishing sector. When I asked Erlbaum whether he thought HTML 5 and jQuery/JavaScript would steal significant market share from Flash, he responded “Not at all! In particular, not for media and advertising customers! These sectors are not generally in the business of making highly functional applications, which is the one place where HTML5/jQuery/etc really shines.” Ironically, Pinto’s firm is a heavy user of Flash for its projects and Erlbaum’s develops atop the “LAMP” (Linux, Apache, MySQL and PHP/Perl) stack. For whatever reason, each firm seems to see the other’s toolset as a more viable choice. But both agree that the developer tool story around HTML 5 is deficient. Pinto explains “What’s lost with [HTML 5 and Javascript] techniques is that there isn’t a single widely favored easy-to-use tool of choice for authoring. So with Flash you can get up and running right away and not worry about what is different from one browser to the next.“ Erlbaum agrees, saying: “HTML5/Javascript lacks a sophisticated integrated development environment (IDE) which is an essential part of Flash. If what someone is trying to make is primarily animation, it's a waste of time…to do this in Javascript. It can be done much more easily in Flash, and with greater cross-browser compatibility and consistency due to the ubiquity of Flash.” Adobe (maker of Flash since its 2005 acquisition of Macromedia) likely agrees. And for better or worse, they’ve decided to address this shortcoming of HTML 5, even at risk of diminishing their Flash platfrom. Yesterday Adobe announced that their hugely popular Deamweaver Web design authoring tool would directly support HTML 5 and CSS 3 development. In fact, the Adobe Dreamweaver CS5 HTML5 Pack is downloadable now from Adobe Labs. Maybe Adobe is bowing to pressure from ardent Web professionals like Scott Kellum, Lead Designer at Channel V Media, a digital and offline branding firm, serving the media and marketing sectors, among others. Kellum told me that HTML 5 “…will definitely move people away from Flash. It has many of the same functionalities with faster load times and better accessibility. HTML5 will help Flash as well: with the new caching methods you can now even run Flash apps offline.” Although all three Web developers I interviewed would agree that Flash is still required for more sophisticated applications, Kellum seems to have put his finger on why HTML 5 may nonetheless dominate. In his view, much of the Web development out there has little need for high-end capabilities: “Most people want to add a little punch to a navigation bar or some video and now you can get the biggest bang for your buck with HTML5, CSS3 and Javascript.” I’ve already mentioned that Google’s ongoing I/O conference, at the Moscone West center in San Francisco, is driving the HTML 5 news cycle, big time. And Google made many announcements of their own, including the open sourcing of their VP8 video codec, new enterprise-oriented capabilities for its App Engine cloud offering, and the creation of the Chrome Web Store, which the company says will make it easier to find and “install” Web applications, in a fashion similar to the way users procure native apps on various mobile platforms. HTML 5 looks to be disruptive, especially to the media world. And even if the technology ends up disappointing, the chatter around it alone is causing big changes in the technology world. If the richness it promises delivers, then magazine publishers and non-text digital advertisers may indeed have a platform for creating compelling content that loads quickly, is standards-based and will render identically in (the newest versions of) all major Web browsers. Can this development in the digital arena save the titans of the print world? I can’t predict, but it’s going to be fun to watch, and the competitive innovation from all players in both industries will likely be immense.

Read the article
Guest Post: Using IronRuby and .NET to produce the ‘Hello World of WPF’

- by Eric Nelson

[You might want to also read other GuestPosts on my blog – or contribute one?] On the 26th and 27th of March (2010) myself and Edd Morgan of Microsoft will be popping along to the Scottish Ruby Conference. I dabble with Ruby and I am a huge fan whilst Edd is a “proper Ruby developer”. Hence I asked Edd if he was interested in creating a guest post or two for my blog on IronRuby. This is the second of those posts. If you should stumble across this post and happen to be attending the Scottish Ruby Conference, then please do keep a look out for myself and Edd. We would both love to chat about all things Ruby and IronRuby. And… we should have (if Amazon is kind) a few books on IronRuby with us at the conference which will need to find a good home. This is me and Edd and … the book: Order on Amazon: http://bit.ly/ironrubyunleashed Using IronRuby and .NET to produce the ‘Hello World of WPF’ In my previous post I introduced, to a minor extent, IronRuby. I expanded a little on the basics of by getting a Rails app up-and-running on this .NET implementation of the Ruby language — but there wasn't much to it! So now I would like to go from simply running a pre-existing project under IronRuby to developing a whole new application demonstrating the seamless interoperability between IronRuby and .NET. In particular, we'll be using WPF (Windows Presentation Foundation) — the component of the .NET Framework stack used to create rich media and graphical interfaces. Foundations of WPF To reiterate, WPF is the engine in the .NET Framework responsible for rendering rich user interfaces and other media. It's not the only collection of libraries in the framework with the power to do this — Windows Forms does the trick, too — but it is the most powerful and flexible. Put simply, WPF really excels when you need to employ eye candy. It's all about creating impact. Whether you're presenting a document, video, a data entry form, some kind of data visualisation (which I am most hopeful for, especially in terms of IronRuby - more on that later) or chaining all of the above with some flashy animations, you're likely to find that WPF gives you the most power when developing any of these for a Windows target. Let's demonstrate this with an example. I give you what I like to consider the 'hello, world' of WPF applications: the analogue clock. Today, over my lunch break, I created a WPF-based analogue clock using IronRuby... Any normal person would have just looked at their watch. - Twitter The Sample Application: Click here to see this sample in full on GitHub. Using Windows Presentation Foundation from IronRuby to create a Clock class Invoking the Clock class Gives you The above is by no means perfect (it was a lunch break), but I think it does the job of illustrating IronRuby's interoperability with WPF using a familiar data visualisation. I'm sure you'll want to dissect the code yourself, but allow me to step through the important bits. (By the way, feel free to run this through ir first to see what actually happens). Now we're using IronRuby - unlike my previous post where we took pure Ruby code and ran it through ir, the IronRuby interpreter, to demonstrate compatibility. The main thing of note is the very distinct parallels between .NET namespaces and Ruby modules, .NET classes and Ruby classes. I guess there's not much to say about it other than at this point, you may as well be working with a purely Ruby graphics-drawing library. You're instantiating .NET objects, but you're doing it with the standard Ruby .new method you know from Ruby as Object#new — although, the root object of all your IronRuby objects isn't actually Object, it's System.Object. You're calling methods on these objects (and classes, for example in the call to System.Windows.Controls.Canvas.SetZIndex()) using the underscored, lowercase convention established for the Ruby language. The integration is so seamless. The fact that you're using a dynamic language on top of .NET's CLR is completely abstracted from you, allowing you to just build your software. A Brief Note on Events Events are a big part of developing client applications in .NET as well as under every other environment I can think of. In case you aren't aware, event-driven programming is essentially the practice of telling your code to call a particular method, or other chunk of code (a delegate) when something happens at an unpredictable time. You can never predict when a user is going to click a button, move their mouse or perform any other kind of input, so the advent of the GUI is what necessitated event-driven programming. This is where one of my favourite aspects of the Ruby language, blocks, can really help us. In traditional C#, for instance, you may subscribe to an event (assign a block of code to execute when an event occurs) in one of two ways: by passing a reference to a named method, or by providing an anonymous code block. You'd be right for seeing the parallel here with Ruby's concept of blocks, Procs and lambdas. As demonstrated at the very end of this rather basic script, we are using .NET's System.Timers.Timer to (attempt to) update the clock every second (I know it's probably not the best way of doing this, but for example's sake). Note: Diverting a little from what I said above, the ticking of a clock is very predictable, yet we still use the event our Timer throws to do this updating as one of many ways to perform that task outside of the main thread. You'll see that all that's needed to assign a block of code to be triggered on an event is to provide that block to the method of the name of the event as it is known to the CLR. This drawback to this is that it only allows the delegation of one code block to each event. You may use the add method to subscribe multiple handlers to that event - pushing that to the end of a queue. Like so: def tick puts "tick tock" end timer.elapsed.add method(:tick) timer.elapsed.add proc { puts "tick tock" } tick_handler = lambda { puts "tick tock" } timer.elapsed.add(tick_handler) The ability to just provide a block of code as an event handler helps IronRuby towards that very important term I keep throwing around; low ceremony. Anonymous methods are, of course, available in other more conventional .NET languages such as C# and VB but, as usual, feel ever so much more elegant and natural in IronRuby. Note: Whether it's a named method or an anonymous chunk o' code, the block you delegate to the handling of an event can take arguments - commonly, a sender object and some args. Another Brief Note on Verbosity Personally, I don't mind verbose chaining of references in my code as long as it doesn't interfere with performance - as evidenced in the example above. While I love clean code, there's a certain feeling of safety that comes with the terse explicitness of long-winded addressing and the describing of objects as opposed to ambiguity (not unlike this sentence). However, when working with IronRuby, even I grow tired of typing System::Whatever::Something. Some people enjoy simply assuming namespaces and forgetting about them, regardless of the language they're using. Don't worry, IronRuby has you covered. It is completely possible to, with a call to include, bring the contents of a .NET-converted module into context of your IronRuby code - just as you would if you wanted to bring in an 'organic' Ruby module. To refactor the style of the above example, I could place the following at the top of my Clock class: class Clock include System::Windows::Shape include System::Windows::Media include System::Windows::Threading # and so on... And by doing so, reduce calls to System::Windows::Shapes::Ellipse.new to simply Ellipse.new or references to System::Windows::Threading::DispatcherPriority.Render to a friendlier DispatcherPriority.Render. Conclusion I hope by now you can understand better how IronRuby interoperates with .NET and how you can harness the power of the .NET framework with the dynamic nature and elegant idioms of the Ruby language. The manner and parlance of Ruby that makes it a joy to work with sets of data is, of course, present in IronRuby — couple that with WPF's capability to produce great graphics quickly and easily, and I hope you can visualise the possibilities of data visualisation using these two things. Using IronRuby and WPF together to create visual representations of data and infographics is very exciting to me. Although today, with this project, we're only presenting one simple piece of information - the time - the potential is much grander. My day-to-day job is centred around software development and UI design, specifically in the realm of healthcare, and if you were to pay a visit to our office you would behold, directly above my desk, a large plasma TV with a constantly rotating, animated slideshow of charts and infographics to help members of our team do their jobs. It's an app powered by WPF which never fails to spark some conversation with visitors whose gaze has been hooked. If only it was written in IronRuby, the pleasantly low ceremony and reduced pre-processing time for my brain would have helped greatly. Edd Morgan blog Related Links: Getting PhP and Ruby working on Windows Azure and SQL Azure

Read the article
The Java Specialist: An Interview with Java Champion Heinz Kabutz

- by Janice J. Heiss

Dr. Heinz Kabutz is well known for his Java Specialists’ Newsletter, initiated in November 2000, where he displays his acute grasp of the intricacies of the Java platform for an estimated 70,000 readers; for his work as a consultant; and for his workshops and trainings at his home on the Island of Crete where he has lived since 2006 -- where he is known to curl up on the beach with his laptop to hack away, in between dips in the Mediterranean. Kabutz was born of German parents and raised in Cape Town, South Africa, where he developed a love of programming in junior high school through his explorations on a ZX Spectrum computer. He received a B.S. from the University of Cape Town, and at 25, a Ph.D., both in computer science. He will be leading a two-hour hands-on lab session, HOL6500 – “Finding and Solving Java Deadlocks,” at this year’s JavaOne that will explore what causes deadlocks and how to solve them. Q: Tell us about your JavaOne plans.A: I am arriving on Sunday evening and have just one hands-on-lab to do on Monday morning. This is the first time that a non-Oracle team is doing a HOL at JavaOne under Oracle's stewardship and we are all a bit nervous about how it will turn out. Oracle has been immensely helpful in getting us set up. I have a great team helping me: Kirk Pepperdine, Dario Laverde, Benjamin Evans and Martijn Verburg from jClarity, Nathan Reynolds from Oracle, Henri Tremblay of OCTO Technology and Jeff Genender of Savoir Technologies. Monday will be hard work, but after that, I will hopefully get to network with fellow Java experts, attend interesting sessions and just enjoy San Francisco. Oh, and my kids have already given me a shopping list of things to get, like a GoPro Hero 2 dive housing for shooting those nice videos of Crete. (That's me at the beginning diving down.) Q: What sessions are you attending that we should know about?A: Sometimes the most unusual sessions are the best. I avoid the "big names". They often are spread too thin with all their sessions, which makes it difficult for them to deliver what I would consider deep content. I also avoid entertainers who might be good at presenting but who do not say that much.In 2010, I attended a session by Vladimir Yaroslavskiy where he talked about sorting. Although he struggled to speak English, what he had to say was spectacular. There was hardly anybody in the room, having not heard of Vladimir before. To me that was the highlight of 2010. Funnily enough, he was supposed to speak with Joshua Bloch, but if you remember, Google cancelled. If Bloch has been there, the room would have been packed to capacity.Q: Give us an update on the Java Specialists’ Newsletter.A: The Java Specialists' Newsletter continues being read by an elite audience around the world. The apostrophe in the name is significant. It is a newsletter for Java specialists. When I started it twelve years ago, I was trying to find non-obvious things in Java to write about. Things that would be interesting to an advanced audience.As an April Fool's joke, I told my readers in Issue 44 that subscribing would remain free, but that they would have to pay US$5 to US$7 depending on their geographical location. I received quite a few angry emails from that one. I would have not earned that much from unsubscriptions. Most readers stay for a very long time.After Oracle bought Sun, the Java community held its breath for about two years whilst Oracle was figuring out what to do with Java. For a while, we were quite concerned that there was not much progress shown by Oracle. My newsletter still continued, but it was quite difficult finding new things to write about. We have probably about 70,000 readers, which is quite a small number for a Java publication. However, our readers are the top in the Java industry. So I don't mind having "only" 70000 readers, as long as they are the top 0.7%.Java concurrency is a very important topic that programmers think they should know about, but often neglect to fully understand. I continued writing about that and made some interesting discoveries. For example, in Issue 165, I showed how we can get thread starvation with the ReadWriteLock. This was a bug in Java 5, which was corrected in Java 6, but perhaps a bit too much. Whereas we could get starvation of writers in Java 5, in Java 6 we could now get starvation of readers. All of these interesting findings make their way into my courseware to help companies avoid these pitfalls.Another interesting discovery was how polymorphism works in the Server HotSpot compiler in Issue 157 and Issue 158. HotSpot can inline methods from interfaces that have only one implementation class in the JVM. When a new subclass is instantiated and called for the first time, the JVM will undo the previous optimization and re-optimize differently.Here is a little memory puzzle for your readers: public class JavaMemoryPuzzle { private final int dataSize = (int) (Runtime.getRuntime().maxMemory() * 0.6); public void f() { { byte[] data = new byte[dataSize]; } byte[] data2 = new byte[dataSize]; } public static void main(String[] args) { JavaMemoryPuzzle jmp = new JavaMemoryPuzzle(); jmp.f(); }}When you run this you will always get an OutOfMemoryError, even though the local variable data is no longer visible outside of the code block.So here comes the puzzle, that I'd like you to ponder a bit. If you very politely ask the VM to release memory, then you don't get an OutOfMemoryError: public class JavaMemoryPuzzlePolite { private final int dataSize = (int) (Runtime.getRuntime().maxMemory() * 0.6); public void f() { { byte[] data = new byte[dataSize]; } for(int i=0; i<10; i++) { System.out.println("Please be so kind and release memory"); } byte[] data2 = new byte[dataSize]; } public static void main(String[] args) { JavaMemoryPuzzlePolite jmp = new JavaMemoryPuzzlePolite(); jmp.f(); System.out.println("No OutOfMemoryError"); }}Why does this work? When I published this in my newsletter, I received over 400 emails from excited readers around the world, most of whom sent me the wrong explanation. After the 300th wrong answer, my replies became unfortunately a bit curt. Have a look at Issue 174 for a detailed explanation, but before you do, put on your thinking caps and try to figure it out yourself. Q: What do you think Java developers should know that they currently do not know?A: They should definitely get to know more about concurrency. It is a tough subject that most programmers try to avoid. Unfortunately we do come in contact with it. And when we do, we need to know how to protect ourselves and how to solve tricky system errors.Knowing your IDE is also useful. Most IDEs have a ton of shortcuts, which can make you a lot more productive in moving code around. Another thing that is useful is being able to read GC logs. Kirk Pepperdine has a great talk at JavaOne that I can recommend if you want to learn more. It's this: CON5405 – “Are Your Garbage Collection Logs Speaking to You?” Q: What are you looking forward to in Java 8?A: I'm quite excited about lambdas, though I must confess that I have not studied them in detail yet. Maurice Naftalin's Lambda FAQ is quite a good start to document what you can do with them. I'm looking forward to finding all the interesting bugs that we will now get due to lambdas obscuring what is really going on underneath, just like we had with generics.I am quite impressed with what the team at Oracle did with OpenJDK's performance. A lot of the benchmarks now run faster.Hopefully Java 8 will come with JSR 310, the Date and Time API. It still boggles my mind that such an important API has been left out in the cold for so long.What I am not looking forward to is losing perm space. Even though some systems run out of perm space, at least the problem is contained and they usually manage to work around it. In most cases, this is due to a memory leak in that region of memory. Once they bundle perm space with the old generation, I predict that memory leaks in perm space will be harder to find. More contracts for us, but also more pain for our customers. Originally published on blogs.oracle.com/javaone.

Read the article
The Java Specialist: An Interview with Java Champion Heinz Kabutz

- by Janice J. Heiss

Dr. Heinz Kabutz is well known for his Java Specialists’ Newsletter, initiated in November 2000, where he displays his acute grasp of the intricacies of the Java platform for an estimated 70,000 readers; for his work as a consultant; and for his workshops and trainings at his home on the Island of Crete where he has lived since 2006 -- where he is known to curl up on the beach with his laptop to hack away, in between dips in the Mediterranean. Kabutz was born of German parents and raised in Cape Town, South Africa, where he developed a love of programming in junior high school through his explorations on a ZX Spectrum computer. He received a B.S. from the University of Cape Town, and at 25, a Ph.D., both in computer science. He will be leading a two-hour hands-on lab session, HOL6500 – “Finding and Solving Java Deadlocks,” at this year’s JavaOne that will explore what causes deadlocks and how to solve them. Q: Tell us about your JavaOne plans.A: I am arriving on Sunday evening and have just one hands-on-lab to do on Monday morning. This is the first time that a non-Oracle team is doing a HOL at JavaOne under Oracle's stewardship and we are all a bit nervous about how it will turn out. Oracle has been immensely helpful in getting us set up. I have a great team helping me: Kirk Pepperdine, Dario Laverde, Benjamin Evans and Martijn Verburg from jClarity, Nathan Reynolds from Oracle, Henri Tremblay of OCTO Technology and Jeff Genender of Savoir Technologies. Monday will be hard work, but after that, I will hopefully get to network with fellow Java experts, attend interesting sessions and just enjoy San Francisco. Oh, and my kids have already given me a shopping list of things to get, like a GoPro Hero 2 dive housing for shooting those nice videos of Crete. (That's me at the beginning diving down.) Q: What sessions are you attending that we should know about?A: Sometimes the most unusual sessions are the best. I avoid the "big names". They often are spread too thin with all their sessions, which makes it difficult for them to deliver what I would consider deep content. I also avoid entertainers who might be good at presenting but who do not say that much.In 2010, I attended a session by Vladimir Yaroslavskiy where he talked about sorting. Although he struggled to speak English, what he had to say was spectacular. There was hardly anybody in the room, having not heard of Vladimir before. To me that was the highlight of 2010. Funnily enough, he was supposed to speak with Joshua Bloch, but if you remember, Google cancelled. If Bloch has been there, the room would have been packed to capacity.Q: Give us an update on the Java Specialists’ Newsletter.A: The Java Specialists' Newsletter continues being read by an elite audience around the world. The apostrophe in the name is significant. It is a newsletter for Java specialists. When I started it twelve years ago, I was trying to find non-obvious things in Java to write about. Things that would be interesting to an advanced audience.As an April Fool's joke, I told my readers in Issue 44 that subscribing would remain free, but that they would have to pay US$5 to US$7 depending on their geographical location. I received quite a few angry emails from that one. I would have not earned that much from unsubscriptions. Most readers stay for a very long time.After Oracle bought Sun, the Java community held its breath for about two years whilst Oracle was figuring out what to do with Java. For a while, we were quite concerned that there was not much progress shown by Oracle. My newsletter still continued, but it was quite difficult finding new things to write about. We have probably about 70,000 readers, which is quite a small number for a Java publication. However, our readers are the top in the Java industry. So I don't mind having "only" 70000 readers, as long as they are the top 0.7%.Java concurrency is a very important topic that programmers think they should know about, but often neglect to fully understand. I continued writing about that and made some interesting discoveries. For example, in Issue 165, I showed how we can get thread starvation with the ReadWriteLock. This was a bug in Java 5, which was corrected in Java 6, but perhaps a bit too much. Whereas we could get starvation of writers in Java 5, in Java 6 we could now get starvation of readers. All of these interesting findings make their way into my courseware to help companies avoid these pitfalls.Another interesting discovery was how polymorphism works in the Server HotSpot compiler in Issue 157 and Issue 158. HotSpot can inline methods from interfaces that have only one implementation class in the JVM. When a new subclass is instantiated and called for the first time, the JVM will undo the previous optimization and re-optimize differently.Here is a little memory puzzle for your readers: public class JavaMemoryPuzzle { private final int dataSize = (int) (Runtime.getRuntime().maxMemory() * 0.6); public void f() { { byte[] data = new byte[dataSize]; } byte[] data2 = new byte[dataSize]; } public static void main(String[] args) { JavaMemoryPuzzle jmp = new JavaMemoryPuzzle(); jmp.f(); }}When you run this you will always get an OutOfMemoryError, even though the local variable data is no longer visible outside of the code block.So here comes the puzzle, that I'd like you to ponder a bit. If you very politely ask the VM to release memory, then you don't get an OutOfMemoryError: public class JavaMemoryPuzzlePolite { private final int dataSize = (int) (Runtime.getRuntime().maxMemory() * 0.6); public void f() { { byte[] data = new byte[dataSize]; } for(int i=0; i<10; i++) { System.out.println("Please be so kind and release memory"); } byte[] data2 = new byte[dataSize]; } public static void main(String[] args) { JavaMemoryPuzzlePolite jmp = new JavaMemoryPuzzlePolite(); jmp.f(); System.out.println("No OutOfMemoryError"); }}Why does this work? When I published this in my newsletter, I received over 400 emails from excited readers around the world, most of whom sent me the wrong explanation. After the 300th wrong answer, my replies became unfortunately a bit curt. Have a look at Issue 174 for a detailed explanation, but before you do, put on your thinking caps and try to figure it out yourself. Q: What do you think Java developers should know that they currently do not know?A: They should definitely get to know more about concurrency. It is a tough subject that most programmers try to avoid. Unfortunately we do come in contact with it. And when we do, we need to know how to protect ourselves and how to solve tricky system errors.Knowing your IDE is also useful. Most IDEs have a ton of shortcuts, which can make you a lot more productive in moving code around. Another thing that is useful is being able to read GC logs. Kirk Pepperdine has a great talk at JavaOne that I can recommend if you want to learn more. It's this: CON5405 – “Are Your Garbage Collection Logs Speaking to You?” Q: What are you looking forward to in Java 8?A: I'm quite excited about lambdas, though I must confess that I have not studied them in detail yet. Maurice Naftalin's Lambda FAQ is quite a good start to document what you can do with them. I'm looking forward to finding all the interesting bugs that we will now get due to lambdas obscuring what is really going on underneath, just like we had with generics.I am quite impressed with what the team at Oracle did with OpenJDK's performance. A lot of the benchmarks now run faster.Hopefully Java 8 will come with JSR 310, the Date and Time API. It still boggles my mind that such an important API has been left out in the cold for so long.What I am not looking forward to is losing perm space. Even though some systems run out of perm space, at least the problem is contained and they usually manage to work around it. In most cases, this is due to a memory leak in that region of memory. Once they bundle perm space with the old generation, I predict that memory leaks in perm space will be harder to find. More contracts for us, but also more pain for our customers.

Read the article
Adopting Technologies for the Sake of Technologies

- by shiju

Unlike other engineering industries, the software engineering industry is really lacking maturity. The lack of maturity can see in different aspects of entire software development life cycle. I think other engineering industries are well organised and structured with common, proven engineering practices. The software engineering industry is greatly a diverse industry with different operating systems, and variety of development platforms, programming languages, frameworks and tools. Now these days, people are going behind the hypes and intellectual thoughts without understanding their core business problems and adopting technologies and practices for the sake of technologies and practices and simply becoming a “poster child” of technologies and practices. Understanding the core business problem and providing best, solid solution with a platform neutral approach, will give you more business values and ROI, instead of blindly adopting technologies and tailor-made your applications for the sake of technologies and practices. People have been simply migrating their solutions in favour of new technologies and different versions of frameworks without any business need. The “Pepsi Challenge” in the Software Development Pepsi Challenge marketing campaign of the 1980s was a popular and very interesting marketing promotion in which people taste one cup of Pepsi and another cup with Coca Cola. In the taste test, more than 50% of people were preferred Pepsi over Coca Cola. The success story behind the Pepsi was more sweetness contains in the Pepsi cola. They have simply added more sugar and more people preferred more sweet flavour. You can’t simply identify the better one after sipping one cup of cola based on the sweetness which contains. These things have been happening in the software industry for choosing development frameworks and technologies. People have been simply choosing frameworks based on the initial sugary feeling without understanding its core strengths and weakness. The sugary framework might be more harmful when you develop real-world systems. There is not any silver bullet for solving all kind of problems and frameworks and tools do have strengths and weakness. So it would be better to understand their strength and weakness. And please keep in mind that you have to develop real apps to understand the real capabilities and weakness of a framework. Evaluating a technology based on few blog posts will harm your projects and these bloggers might be lacking real-world experience with the framework. The Problem with Align a Development Practice with Tools Recently I have observed a discussion in a group where one guy asked suggestions for practicing Continuous Delivery (CD) as part of the agile based application engineering. Then the discussion quickly went to using and choosing a Continuous Integration (CI) tool and different people suggested different Continuous Integration (CI) tools for simply practicing Continuous Delivery. If you have worked with core agile engineering practices, you could clearly know that the real essence of agile is neither choosing a tool nor choosing a process. By simply choosing CI tool from a particular vendor will not ensure that you are delivering an evolving software based on customer feedback. You have to understand the real essence of a engineering practice and choose a right tool for practicing it instead of simply focus on a particular tool for a practicing an development practice. If you want to adopt a practice, you need a solid understanding on it with its real essence where tools are just helping us for better automation. Adopting New Technologies for the Sake of Technologies The another problem is that developers have been a tendency to adopt new technologies and simply migrating their existing apps to new technologies. It is okay if your existing system is having problem with a technology stack or or maintainability challenge with existing solution, and moving to new technology for solving the current problems. We have been adopting new technologies for solving new challenges like solving the scalability challenges when the application or user bases is growing unpredictably. Please keep in mind that all new technologies will become old after working with it for few years. The below Facebook status update of Janakiraman, expresses the attitude of a typical customer. For an example, Node.js is becoming a hottest buzzword in the software industry and many developers are trying to adopt Node.js for their apps. The important thing is that Node.js is a minimalist framework that does some great things for some problems, but it’s not a silver bullet. I have been also working with Node.js which is good for some problems, but really bad for choosing it for all kind of problems. By adopting new technologies for new projects is good if we could get real business values from it because newer framework would solve some existing well known problems and provide better solutions where it can incorporate good solutions for the latest challenges . But adopting a new technology for the sake of new technology is really bad idea. Another example is JavaScript is getting lot of attention so that lot of developers are developing heavy JavaScript centric web apps. First, they will adopt a client-side JavaScript MV* framework from AngularJS, Ember, Backbone etc, and develop a Single Page App(SPA) where they are repeating the mistakes we did in the past with server-side. The mistakes we did in the server-side is transforming to client-side. The problem is that people are just adopting new technologies, but not improving their solutions. I predict that many Single Page App will suck in the future. We need a hybrid approach where we should be able to leverage both server-side and client-side for developing next-generation web apps. The another problem is that if you like a particular framework, use it for all kind of apps. In the past, I know some Silverlight passionate guys were tried to use that framework for all kind of apps including larger line of business apps. And these days developers are migrating their existing Silverlight apps in favour of HTML5 buzzword. So the real question is, what is the business values we are getting from these apps when we are developing it for the sake of a particular technology instead of business need. The another problem is that our solutions consultants are trying to provide unnecessary solutions for the sake of a particular technology or for a hype. For an example, Big Data solutions are great for solving the problem of three Vs : volume, velocity and variety. But trying to put this for every application will make problems. Let’s say, there is a small web site running with limited budget and saying that we need a recommendation engine for the web site with a Hadoop based solution with a 16 node cluster, would be really horrible. If we really need a Hadoop based solution, got for it, but trying to put this for all application would be a big disaster. It would be great if could understand the core business problems first, and later choose a right framework for providing solutions for the actual business problem, instead of trying to provide so many solutions. The Problem with Tied Up to a Platform Vendor Some organizations and teams are tied up with a particular platform vendor where they don’t want to use any product other than their preferred or existing platform vendor. They will accept any product provided by the vendor regardless of its capability. This will lets you some benefits regards with integration and collaboration of different products provided by the same vendor, but it will loose your opportunity to provide better solution for your business problems. For a real world sample scenario, lot of companies have been using SAP for their ERP solutions. When they are thinking about mobility or thinking about developing hybrid mobile apps, they can easily find out a framework from SAP. SAP provides a framework for HTML 5 based UI development named SAPUI5. If you are simply adopting that framework only based for the preference of existing platform vendor, you might be loose different opportunities for providing better solution. Initially you might enjoy the sugary feeling provided by the platform vendor, but you have to think about developing apps which should be capable for solving future challenges. I am not saying that any framework is not good and I believe that all frameworks are good over another one for solving at least one problem. My point is that we should not tied up with any specific platform vendor unless your organization is having resource availability problems. Being Polyglot for Providing Right Solutions The modern software engineering industry is greatly diverse with different tools and platforms. Lot of open source frameworks and new programming languages have been releasing to the developer community, where choosing the right platform without any biased opinion, is really a difficult task. But it would really great if we could develop an attitude with platform neutral mindset and being a polyglot developer for providing better solutions based on the actual business problems. IMHO, we should learn a new programming language and a new framework every year. This will improve the quality of our developer capabilities and also improve the quality of our primary programming language skills. Being polyglot for individual developers and organizational teams will give you greater opportunity to your developer experience and also for your applications. Organizations can analyse their business problem without tied with any technology and later they can provide solutions by choosing different platform and tools. Summary In this blog post, what I was trying to say that we should not tied up or biased with any development platform, technology, vendor or programming language and we should not adopt technologies and practices for the sake of technologies. If we are adopting a technology or a practice for the sake of it, we are simply becoming a “poster child” of the technology and practice. We should not become a poster child of other people’s intellectual thoughts and theories, instead of it we should become solutions developers and solutions consultants where we should be able to provide better solutions for the business problems. Being a polyglot developer is a good idea for improving your developer skills which lets you provide better solutions for the business problems. The most important thing is that we should become platform neutral developers where our passion should be for providing brilliant solutions. It would be great if we could provide minimalist, pragmatic business solutions. You can follow me on Twitter @shijucv

Read the article
BizTalk: History of one project architecture

- by Leonid Ganeline

"In the beginning God made heaven and earth. Then he started to integrate." At the very start was the requirement: integrate two working systems. Small digging up: It was one system. It was good but IT guys want to change it to the new one, much better, chipper, more flexible, and more progressive in technologies, more suitable for the future, for the faster world and hungry competitors. One thing. One small, little thing. We cannot turn off the old system (call it A, because it was the first), turn on the new one (call it B, because it is second but not the last one). The A has a hundreds users all across a country, they must study B. A still has a lot nice custom features, home-made features that cannot disappear. These features have to be moved to the B and it is a long process, months and months of redevelopment. So, the decision was simple. Let’s move not jump, let’s both systems working side-by-side several months. In this time we could teach the users and move all custom A’s special functionality to B. That automatically means both systems should work side-by-side all these months and use the same data. Data in A and B must be in sync. That’s how the integration projects get birth. Moreover, the specific of the user tasks requires the both systems must be in sync in real-time. Nightly synchronization is not working, absolutely. First draft The first draft seems simple. Both systems keep data in SQL databases. When data changes, the Create, Update, Delete operations performed on the data, and the sync process could be started. The obvious decision is to use triggers on tables. When we are talking about data, we are talking about several entities. For example, Orders and Items [in Orders]. We decided to use the BizTalk Server to synchronize systems. Why it was chosen is another story. Second draft Let’s take an example how it works in more details. 1. User creates a new entity in the A system. This fires an insert trigger on the entity table. Trigger has to pass the message “Entity created”. This message includes all attributes of the new entity, but I focused on the Id of this entity in the A system. Notation for this message is id.A. System A sends id.A to the BizTalk Server. 2. BizTalk transforms id.A to the format of the system B. This is easiest part and I will not focus on this kind of transformations in the following text. The message on the picture is still id.A but it is in slightly different format, that’s why it is changing in color. BizTalk sends id.A to the system B. 3. The system B creates the entity on its side. But it uses different id-s for entities, these id-s are id.B. System B saves id.A+id.B. System B sends the message id.A+id.B back to the BizTalk. 4. BizTalk sends the message id.A+id.B to the system A. 5. System A saves id.A+id.B. Why both id-s should be saved on both systems? It was one of the next requirements. Users of both systems have to know the systems are in sync or not in sync. Users working with the entity on the system A can see the id.B and use it to switch to the system B and work there with the copy of the same entity. The decision was to store the pairs of entity id-s on both sides. If there is only one id, the entities are not in sync yet (for the Create operation). Third draft Next problem was the reliability of the synchronization. The synchronizing process can be interrupted on each step, when message goes through the wires. It can be communication problem, timeout, temporary shutdown one of the systems, the second system cannot be synchronized by some internal reason. There were several potential problems that prevented from enclosing the whole synchronization process in one transaction. Decision was to restart the whole sync process if it was not finished (in case of the error). For this purpose was created an additional service. Let’s call it the Resync service. We still keep the id pairs in both systems, but only for the fast access not for the synchronization process. For the synchronizing these id-s now are kept in one main place, in the Resync service database. The Resync service keeps record as: · Id.A · Id.B · Entity.Type · Operation (Create, Update, Delete) · IsSyncStarted (true/false) · IsSyncFinished (true/false0 The example now looks like: 1. System A creates id.A. id.A is saved on the A. Id.A is sent to the BizTalk. 2. BizTalk sends id.A to the Resync and to the B. id.A is saved on the Resync. 3. System B creates id.B. id.A+id.B are saved on the B. id.A+id.B are sent to the BizTalk. 4. BizTalk sends id.A+id.B to the Resync and to the A. id.A+id.B are saved on the Resync. 5. id.A+id.B are saved on the B. Resync changes the IsSyncStarted and IsSyncFinished flags accordingly. The Resync service implements three main methods: · Save (id.A, Entity.Type, Operation) · Save (id.A, id.B, Entity.Type, Operation) · Resync () Two Save() are used to save id-s to the service storage. See in the above example, in 2 and 4 steps. What about the Resync()? It is the method that finishes the interrupted synchronization processes. If Save() is started by the trigger event, the Resync() is working as an independent process. It periodically scans the Resync storage to find out “unfinished” records. Then it restarts the synchronization processes. It tries to synchronize them several times then gives up. One more thing, both systems A and B must tolerate duplicates of one synchronizing process. Say on the step 3 the system B was not able to send id.A+id.B back. The Resync service must restart the synchronization process that will send the id.A to B second time. In this case system B must just send back again also created id.A+id.B pair without errors. That means “tolerate duplicates”. Fourth draft Next draft was created only because of the aesthetics. As it always happens, aesthetics gave significant performance gain to the whole system. First was the stupid question. Why do we need this additional service with special database? Can we just master the BizTalk to do something like this Resync() does? So the Resync orchestration is doing the same thing as the Resync service. It is started by the Id.A and finished by the id.A+id.B message. The first works as a Start message, the second works as a Finish message. Here is a diagram the whole process without errors. It is pretty straightforward. The Resync orchestration is waiting for the Finish message specific period of time then resubmits the Id.A message. It resubmits the Id.A message specific number of times then gives up and gets suspended. It can be resubmitted then it starts the whole process again: waiting [, resubmitting [, get suspended]], finishing. Tuning up The Resync orchestration resubmits the id.A message with special “Resubmitted” flag. The subscription filter on the Resync orchestration includes predicate as (Resubmit_Flag != “Resubmitted”). That means only the first Sync orchestration starts the Resync orchestration. Other Sync orchestration instantiated by the resubmitting can finish this Resync orchestration but cannot start another instance of the Resync Here is a diagram where system B was inaccessible for some period of time. The Resync orchestration resubmitted the id.A two times. Then system B got the response the id.A+id.B and this finished the Resync service execution. What is interesting about this, there were submitted several identical id.A messages and only one id.A+id.B message. Because of this, the system B and the Resync must tolerate the duplicate messages. We also told about this requirement for the system B. Now the same requirement is for the Resunc. Let’s assume the system B was very slow in the first response and the Resync service had time to resubmit two id.A messages. System B responded not, as it was in previous case, with one id.A+id.B but with two id.A+id.B messages. First of them finished the Resync execution for the id.A. What about the second id.A+id.B? Where it goes? So, we have to add one more internal requirement. The whole solution must tolerate many identical id.A+id.B messages. It is easy task with the BizTalk. I added the “SinkExtraMessages” subscriber (orchestration with one receive shape), that just get these messages and do nothing. Real design Real architecture is much more complex and interesting. In reality each system can submit several id.A almost simultaneously and completely unordered. There are not only the “Create entity” operation but the Update and Delete operations. And these operations relate each other. Say the Update operation after Delete means not the same as Update after Create. In reality there are entities related each other. Say the Order and Order Items. Change on one of it could start the series of the operations on another. Moreover, the system internals are the “black boxes” and we cannot predict the exact content and order of the operation series. It worth to say, I had to spend a time to manage the zombie message problems. The zombies are still here, but this is not a problem now. And this is another story. What is interesting in the last design? One orchestration works to help another to be more reliable. Why two orchestration design is more reliable, isn’t it something strange? The Synch orchestration takes all the message exchange between systems, here is the area where most of the errors could happen. The Resync orchestration sends and receives messages only within the BizTalk server. Is there another design? Sure. All Resync functionality could be implemented inside the Sync orchestration. Hey guys, some other ideas?

Read the article
OK - What now? How do we become a Social Business?

- by Michael Snow

We hope that those of you that attended yesterday's Webcast with Brian Solis enjoyed Brian's discussion with Christian Finn for our last Webcast of the season for the Oracle Social Business Thought Leaders Series. For those of you that may have missed the webcast or were stuck at a company holiday party - you'll be glad to hear that the webcast will be available On-Demand starting later today (12/14/12). And any of you who'd like to listen to a quick but informative podcast with Brian - can listen to that here. Some of you may still be left with questions about how to get from point A to point B and even more confused than when you started thinking about this new world of Digital Darwinism. The post below, grabbed from an abundance of great thought leadership prose on Brian's blog may help you frame the path you need to start walking sooner versus later to stay off of the endangered species list. As you explore your path forward, please keep Oracle in mind - we do offer a wide range of solutions to help your organization 12.00 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} optimize the engagement for your customers, employees and partners. The Path from a Social Brand to a Social Business Brian Solis Originally posted May 2, 2012 I’ve been a long-time supporter of MediaTemple’s (MT)Residence program along with Gary Vaynerchuk, Neil Patel, and many others whom I respect. I wanted to share my “7 questions to answer to become a social business” with you here.. Social Media is pervasive and is becoming the new normal in corporate marketing. Brands who get this right are starting to build their own media networks rich with customer connections numbering in the millions. Right now, Coca-Cola has over 34 million fans on Facebook, but they’re hardly alone. Disney follows just behind with 29 million fans, Starbucks boasts 25 million, and Oreo, Red Bull, and Converse play host to over 20 million fans. If we were to look at other networks such as Twitter and Youtube, we would see a recurring theme. People are connecting en masse with the businesses they support and new media represents the ability to cultivate consumer relationships in ways not possible with traditional earned or paid media. Sounds great right? This might sound abrupt, but the truth is that we’re hardly realizing the potential of what lies before us. Everything begins with understanding not just how other brands are marketing themselves in social media, but also seeing what they’re not doing and envisioning what’s possible. We’re already approaching the first of many crossroads that new media will present. Do we take the path of a social brand or that of a social business? What’s the difference? A social brand is just that, a business that is remodeling or retrofitting its existing marketing practices to new media. A social business is something altogether different as it embraces introspection and extrospection to reevaluate internal and external processes, systems, and opportunities to transform into a living, breathing entity that adapts to market conditions and opportunities. It’s a tough decision to make right now especially at a time when all we read about is how much success many businesses are finding without having to answer this very question. With all of the newfound success in social networks, the truth is that we’re only just beginning to learn what’s possible and that’s where you come in. When compared to the investment in time and resources across the board, social media represents only a small part of the mix. But with your help, that’s all about to change. The CMO Survey, an organization that disseminates the opinions of top marketers in order to predict the future of markets, recently published a report that gave credence to the fact that social media is taking off. One of the most profound takeaways from the report was this gem; “The “like button” [in Facebook] packs more customer-acquisition punch than other demand-generating activities.” With insights like this, it’s easy to see why the race to social is becoming heated. The report also highlighted exactly where social fits in the marketing mix today and as you can see, despite all of the hype, it’s not a dominant focus yet. As of August 2011, the percentage of overall marketing budgets dedicated to social media hovered at around 7%. However, in 2012 the investment in social media will climb to 10%. And, in five years, social media is expected to represent almost 18% of the total marketing budget. Think about that for a moment. In 2016, social media will only represent 18%? Queue the sound of a record scratching here. With businesses finding success in social networks, why are businesses failing to realize the true opportunity brought forth by the ability to listen to, connect with, and engage with customers? While there’s value in earning views, driving traffic, and building connections through the 3F’s (friends, fans and followers), success isn’t just defined simply by what really amounts to low-hanging fruit. The truth is that businesses cannot measure what it is they don’t know to value. As a result, innovation in new engagement initiatives is stifled because we’re applying dated or inflexible frameworks to new paradigms. Social media isn’t owned by marketing, but instead the entire organization. This changes everything and makes your role so much more important. It’s up to you to learn how to think outside of the proverbial social media box to see what others don’t, the ability to improve customers experiences through the evolution of a social brand into a social business. Doing so will translate customer insights from what they do and don’t share in social networks into better products, services, and processes. See, customers want something more from their favorite businesses than creative campaigns, viral content, and everyday dialogue in social networks. Customers want to be heard and they want to know that you’re listening. How businesses use social media must remind them that they’re more than just an audience, consumer, or a conduit to “trigger” a desired social effect. Herein lies both the challenge and opportunity of social media. It’s bigger than marketing. It’s also bigger than customer service. It’s about building relationships with customers that improve experiences and more importantly, teaches businesses how to re-imagine products and internal processes to better adapt to potential crises and seize new opportunities. When it comes down to it, Twitter, Facebook, Youtube, Foursquare, are all channels for listening, learning, and engaging. It’s what you do within each channel that builds a community around your brand. And, at the end of the day, the value of the community you build counts for everything. It’s important to understand that we cannot assume that these networks simply exist for people to lineup for our marketing messages or promotional campaigns. Nor can we assume that they’re reeling in anticipation for simple dialogue. They want value. They want recognition. They want access to exclusive information and offers. They need direction, answers and resolution. What we’re talking about here is the multidimensional makeup of consumers and how a one-sided approach to social media forces the needs for social media to expand beyond traditional marketing to socialize the various departments, lines of business, and functions to engage based on the nature of the situation or opportunity. In the same CMO study, it was revealed that marketers believe that social media has a long way to go toward integrating into the overall company strategy. On a scale of 1-7, with one being “not integrated at all” and seven being “very integrated,” 22% chose “one.” Critical functions such as service, HR, sales, R&D, product marketing and development, IR, CSR, etc. are either not engaged or are operating social media within a silo disconnected from other efforts or possibilities. The problem is that customers don’t view a company by silo, instead they see one company, one brand, and their experience in social media forms an impression that eventually contributes to their view of your brand. The first step here is to understand business priorities and objectives to assess how social media can be additive in achieving these goals. Additionally, surveying the landscape to determine other areas of interest as its specifically related to your business. • Are customers seeking help or direction? • Who are your most valuable customers and what are they sharing? • How can you use social media to acquire and retain customers? - What ideas are circulating and how can you harness user generated activity and content to innovate or adapt to better meet the needs of customers? - How can you broaden a single customer view to recognize the varying needs of customers and how your organization can organize around each circumstance? - What insights exist based on how consumers are interacting with one another? How can this intelligence inform marketing, service, products and other important business initiatives? - How can your business extend their current efforts to deliver better customer experiences and in turn more effectively unit internal collaboration and communication? Customer demands far exceed the capabilities of the marketing department. While creating a social brand is a necessary endeavor, building a social business is an investment in customer relevance now and over time. Beyond relevance, a social business fosters a culture of change that unites employees and customers and sets a foundation for meaningful and beneficial relationships. Innovation, communication, and creativity are the natural byproducts of engagement and transformation. As a social brand, we are competing for the moment. As a social business, we are competing the future in all that we do today.

Read the article
BizTalk and SQL: Alternatives to the SQL receive adapter. Using Msmq to receive SQL data

- by Leonid Ganeline

If we have to get data from the SQL database, the standard way is to use a receive port with SQL adapter. SQL receive adapter is a solicit-response adapter. It periodically polls the SQL database with queries. That’s only way it can work. Sometimes it is undesirable. With new WCF-SQL adapter we can use the lightweight approach but still with the same principle, the WCF-SQL adapter periodically solicits the database with queries to check for the new records. Imagine the situation when the new records can appear in very broad time limits, some - in a second interval, others - in the several minutes interval. Our requirement is to process the new records ASAP. That means the polling interval should be near the shortest interval between the new records, a second interval. As a result the most of the poll queries would return nothing and would load the database without good reason. If the database is working under heavy payload, it is very undesirable. Do we have other choices? Sure. We can change the polling to the “eventing”. The good news is the SQL server could issue the event in case of new records with triggers. Got a new record –the trigger event is fired. No new records – no the trigger events – no excessive load to the database. The bad news is the SQL Server doesn’t have intrinsic methods to send the event data outside. For example, we would rather use the adapters that do listen for the data and do not solicit. There are several such adapters-listeners as File, Ftp, SOAP, WCF, and Msmq. But the SQL Server doesn’t have methods to create and save files, to consume the Web-services, to create and send messages in the queue, does it? Can we use the File, FTP, Msmq, WCF adapters to get data from SQL code? Yes, we can. The SQL Server 2005 and 2008 have the possibility to use .NET code inside SQL code. See the SQL Integration. How it works for the Msmq, for example: · New record is created, trigger is fired · Trigger calls the CLR stored procedure and passes the message parameters to it · The CLR stored procedure creates message and sends it to the outgoing queue in the SQL Server computer. · Msmq service transfers message to the queue in the BizTalk Server computer. · WCF-NetMsmq adapter receives the message from this queue. For the File adapter the idea is the same, the CLR stored procedure creates and stores the file with message, and then the File adapter picks up this file. Using WCF-NetMsmq adapter to get data from SQL I am describing the full set of the deployment and development steps for the case with the WCF-NetMsmq adapter. Development: 1. Create the .NET code: project, class and method to create and send the message to the MSMQ queue. 2. Create the SQL code in triggers to call the .NET code. Installation and Deployment: 1. SQL Server: a. Register the CLR assembly with .NET (CLR) code b. Install the MSMQ Services 2. BizTalk Server: a. Install the MSMQ Services b. Create the MSMQ queue c. Create the WCF-NetMsmq receive port. The detailed description is below. Code .NET code … using System.Xml; using System.Xml.Linq; using System.Xml.Serialization; //namespace MyCompany.MySolution.MyProject – doesn’t work. The assembly name is MyCompany.MySolution.MyProject // I gave up with the compound namespace. Seems the CLR Integration cannot work with it L. Maybe I’m wrong. public class Event { static public XElement CreateMsg(int par1, int par2, int par3) { XNamespace ns = "http://schemas.microsoft.com/Sql/2008/05/TypedPolling/my_storedProc"; XElement xdoc = new XElement(ns + "TypedPolling", new XElement(ns + "TypedPollingResultSet0", new XElement(ns + "TypedPollingResultSet0", new XElement(ns + "par1", par1), new XElement(ns + "par2", par2), new XElement(ns + "par3", par3), ) ) ); return xdoc; } } //////////////////////////////////////////////////////////////////////// … using System.ServiceModel; using System.ServiceModel.Channels; using System.Transactions; using System.Data; using System.Data.Sql; using System.Data.SqlTypes; public class MsmqHelper { [Microsoft.SqlServer.Server.SqlProcedure] // msmqAddress as "net.msmq://localhost/private/myapp.myqueue"; public static void SendMsg(string msmqAddress, string action, int par1, int par2, int par3) { using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Suppress)) { NetMsmqBinding binding = new NetMsmqBinding(NetMsmqSecurityMode.None); binding.ExactlyOnce = true; EndpointAddress address = new EndpointAddress(msmqAddress); using (ChannelFactory<IOutputChannel> factory = new ChannelFactory<IOutputChannel>(binding, address)) { IOutputChannel channel = factory.CreateChannel(); try { XElement xe = Event.CreateMsg(par1, par2, par3); XmlReader xr = xe.CreateReader(); Message msg = Message.CreateMessage(MessageVersion.Default, action, xr); channel.Send(msg); //SqlContext.Pipe.Send(…); // to test } catch (Exception ex) { … } } scope.Complete(); } } SQL code in triggers -- sp_SendMsg was registered as a name of the MsmqHelper.SendMsg() EXEC sp_SendMsg'net.msmq://biztalk_server_name/private/myapp.myqueue', 'Create', @par1, @par2, @par3 Installation and Deployment On the SQL Server Registering the CLR assembly 1. Prerequisites: .NET 3.5 SP1 Framework. It could be the issue for the production SQL Server! 2. For more information, please, see the link http://nielsb.wordpress.com/sqlclrwcf/ 3. Copy files: >copy “\Windows\Microsoft.net\Framework\v3.0\Windows Communication Foundation\Microsoft.Transactions.Bridge.dll” “\Program Files\Reference Assemblies\Microsoft\Framework\v3.0 \Microsoft.Transactions.Bridge.dll” If your machine is a 64-bit, run two commands: >copy “\Windows\Microsoft.net\Framework\v3.0\Windows Communication Foundation\Microsoft.Transactions.Bridge.dll” “\Program Files (x86)\Reference Assemblies\Microsoft\Framework\v3.0 \Microsoft.Transactions.Bridge.dll” >copy “\Windows\Microsoft.net\Framework64\v3.0\Windows Communication Foundation\Microsoft.Transactions.Bridge.dll” “\Program Files\Reference Assemblies\Microsoft\Framework\v3.0 \Microsoft.Transactions.Bridge.dll” 4. Execute the SQL code to register the .NET assemblies: -- For x64 OS: CREATE ASSEMBLY SMdiagnostics AUTHORIZATION dbo FROM 'C:\Windows\Microsoft.NET\Framework\v3.0\Windows Communication Foundation\SMdiagnostics.dll' WITH permission_set = unsafe CREATE ASSEMBLY [System.Web] AUTHORIZATION dbo FROM 'C:\Windows\Microsoft.NET\Framework64\v2.0.50727\System.Web.dll' WITH permission_set = unsafe CREATE ASSEMBLY [System.Messaging] AUTHORIZATION dbo FROM 'C:\Windows\Microsoft.NET\Framework\v2.0.50727\System.Messaging.dll' WITH permission_set = unsafe CREATE ASSEMBLY [System.ServiceModel] AUTHORIZATION dbo FROM 'C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\v3.0\System.ServiceModel.dll' WITH permission_set = unsafe CREATE ASSEMBLY [System.Xml.Linq] AUTHORIZATION dbo FROM 'C:\Program Files\Reference Assemblies\Microsoft\Framework\v3.5\System.Xml.Linq.dll' WITH permission_set = unsafe -- For x32 OS: --CREATE ASSEMBLY SMdiagnostics AUTHORIZATION dbo FROM 'C:\Windows\Microsoft.NET\Framework\v3.0\Windows Communication Foundation\SMdiagnostics.dll' WITH permission_set = unsafe --CREATE ASSEMBLY [System.Web] AUTHORIZATION dbo FROM 'C:\Windows\Microsoft.NET\Framework\v2.0.50727\System.Web.dll' WITH permission_set = unsafe --CREATE ASSEMBLY [System.Messaging] AUTHORIZATION dbo FROM 'C:\Windows\Microsoft.NET\Framework\v2.0.50727\System.Messaging.dll' WITH permission_set = unsafe --CREATE ASSEMBLY [System.ServiceModel] AUTHORIZATION dbo FROM 'C:\Program Files\Reference Assemblies\Microsoft\Framework\v3.0\System.ServiceModel.dll' WITH permission_set = unsafe 5. Register the assembly with the external stored procedure: CREATE ASSEMBLY [HelperClass] AUTHORIZATION dbo FROM ’<FilePath>MyCompany.MySolution.MyProject.dll' WITH permission_set = unsafe where the <FilePath> - the path of the file on this machine! 6. Create the external stored procedure CREATE PROCEDURE sp_SendMsg ( @msmqAddress nvarchar(100), @Action NVARCHAR(50), @par1 int, @par2 int, @par3 int ) AS EXTERNAL NAME HelperClear.MsmqHelper.SendMsg Installing the MSMQ Services 1. Check if the MSMQ service is NOT installed. To check: Start / Administrative Tools / Computer Management, on the left pane open the “Services and Applications”, search to the “Message Queuing”. If you cannot see it, follow next steps. 2. Start / Control Panel / Programs and Features 3. Click “Turn Windows Features on or off” 4. Click Features, click “Add Features” 5. Scroll down the feature list; open the “Message Queuing” / “Message Queuing Services”; and check the “Message Queuing Server” option 6. Click Next; Click Install; wait to the successful finish of the installation Creating the MSMQ queue We don’t need to create the queue on the “sender” side. On the BizTalk Server Installing the MSMQ Services The same is as for the SQL Server. Creating the MSMQ queue 1. Start / Administrative Tools / Computer Management, on the left pane open the “Services and Applications”, open the “Message Queuing”, and open the “Private Queues”. 2. Right-click the “Private Queues”; choose New; choose “Private Queue”. 3. Type the Queue name as ’myapp.myqueue'; check the “Transactional” option. Creating the WCF-NetMsmq receive port I will not go through this step in all details. It is straightforward. URI for this receive location should be 'net.msmq://localhost/private/myapp.myqueue'. Notes · The biggest problem is usually on the step the “Registering the CLR assembly”. It is hard to predict where are the assemblies from the assembly list, what version should be used, x86 or x64. It is pity of such “rude” integration of the SQL with .NET. · In couple cases the new WCF-NetMsmq port was not able to work with the queue. Try to replace the WCF- NetMsmq port with the WCF-Custom port with netMsmqBinding. It was working fine for me. · To test how messages go through the queue you can turn on the Journal /Enabled option for the queue. I used the QueueExplorer utility to look to the messages in Journal. The Computer Management can also show the messages but it shows only small part of the message body and in the weird format. The QueueExplorer can do the better job; it shows the whole body and Xml messages are in good color format.

Read the article
Using an alternate search platform in Commerce Server 2009

- by Lewis Benge

Although Microsoft Commerce Server 2009's architecture is built upon Microsoft SQL Server, and has the full power of the SQL Full Text Indexing Search Platform, there are time however when you may require a richer or alternate search platform. One of these scenarios if when you want to implement a faceted (refinement) search into your site, which provides dynamic refinements based on the search results dataset. Faceted search is becoming popular in most online retail environments as a way of providing an enhanced user experience when browsing a larger catalogue. This is powerful for two reasons, firstly with a traditional search it is down to a user to think of a search term suitable for the product they are trying to find. This typically will not return similar products or help in any way to refine a larger dataset. Faceted searches on the other hand provide a comprehensive list of product properties, grouped together by similarity to help the user narrow down the results returned, as the user progressively restricts the search criteria by selecting additional criteria to search again, these facets needs to continually refresh. The whole experience allows users to explore alternate brands, price-ranges, or find products they hadn't initially thought of or where looking for in a bid to enhance cross sell in the retail environment. The second advantage of this type of search from a business perspective is also to harvest the search result to start to profile your user. Even though anonymous users may routinely visit your site, and will not necessarily register or complete a transaction to build up marketing data- profiling, you can still achieve the same result by recording search facets used within the search sequence. Below is a faceted search scenario generated from eBay using the search term "server". By creating a search profile of clicking through Computer & Networking -> Servers -> Dell - > New and recording this information against my user profile you can start to predict with a lot more certainty what types of products I am interested in. This will allow you to apply shopping-cart analysis against your search data and provide great cross-sale or advertising opportunity, or personalise the user experience based on your prediction of what the user may be interested in. This type of search is extremely beneficial in e-Commerce environments but achieving it out of the box with Commerce Server and SQL Full Text indexing can be challenging. In many deployments it is often easier to use an alternate search platform such as Microsoft's FAST, Apache SOLR, or Endecca, however you still want these products to integrate natively into Commerce Server to ensure that up-to-date inventory information is presented, profile information is generated, and you provide a consistant API. To do so we make the most of the Commerce Server extensibilty points called operation sequence components. In this example I will be talking about Apache Solr hosted on Apache Tomcat, in this specific example I have used the SolrNet C# library to interface to the Java platform. Also I am not going to talk about Solr configuration of indexing – but in a production envionrment this would typically happen by using Powershell to call the Commerce Server management webservice to export your catalog as XML, apply an XSLT transform to the file to make it conform to SOLR and use a simple HTTP Post to send it to the search enginge for indexing. Essentially a sequance component is a step in a serial workflow used to call a data repository (which in most cases is usually the Commerce Server pipelines or databases) and map to and from a Commerce Entity object whilst enforcing any business rules. So the first step in the process is to add a new class library to your existing Commerce Server site. You will need to use a new library as Sequence Components will need to be strongly named to be deployed. Once you are inside of your new project, add a new class file and add a reference to the Microsoft.Commerce.Providers, Microsoft.Commerce.Contracts and the Microsoft.Commerce.Broker assemblies. Now make your new class derive from the base object Microsoft.Commerce.Providers.Components.OperationSequanceComponent and overide the ExecuteQueryMethod. Your screen will then look something similar ot this: As all we are doing on this component is conducting a search we are only interested in the ExecuteQuery method. This method accepts three arguments, queryOperation, operationCache, and response. The queryOperation will be the object in which we receive our search parameters, the cache allows access to the Commerce Server cache allowing us to store regulary accessed information, and the response object is the object which we will return the result of our search upon. Inside this method is simply where we are going to inject our logic for our third party search platform. As I am not going to explain the inner-workings of actually making a SOLR call, I'll simply provide the sample code here. I would highly recommend however looking at the SolrNet wiki as they have some great explinations of how the API works. What you will find however is that there are some further extensions required when attempting to integrate a custom search provider. Firstly you out of the box the CommerceQueryOperation you will receive into the method when conducting a search against a catalog is specifically geared towards a SQL Full Text Search with properties such as a Where clause. To make the operation you receive more relevant you will need to create another class, this time derived from Microsoft.Commerce.Contract.Messages.CommerceSearchCriteria and within this you need to detail the properties you will require to allow you to submit as parameters to the SOLR search API. My exmaple looks like this: [DataContract(Namespace = "http://schemas.microsoft.com/microsoft-multi-channel-commerce-foundation/types/2008/03")] public class CommerceCatalogSolrSearch : CommerceSearchCriteria { private Dictionary<string, string> _facetQueries; public CommerceCatalogSolrSearch() { _facetQueries = new Dictionary<String, String>(); } public Dictionary<String, String> FacetQueries { get { return _facetQueries; } set { _facetQueries = value; } } public String SearchPhrase{ get; set; } public int PageIndex { get; set; } public int PageSize { get; set; } public IEnumerable<String> Facets { get; set; } public string Sort { get; set; } public new int FirstItemIndex { get { return (PageIndex-1)*PageSize; } } public int LastItemIndex { get { return FirstItemIndex + PageSize; } } } To allow you to construct a CommerceQueryOperation call within the API you will also need to construct another class to derived from Microsoft.Commerce.Common.MessageBuilders.CommerceSearchCriteriaBuilder and is simply used to construct an instance of the CommerceQueryOperation you have just created and expose the properties you want set. My Message builder looks like this: public class CommerceCatalogSolrSearchBuilder : CommerceSearchCriteriaBuilder { private CommerceCatalogSolrSearch _solrSearch; public CommerceCatalogSolrSearchBuilder() { _solrSearch = new CommerceCatalogSolrSearch(); } public String SearchPhrase { get { return _solrSearch.SearchPhrase; } set { _solrSearch.SearchPhrase = value; } } public int PageIndex { get { return _solrSearch.PageIndex; } set { _solrSearch.PageIndex = value; } } public int PageSize { get { return _solrSearch.PageSize; } set { _solrSearch.PageSize = value; } } public Dictionary<String,String> FacetQueries { get { return _solrSearch.FacetQueries; } set { _solrSearch.FacetQueries = value; } } public String[] Facets { get { return _solrSearch.Facets.ToArray(); } set { _solrSearch.Facets = value; } } public override CommerceSearchCriteria ToSearchCriteria() { return _solrSearch; } } Once you have these two classes in place you can now safely cast the CommerceOperation you receive as an argument of the overidden ExecuteQuery method in the SequenceComponent to the CommerceCatalogSolrSearch operation you have just created, e.g. public CommerceCatalogSolrSearch TryGetSearchCriteria(CommerceOperation operation) { var searchCriteria = operation as CommerceQueryOperation; if (searchCriteria == null) throw new Exception("No search criteria present"); var local = (CommerceCatalogSolrSearch) searchCriteria.SearchCriteria; if (local == null) throw new Exception("Unexpected Search Criteria in Operation"); return local; } Now you have all of your search parameters present, you can go off an call the external search platform API. You will of-course get proprietry objects returned, so the next step in the process is to convert the results being returned back into CommerceEntities. You do this via another extensibility point within the Commerce Server API called translatators. Translators are another separate class, this time derived inheriting the interface Microsoft.Commerce.Providers.Translators.IToCommerceEntityTranslator . As you can imaginge this interface is specific for the conversion of the object TO a CommerceEntity, you will need to implement a separate interface if you also need to go in the opposite direction. If you implement the required method for the interace you will get a single translate method which has a source onkect, destination CommerceEntity, and a collection of properties as arguments. For simplicity sake in this example I have hard-coded the mappings, however best practice would dictate you map the objects using your metadatadefintions.xml file . Once complete your translator would look something like the following: public class SolrEntityTranslator : IToCommerceEntityTranslator { #region IToCommerceEntityTranslator Members public void Translate(object source, CommerceEntity destinationCommerceEntity, CommercePropertyCollection propertiesToReturn) { if (source.GetType().Equals(typeof (SearchProduct))) { var searchResult = (SearchProduct) source; destinationCommerceEntity.Id = searchResult.ProductId; destinationCommerceEntity.SetPropertyValue("DisplayName", searchResult.Title); destinationCommerceEntity.ModelName = "Product"; } } Once you have a translator in place you can then safely map the results of your search platform into Commerce Entities and attach them on to the CommerceResponse object in a fashion similar to this: foreach (SearchProduct result in matchingProducts) { var destinationEntity = new CommerceEntity(_returnModelName); Translator.ToCommerceEntity(result, destinationEntity, _queryOperation.Model.Properties); response.CommerceEntities.Add(destinationEntity); } In SOLR I actually have two objects being returned – a product, and a collection of facets so I have an additional translator for facet (which maps to a custom facet CommerceEntity) and my facet response from SOLR is passed into the Translator helper class seperatley. When all of this is pieced together you have sucessfully completed the extensiblity point coding. You would have created a new OperationSequanceComponent, a custom SearchCritiera object and message builder class, and translators to convert the objects into Commerce Entities. Now you simply need to configure them, and can start calling them in your code. Make sure you sign you assembly, compile it and identiy its signature. Next you need to put this a reference of your new assembly into the Channel.Config configuration file replacing that of the existing SQL Full Text component: You will also need to add your translators to the Translators node of your Channel.Config too: Lastly add any custom CommerceEntities you have developed to your MetaDataDefintions.xml file. Your configuration is now complete, and you should now be able to happily make a call to the Commerce Foundation API, which will act as a proxy to your third party search platform and return back CommerceEntities of your search results. If you require data to be enriched, or logged, or any other logic applied then simply add further sequence components into the OperationSequence (obviously keeping the search response first) to the node of your Channel.Config file. Now to call your code you simply request it as per any other CommerceQuery operation, but taking into account you may be receiving multiple types of CommerceEntity returned: public KeyValuePair<FacetCollection ,List<Product>> DoFacetedProductQuerySearch(string searchPhrase, string orderKey, string sortOrder, int recordIndex, int recordsPerPage, Dictionary<string, string> facetQueries, out int totalItemCount) { var products = new List<Product>(); var query = new CommerceQuery<CatalogEntity, CommerceCatalogSolrSearchBuilder>(); query.SearchCriteria.PageIndex = recordIndex; query.SearchCriteria.PageSize = recordsPerPage; query.SearchCriteria.SearchPhrase = searchPhrase; query.SearchCriteria.FacetQueries = facetQueries; totalItemCount = 0; CommerceResponse response = SiteContext.ProcessRequest(query.ToRequest()); var queryResponse = response.OperationResponses[0] as CommerceQueryOperationResponse; // No results. Return the empty list if (queryResponse != null && queryResponse.CommerceEntities.Count == 0) return new KeyValuePair<FacetCollection, List<Product>>(); totalItemCount = (int)queryResponse.TotalItemCount; // Prepare a multi-operation to retrieve the product variants var multiOperation = new CommerceMultiOperation(); //Add products to results foreach (Product product in queryResponse.CommerceEntities.Where(x => x.ModelName == "Product")) { var productQuery = new CommerceQuery<Product>(Product.ModelNameDefinition); productQuery.SearchCriteria.Model.Id = product.Id; productQuery.SearchCriteria.Model.CatalogId = product.CatalogId; var variantQuery = new CommerceQueryRelatedItem<Variant>(Product.RelationshipName.Variants); productQuery.RelatedOperations.Add(variantQuery); multiOperation.Add(productQuery); } CommerceResponse variantsResponse = SiteContext.ProcessRequest(multiOperation.ToRequest()); foreach (CommerceQueryOperationResponse queryOpResponse in variantsResponse.OperationResponses) { if (queryOpResponse.CommerceEntities.Count() > 0) products.Add(queryOpResponse.CommerceEntities[0]); } //Get facet collection FacetCollection facetCollection = queryResponse.CommerceEntities.Where(x => x.ModelName == "FacetCollection").FirstOrDefault(); return new KeyValuePair<FacetCollection, List<Product>>(facetCollection, products); } ..And that is it – simply a few classes and some configuration will allow you to extend the Commerce Server query operations to call a third party search platform, whilst still maintaing a unifed API in the remainder of your code. This logic stands for any extensibility within CommerceServer, which requires excution in a serial fashioon such as call to LOB systems or web service to validate or enrich data. Feel free to use this example on other applications, and if you have any questions please feel free to e-mail and I'll help out where I can!

Read the article
Is Social Media The Vital Skill You Aren’t Tracking?

- by HCM-Oracle

By Mark Bennett - Originally featured in Talent Management Excellence The ever-increasing presence of the workforce on social media presents opportunities as well as risks for organizations. While on the one hand, we read about social media embarrassments happening to organizations, on the other we see that social media activities by workers and candidates can enhance a company’s brand and provide insight into what individuals are, or can become, influencers in the social media sphere. HR can play a key role in helping organizations make the most value out of the activities and presence of workers and candidates, while at the same time also helping to manage the risks that come with the permanence and viral nature of social media. What is Missing from Understanding Our Workforce? “If only HP knew what HP knows, we would be three-times more productive.” Lew Platt, Former Chairman, President, CEO, Hewlett-Packard What Lew Platt recognized was that organizations only have a partial understanding of what their workforce is capable of. This lack of understanding impacts the company in several negative ways: 1. A particular skill that the company needs to access in one part of the organization might exist somewhere else, but there is no record that the skill exists, so the need is unfulfilled. 2. As market conditions change rapidly, the company needs to know strategic options, but some options are missed entirely because the company doesn’t know that sufficient capability already exists to enable those options. 3. Employees may miss out on opportunities to demonstrate how their hidden skills could create new value to the company. Why don’t companies have that more complete picture of their workforce capabilities – that is, not know what they know? One very good explanation is that companies put most of their efforts into rating their workforce according to the jobs and roles they are filling today. This is the essence of two important talent management processes: recruiting and performance appraisals. In recruiting, a set of requirements is put together for a job, either explicitly or indirectly through a job description. During the recruiting process, much of the attention is paid towards whether the candidate has the qualifications, the skills, the experience and the cultural fit to be successful in the role. This makes a lot of sense. In the performance appraisal process, an employee is measured on how well they performed the functions of their role and in an effort to help the employee do even better next time, they are also measured on proficiency in the competencies that are deemed to be key in doing that job. Again, the logic is impeccable. But in both these cases, two adages come to mind: 1. What gets measured is what gets managed. 2. You only see what you are looking for. In other words, the fact that the current roles the workforce are performing are the basis for measuring which capabilities the workforce has, makes them the only capabilities to be measured. What was initially meant to be a positive, i.e. identify what is needed to perform well and measure it, in order that it can be managed, comes with the unintended negative consequence of overshadowing the other capabilities the workforce has. This also comes with an employee engagement price, for the measurements and management of workforce capabilities is to typically focus on where the workforce comes up short. Again, it makes sense to do this, since improving a capability that appears to result in improved performance benefits, both the individual through improved performance ratings and the company through improved productivity. But this is based on the assumption that the capabilities identified and their required proficiencies are the only attributes of the individual that matter. Anything else the individual brings that results in high performance, while resulting in a desired performance outcome, often goes unrecognized or underappreciated at best. As social media begins to occupy a more important part in current and future roles in organizations, businesses must incorporate social media savvy and innovation into job descriptions and expectations. These new measures could provide insight into how well someone can use social media tools to influence communities and decision makers; keep abreast of trends in fast-moving industries; present a positive brand image for the organization around thought leadership, customer focus, social responsibility; and coordinate and collaborate with partners. These measures should demonstrate the “social capital” the individual has invested in and developed over time. Without this dimension, “short cut” methods may generate a narrow set of positive metrics that do not have real, long-lasting benefits to the organization. How Workforce Reputation Management Helps HR Harness Social Media With hundreds of petabytes of social media data flowing across Facebook, LinkedIn and Twitter, businesses are tapping technology solutions to effectively leverage social for HR. Workforce reputation management technology helps organizations discover, mobilize and retain talent by providing insight into the social reputation and influence of the workforce while also helping organizations monitor employee social media policy compliance and mitigate social media risk. There are three major ways that workforce reputation management technology can play a strategic role to support HR: 1. Improve Awareness and Decisions on Talent Many organizations measure the skills and competencies that they know they need today, but are unaware of what other skills and competencies their workforce has that could be essential tomorrow. How about whether your workforce has the reputation and influence to make their skills and competencies more effective? Many organizations don’t have insight into the social media “reach” their workforce has, which is becoming more critical to business performance. These features help organizations, managers, and employees improve many talent processes and decision making, including the following: Hiring and Assignments. People and teams with higher reputations are considered more valuable and effective workers. Someone with high reputation who refers a candidate also can have high credibility as a source for hires. Training and Development. Reputation trend analysis can impact program decisions regarding training offerings by showing how reputation and influence across the workforce changes in concert with training. Worker reputation impacts development plans and goal choices by helping the individual see which development efforts result in improved reputation and influence. Finding Hidden Talent. Managers can discover hidden talent and skills amongst employees based on a combination of social profile information and social media reputation. Employees can improve their personal brand and accelerate their career development. 2. Talent Search and Discovery The right technology helps organizations find information on people that might otherwise be hidden. By leveraging access to candidate and worker social profiles as well as their social relationships, workforce reputation management provides companies with a more complete picture of what their knowledge, skills, and attributes are and what they can in turn access. This more complete information helps to find the right talent both outside the organization as well as the right, perhaps previously hidden talent, within the organization to fill roles and staff projects, particularly those roles and projects that are required in reaction to fast-changing opportunities and circumstances. 3. Reputation Brings Credibility Workforce reputation management technology provides a clearer picture of how candidates and workers are viewed by their peers and communities across a wide range of social reputation and influence metrics. This information is less subject to individual bias and can impact critical decision-making. Knowing the individual’s reputation and influence enables the organization to predict how well their capabilities and behaviors will have a positive effect on desired business outcomes. Many roles that have the highest impact on overall business performance are dependent on the individual’s influence and reputation. In addition, reputation and influence measures offer a very tangible source of feedback for workers, providing them with insight that helps them develop themselves and their careers and see the effectiveness of those efforts by tracking changes over time in their reputation and influence. The following are some examples of the different reputation and influence measures of the workforce that Workforce Reputation Management could gather and analyze: Generosity – How often the user reposts other’s posts. Influence – How often the user’s material is reposted by others. Engagement – The ratio of recent posts with references (e.g. links to other posts) to the total number of posts. Activity – How frequently the user posts. (e.g. number per day) Impact – The size of the users’ social networks, which indicates their ability to reach unique followers, friends, or users. Clout – The number of references and citations of the user’s material in others’ posts. The Vital Ingredient of Workforce Reputation Management: Employee Participation “Nothing about me, without me.” Valerie Billingham, “Through the Patient’s Eyes”, Salzburg Seminar Session 356, 1998 Since data resides primarily in social media, a question arises: what manner is used to collect that data? While much of social media activity is publicly accessible (as many who wished otherwise have learned to their chagrin), the social norms of social media have developed to put some restrictions on what is acceptable behavior and by whom. Disregarding these norms risks a repercussion firestorm. One of the more recognized norms is that while individuals can follow and engage with other individual’s public social activity (e.g. Twitter updates) fairly freely, the more an organization does this unprompted and without getting permission from the individual beforehand, the more likely the organization risks a totally opposite outcome from the one desired. Instead, the organization must look for permission from the individual, which can be met with resistance. That resistance comes from not knowing how the information will be used, how it will be shared with others, and not receiving enough benefit in return for granting permission. As the quote above about patient concerns and rights succinctly states, no one likes not feeling in control of the information about themselves, or the uncertainty about where it will be used. This is well understood in consumer social media (i.e. permission-based marketing) and is applicable to workforce reputation management. However, asking permission leaves open the very real possibility that no one, or so few, will grant permission, resulting in a small set of data with little usefulness for the company. Connecting Individual Motivation to Organization Needs So what is it that makes an individual decide to grant an organization access to the data it wants? It is when the individual’s own motivations are in alignment with the organization’s objectives. In the case of workforce reputation management, when the individual is motivated by a desire for increased visibility and career growth opportunities to advertise their skills and level of influence and reputation, they are aligned with the organizations’ objectives; to fill resource needs or strategically build better awareness of what skills are present in the workforce, as well as levels of influence and reputation. Individuals can see the benefit of granting access permission to the company through multiple means. One is through simple social awareness; they begin to discover that peers who are getting more career opportunities are those who are signed up for workforce reputation management. Another is where companies take the message directly to the individual; we think you would benefit from signing up with our workforce reputation management solution. Another, more strategic approach is to make reputation management part of a larger Career Development effort by the company; providing a wide set of tools to help the workforce find ways to plan and take action to achieve their career aspirations in the organization. An effective mechanism, that facilitates connecting the visibility and career growth motivations of the workforce with the larger context of the organization’s business objectives, is to use game mechanics to help individuals transform their career goals into concrete, actionable steps, such as signing up for reputation management. This works in favor of companies looking to use workforce reputation because the workforce is more apt to see how it fits into achieving their overall career goals, as well as seeing how other participation brings additional benefits. Once an individual has signed up with reputation management, not only have they made themselves more visible within the organization and increased their career growth opportunities, they have also enabled a tool that they can use to better understand how their actions and behaviors impact their influence and reputation. Since they will be able to see their reputation and influence measurements change over time, they will gain better insight into how reputation and influence impacts their effectiveness in a role, as well as how their behaviors and skill levels in turn affect their influence and reputation. This insight can trigger much more directed, and effective, efforts by the individual to improve their ability to perform at a higher level and become more productive. The increased sense of autonomy the individual experiences, in linking the insight they gain to the actions and behavior changes they make, greatly enhances their engagement with their role as well as their career prospects within the company. Workforce reputation management takes the wide range of disparate data about the workforce being produced across various social media platforms and transforms it into accessible, relevant, and actionable information that helps the organization achieve its desired business objectives. Social media holds untapped insights about your talent, brand and business, and workforce reputation management can help unlock them. Imagine - if you could find the hidden secrets of your businesses, how much more productive and efficient would your organization be? Mark Bennett is a Director of Product Strategy at Oracle. Mark focuses on setting the strategic vision and direction for tools that help organizations understand, shape, and leverage the capabilities of their workforce to achieve business objectives, as well as help individuals work effectively to achieve their goals and navigate their own growth. His combination of a deep technical background in software design and development, coupled with a broad knowledge of business challenges and thinking in today’s globalized, rapidly changing, technology accelerated economy, has enabled him to identify and incorporate key innovations that are central to Oracle Fusion’s unique value proposition. Mark has over the course of his career been in charge of the design, development, and strategy of Talent Management products and the design and development of cutting edge software that is better equipped to handle the increasingly complex demands of users while also remaining easy to use. Follow him @mpbennett

Read the article
CodePlex Daily Summary for Wednesday, March 14, 2012

CodePlex Daily Summary for Wednesday, March 14, 2012Popular ReleasesAcDown????? - Anime&Comic Downloader: AcDown????? v3.9.2: ?? ●AcDown??????????、??、??????，????1M，????，????，?????????????????????????。???????????Acfun、????(Bilibili)、??、??、YouTube、??、???、??????、SF????、????????????。??????AcPlay?????，??????、????????????????。 ● AcDown???????????????????????????，???，???????????????????。 ● AcDown???????C#??，????.NET Framework 2.0??。?????"Acfun?????"。 ????32??64? Windows XP/Vista/7/8 ????????????? ??：????????Windows XP???,?????????.NET Framework 2.0???(x86)，?????"?????????"??? ??????????????，??????????： ??"AcDo...C.B.R. : Comic Book Reader: CBR 0.6: 20 Issue trackers are closed and a lot of bugs too Localize view is now MVVM and delete is working. Added the unused flag (take care that it goes to true only when displaying screen elements) Backstage - new input/output format choice control for the conversion Backstage - Add display, behaviour and register file type options in the extended options dialog Explorer list view has been transformed to a custom control. New group header, colunms order and size are saved Single insta...Windows Azure Toolkit for Windows 8: Windows Azure Toolkit for Windows 8 Consumer Prv: Windows Azure Toolkit for Windows 8 Consumer Preview - Preview Release v 1.2.0 Please download this for Windows Azure Toolkit for Windows 8 functionality on Windows 8 Consumer Preview. The core features of the toolkit include: Automated Install – Scripted install of all dependencies including Visual Studio 2010 Express and the Windows Azure SDK on Windows 8 Consumer Preview. Project Templates – Windows 8 Metro Style app project templates in Dev 11 in both XAML/C# and HTML5/JS with a suppor...CODE Framework: 4.0.20312.0: This version includes significant improvements in the WPF system (and the WPF MVVM/MVC system). This includes new styles for Metro controls and layouts. Improved color handling. It also includes an improved theme/style swapping engine down to active (open) views. There also are various other enhancements and small fixes throughout the entire framework.Visual Studio ALM Quick Reference Guidance: v3 - For Visual Studio 11: RELEASE README Welcome to the BETA release of the Quick Reference Guide preview As this is a BETA release and the quality bar for the final Release has not been achieved, we value your candid feedback and recommend that you do not use or deploy these BETA artifacts in a production environment. Quality-Bar Details Documentation has been reviewed by Visual Studio ALM Rangers Documentation has not been through an independent technical review Documentation ...AvalonDock: AvalonDock 2.0.0345: Welcome to early alpha release of AvalonDock 2.0 I've completely rewritten AvalonDock in order to take full advantage of the MVVM pattern. New version also boost a lot of new features: 1) Deep separation between model and layout. 2) Full WPF binding support thanks to unified logical tree between main docking manager, auto-hide windows and floating windows. 3) Support for Aero semi-maximized windows feature. 4) Support for multiple panes in the same floating windows. For a short list of new f...Google Books Downloader for Windows: Google Books Downloader-1.9.0.0.: Google Books DownloaderWindows Azure PowerShell Cmdlets: Windows Azure PowerShell Cmdlets 2.2.2: Changes Added Start Menu Item for Easy Startup Added Link to Getting Started Document Added Ability to Persist Subscription Data to Disk Fixed Get-Deployment to not throw on empty slot Simplified numerous default values for cmdlets Breaking Changes: -SubscriptionName is now mandatory in Set-Subscription. -DefaultStorageAccountName and -DefaultStorageAccountKey parameters were removed from Set-Subscription. Instead, when adding multiple accounts to a subscription, each one needs to be added ...IronPython: 2.7.2.1: On behalf of the IronPython team, I'm happy to announce the final release IronPython 2.7.2. This release includes everything from IronPython 54498 and 62475 as well. Like all IronPython 2.7-series releases, .NET 4 is required to install it. Installing this release will replace any existing IronPython 2.7-series installation. Unlike previous releases, the assemblies for all supported platforms are included in the installer as well as the zip package, in the "Platforms" directory. IronPython 2...Kooboo CMS: Kooboo CMS 3.2.0.0: Breaking changes: When upgrade from previous versions, MUST reset the all the content type templates, otherwise the content manager might get a compile error. New features Integrate with Windows azure. See: http://wiki.kooboo.com/?wiki=Kooboo CMS on Azure Complete solution to deploy on load balance servers. See: http://wiki.kooboo.com/?wiki=Kooboo CMS load balance Update Jquery and Jquery ui to the lastest version(Jquery 1.71, Jquery UI 1.8.16). Tree style text content editing. See:h...Home Access Plus+: v7.10: Don't forget to add your location to the list: http://www.nbdev.co.uk/projects/hap/locations.aspx Changes: Added: CompressJS controls to the Help Desk & Booking System (reduces page size) Fixed: Debug/Release mode detection in CompressJS control Added: Older Browsers will use an iframe and the old uploadh.aspx page (works better than the current implementation on older browsers) Added: Permalinks for my files, you can give out links that redirect to the correct location when you log i...SubExtractor: Release 1026: Fix: multi-colored bluray subs will no longer result in black blob for OCR Fix: dvds with no language specified will not cause exception in name creation of subtitle files Fix: Root directory Dvds will use volume label as their directory nameExtensions for Reactive Extensions (Rxx): Rxx 1.3: Please read the latest release notes for details about what's new. Related Work Items Content SummaryRxx provides the following features. See the Documentation for details. Many IObservable<T> extension methods and IEnumerable<T> extension methods. Many wrappers that convert asynchronous Framework Class Library APIs into observables. Many useful types such as ListSubject<T>, DictionarySubject<T>, CommandSubject, ViewModel, ObservableDynamicObject, Either<TLeft, TRight>, Maybe<T>, Scala...Microsoft Ajax Minifier: Microsoft Ajax Minifier 4.47: Properly output escaped characters in CSS identifiers throw an EOF error when parsing a CSS selector that doesn't end in a declaration block chased down a stack-overflow issue with really large JS sources. Needed to flatten out the AST tree for adjacent expression statements that the application merges into a single expression statement, or that already contain large, comma-separated expressions in the original source. fix issue #17569: tie together the -debug switch with the DEBUG defi...Player Framework by Microsoft: Player Framework for Windows 8 Metro (Preview): Player Framework for HTML/JavaScript and XAML/C# Metro Style Applications. Additional DownloadsIIS Smooth Streaming Client SDK for Windows 8WPF Application Framework (WAF): WAF for .NET 4.5 (Experimental): Version: 2.5.0.440 (Experimental): This is an experimental release! It can be used to investigate the new .NET Framework 4.5 features. The ideas shown in this release might come in a future release (after 2.5) of the WPF Application Framework (WAF). More information can be found in this dicussion post. Requirements .NET Framework 4.5 (The package contains a solution file for Visual Studio 11) The unit test projects require Visual Studio 11 Professional Changelog All: Upgrade all proje...SSH.NET Library: 2012.3.9: There are still few outstanding issues I wanted to include in this release but since its been a while and there are few new features already I decided to create a new release now. New Features Add SOCKS4, SOCKS5 and HTTP Proxy support when connecting to remote server. For silverlight only IP address can be used for server address when using proxy. Add dynamic port forwarding support using ForwardedPortDynamic class. Add new ShellStream class to work with SSH Shell. Add supports for mu...Test Case Import Utilities for Visual Studio 2010 and Visual Studio 11 Beta: V1.2 RTM: This release (V1.2 RTM) includes: Support for connecting to Hosted Team Foundation Server Preview. Support for connecting to Team Foundation Server 11 Beta. Fix to issue with read-only attribute being set for LinksMapping-ReportFile which may have led to problems when saving the report file. Fix to issue with “related links” not being set properly in certain conditions. Fix to ensure that tool works fine when the Excel file contained rich text data. Note: Data is still imported in pl...DotNetNuke® Community Edition CMS: 06.01.04: Major Highlights Fixed issue with loading the splash page skin in the login, privacy and terms of use pages Fixed issue when searching for words with special characters in them Fixed redirection issue when the user does not have permissions to access a resource Fixed issue when clearing the cache using the ClearHostCache() function Fixed issue when displaying the site structure in the link to page feature Fixed issue when inline editing the title of modules Fixed issue with ...Mayhem: Mayhem Developer Preview: This is the developer preview of Mayhem. Enjoy! If you are looking for the Phone applications, please visit this discussionNew ProjectsAD Test Tool: After trying to figure out replication and other user-based issues with a local domain system we built this simple tool to test what the directory was currently reflecting for each user who accessed the tool.BuildUp: MSBuild scripts for use with CI servers.CDRtoSQL: CDRtoSQLCloudStorage4Net: A set of C# wrapper used to access cloud storage services. Such aliyun, grandcloud, baiduyun etc. The project is created to generalize the interfaces of difference storeage platform. We expect programmers could use all these cloud storage in a uniform way. D.E.M.O.N MySQL Tweaker: D.E.M.O.N Studio MySQL Tweaker, a fast and easy-to-use Mysql tools. Add this to your project, install , start, and stop mysql service with a little coding work. Love MySQL , Love MySQL TweakerEarth2012: Earth2012 shows the possible 31 events of 2012 according to the research at www.heliwave.com. For each event, the Earth map shows the danger latitude and all danger and semi-danger cities around it. The user can adjust the starting date of the events once we see a 4-day event.fjyc documents: this is a documentation share for fjyc projectFolderDrive: FolderDrive is a small tool that turns your folders into drives. Imagine having your Dropbox folder behaving as the (almost real) harddrive. It's another handy solution in the ongoing quest for quick access to your favorite folders. The project is developed in C# using WPF - you'll need .NET framework 4 to use it.GralicNew: dHarness: Harness is Web Browser Automation Framework. implemented as COM (Common Object Library) Harness will automate the testing of Web applications by automatic operation the Internet Explorer. you can easily use from your favorite development language.IIS2SQL: Powershell script for IIS7.5 that will send IIS logs to an MSSQL database. Separate table is used to keep track of when the last update occurred so only new events are inserted. Writes to the Application log status messages.Inside Sales Api Client: InsideSales.com api client. Current valid operations are Login and AddLead only. Developed in C#, .NET Framework 4.jqGrid Helper Library: This library is intended for use with the jqGrid jQuery plugin. It provides classes for handling grid settings, building filter and order by expressions, and will build a JQGrid object and then serialize it to either JSON or XML. longSudoku: A simple sudoku game! Use WinForm development! Imitate Metro UI!MantisWare Proxy Switcher: This is a small application that runs in your task bar and gives you the capability to add a proxy list and switch between them quickly, either by selecting the appropriate proxy or using the shortcut keys. It's developed in C#.net, MSSQL Compact.Map Wisata: Map , Dotspatial , Ribbon, Visual Basic.Netmapabc?? API WP SDK: mapabc?? API WP SDKNetWinsock: Provides a simple, event-driven Windows Forms Component capable of serving as a TCP client or server. Utilizes an asynchronous/multithreaded pattern and raises data reception and error events on the UI thread for easy processing. Reminiscent of the VB6 Winsock control.Orchard Zencoder: Orchard module that enables usage of zencoder for encoding media files. http://zencoder.comPleXKeys Predictive Typing: PleXKeys is a predictive typing app for Microsoft Windows. It allows you to type faster and more accurately. With adaptive typing and instant spelling suggestions. Oh and Macros too! Automatic Word Prediction As you type, PleXKeys will predict the next word you are going to type. Then, by simply pressing the ENTER key, PleXKeys will insert the word for you, saving wasted keystrokes. After you have used PleXKeys enough for it to adapt to your typing style, typing entire sentences can b...Samples for my book "Getting Started with the Internet of Things": All examples for my book "Getting Started with the Internet of Things", the Gsiot.PachubeClient library, and the Gsiot.Server library are included in this project.SharePoint Hive - Projects & Articles: TBDSharePoint webparts for the community (Enterprise ready and free): SharePoint webparts for the community (Enterprise ready and free)SimpleCritters: An experiment with neural networks. This project is chiefly academic in nature. This project is concerned with evolving simple critter behavior using simulated neurons.Snip-A-Dillie-O Web: My venture into MVC3 / EF Code First / SQL CE. Snip-A-Dillie-O is just one piece of the bigger picture as a simple code snippet management toolTestMum: C#, Asp.net test projectVisDiskUse: Visualize how much disk space is being used by folders in your hard drive.Windows Phone App Accelerators (WPAA): Windows Phone App Accelerator aims to make creating a Windows Phone application even easier by creating a Visual Studio Template (or Templates) that do the bulk of the common work for you. This is NOT an app factory.

Read the article
Project Management Helps AmeriCares Deliver International Aid

- by Sylvie MacKenzie, PMP

Excerpt from PROFIT - ORACLE - by Alison Weiss Handle with Care Sound project management helps AmeriCares bring international aid to those in need. The stakes are always high for AmeriCares. On a mission to restore health and save lives during times of disaster, the nonprofit international relief and humanitarian aid organization delivers donated medicines, medical supplies, and humanitarian aid to people in the U.S. and around the globe. Founded in 1982 with the express mission of responding as quickly and efficiently as possible to help people in need, the Stamford, Connecticut-based AmeriCares has delivered more than US$10.5 billion in aid to 147 countries over the past three decades. Launch the Slideshow “It’s critically important to us that we steward all the donations and that the medical supplies and medicines get to people as quickly as possible with no loss,” says Kate Sears, senior vice president for finance and technology at AmeriCares. “Whether we’re shipping IV solutions to victims of cholera in Haiti or antibiotics to Somali famine victims, we need to get the medicines there sooner because it means more people will be helped and lives improved or even saved.” Ten years ago, the tracking systems used by AmeriCares associates were paper-based. In recent years, staff started using spreadsheets, but the tracking processes were not standardized between teams. “Every team was tracking completely different information,” says Megan McDermott, senior associate, Sub-Saharan Africa partnerships, at AmeriCares. “It was just a few key things. For example, we tracked the date a shipment was supposed to arrive and the date we got reports from our partner that a hospital received aid on their end.” While the data was accurate, much detail was being lost in the process. AmeriCares management knew it could do a better job of tracking this enterprise data and in 2011 took a significant step by implementing Oracle’s Primavera P6 Professional Project Management. “It’s a comprehensive solution that has helped us improve the monitoring and controlling processes. It has allowed us to do our distribution better,” says Sears. In addition, the implementation effort has been a change agent, helping AmeriCares leadership rethink project management across the entire organization. Initially, much of the focus was on standardizing processes, but staff members also learned the importance of thinking proactively to prevent possible problems and evaluating results to determine if goals and objectives are truly being met. Such data about process efficiency and overall results is critical not only to AmeriCares staff but also to the donors supporting the organization’s life-saving missions. Efficiency Saves Lives One of AmeriCares’ core operations is to gather product donations from the private sector, establish where the most-urgent needs are, and solicit monetary support to send the aid via ocean cargo or airlift to welfare- and health-oriented nongovernmental organizations, hospitals, health networks, and government ministries based in areas in need. In 2011 alone, AmeriCares sent more than 3,500 shipments to 95 countries in response to both ongoing humanitarian needs and more than two dozen emergencies, including deadly tornadoes and storms in the U.S. and the devastating tsunami in Japan. When it comes to nonprofits in general, donors want to know that the charitable organizations they support are using funds wisely. Typically, nonprofits are evaluated by donors in terms of efficiency, an area where AmeriCares has an excellent reputation: 98 percent of expenses go directly to supporting programs and less than 2 percent represent administrative and fundraising costs. Donors, however, should look at more than simple efficiency, says Peter York, senior partner and chief research and learning officer at TCC Group, a nonprofit consultancy headquartered in New York, New York. They should also look at whether organizations have the systems in place to sustain their missions and continue to thrive. An expert on nonprofit organizational management, York has spent years studying sustainable charitable organizations. He defines them as nonprofits that are able to achieve the ongoing financial support to stay relevant and continue doing core mission work. In his analysis of well over 2,500 larger nonprofits, York has found that many are not sustaining, and are actually scaling back in size. “One of the biggest challenges of nonprofit sustainability is the general public’s perception that every dollar donated has to go only to the delivery of service,” says York. “What our data shows is that there are some fundamental capacities that have to be there in order for organizations to sustain and grow.” York’s research highlights the importance of data-driven leadership at successful nonprofits. “You’ve got to have the tools, the systems, and the technologies to get objective information on what you do, the people you serve, and the results you’re achieving,” says York. “If leaders don’t have the knowledge and the data, they can’t make the strategic decisions about programs to take organizations to the next level.” Historically, AmeriCares associates have used time-tested and cost-effective strategies to ship and then track supplies from donation to delivery to their destinations in designated time frames. When disaster strikes, AmeriCares ships by air and generally pulls out all the stops to deliver the most urgently needed aid within the first few days and weeks. Then, as situations stabilize, AmeriCares turns to delivering sea containers for the postemergency and ongoing aid so often needed over the long term. According to McDermott, getting a shipment out the door is fairly complicated, requiring as many as five different AmeriCares teams collaborating together. The entire process can take months—from when products are received in the warehouse and deciding which recipients to allocate supplies to, to getting customs and governmental approvals in place, actually shipping products, and finally ensuring that the products are received in-country. Delivering that aid is no small affair. “Our volume exceeds half a billion dollars a year worth of donated medicines and medical supplies, so it’s a sizable logistical operation to bring these products in and get them out to the right place quickly to have the most impact,” says Sears. “We really pride ourselves on our controls and efficiencies.” Adding to that complexity is the fact that the longer it takes to deliver aid, the more dire the human need can be. Any time AmeriCares associates can shave off the complicated aid delivery process can translate into lives saved. “It’s really being able to track information consistently that will help us to see where are the bottlenecks and where can we work on improving our processes,” says McDermott. Setting a Standard Productivity and information management improvements were key objectives for AmeriCares when staff began the process of implementing Oracle’s Primavera solution. But before configuring the software, the staff needed to take the time to analyze the systems already in place. According to Greg Loop, manager of database systems at AmeriCares, the organization received guidance from several consultants, including Rich D’Addario, consulting project manager in the Primavera Global Business Unit at Oracle, who was instrumental in shepherding the critical requirements-gathering phase. D’Addario encouraged staff to begin documenting shipping processes by considering the order in which activities occur and which ones are dependent on others to get accomplished. This exercise helped everyone realize that to be more efficient, they needed to keep track of shipments in a more standard way. “The staff didn’t recognize formal project management methodology,” says D’Addario. “But they did understand what the most important things are and that if they go wrong, an entire project can go off course.” Before, if a boatload of supplies was being sent to Haiti and there was a problem somewhere, a lot of time was taken up finding out where the problem was—because staff was not tracking things in a standard way. As a result, even more time was needed to find possible solutions to the problem and alert recipients that the aid might be delayed. “For everyone to put on the project manager hat and standardize the way every single thing is done means that now the whole organization is on the same page as to what needs to occur from the time a hurricane hits Haiti and when a boat pulls in to unload supplies,” says D’Addario. With so much care taken to put a process foundation firmly in place, configuring the Primavera solution was actually quite simple. Specific templates were set up for different types of shipments, and dashboards were implemented to provide executives with clear overviews of every project in the system. AmeriCares’ Loop reports that system planning, refining, and testing, followed by writing up documentation and training, took approximately four months. The system went live in spring 2011 at AmeriCares’ Connecticut headquarters. While the nonprofit has an international presence, with warehouses in Europe and offices in Haiti, India, Japan, and Sri Lanka, most donated medicines come from U.S. entities and are shipped from the U.S. out to the rest of the world. In addition, all shipments are tracked from the U.S. office. AmeriCares doesn’t expect the Primavera system to take months off the shipping time, especially for sea containers. However, any time saved is still important because it will allow aid to be delivered to people more quickly at a lower overall cost. “If we can trim a day or two here or there, that can translate into lives that we’re saving, especially in emergency situations,” says Sears. A Cultural Change Beyond the measurable benefits that come with IT-driven process improvement, AmeriCares management is seeing a change in culture as a result of the Primavera project. One change has been treating every shipment of aid as a project, and everyone involved with facilitating shipments as a project manager. “This is a revolutionary concept for us,” says McDermott. “Before, we were used to thinking we were doing logistics—getting a container from point A to point B without looking at it as one project and really understanding what it meant to manage it.” AmeriCares staff is also happy to report that collaboration within the organization is much more efficient. When someone creates a shipment in the Primavera system, the same shared template is used, which means anyone can log in to the system to see the status of a shipment. Knowledgeable staff can access a shipment project to help troubleshoot a problem. Management can easily check the status of projects across the organization. “Dashboards are really useful,” says McDermott. “Instead of going into the details of each project, you can just see the high-level real-time information at a glance.” The new system is helping team members focus on proactively managing shipments rather than simply reacting when problems occur. For example, when a container is shipped, documents must be included for customs clearance. Now, the shipping template has built-in reminders to prompt team members to ask for copies of these documents from freight forwarders and to follow up with partners to discover if a shipment is on time. In the past, staff may not have worked on securing these documents until they’d been notified a shipment had arrived in-country. Another benefit of capturing and adopting best practices within the Primavera system is that staff training is easier. “Capturing the processes in documented steps and milestones allows us to teach new staff members how to do their jobs faster,” says Sears. “It provides them with the knowledge of their predecessors so they don’t have to keep reinventing the wheel.” With the Primavera system already generating positive results, management is eager to take advantage of advanced capabilities. Loop is working on integrating the company’s proprietary inventory management system with the Primavera system so that when logistics or warehousing operators input data, the information will automatically go into the Primavera system. In the past, this information had to be manually keyed into spreadsheets, often leading to errors. Mining Historical Data Another feature on the horizon for AmeriCares is utilizing Primavera P6 Professional Project Management reporting capabilities. As the system begins to include more historical data, management soon will be able to draw on this information to conduct analysis that has not been possible before and create customized reports. For example, at the beginning of the shipment process, staff will be able to use historical data to more accurately estimate how long the approval process should take for a particular country. This could help ensure that food and medicine with limited shelf lives do not get stuck in customs or used beyond their expiration dates. The historical data in the Primavera system will also help AmeriCares with better planning year to year. The nonprofit’s staff has always put together a plan at the beginning of the year, but this has been very challenging simply because it is impossible to predict disasters. Now, management will be able to look at historical data and see trends and statistics as they set current objectives and prepare for future need. In addition, this historical data will provide AmeriCares management with the ability to review year-end data and compare actual project results with goals set at the beginning of the year—to see if desired outcomes were achieved and if there are areas that need improvement. It’s this type of information that is so valuable to donors. And, according to York, project management software can play a critical role in generating the data to help nonprofits sustain and grow. “It is important to invest in systems to help replicate, expand, and deliver services,” says York. “Project management software can help because it encourages nonprofits to examine program or service changes and how to manage moving forward.” Sears believes that AmeriCares donors will support the return on investment the organization will achieve with the Primavera solution. “It won’t be financial returns, but rather how many more people we can help for a given dollar or how much more quickly we can respond to a need,” says Sears. “I think donors are receptive to such arguments.” And for AmeriCares, it is all about the future and increasing results. The project management environment currently may be quite simple, but IT staff plans to expand the complexity and functionality as the organization grows in its knowledge of project management and the goals it wants to achieve. “As we use the system over time, we’ll continue to refine our best practices and accumulate more data,” says Sears. “It will advance our ability to make better data-driven decisions.”

Read the article
Improving Partitioned Table Join Performance

- by Paul White

The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing. Test Data The two tables in this example use a common partitioning partition scheme. The partition function uses 41 equal-size partitions: CREATE PARTITION FUNCTION PFT (integer) AS RANGE RIGHT FOR VALUES ( 125000, 250000, 375000, 500000, 625000, 750000, 875000, 1000000, 1125000, 1250000, 1375000, 1500000, 1625000, 1750000, 1875000, 2000000, 2125000, 2250000, 2375000, 2500000, 2625000, 2750000, 2875000, 3000000, 3125000, 3250000, 3375000, 3500000, 3625000, 3750000, 3875000, 4000000, 4125000, 4250000, 4375000, 4500000, 4625000, 4750000, 4875000, 5000000 ); GO CREATE PARTITION SCHEME PST AS PARTITION PFT ALL TO ([PRIMARY]); There two tables are: CREATE TABLE dbo.T1 ( TID integer NOT NULL IDENTITY(0,1), Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T1 PRIMARY KEY CLUSTERED (TID) ON PST (TID) ); CREATE TABLE dbo.T2 ( TID integer NOT NULL, Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T2 PRIMARY KEY CLUSTERED (TID, Column1) ON PST (TID) ); The next script loads 5 million rows into T1 with a pseudo-random value between 1 and 5 for Column1. The table is partitioned on the IDENTITY column TID: INSERT dbo.T1 WITH (TABLOCKX) (Column1) SELECT (ABS(CHECKSUM(NEWID())) % 5) + 1 FROM dbo.Numbers AS N WHERE n BETWEEN 1 AND 5000000; In case you don’t already have an auxiliary table of numbers lying around, here’s a script to create one with 10 million rows: CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY); WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5) INSERT dbo.Numbers WITH (TABLOCKX) SELECT TOP (10000000) n FROM Nums ORDER BY n OPTION (MAXDOP 1); Table T1 contains data like this: Next we load data into table T2. The relationship between the two tables is that table 2 contains ‘n’ rows for each row in table 1, where ‘n’ is determined by the value in Column1 of table T1. There is nothing particularly special about the data or distribution, by the way. INSERT dbo.T2 WITH (TABLOCKX) (TID, Column1) SELECT T.TID, N.n FROM dbo.T1 AS T JOIN dbo.Numbers AS N ON N.n >= 1 AND N.n <= T.Column1; Table T2 ends up containing about 15 million rows: The primary key for table T2 is a combination of TID and Column1. The data is partitioned according to the value in column TID alone. Partition Distribution The following query shows the number of rows in each partition of table T1: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T1 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are 40 partitions containing 125,000 rows (40 * 125k = 5m rows). The rightmost partition remains empty. The next query shows the distribution for table 2: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T2 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are roughly 375,000 rows in each partition (the rightmost partition is also empty): Ok, that’s the test data done. Test Query and Execution Plan The task is to count the rows resulting from joining tables 1 and 2 on the TID column: SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; The optimizer chooses a plan using parallel hash join, and partial aggregation: The Plan Explorer plan tree view shows accurate cardinality estimates and an even distribution of rows across threads (click to enlarge the image): With a warm data cache, the STATISTICS IO output shows that no physical I/O was needed, and all 41 partitions were touched: Running the query without actual execution plan or STATISTICS IO information for maximum performance, the query returns in around 2600ms. Execution Plan Analysis The first step toward improving on the execution plan produced by the query optimizer is to understand how it works, at least in outline. The two parallel Clustered Index Scans use multiple threads to read rows from tables T1 and T2. Parallel scan uses a demand-based scheme where threads are given page(s) to scan from the table as needed. This arrangement has certain important advantages, but does result in an unpredictable distribution of rows amongst threads. The point is that multiple threads cooperate to scan the whole table, but it is impossible to predict which rows end up on which threads. For correct results from the parallel hash join, the execution plan has to ensure that rows from T1 and T2 that might join are processed on the same thread. For example, if a row from T1 with join key value ‘1234’ is placed in thread 5’s hash table, the execution plan must guarantee that any rows from T2 that also have join key value ‘1234’ probe thread 5’s hash table for matches. The way this guarantee is enforced in this parallel hash join plan is by repartitioning rows to threads after each parallel scan. The two repartitioning exchanges route rows to threads using a hash function over the hash join keys. The two repartitioning exchanges use the same hash function so rows from T1 and T2 with the same join key must end up on the same hash join thread. Expensive Exchanges This business of repartitioning rows between threads can be very expensive, especially if a large number of rows is involved. The execution plan selected by the optimizer moves 5 million rows through one repartitioning exchange and around 15 million across the other. As a first step toward removing these exchanges, consider the execution plan selected by the optimizer if we join just one partition from each table, disallowing parallelism: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = 1 AND $PARTITION.PFT(T2.TID) = 1 OPTION (MAXDOP 1); The optimizer has chosen a (one-to-many) merge join instead of a hash join. The single-partition query completes in around 100ms. If everything scaled linearly, we would expect that extending this strategy to all 40 populated partitions would result in an execution time around 4000ms. Using parallelism could reduce that further, perhaps to be competitive with the parallel hash join chosen by the optimizer. This raises a question. If the most efficient way to join one partition from each of the tables is to use a merge join, why does the optimizer not choose a merge join for the full query? Forcing a Merge Join Let’s force the optimizer to use a merge join on the test query using a hint: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN); This is the execution plan selected by the optimizer: This plan results in the same number of logical reads reported previously, but instead of 2600ms the query takes 5000ms. The natural explanation for this drop in performance is that the merge join plan is only using a single thread, whereas the parallel hash join plan could use multiple threads. Parallel Merge Join We can get a parallel merge join plan using the same query hint as before, and adding trace flag 8649: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN, QUERYTRACEON 8649); The execution plan is: This looks promising. It uses a similar strategy to distribute work across threads as seen for the parallel hash join. In practice though, performance is disappointing. On a typical run, the parallel merge plan runs for around 8400ms; slower than the single-threaded merge join plan (5000ms) and much worse than the 2600ms for the parallel hash join. We seem to be going backwards! The logical reads for the parallel merge are still exactly the same as before, with no physical IOs. The cardinality estimates and thread distribution are also still very good (click to enlarge): A big clue to the reason for the poor performance is shown in the wait statistics (captured by Plan Explorer Pro): CXPACKET waits require careful interpretation, and are most often benign, but in this case excessive waiting occurs at the repartitioning exchanges. Unlike the parallel hash join, the repartitioning exchanges in this plan are order-preserving ‘merging’ exchanges (because merge join requires ordered inputs): Parallelism works best when threads can just grab any available unit of work and get on with processing it. Preserving order introduces inter-thread dependencies that can easily lead to significant waits occurring. In extreme cases, these dependencies can result in an intra-query deadlock, though the details of that will have to wait for another time to explore in detail. The potential for waits and deadlocks leads the query optimizer to cost parallel merge join relatively highly, especially as the degree of parallelism (DOP) increases. This high costing resulted in the optimizer choosing a serial merge join rather than parallel in this case. The test results certainly confirm its reasoning. Collocated Joins In SQL Server 2008 and later, the optimizer has another available strategy when joining tables that share a common partition scheme. This strategy is a collocated join, also known as as a per-partition join. It can be applied in both serial and parallel execution plans, though it is limited to 2-way joins in the current optimizer. Whether the optimizer chooses a collocated join or not depends on cost estimation. The primary benefits of a collocated join are that it eliminates an exchange and requires less memory, as we will see next. Costing and Plan Selection The query optimizer did consider a collocated join for our original query, but it was rejected on cost grounds. The parallel hash join with repartitioning exchanges appeared to be a cheaper option. There is no query hint to force a collocated join, so we have to mess with the costing framework to produce one for our test query. Pretending that IOs cost 50 times more than usual is enough to convince the optimizer to use collocated join with our test query: -- Pretend IOs are 50x cost temporarily DBCC SETIOWEIGHT(50); -- Co-located hash join SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (RECOMPILE); -- Reset IO costing DBCC SETIOWEIGHT(1); Collocated Join Plan The estimated execution plan for the collocated join is: The Constant Scan contains one row for each partition of the shared partitioning scheme, from 1 to 41. The hash repartitioning exchanges seen previously are replaced by a single Distribute Streams exchange using Demand partitioning. Demand partitioning means that the next partition id is given to the next parallel thread that asks for one. My test machine has eight logical processors, and all are available for SQL Server to use. As a result, there are eight threads in the single parallel branch in this plan, each processing one partition from each table at a time. Once a thread finishes processing a partition, it grabs a new partition number from the Distribute Streams exchange…and so on until all partitions have been processed. It is important to understand that the parallel scans in this plan are different from the parallel hash join plan. Although the scans have the same parallelism icon, tables T1 and T2 are not being co-operatively scanned by multiple threads in the same way. Each thread reads a single partition of T1 and performs a hash match join with the same partition from table T2. The properties of the two Clustered Index Scans show a Seek Predicate (unusual for a scan!) limiting the rows to a single partition: The crucial point is that the join between T1 and T2 is on TID, and TID is the partitioning column for both tables. A thread that processes partition ‘n’ is guaranteed to see all rows that can possibly join on TID for that partition. In addition, no other thread will see rows from that partition, so this removes the need for repartitioning exchanges. CPU and Memory Efficiency Improvements The collocated join has removed two expensive repartitioning exchanges and added a single exchange processing 41 rows (one for each partition id). Remember, the parallel hash join plan exchanges had to process 5 million and 15 million rows. The amount of processor time spent on exchanges will be much lower in the collocated join plan. In addition, the collocated join plan has a maximum of 8 threads processing single partitions at any one time. The 41 partitions will all be processed eventually, but a new partition is not started until a thread asks for it. Threads can reuse hash table memory for the new partition. The parallel hash join plan also had 8 hash tables, but with all 5,000,000 build rows loaded at the same time. The collocated plan needs memory for only 8 * 125,000 = 1,000,000 rows at any one time. Collocated Hash Join Performance The collated join plan has disappointing performance in this case. The query runs for around 25,300ms despite the same IO statistics as usual. This is much the worst result so far, so what went wrong? It turns out that cardinality estimation for the single partition scans of table T1 is slightly low. The properties of the Clustered Index Scan of T1 (graphic immediately above) show the estimation was for 121,951 rows. This is a small shortfall compared with the 125,000 rows actually encountered, but it was enough to cause the hash join to spill to physical tempdb: A level 1 spill doesn’t sound too bad, until you realize that the spill to tempdb probably occurs for each of the 41 partitions. As a side note, the cardinality estimation error is a little surprising because the system tables accurately show there are 125,000 rows in every partition of T1. Unfortunately, the optimizer uses regular column and index statistics to derive cardinality estimates here rather than system table information (e.g. sys.partitions). Collocated Merge Join We will never know how well the collocated parallel hash join plan might have worked without the cardinality estimation error (and the resulting 41 spills to tempdb) but we do know: Merge join does not require a memory grant; and Merge join was the optimizer’s preferred join option for a single partition join Putting this all together, what we would really like to see is the same collocated join strategy, but using merge join instead of hash join. Unfortunately, the current query optimizer cannot produce a collocated merge join; it only knows how to do collocated hash join. So where does this leave us? CROSS APPLY sys.partitions We can try to write our own collocated join query. We can use sys.partitions to find the partition numbers, and CROSS APPLY to get a count per partition, with a final step to sum the partial counts. The following query implements this idea: SELECT row_count = SUM(Subtotals.cnt) FROM ( -- Partition numbers SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1 ) AS P CROSS APPLY ( -- Count per collocated join SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; The estimated plan is: The cardinality estimates aren’t all that good here, especially the estimate for the scan of the system table underlying the sys.partitions view. Nevertheless, the plan shape is heading toward where we would like to be. Each partition number from the system table results in a per-partition scan of T1 and T2, a one-to-many Merge Join, and a Stream Aggregate to compute the partial counts. The final Stream Aggregate just sums the partial counts. Execution time for this query is around 3,500ms, with the same IO statistics as always. This compares favourably with 5,000ms for the serial plan produced by the optimizer with the OPTION (MERGE JOIN) hint. This is another case of the sum of the parts being less than the whole – summing 41 partial counts from 41 single-partition merge joins is faster than a single merge join and count over all partitions. Even so, this single-threaded collocated merge join is not as quick as the original parallel hash join plan, which executed in 2,600ms. On the positive side, our collocated merge join uses only one logical processor and requires no memory grant. The parallel hash join plan used 16 threads and reserved 569 MB of memory: Using a Temporary Table Our collocated merge join plan should benefit from parallelism. The reason parallelism is not being used is that the query references a system table. We can work around that by writing the partition numbers to a temporary table (or table variable): SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); CREATE TABLE #P ( partition_number integer PRIMARY KEY); INSERT #P (partition_number) SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1; SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; DROP TABLE #P; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; Using the temporary table adds a few logical reads, but the overall execution time is still around 3500ms, indistinguishable from the same query without the temporary table. The problem is that the query optimizer still doesn’t choose a parallel plan for this query, though the removal of the system table reference means that it could if it chose to: In fact the optimizer did enter the parallel plan phase of query optimization (running search 1 for a second time): Unfortunately, the parallel plan found seemed to be more expensive than the serial plan. This is a crazy result, caused by the optimizer’s cost model not reducing operator CPU costs on the inner side of a nested loops join. Don’t get me started on that, we’ll be here all night. In this plan, everything expensive happens on the inner side of a nested loops join. Without a CPU cost reduction to compensate for the added cost of exchange operators, candidate parallel plans always look more expensive to the optimizer than the equivalent serial plan. Parallel Collocated Merge Join We can produce the desired parallel plan using trace flag 8649 again: SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: One difference between this plan and the collocated hash join plan is that a Repartition Streams exchange operator is used instead of Distribute Streams. The effect is similar, though not quite identical. The Repartition uses round-robin partitioning, meaning the next partition id is pushed to the next thread in sequence. The Distribute Streams exchange seen earlier used Demand partitioning, meaning the next partition id is pulled across the exchange by the next thread that is ready for more work. There are subtle performance implications for each partitioning option, but going into that would again take us too far off the main point of this post. Performance The important thing is the performance of this parallel collocated merge join – just 1350ms on a typical run. The list below shows all the alternatives from this post (all timings include creation, population, and deletion of the temporary table where appropriate) from quickest to slowest: Collocated parallel merge join: 1350ms Parallel hash join: 2600ms Collocated serial merge join: 3500ms Serial merge join: 5000ms Parallel merge join: 8400ms Collated parallel hash join: 25,300ms (hash spill per partition) The parallel collocated merge join requires no memory grant (aside from a paltry 1.2MB used for exchange buffers). This plan uses 16 threads at DOP 8; but 8 of those are (rather pointlessly) allocated to the parallel scan of the temporary table. These are minor concerns, but it turns out there is a way to address them if it bothers you. Parallel Collocated Merge Join with Demand Partitioning This final tweak replaces the temporary table with a hard-coded list of partition ids (dynamic SQL could be used to generate this query from sys.partitions): SELECT row_count = SUM(Subtotals.cnt) FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12),(13),(14),(15),(16),(17),(18),(19),(20), (21),(22),(23),(24),(25),(26),(27),(28),(29),(30), (31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41) ) AS P (partition_number) CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: The parallel collocated hash join plan is reproduced below for comparison: The manual rewrite has another advantage that has not been mentioned so far: the partial counts (per partition) can be computed earlier than the partial counts (per thread) in the optimizer’s collocated join plan. The earlier aggregation is performed by the extra Stream Aggregate under the nested loops join. The performance of the parallel collocated merge join is unchanged at around 1350ms. Final Words It is a shame that the current query optimizer does not consider a collocated merge join (Connect item closed as Won’t Fix). The example used in this post showed an improvement in execution time from 2600ms to 1350ms using a modestly-sized data set and limited parallelism. In addition, the memory requirement for the query was almost completely eliminated – down from 569MB to 1.2MB. The problem with the parallel hash join selected by the optimizer is that it attempts to process the full data set all at once (albeit using eight threads). It requires a large memory grant to hold all 5 million rows from table T1 across the eight hash tables, and does not take advantage of the divide-and-conquer opportunity offered by the common partitioning. The great thing about the collocated join strategies is that each parallel thread works on a single partition from both tables, reading rows, performing the join, and computing a per-partition subtotal, before moving on to a new partition. From a thread’s point of view… If you have trouble visualizing what is happening from just looking at the parallel collocated merge join execution plan, let’s look at it again, but from the point of view of just one thread operating between the two Parallelism (exchange) operators. Our thread picks up a single partition id from the Distribute Streams exchange, and starts a merge join using ordered rows from partition 1 of table T1 and partition 1 of table T2. By definition, this is all happening on a single thread. As rows join, they are added to a (per-partition) count in the Stream Aggregate immediately above the Merge Join. Eventually, either T1 (partition 1) or T2 (partition 1) runs out of rows and the merge join stops. The per-partition count from the aggregate passes on through the Nested Loops join to another Stream Aggregate, which is maintaining a per-thread subtotal. Our same thread now picks up a new partition id from the exchange (say it gets id 9 this time). The count in the per-partition aggregate is reset to zero, and the processing of partition 9 of both tables proceeds just as it did for partition 1, and on the same thread. Each thread picks up a single partition id and processes all the data for that partition, completely independently from other threads working on other partitions. One thread might eventually process partitions (1, 9, 17, 25, 33, 41) while another is concurrently processing partitions (2, 10, 18, 26, 34) and so on for the other six threads at DOP 8. The point is that all 8 threads can execute independently and concurrently, continuing to process new partitions until the wider job (of which the thread has no knowledge!) is done. This divide-and-conquer technique can be much more efficient than simply splitting the entire workload across eight threads all at once. Related Reading Understanding and Using Parallelism in SQL Server Parallel Execution Plans Suck © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

Read the article

Search Results

Search found 279 results on 12 pages for 'predict'.

Page 11/12 | < Previous Page | 7 8 9 10 11 12 | Next Page >

- by shiju

- by Brenna Johnson-Oracle

- by extended_events

- by jean-pierre.dijcks

- by Stephen.Walther

- by tonyrogerson

- by Paul White

- by Mike Stiles

- by Manca Weeks

- by Steve Madsen

- by Joe Hopfgartner

- by corlettk

- by andrewbrust

- by Eric Nelson

- by Janice J. Heiss

- by Janice J. Heiss

- by shiju

- by Leonid Ganeline

- by Michael Snow

- by Leonid Ganeline

- by Lewis Benge

- by HCM-Oracle

- by Sylvie MacKenzie, PMP

- by Paul White

< Previous Page | 7 8 9 10 11 12 | Next Page >