Search Results

Search found 10417 results on 417 pages for 'large'.

Page 395/417 | < Previous Page | 391 392 393 394 395 396 397 398 399 400 401 402 | Next Page >

Optimizing AES modes on Solaris for Intel Westmere

- by danx

Optimizing AES modes on Solaris for Intel Westmere Review AES is a strong method of symmetric (secret-key) encryption. It is a U.S. FIPS-approved cryptographic algorithm (FIPS 197) that operates on 16-byte blocks. AES has been available since 2001 and is widely used. However, AES by itself has a weakness. AES encryption isn't usually used by itself because identical blocks of plaintext are always encrypted into identical blocks of ciphertext. This encryption can be easily attacked with "dictionaries" of common blocks of text and allows one to more-easily discern the content of the unknown cryptotext. This mode of encryption is called "Electronic Code Book" (ECB), because one in theory can keep a "code book" of all known cryptotext and plaintext results to cipher and decipher AES. In practice, a complete "code book" is not practical, even in electronic form, but large dictionaries of common plaintext blocks is still possible. Here's a diagram of encrypting input data using AES ECB mode: Block 1 Block 2 PlainTextInput PlainTextInput | | | | \/ \/ AESKey-->(AES Encryption) AESKey-->(AES Encryption) | | | | \/ \/ CipherTextOutput CipherTextOutput Block 1 Block 2 What's the solution to the same cleartext input producing the same ciphertext output? The solution is to further process the encrypted or decrypted text in such a way that the same text produces different output. This usually involves an Initialization Vector (IV) and XORing the decrypted or encrypted text. As an example, I'll illustrate CBC mode encryption: Block 1 Block 2 PlainTextInput PlainTextInput | | | | \/ \/ IV >----->(XOR) +------------->(XOR) +---> . . . . | | | | | | | | \/ | \/ | AESKey-->(AES Encryption) | AESKey-->(AES Encryption) | | | | | | | | | \/ | \/ | CipherTextOutput ------+ CipherTextOutput -------+ Block 1 Block 2 The steps for CBC encryption are: Start with a 16-byte Initialization Vector (IV), choosen randomly. XOR the IV with the first block of input plaintext Encrypt the result with AES using a user-provided key. The result is the first 16-bytes of output cryptotext. Use the cryptotext (instead of the IV) of the previous block to XOR with the next input block of plaintext Another mode besides CBC is Counter Mode (CTR). As with CBC mode, it also starts with a 16-byte IV. However, for subsequent blocks, the IV is just incremented by one. Also, the IV ix XORed with the AES encryption result (not the plain text input). Here's an illustration: Block 1 Block 2 PlainTextInput PlainTextInput | | | | \/ \/ AESKey-->(AES Encryption) AESKey-->(AES Encryption) | | | | \/ \/ IV >----->(XOR) IV + 1 >---->(XOR) IV + 2 ---> . . . . | | | | \/ \/ CipherTextOutput CipherTextOutput Block 1 Block 2 Optimization Which of these modes can be parallelized? ECB encryption/decryption can be parallelized because it does more than plain AES encryption and decryption, as mentioned above. CBC encryption can't be parallelized because it depends on the output of the previous block. However, CBC decryption can be parallelized because all the encrypted blocks are known at the beginning. CTR encryption and decryption can be parallelized because the input to each block is known--it's just the IV incremented by one for each subsequent block. So, in summary, for ECB, CBC, and CTR modes, encryption and decryption can be parallelized with the exception of CBC encryption. How do we parallelize encryption? By interleaving. Usually when reading and writing data there are pipeline "stalls" (idle processor cycles) that result from waiting for memory to be loaded or stored to or from CPU registers. Since the software is written to encrypt/decrypt the next data block where pipeline stalls usually occurs, we can avoid stalls and crypt with fewer cycles. This software processes 4 blocks at a time, which ensures virtually no waiting ("stalling") for reading or writing data in memory. Other Optimizations Besides interleaving, other optimizations performed are Loading the entire key schedule into the 128-bit %xmm registers. This is done once for per 4-block of data (since 4 blocks of data is processed, when present). The following is loaded: the entire "key schedule" (user input key preprocessed for encryption and decryption). This takes 11, 13, or 15 registers, for AES-128, AES-192, and AES-256, respectively The input data is loaded into another %xmm register The same register contains the output result after encrypting/decrypting Using SSSE 4 instructions (AESNI). Besides the aesenc, aesenclast, aesdec, aesdeclast, aeskeygenassist, and aesimc AESNI instructions, Intel has several other instructions that operate on the 128-bit %xmm registers. Some common instructions for encryption are: pxor exclusive or (very useful), movdqu load/store a %xmm register from/to memory, pshufb shuffle bytes for byte swapping, pclmulqdq carry-less multiply for GCM mode Combining AES encryption/decryption with CBC or CTR modes processing. Instead of loading input data twice (once for AES encryption/decryption, and again for modes (CTR or CBC, for example) processing, the input data is loaded once as both AES and modes operations occur at in the same function Performance Everyone likes pretty color charts, so here they are. I ran these on Solaris 11 running on a Piketon Platform system with a 4-core Intel Clarkdale processor @3.20GHz. Clarkdale which is part of the Westmere processor architecture family. The "before" case is Solaris 11, unmodified. Keep in mind that the "before" case already has been optimized with hand-coded Intel AESNI assembly. The "after" case has combined AES-NI and mode instructions, interleaved 4 blocks at-a-time. « For the first table, lower is better (milliseconds). The first table shows the performance improvement using the Solaris encrypt(1) and decrypt(1) CLI commands. I encrypted and decrypted a 1/2 GByte file on /tmp (swap tmpfs). Encryption improved by about 40% and decryption improved by about 80%. AES-128 is slighty faster than AES-256, as expected. The second table shows more detail timings for CBC, CTR, and ECB modes for the 3 AES key sizes and different data lengths. » The results shown are the percentage improvement as shown by an internal PKCS#11 microbenchmark. And keep in mind the previous baseline code already had optimized AESNI assembly! The keysize (AES-128, 192, or 256) makes little difference in relative percentage improvement (although, of course, AES-128 is faster than AES-256). Larger data sizes show better improvement than 128-byte data. Availability This software is in Solaris 11 FCS. It is available in the 64-bit libcrypto library and the "aes" Solaris kernel module. You must be running hardware that supports AESNI (for example, Intel Westmere and Sandy Bridge, microprocessor architectures). The easiest way to determine if AES-NI is available is with the isainfo(1) command. For example, $ isainfo -v 64-bit amd64 applications pclmulqdq aes sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov amd_sysc cx8 tsc fpu 32-bit i386 applications pclmulqdq aes sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov sep cx8 tsc fpu No special configuration or setup is needed to take advantage of this software. Solaris libraries and kernel automatically determine if it's running on AESNI-capable machines and execute the correctly-tuned software for the current microprocessor. Summary Maximum throughput of AES cipher modes can be achieved by combining AES encryption with modes processing, interleaving encryption of 4 blocks at a time, and using Intel's wide 128-bit %xmm registers and instructions. References "Block cipher modes of operation", Wikipedia Good overview of AES modes (ECB, CBC, CTR, etc.) "Advanced Encryption Standard", Wikipedia "Current Modes" describes NIST-approved block cipher modes (ECB,CBC, CFB, OFB, CCM, GCM)

Read the article
T-SQL Tuesday #31 - Logging Tricks with CONTEXT_INFO

- by Most Valuable Yak (Rob Volk)

This month's T-SQL Tuesday is being hosted by Aaron Nelson [b | t], fellow Atlantan (the city in Georgia, not the famous sunken city, or the resort in the Bahamas) and covers the topic of logging (the recording of information, not the harvesting of trees) and maintains the fine T-SQL Tuesday tradition begun by Adam Machanic [b | t] (the SQL Server guru, not the guy who fixes cars, check the spelling again, there will be a quiz later). This is a trick I learned from Fernando Guerrero [b | t] waaaaaay back during the PASS Summit 2004 in sunny, hurricane-infested Orlando, during his session on Secret SQL Server (not sure if that's the correct title, and I haven't used parentheses in this paragraph yet). CONTEXT_INFO is a neat little feature that's existed since SQL Server 2000 and perhaps even earlier. It lets you assign data to the current session/connection, and maintains that data until you disconnect or change it. In addition to the CONTEXT_INFO() function, you can also query the context_info column in sys.dm_exec_sessions, or even sysprocesses if you're still running SQL Server 2000, if you need to see it for another session. While you're limited to 128 bytes, one big advantage that CONTEXT_INFO has is that it's independent of any transactions. If you've ever logged to a table in a transaction and then lost messages when it rolled back, you can understand how aggravating it can be. CONTEXT_INFO also survives across multiple SQL batches (GO separators) in the same connection, so for those of you who were going to suggest "just log to a table variable, they don't get rolled back": HA-HA, I GOT YOU! Since GO starts a new batch all variable declarations are lost. Here's a simple example I recently used at work. I had to test database mirroring configurations for disaster recovery scenarios and measure the network throughput. I also needed to log how long it took for the script to run and include the mirror settings for the database in question. I decided to use AdventureWorks as my database model, and Adam Machanic's Big Adventure script to provide a fairly large workload that's repeatable and easily scalable. My test would consist of several copies of AdventureWorks running the Big Adventure script while I mirrored the databases (or not). Since Adam's script contains several batches, I decided CONTEXT_INFO would have to be used. As it turns out, I only needed to grab the start time at the beginning, I could get the rest of the data at the end of the process. The code is pretty small: declare @time binary(128)=cast(getdate() as binary(8)) set context_info @time ... rest of Big Adventure code ... go use master; insert mirror_test(server,role,partner,db,state,safety,start,duration) select @@servername, mirroring_role_desc, mirroring_partner_instance, db_name(database_id), mirroring_state_desc, mirroring_safety_level_desc, cast(cast(context_info() as binary(8)) as datetime), datediff(s,cast(cast(context_info() as binary(8)) as datetime),getdate()) from sys.database_mirroring where db_name(database_id) like 'Adv%'; I declared @time as a binary(128) since CONTEXT_INFO is defined that way. I couldn't convert GETDATE() to binary(128) as it would pad the first 120 bytes as 0x00. To keep the CAST functions simple and avoid using SUBSTRING, I decided to CAST GETDATE() as binary(8) and let SQL Server do the implicit conversion. It's not the safest way perhaps, but it works on my machine. :) As I mentioned earlier, you can query system views for sessions and get their CONTEXT_INFO. With a little boilerplate code this can be used to monitor long-running procedures, in case you need to kill a process, or are just curious how long certain parts take. In this example, I added code to Adam's Big Adventure script to set CONTEXT_INFO messages at strategic places I want to monitor. (His code is in UPPERCASE as it was in the original, mine is all lowercase): declare @msg binary(128) set @msg=cast('Altering bigProduct.ProductID' as binary(128)) set context_info @msg go ALTER TABLE bigProduct ALTER COLUMN ProductID INT NOT NULL GO set context_info 0x0 go declare @msg1 binary(128) set @msg1=cast('Adding pk_bigProduct Constraint' as binary(128)) set context_info @msg1 go ALTER TABLE bigProduct ADD CONSTRAINT pk_bigProduct PRIMARY KEY (ProductID) GO set context_info 0x0 go declare @msg2 binary(128) set @msg2=cast('Altering bigTransactionHistory.TransactionID' as binary(128)) set context_info @msg2 go ALTER TABLE bigTransactionHistory ALTER COLUMN TransactionID INT NOT NULL GO set context_info 0x0 go declare @msg3 binary(128) set @msg3=cast('Adding pk_bigTransactionHistory Constraint' as binary(128)) set context_info @msg3 go ALTER TABLE bigTransactionHistory ADD CONSTRAINT pk_bigTransactionHistory PRIMARY KEY NONCLUSTERED(TransactionID) GO set context_info 0x0 go declare @msg4 binary(128) set @msg4=cast('Creating IX_ProductId_TransactionDate Index' as binary(128)) set context_info @msg4 go CREATE NONCLUSTERED INDEX IX_ProductId_TransactionDate ON bigTransactionHistory(ProductId,TransactionDate) INCLUDE(Quantity,ActualCost) GO set context_info 0x0 This doesn't include the entire script, only those portions that altered a table or created an index. One annoyance is that SET CONTEXT_INFO requires a literal or variable, you can't use an expression. And since GO starts a new batch I need to declare a variable in each one. And of course I have to use CAST because it won't implicitly convert varchar to binary. And even though context_info is a nullable column, you can't SET CONTEXT_INFO NULL, so I have to use SET CONTEXT_INFO 0x0 to clear the message after the statement completes. And if you're thinking of turning this into a UDF, you can't, although a stored procedure would work. So what does all this aggravation get you? As the code runs, if I want to see which stage the session is at, I can run the following (assuming SPID 51 is the one I want): select CAST(context_info as varchar(128)) from sys.dm_exec_sessions where session_id=51 Since SQL Server 2005 introduced the new system and dynamic management views (DMVs) there's not as much need for tagging a session with these kinds of messages. You can get the session start time and currently executing statement from them, and neatly presented if you use Adam's sp_whoisactive utility (and you absolutely should be using it). Of course you can always use xp_cmdshell, a CLR function, or some other tricks to log information outside of a SQL transaction. All the same, I've used this trick to monitor long-running reports at a previous job, and I still think CONTEXT_INFO is a great feature, especially if you're still using SQL Server 2000 or want to supplement your instrumentation. If you'd like an exercise, consider adding the system time to the messages in the last example, and an automated job to query and parse it from the system tables. That would let you track how long each statement ran without having to run Profiler. #TSQL2sDay

Read the article
What Makes a Good Design Critic? CHI 2010 Panel Review

- by jatin.thaker

Author: Daniel Schwartz, Senior Interaction Designer, Oracle Applications User Experience Oracle Applications UX Chief Evangelist Patanjali Venkatacharya organized and moderated an innovative and stimulating panel discussion titled "What Makes a Good Design Critic? Food Design vs. Product Design Criticism" at CHI 2010, the annual ACM Conference on Human Factors in Computing Systems. The panelists included Janice Rohn, VP of User Experience at Experian; Tami Hardeman, a food stylist; Ed Seiber, a restaurant architect and designer; John Kessler, a food critic and writer at the Atlanta Journal-Constitution; and Larry Powers, Chef de Cuisine at Shaun's restaurant in Atlanta, Georgia. Building off the momentum of his highly acclaimed panel at CHI 2009 on what interaction design can learn from food design (for which I was on the other side as a panelist), Venkatacharya brought together new people with different roles in the restaurant and software interaction design fields. The session was also quite delicious -- but more on that later. Criticism, as it applies to food and product or interaction design, was the tasty topic for this forum and showed that strong parallels exist between food and interaction design criticism. Figure 1. The panelists in discussion: (left to right) Janice Rohn, Ed Seiber, Tami Hardeman, and John Kessler. The panelists had great insights to share from their respective fields, and they enthusiastically discussed as if they were at a casual collegial dinner. John Kessler stated that he prefers to have one professional critic's opinion in general than a large sampling of customers, however, "Web sites like Yelp get users excited by the collective approach. People are attracted to things desired by so many." Janice Rohn added that this collective desire was especially true for users of consumer products. Ed Seiber remarked that while people looked to the popular view for their target tastes and product choices, "professional critics like John [Kessler] still hold a big weight on public opinion." Chef Powers indicated that chefs take in feedback from all sources, adding, "word of mouth is very powerful. We also look heavily at the sales of the dishes to see what's moving; what's selling and thus successful." Hearing this discussion validates our design work at Oracle in that we listen to our users (our diners) and industry feedback (our critics) to ensure an optimal user experience of our products. Rohn considers that restaurateur Danny Meyer's book, Setting the Table: The Transforming Power of Hospitality in Business, which is about creating successful restaurant experiences, has many applicable parallels to user experience design. Meyer actually argues that the customer is not always right, but that "they must always feel heard." Seiber agreed, but noted "customers are not designers," and while designers need to listen to customer feedback, it is the designer's job to synthesize it. Seiber feels it's the critic's job to point out when something is missing or not well-prioritized. In interaction design, our challenges are quite similar, if not parallel. Software tasks are like puzzles that are in search of a solution on how to be best completed. As a food stylist, Tami Hardeman has the demanding and challenging task of presenting food to be as delectable as can be. To present food in its best light requires a lot of creativity and insight into consumer tastes. It's no doubt then that this former fashion stylist came up with the ultimate catch phrase to capture the emotion that clients want to draw from their users: "craveability." The phrase was a hit with the audience and panelists alike. Sometime later in the discussion, Seiber remarked, "designers strive to apply craveability to products, and I do so for restaurants in my case." Craveabilty is also very applicable to interaction design. Creating straightforward and smooth workflows for users of Oracle Applications is a primary goal for my colleagues. We want our users to really enjoy working with our products where it makes them more efficient and better at their jobs. That's our "craveability." Patanjali Venkatacharya asked the panel, "if a design's "craveability" appeals to some cultures but not to others, then what is the impact to the food or product design process?" Rohn stated that "taste is part nature and part nurture" and that the design must take the full context of a product's usage into consideration. Kessler added, "good design is about understanding the context" that the experience necessitates. Seiber remarked how important seat comfort is for diners and how the quality of seating will add so much to the complete dining experience. Sometimes if these non-food factors are not well executed, they can also take away from an otherwise pleasant dining experience. Kessler recounted a time when he was dining at a restaurant that actually had very good food, but the photographs hanging on all the walls did not fit in with the overall décor and created a negative overall dining experience. While the tastiness of the food is critical to a restaurant's success, it is a captivating complete user experience, as in interaction design, which will keep customers coming back and ultimately making the restaurant a hit. Figure 2. Patanjali Venkatacharya enjoyed the Sardinian flatbread salad. As a surprise Chef Powers brought out a signature dish from Shaun's restaurant for all the panelists to sample and critique. The Sardinian flatbread dish showcased Atlanta's taste for fresh and local produce and cheese at its finest as a salad served on a crispy flavorful flat bread. Hardeman said it could be photographed from any angle, a high compliment coming from a food stylist. Seiber really enjoyed the colors that the dish brought together and thought it would be served very well in a casual restaurant on a summer's day. The panel really appreciated the taste and quality of the different components and how the rosemary brought all the flavors together. Seiber remarked that "a lot of effort goes into the appearance of simplicity." Rohn indicated that the same notion holds true with software user interface design. A tremendous amount of work goes into crafting straightforward interfaces, including user research, prototyping, design iterations, and usability studies. Design criticism for food and software interfaces clearly share many similarities. Both areas value expert opinions and user feedback. Both areas understand the importance of great design needing to work well in its context. Last but not least, both food and interaction design criticism value "craveability" and how having users excited about experiencing and enjoying the designs is an important goal. Now if we can just improve the taste of software user interfaces, people may choose to dine on their enterprise applications over a fresh organic salad.

Read the article
SQL SERVER – Core Concepts – Elasticity, Scalability and ACID Properties – Exploring NuoDB an Elastically Scalable Database System

- by pinaldave

I have been recently exploring Elasticity and Scalability attributes of databases. You can see that in my earlier blog posts about NuoDB where I wanted to look at Elasticity and Scalability concepts. The concepts are very interesting, and intriguing as well. I have discussed these concepts with my friend Joyti M and together we have come up with this interesting read. The goal of this article is to answer following simple questions What is Elasticity? What is Scalability? How ACID properties vary from NOSQL Concepts? What are the prevailing problems in the current database system architectures? Why is NuoDB an innovative and welcome change in database paradigm? Elasticity This word’s original form is used in many different ways and honestly it does do a decent job in holding things together over the years as a person grows and contracts. Within the tech world, and specifically related to software systems (database, application servers), it has come to mean a few things - allow stretching of resources without reaching the breaking point (on demand). What are resources in this context? Resources are the usual suspects – RAM/CPU/IO/Bandwidth in the form of a container (a process or bunch of processes combined as modules). When it is about increasing resources the simplest idea which comes to mind is the addition of another container. Another container means adding a brand new physical node. When it is about adding a new node there are two questions which comes to mind. 1) Can we add another node to our software system? 2) If yes, does adding new node cause downtime for the system? Let us assume we have added new node, let us see what the new needs of the system are when a new node is added. Balancing incoming requests to multiple nodes Synchronization of a shared state across multiple nodes Identification of “downstate” and resolution action to bring it to “upstate” Well, adding a new node has its advantages as well. Here are few of the positive points Throughput can increase nearly horizontally across the node throughout the system Response times of application will increase as in-between layer interactions will be improved Now, Let us put the above concepts in the perspective of a Database. When we mention the term “running out of resources” or “application is bound to resources” the resources can be CPU, Memory or Bandwidth. The regular approach to “gain scalability” in the database is to look around for bottlenecks and increase the bottlenecked resource. When we have memory as a bottleneck we look at the data buffers, locks, query plans or indexes. After a point even this is not enough as there needs to be an efficient way of managing such large workload on a “single machine” across memory and CPU bound (right kind of scheduling) workload. We next move on to either read/write separation of the workload or functionality-based sharing so that we still have control of the individual. But this requires lots of planning and change in client systems in terms of knowing where to go/update/read and for reporting applications to “aggregate the data” in an intelligent way. What we ideally need is an intelligent layer which allows us to do these things without us getting into managing, monitoring and distributing the workload. Scalability In the context of database/applications, scalability means three main things Ability to handle normal loads without pressure E.g. X users at the Y utilization of resources (CPU, Memory, Bandwidth) on the Z kind of hardware (4 processor, 32 GB machine with 15000 RPM SATA drives and 1 GHz Network switch) with T throughput Ability to scale up to expected peak load which is greater than normal load with acceptable response times Ability to provide acceptable response times across the system E.g. Response time in S milliseconds (or agreed upon unit of measure) – 90% of the time The Issue – Need of Scale In normal cases one can plan for the load testing to test out normal, peak, and stress scenarios to ensure specific hardware meets the needs. With help from Hardware and Software partners and best practices, bottlenecks can be identified and requisite resources added to the system. Unfortunately this vertical scale is expensive and difficult to achieve and most of the operational people need the ability to scale horizontally. This helps in getting better throughput as there are physical limits in terms of adding resources (Memory, CPU, Bandwidth and Storage) indefinitely. Today we have different options to achieve scalability: Read & Write Separation The idea here is to do actual writes to one store and configure slaves receiving the latest data with acceptable delays. Slaves can be used for balancing out reads. We can also explore functional separation or sharing as well. We can separate data operations by a specific identifier (e.g. region, year, month) and consolidate it for reporting purposes. For functional separation the major disadvantage is when schema changes or workload pattern changes. As the requirement grows one still needs to deal with scale need in manual ways by providing an abstraction in the middle tier code. Using NOSQL solutions The idea is to flatten out the structures in general to keep all values which are retrieved together at the same store and provide flexible schema. The issue with the stores is that they are compromising on mostly consistency (no ACID guarantees) and one has to use NON-SQL dialect to work with the store. The other major issue is about education with NOSQL solutions. Would one really want to make these compromises on the ability to connect and retrieve in simple SQL manner and learn other skill sets? Or for that matter give up on ACID guarantee and start dealing with consistency issues? Hybrid Deployment – Mac, Linux, Cloud, and Windows One of the challenges today that we see across On-premise vs Cloud infrastructure is a difference in abilities. Take for example SQL Azure – it is wonderful in its concepts of throttling (as it is shared deployment) of resources and ability to scale using federation. However, the same abilities are not available on premise. This is not a mistake, mind you – but a compromise of the sweet spot of workloads, customer requirements and operational SLAs which can be supported by the team. In today’s world it is imperative that databases are available across operating systems – which are a commodity and used by developers of all hues. An Ideal Database Ability List A system which allows a linear scale of the system (increase in throughput with reasonable response time) with the addition of resources A system which does not compromise on the ACID guarantees and require developers to learn new paradigms A system which does not force fit a new way interacting with database by learning Non-SQL dialect A system which does not force fit its mechanisms for providing availability across its various modules. Well NuoDB is the first database which has all of the above abilities and much more. In future articles I will cover my hands-on experience with it. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

Read the article
A Good Developer is So Hard to Find

- by James Michael Hare

Let me start out by saying I want to damn the writers of the Toughest Developer Puzzle Ever – 2. It is eating every last shred of my free time! But as I've been churning through each puzzle and marvelling at the brain teasers and trivia within, I began to think about interviewing developers and why it seems to be so hard to find good ones. The problem is, it seems like no matter how hard we try to find the perfect way to separate the chaff from the wheat, inevitably someone will get hired who falls far short of expectations or someone will get passed over for missing a piece of trivia or a tricky brain teaser that could have been an excellent team member. In shops that are primarily software-producing businesses or other heavily IT-oriented businesses (Microsoft, Amazon, etc) there often exists a much tighter bond between HR and the hiring development staff because development is their life-blood. Unfortunately, many of us work in places where IT is viewed as a cost or just a means to an end. In these shops, too often, HR and development staff may work against each other due to differences in opinion as to what a good developer is or what one is worth. It seems that if you ask two different people what makes a good developer, often you will get three different opinions. With the exception of those shops that are purely development-centric (you guys have it much easier!), most other shops have management who have very little knowledge about the development process. Their view can often be that development is simply a skill that one learns and then once aquired, that developer can produce widgets as good as the next like workers on an assembly-line floor. On the other side, you have many developers that feel that software development is an art unto itself and that the ability to create the most pure design or know the most obscure of keywords or write the shortest-possible obfuscated piece of code is a good coder. So is it a skill? An Art? Or something entirely in between? Saying that software is merely a skill and one just needs to learn the syntax and tools would be akin to saying anyone who knows English and can use Word can write a 300 page book that is accurate, meaningful, and stays true to the point. This just isn't so. It takes more than mere skill to take words and form a sentence, join those sentences into paragraphs, and those paragraphs into a document. I've interviewed candidates who could answer obscure syntax and keyword questions and once they were hired could not code effectively at all. So development must be more than a skill. But on the other end, we have art. Is development an art? Is our end result to produce art? I can marvel at a piece of code -- see it as concise and beautiful -- and yet that code most perform some stated function with accuracy and efficiency and maintainability. None of these three things have anything to do with art, per se. Art is beauty for its own sake and is a wonderful thing. But if you apply that same though to development it just doesn't hold. I've had developers tell me that all that matters is the end result and how you code it is entirely part of the art and I couldn't disagree more. Yes, the end result, the accuracy, is the prime criteria to be met. But if code is not maintainable and efficient, it would be just as useless as a beautiful car that breaks down once a week or that gets 2 miles to the gallon. Yes, it may work in that it moves you from point A to point B and is pretty as hell, but if it can't be maintained or is not efficient, it's not a good solution. So development must be something less than art. In the end, I think I feel like development is a matter of craftsmanship. We use our tools and we use our skills and set about to construct something that satisfies a purpose and yet is also elegant and efficient. There is skill involved, and there is an art, but really it boils down to being able to craft code. Crafting code is far more than writing code. Anyone can write code if they know the syntax, but so few people can actually craft code that solves a purpose and craft it well. So this is what I want to find, I want to find code craftsman! But how? I used to ask coding-trivia questions a long time ago and many people still fall back on this. The thought is that if you ask the candidate some piece of coding trivia and they know the answer it must follow that they can craft good code. For example: What C++ keyword can be applied to a class/struct field to allow it to be changed even from a const-instance of that class/struct? (answer: mutable) So what do we prove if a candidate can answer this? Only that they know what mutable means. One would hope that this would infer that they'd know how to use it, and more importantly when and if it should ever be used! But it rarely does! The problem with triva questions is that you will either: Approve a really good developer who knows what some obscure keyword is (good) Reject a really good developer who never needed to use that keyword or is too inexperienced to know how to use it (bad) Approve a really bad developer who googled "C++ Interview Questions" and studied like hell but can't craft (very bad) Many HR departments love these kind of tests because they are short and easy to defend if a legal issue arrises on hiring decisions. After all it's easy to say a person wasn't hired because they scored 30 out of 100 on some trivia test. But unfortunately, you've eliminated a large part of your potential developer pool and possibly hired a few duds. There are times I've hired candidates who knew every trivia question I could throw out them and couldn't craft. And then there are times I've interviewed candidates who failed all my trivia but who I took a chance on who were my best finds ever. So if not trivia, then what? Brain teasers? The thought is, these type of questions measure the thinking power of a candidate. The problem is, once again, you will either: Approve a good candidate who has never heard the problem and can solve it (good) Reject a good candidate who just happens not to see the "catch" because they're nervous or it may be really obscure (bad) Approve a candidate who has studied enough interview brain teasers (once again, you can google em) to recognize the "catch" or knows the answer already (bad). Once again, you're eliminating good candidates and possibly accepting bad candidates. In these cases, I think testing someone with brain teasers only tests their ability to answer brain teasers, not the ability to craft code. So how do we measure someone's ability to craft code? Here's a novel idea: have them code! Give them a computer and a compiler, or a whiteboard and a pen, or paper and pencil and have them construct a piece of code. It just makes sense that if we're going to hire someone to code we should actually watch them code. When they're done, we can judge them on several criteria: Correctness - does the candidate's solution accurately solve the problem proposed? Accuracy - is the candidate's solution reasonably syntactically correct? Efficiency - did the candidate write or use the more efficient data structures or algorithms for the job? Maintainability - was the candidate's code free of obfuscation and clever tricks that diminish readability? Persona - are they eager and willing or aloof and egotistical? Will they work well within your team? It may sound simple, or it may sound crazy, but when I'm looking to hire a developer, I want to see them actually develop well-crafted code.

Read the article
Packaging Swing apps with integrated JavaFX content

- by igor

JavaFX provides a lot of interesting capabilities for developing rich client applications in Java, but what if you are working on an existing Swing application and you want to take advantage of these new features? Maybe you want to use one or two controls like the LineChart or a MediaView. Maybe you want to embed a large Scene Graph as an initial step in porting your application to FX. A hybrid Swing/FX application might just be the answer. Developing a hybrid Swing + JavaFX application is not terribly difficult, but until recently the deployment of hybrid applications has not simple as a "pure" JavaFX application. The existing tools focused on packaging FX Applications, or Swing applications - they did not account for hybrid applications. But with JavaFX 2.2 the tools include support for this hybrid application use case. Solution In JavaFX 2.2 we extended the packaging ant tasks to greatly simplify deploying hybrid applications. You now use the same deployment approach as you would for pure JavaFX applications. Just bundle your main application jar with the fx:jar ant task and then generate html/jnlp files using fx:deploy. The only difference is setting toolkit attribute for the fx:application tag as shown below: <fx:application id="swingFXApp" mainClass="${main.class}" toolkit="swing"/> The value of ${main.class} in the example above is your application class which has a main method. It does not need to extend JavaFX Application class. The resulting package provides support for the same set of execution modes as a package for a JavaFX application, although the packages which are created are not identical to the packages created for a pure FX application. You will see two JNLP files generated in the case of a hybrid application - one for use from Swing applet and another for the webstart launch. Note that these improvements do not alter the set of features available to Swing applications. The packaging tools just make it easier to use the advanced features of JavaFX in your Swing application. The same limits still apply, for example a Swing application can not use JavaFX Preloaders and code changes are necessary to support HTML splash screens. Why should I use the JavaFX ant tasks for packaging my Swing application? While using FX packaging tool for a Swing application may seem like a mismatch at face value, there are some really good reasons to use this approach. The primary justification for our packaging tools is to simplify the creation of your application artifacts, and to reduce manual errors. Plus, no one should have to write JNLP by hand. Some specific benefits include: Your application jar will include a launcher program. This improves your standalone launch by: checking for the JavaFX runtime guiding the user through any necessary installations setting the system proxy for Java The ant tasks will generate JNLP and HTML files for your swing app: avoids learning unnecessary details about JNLP, and eliminates the error-prone hand editing of JNLP files simplifies using advanced features like embedding JNLP and signing jars as BLOBs to improve launch performance.you can also embed the signing certificate details to improve the user's experience allows the use of web page templates to inject the generated code directly into your actual web page instead of being forced to copy/paste the generated code snippets. What about native packing? Absolutely! The very same ant task can generate a native bundle for a Swing application with JavaFX content. Try running one of these sample native bundles for the "SwingInterop" FX example: exe and dmg. I also used another feature on these examples: a click-through license agreement for .exe installers and OS X DMG drag installers. Small Caveat This packaging procedure is optimized around using the JavaFX packaging tools for your entire Swing application. If you are trying to embed JavaFX content into existing project (with an existing build/packing process) then you may need to experiment in order to find the best way to integrate the JavaFX packaging steps into your existing build procedure. As long as you can use ant in your build process this should be a workable approach. It some cases solution could be less than ideal. For example, you need to use fx:jar to package your main jar file in order to produce a double-clickable jar or a native bundle. The jar will be created from scratch, but you may already be creating the main jar file with a custom manifest. This may lead to some redundant steps in your build process. Hopefully the benefits will outweigh the problems. This is an area of ongoing development for the team, and we will continue to refine and improve both the tools and the process. Please share your experiences and suggestions with us. You can comment here on the blog or file issues to JIRA. Sample code Here is the full ant code used to package SwingInterop. You can grab latest JavaFX samples and try it yourself: <target name="-post-jar"> <taskdef resource="com/sun/javafx/tools/ant/antlib.xml" uri="javafx:com.sun.javafx.tools.ant" classpath="${javafx.tools.ant.jar}"/>  <fx:application id="swingFXApp" mainClass="${main.class}" toolkit="swing"/>  <fx:jar destfile="${dist.jar}"> <fileset dir="${build.classes.dir}"/> <fx:application refid="swingFXApp" name="SwingInterop"/> <manifest> <attribute name="Implementation-Vendor" value="${application.vendor}"/> <attribute name="Implementation-Title" value="${application.title}"/> <attribute name="Implementation-Version" value="1.0"/> </manifest> </fx:jar>  <delete file="${build.dir}/test.keystore"/> <genkey alias="TestAlias" storepass="xyz123" keystore="${build.dir}/test.keystore" dname="CN=Samples, OU=JavaFX Dev, O=Oracle, C=US"/> <fx:signjar keystore="${build.dir}/test.keystore" alias="TestAlias" storepass="xyz123"> <fileset file="${dist.jar}"/> </fx:signjar>  <fx:deploy width="960" height="720" includeDT="true" nativeBundles="all" outdir="${basedir}/${dist.dir}" embedJNLP="true" outfile="${application.title}"> <fx:application refId="swingFXApp"/> <fx:resources> <fx:fileset dir="${basedir}/${dist.dir}" includes="SwingInterop.jar"/> </fx:resources> <fx:permissions/> <info title="Sample app: ${application.title}" vendor="${application.vendor}"/> </fx:deploy> </target>

Read the article
In Case You Weren’t There: Blogwell NYC

- by Mike Stiles

0 0 1 1009 5755 Vitrue 47 13 6751 14.0 Normal 0 false false false EN-US JA X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman";} Your roving reporter roved out to another one of Socialmedia.org’s fantastic Blogwell events, this time in NYC. As Central Park and incredible weather beckoned, some of the biggest brand names in the world gathered to talk about how they’re incorporating social into marketing and CRM, as well as extending social across their entire organizations internally. Below we present a collection of the live tweets from many of the key sessions GE @generalelectricJon Lombardo, Leader of Social Media COE How GE builds and extends emotional connections with consumers around health and reaps the benefits of increased brand equity in the process. GE has a social platform around Healthyimagination to create better health for people. If you and a friend are trying to get healthy together, you’ll do better. Health is inherently. Get health challenges via Facebook and share with friends to achieve goals together. They’re creating an emotional connection around the health context. You don’t influence people at large. Your sphere of real influence is around 5-10 people. They find relevant conversations about health on Twitter and engage sounding like a friend, not a brand. Why would people share on behalf of a brand? Because you tapped into an activity and emotion they’re already having. To create better habits in health, GE gave away inexpensive, relevant gifts related to their goals. Create the context, give the relevant gift, get social acknowledgment for giving it. What you get when you get acknowledgment for your engagement and gift is user generated microcontent. GE got 12,000 unique users engaged and 1400 organic posts with the healthy gift campaign. The Dow Chemical Company @DowChemicalAbby Klanecky, Director of Digital & Social Media Learn how Dow Chemical is finding, training, and empowering their scientists to be their storytellers in social media. There are 1m jobs coming open in science. Only 200k are qualified for them. Dow Chemical wanted to use social to attract and talk to scientists. Dow Chemical decided to use real scientists as their storytellers. Scientists are incredibly passionate, the key ingredient of a great storyteller. Step 1 was getting scientists to focus on a few platforms, blog, Twitter, LinkedIn. Dow Chemical social flow is Core Digital Team - #CMs – ambassadors – advocates. The scientists were trained in social etiquette via practice scenarios. It’s not just about sales. It’s about growing influence and the business. Dow Chemical trained about 100 scientists, 55 are active and there’s a waiting list for the next sessions. In person social training produced faster results and better participation. Sometimes you have to tell pieces of the story instead of selling your execs on the whole vision. Social Media Ethics Briefing: Staying Out of TroubleAndy Sernovitz, CEO @SocialMediaOrg How do we get people to share our message for us? We have to have their trust. The difference between being honest and being sleazy is disclosure. Disclosure does not hurt the effectiveness of your marketing. No one will get mad if you tell them up front you’re a paid spokesperson for a company. It’s a legal requirement by the FTC, it’s the law, to disclose if you’re being paid for an endorsement. Require disclosure and truthfulness in all your social media outreach. Don’t lie to people. Monitor the conversation and correct misstatements. Create social media policies and training programs. If you want to stay safe, never pay cash for social media. Money changes everything. As soon as you pay, it’s not social media, it’s advertising. Disclosure, to the feds, means clear, conspicuous, and understandable to the average reader. This phrase will keep you in the clear, “I work for ___ and this is my personal opinion.” Who are you? Were you paid? Are you giving an honest opinion based on a real experience? You as a brand are responsible for what an agency or employee or contactor does in your behalf. SocialMedia.org makes available a Disclosure Best Practices Toolkit. Socialmedia.org/disclosure. The point is to not ethically mess up and taint social media as happened to e-mail. Not only is the FTC cracking down, so is Google and Facebook. Visa @VisaNewsLucas Mast, Senior Business Leader, Global Corporate Social Media Visa built a mobile studio for the Olympics for execs and athletes. They wanted to do postcard style real time coverage of Visa’s Olympics sponsorships, and on a shoestring. Challenges included Olympic rules, difficulty getting interviews, time zone trouble, and resourcing. Another problem was they got bogged down with their own internal approval processes. Despite all the restrictions, they created and published a variety of and fair amount of content. They amassed 1000+ views of videos posted to the Visa Communication YouTube channel. Less corporate content yields more interest from media outlets and bloggers. They did real world video demos of how their products work in the field vs. an exec doing a demo in a studio. Don’t make exec interview videos dull and corporate. Keep answers short, shoot it in an interesting place, do takes until they’re comfortable and natural. Not everything will work. Not everything will get a retweet. But like the lottery, you can’t win if you don’t play. Promoting content is as important as creating it. McGraw-Hill Companies @McGrawHillCosPatrick Durando, Senior Director of Global New Media McGraw-Hill has 26,000 employees. McGraw-Hill created a social intranet called Buzz. Intranets create operational efficiency, help product dev, facilitate crowdsourcing, and breaks down geo silos. Intranets help with talent development, acquisition, retention. They replaced the corporate directory with their own version of LinkedIn. The company intranet has really cut down on the use of email. Long email threats become organized, permanent social discussions. The intranet is particularly useful in HR for researching and getting answers surrounding benefits and policies. Using a profile on your company intranet can establish and promote your internal professional brand. If you’re going to make an intranet, it has to look great, work great, and employees are going have to want to go there. You can’t order them to like it.

Read the article
HTML5 Input type=date Formatting Issues

- by Rick Strahl

One of the nice features in HTML5 is the abililty to specify a specific input type for HTML text input boxes. There a host of very useful input types available including email, number, date, datetime, month, number, range, search, tel, time, url and week. For a more complete list you can check out the MDN reference. Date input types also support automatic validation which can be useful in some scenarios but maybe can get in the way at other times. One of the more common input types, and one that can most benefit of a custom UI for selection is of course date input. Almost every application could use a decent date representation and HTML5's date input type seems to push into the right direction. It'd be nice if you could just say:<form action="DateTest.html"> <label for="FromDate">Enter a Date:</label> <input type="date" id="FromDate" name="FromDate" value="11/08/2012" class="date" /> <hr /> <input type="submit" id="btnSubmit" name="btnSubmit" value="Save Date" class="smallbutton" /> </form> but if you'd expect to just work, you're likely to be pretty disappointed. Problem #1: Browser Support For starters there's browser support. Out of the major browsers only the latest versions of WebKit and Opera based browsers seem to support date input. Neither FireFox, nor any version of Internet Explorer (including the new touch enabled IE10 in Windows RT) support input type=date. Browser support is an issue, but it would be OK if it wasn't for problem #2. Problem #2: Date Formatting If you look at my date input from before:<input type="date" id="FromDate" name="FromDate" value="11/08/2012" class="date" /> You can see that my date is formatted in local date format (ie. en-us). Now when I run this sadly the form that comes up in Chrome (and also iOS mobile browsers) comes up like this: Chrome isn't recognizing my local date string. Instead it's expecting my date format to be provided in ISO 8601 format which is: 2012-11-08 So if I change the date input field to:<input type="date" id="FromDate" name="FromDate" value="2012-10-08" class="date" /> I correctly get the date field filled in: Also when I pick a date with the DatePicker the date value is also returned is also set to the ISO date format. Yet notice how the date is still formatted to the local date time format (ie. en-US format). So if I pick a new date: and then save, the value field is set back to: 2012-11-15 using the ISO format. The same is true for Opera and iOS browsers and I suspect any other WebKit style browser and their date pickers. So to summarize input type=date: Expects ISO 8601 format dates to display intial values Sets selected date values to ISO 8601 Now what? This would sort of make sense, if all browsers supported input type=date. It'd be easy because you could just format dates appropriately when you set the date value into the control by applying the appropriate culture formatting (ie. .ToString("yyyy-MM-dd") ). .NET is actually smart enough to pick up the date on the other end for modelbinding when ISO 8601 is used. For other environments this might be a bit more tricky. input type=date is clearly the way to go forward. Date controls implemented in HTML are going the way of the dodo, given the intricacies of mobile platforms and scaling for both desktop and mobile. I've been using jQuery UI Datepicker for ages but once going to mobile, that's no longer an option as the control doesn't scale down well for mobile apps (at least not without major re-styling). It also makes a lot of sense for the browser to provide this functionality - creating a consistent date input experience across apps only makes sense, which is why I find it baffling that neither FireFox nor IE 10 deign it necessary to support date input natively. The problem is that a large number of even the latest and greatest browsers don't support this. So now you're stuck with not knowing what date format you have to serve since neither the local format, nor the ISO format works in all cases. For my current app I just broke down and used the ISO format and so I'll live with the non-local date format. <input type="date" id="ToDate" name="ToDate" value="2012-11-08" class="date"/> Here's what this looks like on Chrome: Here's what it looks like on my iPhone: Both Chrome and the phone do this the way it should be. For the phone especially this demonstrates why we'd want this - the built-in date picker there certainly beats manually trying to edit the date using finger gymnastics, and it's one of the easiest ways to pick a date I can think of (ie. easier to use than your typical date picker). Finally here's what the date looks like in FireFox: Certainly this is not the ideal date format, but it's clear enough I suppose. If users enter a date in local US format and that works as well (but won't work for other locales). It'll have to do. Over time one can only hope that other browsers will finally decide to implement this functionality natively to provide a unique experience. Until then, incomplete solutions it is. Related Posts Html 5 Input Types - How useful is this really going to be?© Rick Strahl, West Wind Technologies, 2005-2012Posted in HTML5 HTML Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article
Microsoft and the open source community

- by Charles Young

For the last decade, I have repeatedly, in my imitable Microsoft fan boy style, offered an alternative view to commonly held beliefs about Microsoft's stance on open source licensing. In earlier times, leading figures in Microsoft were very vocal in resisting the idea that commercial licensing is outmoded or morally reprehensible. Many people interpreted this as all-out corporate opposition to open source licensing. I never read it that way. It is true that I've met individual employees of Microsoft who are antagonistic towards FOSS (free and open source software), but I've met more who are supportive or at least neutral on the subject. In any case, individual attitudes of employees don't necessarily reflect a corporate stance. The strongest opposition I've encountered has actually come from outside the company. It's not a charitable thought, but I sometimes wonder if there are people in the .NET community who are opposed to FOSS simply because they believe, erroneously, that Microsoft is opposed. Here, for what it is worth, are the points I've repeated endlessly over the years and which have often been received with quizzical scepticism. a) A decade ago, Microsoft's big problem was not FOSS per se, or even with copyleft. The thing which really kept them awake at night was the fear that one day, someone might find, deep in the heart of the Windows code base, some code that should not be there and which was published under GPL. The likelihood of this ever happening has long since faded away, but there was a time when MS was running scared. I suspect this is why they held out for a while from making Windows source code open to inspection. Nowadays, as an MVP, I am positively encouraged to ask to see Windows source. b) Microsoft has never opposed the open source community. They have had problems with specific people and organisations in the FOSS community. Back in the 1990s, Richard Stallman gave time and energy to a successful campaign to launch antitrust proceedings against Microsoft. In more recent times, the negative attitude of certain people to Microsoft's submission of two FOSS licences to the OSI (both of which have long since been accepted), and the mad scramble to try to find any argument, however tenuous, to block their submission was not, let us say, edifying. c) Microsoft has never, to my knowledge, written off the FOSS model. They certainly don't agree that more traditional forms of licensing are inappropriate or immoral, and they've always been prepared to say so. One reason why it was so hard to convince people that Microsoft is not rabidly antagonistic towards FOSS licensing is that so many people think they have no involvement in open source. A decade ago, there was virtually no evidence of any such involvement. However, that was a long time ago. Quietly over the years, Microsoft has got on with the job of working out how to make use of FOSS licensing and how to support the FOSS community. For example, as well as making increasingly extensive use of Github, they run an important FOSS forge (CodePlex) on which they, themselves, host many hundreds of distinct projects. The total count may even be in the thousands now. I suspect there is a limit of about 500 records on CodePlex searches because, for the past few years, whenever I search for Microsoft-specific projects on CodePlex, I always get approx. 500 hits. Admittedly, a large volume of the stuff they publish under FOSS licences amounts to code samples, but many of those 'samples' have grown into useful and fully featured frameworks, libraries and tools. All this is leading up to the observation that yesterday's announcement by Scott Guthrie marks a significant milestone and should not go unnoticed. If you missed it, let me summarise. From the first release of .NET, Microsoft has offered a web development framework called ASP.NET. The core libraries are included in the .NET framework which is released free of charge, but which is not open source. However, in recent years, the number of libraries that constitute ASP.NET have grown considerably. Today, most professional ASP.NET web development exploits the ASP.NET MVC framework. This, together with several other important parts of the ASP.NET technology stack, is released on CodePlex under the Apache 2.0 licence. Hence, today, a huge swathe of web development on the .NET/Azure platform relies four-square on the use of FOSS frameworks and libraries. Yesterday, Scott Guthrie announced the next stage of ASP.NET's journey towards FOSS nirvana. This involves extending ASP.NET's FOSS stack to include Web API and the MVC Razor view engine which is rapidly becoming the de facto 'standard' for building web pages in ASP.NET. However, perhaps the more important announcement is that the ASP.NET team will now accept and review contributions from the community. Scott points out that this model is already in place elsewhere in Microsoft, and specifically draws attention to development of the Windows Azure SDKs. These SDKs are central to Azure development. The .NET and Java SDKs are published under Apache 2.0 on Github and Microsoft is open to community contributions. Accepting contributions is a more profound move than simply releasing code under FOSS licensing. It means that Microsoft is wholeheartedly moving towards a full-blooded open source approach for future evolution of some of their central and most widely used .NET and Azure frameworks and libraries. In conjunction with Scott's announcement, Microsoft has also released Git support for CodePlex (at long last!) and, perhaps more importantly, announced significant new investment in their own FOSS forge. Here at Solidsoft we have several reasons to be very interested in Scott's announcement. I'll draw attention to one of them. Earlier this year we wrote the initial version of a new UK Government web application called CloudStore. CloudStore provides a way for local and central government to discover and purchase applications and services. We wrote the web site using ASP.NET MVC which is FOSS. However, this point has been lost on the ladies and gentlemen of the press and, I suspect, on some of the decision makers on the government side. They announced a few weeks ago that future versions of CloudStore will move to a FOSS framework, clearly oblivious of the fact that it is already built on a FOSS framework. We are, it is fair to say, mildly irked by the uninformed and badly out-of-date assumption that “if it is Microsoft, it can't be FOSS”. Old prejudices live on.

Read the article
Metro: Grouping Items in a ListView Control

- by Stephen.Walther

The purpose of this blog entry is to explain how you can group list items when displaying the items in a WinJS ListView control. In particular, you learn how to group a list of products by product category. Displaying a grouped list of items in a ListView control requires completing the following steps: Create a Grouped data source from a List data source Create a Grouped Header Template Declare the ListView control so it groups the list items Creating the Grouped Data Source Normally, you bind a ListView control to a WinJS.Binding.List object. If you want to render list items in groups, then you need to bind the ListView to a grouped data source instead. The following code – contained in a file named products.js — illustrates how you can create a standard WinJS.Binding.List object from a JavaScript array and then return a grouped data source from the WinJS.Binding.List object by calling its createGrouped() method: (function () { "use strict"; // Create List data source var products = new WinJS.Binding.List([ { name: "Milk", price: 2.44, category: "Beverages" }, { name: "Oranges", price: 1.99, category: "Fruit" }, { name: "Wine", price: 8.55, category: "Beverages" }, { name: "Apples", price: 2.44, category: "Fruit" }, { name: "Steak", price: 1.99, category: "Other" }, { name: "Eggs", price: 2.44, category: "Other" }, { name: "Mushrooms", price: 1.99, category: "Other" }, { name: "Yogurt", price: 2.44, category: "Other" }, { name: "Soup", price: 1.99, category: "Other" }, { name: "Cereal", price: 2.44, category: "Other" }, { name: "Pepsi", price: 1.99, category: "Beverages" } ]); // Create grouped data source var groupedProducts = products.createGrouped( function (dataItem) { return dataItem.category; }, function (dataItem) { return { title: dataItem.category }; }, function (group1, group2) { return group1.charCodeAt(0) - group2.charCodeAt(0); } ); // Expose the grouped data source WinJS.Namespace.define("ListViewDemos", { products: groupedProducts }); })(); Notice that the createGrouped() method requires three functions as arguments: groupKey – This function associates each list item with a group. The function accepts a data item and returns a key which represents a group. In the code above, we return the value of the category property for each product. groupData – This function returns the data item displayed by the group header template. For example, in the code above, the function returns a title for the group which is displayed in the group header template. groupSorter – This function determines the order in which the groups are displayed. The code above displays the groups in alphabetical order: Beverages, Fruit, Other. Creating the Group Header Template Whenever you create a ListView control, you need to create an item template which you use to control how each list item is rendered. When grouping items in a ListView control, you also need to create a group header template. The group header template is used to render the header for each group of list items. Here’s the markup for both the item template and the group header template: <div id="productTemplate" data-win-control="WinJS.Binding.Template"> <div class="product"> <span data-win-bind="innerText:name"></span> <span data-win-bind="innerText:price"></span> </div> </div> <div id="productGroupHeaderTemplate" data-win-control="WinJS.Binding.Template"> <div class="productGroupHeader"> <h1 data-win-bind="innerText: title"></h1> </div> </div> You should declare the two templates in the same file as you declare the ListView control – for example, the default.html file. Declaring the ListView Control The final step is to declare the ListView control. Here’s the required markup: <div data-win-control="WinJS.UI.ListView" data-win-options="{ itemDataSource:ListViewDemos.products.dataSource, itemTemplate:select('#productTemplate'), groupDataSource:ListViewDemos.products.groups.dataSource, groupHeaderTemplate:select('#productGroupHeaderTemplate'), layout: {type: WinJS.UI.GridLayout} }"> </div> In the markup above, six properties of the ListView control are set when the control is declared. First the itemDataSource and itemTemplate are specified. Nothing new here. Next, the group data source and group header template are specified. Notice that the group data source is represented by the ListViewDemos.products.groups.dataSource property of the grouped data source. Finally, notice that the layout of the ListView is changed to Grid Layout. You are required to use Grid Layout (instead of the default List Layout) when displaying grouped items in a ListView. Here’s the entire contents of the default.html page: <!DOCTYPE html> <html> <head> <meta charset="utf-8"> <title>ListViewDemos</title>  <link href="//Microsoft.WinJS.0.6/css/ui-dark.css" rel="stylesheet"> <script src="//Microsoft.WinJS.0.6/js/base.js"></script> <script src="//Microsoft.WinJS.0.6/js/ui.js"></script>  <link href="/css/default.css" rel="stylesheet"> <script src="/js/default.js"></script> <script src="/js/products.js" type="text/javascript"></script> <style type="text/css"> .product { width: 200px; height: 100px; border: white solid 1px; font-size: x-large; } </style> </head> <body> <div id="productTemplate" data-win-control="WinJS.Binding.Template"> <div class="product"> <span data-win-bind="innerText:name"></span> <span data-win-bind="innerText:price"></span> </div> </div> <div id="productGroupHeaderTemplate" data-win-control="WinJS.Binding.Template"> <div class="productGroupHeader"> <h1 data-win-bind="innerText: title"></h1> </div> </div> <div data-win-control="WinJS.UI.ListView" data-win-options="{ itemDataSource:ListViewDemos.products.dataSource, itemTemplate:select('#productTemplate'), groupDataSource:ListViewDemos.products.groups.dataSource, groupHeaderTemplate:select('#productGroupHeaderTemplate'), layout: {type: WinJS.UI.GridLayout} }"> </div> </body> </html> Notice that the default.html page includes a reference to the products.js file: <script src=”/js/products.js” type=”text/javascript”></script> The default.html page also contains the declarations of the item template, group header template, and ListView control. Summary The goal of this blog entry was to explain how you can group items in a ListView control. You learned how to create a grouped data source, a group header template, and declare a ListView so that it groups its list items.

Read the article
SSIS: Deploying OLAP cubes using C# script tasks and AMO

- by DrJohn

As part of the continuing series on Building dynamic OLAP data marts on-the-fly, this blog entry will focus on how to automate the deployment of OLAP cubes using SQL Server Integration Services (SSIS) and Analysis Services Management Objects (AMO). OLAP cube deployment is usually done using the Analysis Services Deployment Wizard. However, this option was dismissed for a variety of reasons. Firstly, invoking external processes from SSIS is fraught with problems as (a) it is not always possible to ensure SSIS waits for the external program to terminate; (b) we cannot log the outcome properly and (c) it is not always possible to control the server's configuration to ensure the executable works correctly. Another reason for rejecting the Deployment Wizard is that it requires the 'answers' to be written into four XML files. These XML files record the three things we need to change: the name of the server, the name of the OLAP database and the connection string to the data mart. Although it would be reasonably straight forward to change the content of the XML files programmatically, this adds another set of complication and level of obscurity to the overall process. When I first investigated the possibility of using C# to deploy a cube, I was surprised to find that there are no other blog entries about the topic. I can only assume everyone else is happy with the Deployment Wizard! SSIS "forgets" assembly references If you build your script task from scratch, you will have to remember how to overcome one of the major annoyances of working with SSIS script tasks: the forgetful nature of SSIS when it comes to assembly references. Basically, you can go through the process of adding an assembly reference using the Add Reference dialog, but when you close the script window, SSIS "forgets" the assembly reference so the script will not compile. After repeating the operation several times, you will find that SSIS only remembers the assembly reference when you specifically press the Save All icon in the script window. This problem is not unique to the AMO assembly and has certainly been a "feature" since SQL Server 2005, so I am not amazed it is still present in SQL Server 2008 R2! Sample Package So let's take a look at the sample SSIS package I have provided which can be downloaded from here: DeployOlapCubeExample.zip Below is a screenshot after a successful run. Connection Managers The package has three connection managers: AsDatabaseDefinitionFile is a file connection manager pointing to the .asdatabase file you wish to deploy. Note that this can be found in the bin directory of you OLAP database project once you have clicked the "Build" button in Visual Studio TargetOlapServerCS is an Analysis Services connection manager which identifies both the deployment server and the target database name. SourceDataMart is an OLEDB connection manager pointing to the data mart which is to act as the source of data for your cube. This will be used to replace the connection string found in your .asdatabase file Once you have configured the connection managers, the sample should run and deploy your OLAP database in a few seconds. Of course, in a production environment, these connection managers would be associated with package configurations or set at runtime. When you run the sample, you should see that the script logs its activity to the output screen (see screenshot above). If you configure logging for the package, then these messages will also appear in your SSIS logging. Sample Code Walkthrough Next let's walk through the code. The first step is to parse the connection string provided by the TargetOlapServerCS connection manager and obtain the name of both the target OLAP server and also the name of the OLAP database. Note that the target database does not have to exist to be referenced in an AS connection manager, so I am using this as a convenient way to define both properties. We now connect to the server and check for the existence of the OLAP database. If it exists, we drop the database so we can re-deploy. svr.Connect(olapServerName); if (svr.Connected) { // Drop the OLAP database if it already exists Database db = svr.Databases.FindByName(olapDatabaseName); if (db != null) { db.Drop(); } // rest of script } Next we start building the XMLA command that will actually perform the deployment. Basically this is a small chuck of XML which we need to wrap around the large .asdatabase file generated by the Visual Studio build process. // Start generating the main part of the XMLA command XmlDocument xmlaCommand = new XmlDocument(); xmlaCommand.LoadXml(string.Format("<Batch Transaction='false' xmlns='http://schemas.microsoft.com/analysisservices/2003/engine'><Alter AllowCreate='true' ObjectExpansion='ExpandFull'><Object><DatabaseID>{0}</DatabaseID></Object><ObjectDefinition/></Alter></Batch>", olapDatabaseName)); Next we need to merge two XML files which we can do by simply using setting the InnerXml property of the ObjectDefinition node as follows: // load OLAP Database definition from .asdatabase file identified by connection manager XmlDocument olapCubeDef = new XmlDocument(); olapCubeDef.Load(Dts.Connections["AsDatabaseDefinitionFile"].ConnectionString); // merge the two XML files by obtain a reference to the ObjectDefinition node oaRootNode.InnerXml = olapCubeDef.InnerXml; One hurdle I had to overcome was removing detritus from the .asdabase file left by the Visual Studio build. Through an iterative process, I found I needed to remove several nodes as they caused the deployment to fail. The XMLA error message read "Cannot set read-only node: CreatedTimestamp" or similar. In comparing the XMLA generated with by the Deployment Wizard with that generated by my code, these read-only nodes were missing, so clearly I just needed to strip them out. This was easily achieved using XPath to find the relevant XML nodes, of which I show one example below: foreach (XmlNode node in rootNode.SelectNodes("//ns1:CreatedTimestamp", nsManager)) { node.ParentNode.RemoveChild(node); } Now we need to change the database name in both the ID and Name nodes using code such as: XmlNode databaseID = xmlaCommand.SelectSingleNode("//ns1:Database/ns1:ID", nsManager); if (databaseID != null) databaseID.InnerText = olapDatabaseName; Finally we need to change the connection string to point at the relevant data mart. Again this is easily achieved using XPath to search for the relevant nodes and then replace the content of the node with the new name or connection string. XmlNode connectionStringNode = xmlaCommand.SelectSingleNode("//ns1:DataSources/ns1:DataSource/ns1:ConnectionString", nsManager); if (connectionStringNode != null) { connectionStringNode.InnerText = Dts.Connections["SourceDataMart"].ConnectionString; } Finally we need to perform the deployment using the Execute XMLA command and check the returned XmlaResultCollection for errors before setting the Dts.TaskResult. XmlaResultCollection oResults = svr.Execute(xmlaCommand.InnerXml); // check for errors during deployment foreach (Microsoft.AnalysisServices.XmlaResult oResult in oResults) { foreach (Microsoft.AnalysisServices.XmlaMessage oMessage in oResult.Messages) { if ((oMessage.GetType().Name == "XmlaError")) { FireError(oMessage.Description); HadError = true; } } } If you are not familiar with XML programming, all this may all seem a bit daunting, but perceiver as the sample code is pretty short. If you would like the script to process the OLAP database, simply uncomment the lines in the vicinity of Process method. Of course, you can extend the script to perform your own custom processing and to even synchronize the database to a front-end server. Personally, I like to keep the deployment and processing separate as the code can become overly complex for support staff.If you want to know more, come see my session at the forthcoming SQLBits conference.

Read the article
More Fun With Math

- by PointsToShare

More Fun with Math The runaway student – three different ways of solving one problem Here is a problem I read in a Russian site: A student is running away. He is moving at 1 mph. Pursuing him are a lion, a tiger and his math teacher. The lion is 40 miles behind and moving at 6 mph. The tiger is 28 miles behind and moving at 4 mph. His math teacher is 30 miles behind and moving at 5 mph. Who will catch him first? Analysis Obviously we have a set of three problems. They are all basically the same, but the details are different. The problems are of the same class. Here is a little excursion into computer science. One of the things we strive to do is to create solutions for classes of problems rather than individual problems. In your daily routine, you call it re-usability. Not all classes of problems have such solutions. If a class has a general (re-usable) solution, it is called computable. Otherwise it is unsolvable. Within unsolvable classes, we may still solve individual (some but not all) problems, albeit with different approaches to each. Luckily the vast majority of our daily problems are computable, and the 3 problems of our runaway student belong to a computable class. So, let’s solve for the catch-up time by the math teacher, after all she is the most frightening. She might even make the poor runaway solve this very problem – perish the thought! Method 1 – numerical analysis. At 30 miles and 5 mph, it’ll take her 6 hours to come to where the student was to begin with. But by then the student has advanced by 6 miles. 6 miles require 6/5 hours, but by then the student advanced by another 6/5 of a mile as well. And so on and so forth. So what are we to do? One way is to write code and iterate it until we have solved it. But this is an infinite process so we’ll end up with an infinite loop. So what to do? We’ll use the principles of numerical analysis. Any calculator – your computer included – has a limited number of digits. A double floating point number is good for about 14 digits. Nothing can be computed at a greater accuracy than that. This means that we will not iterate ad infinidum, but rather to the point where 2 consecutive iterations yield the same result. When we do financial computations, we don’t even have to go that far. We stop at the 10th of a penny. It behooves us here to stop at a 10th of a second (100 milliseconds) and this will how we will avoid an infinite loop. Interestingly this alludes to the Zeno paradoxes of motion – in particular “Achilles and the Tortoise”. Zeno says exactly the same. To catch the tortoise, Achilles must always first come to where the tortoise was, but the tortoise keeps moving – hence Achilles will never catch the tortoise and our math teacher (or lion, or tiger) will never catch the student, or the policeman the thief. Here is my resolution to the paradox. The distance and time in each step are smaller and smaller, so the student will be caught. The only thing that is infinite is the iterative solution. The race is a convergent geometric process so the steps are diminishing, but each step in the solution takes the same amount of effort and time so with an infinite number of steps, we’ll spend an eternity solving it. This BTW is an original thought that I have never seen before. But I digress. Let’s simply write the code to solve the problem. To make sure that it runs everywhere, I’ll do it in JavaScript. function LongCatchUpTime(D, PV, FV) // D is Distance; PV is Pursuers Velocity; FV is Fugitive’ Velocity { var t = 0; var T = 0; var d = parseFloat(D); var pv = parseFloat (PV); var fv = parseFloat (FV); t = d / pv; while (t > 0.000001) //a 10th of a second is 1/36,000 of an hour, I used 1/100,000 { T = T + t; d = t * fv; t = d / pv; } return T; } By and large, the higher the Pursuer’s velocity relative to the fugitive, the faster the calculation. Solving this with the 10th of a second limit yields: 7.499999232000001 Method 2 – Geometric Series. Each step in the iteration above is smaller than the next. As you saw, we stopped iterating when the last step was small enough, small enough not to really matter. When we have a sequence of numbers in which the ratio of each number to its predecessor is fixed we call the sequence geometric. When we are looking at the sum of sequence, we call the sequence of sums series. Now let’s look at our student and teacher. The teacher runs 5 times faster than the student, so with each iteration the distance between them shrinks to a fifth of what it was before. This is a fixed ratio so we deal with a geometric series. We normally designate this ratio as q and when q is less than 1 (0 < q < 1) the sum of + … + is – 1) / (q – 1). When q is less than 1, it is easier to use ) / (1 - q). Now, the steps are 6 hours then 6/5 hours then 6/5*5 and so on, so q = 1/5. And the whole series is multiplied by 6. Also because q is less than 1 , 1/ diminishes to 0. So the sum is just / (1 - q). or 1/ (1 – 1/5) = 1 / (4/5) = 5/4. This times 6 yields 7.5 hours. We can now continue with some algebra and take it back to a simpler formula. This is arduous and I am not going to do it here. Instead let’s do some simpler algebra. Method 3 – Simple Algebra. If the time to capture the fugitive is T and the fugitive travels at 1 mph, then by the time the pursuer catches him he travelled additional T miles. Time is distance divided by speed, so…. (D + T)/V = T thus D + T = VT and D = VT – T = (V – 1)T and T = D/(V – 1) This “strangely” coincides with the solution we just got from the geometric sequence. This is simpler ad faster. Here is the corresponding code. function ShortCatchUpTime(D, PV, FV) { var d = parseFloat(D); var pv = parseFloat (PV); var fv = parseFloat (FV); return d / (pv - fv); } The code above, for both the iterative solution and the algebraic solution are actually for a larger class of problems. In our original problem the student’s velocity (speed) is 1 mph. In the code it may be anything as long as it is less than the pursuer’s velocity. As long as PV > FV, the pursuer will catch up. Here is the really general formula: T = D / (PV – FV) Finally, let’s run the program for each of the pursuers. It could not be worse. I know he’d rather be eaten alive than suffering through yet another math lesson. See the code run? Select “Catch Up Time” in www.mgsltns.com/games.htm The host is running on Unix, so the link is case sensitive. That’s All Folks

Read the article
SortedDictionary and SortedList

- by Simon Cooper

Apart from Dictionary<TKey, TValue>, there's two other dictionaries in the BCL - SortedDictionary<TKey, TValue> and SortedList<TKey, TValue>. On the face of it, these two classes do the same thing - provide an IDictionary<TKey, TValue> interface where the iterator returns the items sorted by the key. So what's the difference between them, and when should you use one rather than the other? (as in my previous post, I'll assume you have some basic algorithm & datastructure knowledge) SortedDictionary We'll first cover SortedDictionary. This is implemented as a special sort of binary tree called a red-black tree. Essentially, it's a binary tree that uses various constraints on how the nodes of the tree can be arranged to ensure the tree is always roughly balanced (for more gory algorithmical details, see the wikipedia link above). What I'm concerned about in this post is how the .NET SortedDictionary is actually implemented. In .NET 4, behind the scenes, the actual implementation of the tree is delegated to a SortedSet<KeyValuePair<TKey, TValue>>. One example tree might look like this: Each node in the above tree is stored as a separate SortedSet<T>.Node object (remember, in a SortedDictionary, T is instantiated to KeyValuePair<TKey, TValue>): class Node { public bool IsRed; public T Item; public SortedSet<T>.Node Left; public SortedSet<T>.Node Right; } The SortedSet only stores a reference to the root node; all the data in the tree is accessed by traversing the Left and Right node references until you reach the node you're looking for. Each individual node can be physically stored anywhere in memory; what's important is the relationship between the nodes. This is also why there is no constructor to SortedDictionary or SortedSet that takes an integer representing the capacity; there are no internal arrays that need to be created and resized. This may seen trivial, but it's an important distinction between SortedDictionary and SortedList that I'll cover later on. And that's pretty much it; it's a standard red-black tree. Plenty of webpages and datastructure books cover the algorithms behind the tree itself far better than I could. What's interesting is the comparions between SortedDictionary and SortedList, which I'll cover at the end. As a side point, SortedDictionary has existed in the BCL ever since .NET 2. That means that, all through .NET 2, 3, and 3.5, there has been a bona-fide sorted set class in the BCL (called TreeSet). However, it was internal, so it couldn't be used outside System.dll. Only in .NET 4 was this class exposed as SortedSet. SortedList Whereas SortedDictionary didn't use any backing arrays, SortedList does. It is implemented just as the name suggests; two arrays, one containing the keys, and one the values (I've just used random letters for the values): The items in the keys array are always guarenteed to be stored in sorted order, and the value corresponding to each key is stored in the same index as the key in the values array. In this example, the value for key item 5 is 'z', and for key item 8 is 'm'. Whenever an item is inserted or removed from the SortedList, a binary search is run on the keys array to find the correct index, then all the items in the arrays are shifted to accomodate the new or removed item. For example, if the key 3 was removed, a binary search would be run to find the array index the item was at, then everything above that index would be moved down by one: and then if the key/value pair {7, 'f'} was added, a binary search would be run on the keys to find the index to insert the new item, and everything above that index would be moved up to accomodate the new item: If another item was then added, both arrays would be resized (to a length of 10) before the new item was added to the arrays. As you can see, any insertions or removals in the middle of the list require a proportion of the array contents to be moved; an O(n) operation. However, if the insertion or removal is at the end of the array (ie the largest key), then it's only O(log n); the cost of the binary search to determine it does actually need to be added to the end (excluding the occasional O(n) cost of resizing the arrays to fit more items). As a side effect of using backing arrays, SortedList offers IList Keys and Values views that simply use the backing keys or values arrays, as well as various methods utilising the array index of stored items, which SortedDictionary does not (and cannot) offer. The Comparison So, when should you use one and not the other? Well, here's the important differences: Memory usage SortedDictionary and SortedList have got very different memory profiles. SortedDictionary... has a memory overhead of one object instance, a bool, and two references per item. On 64-bit systems, this adds up to ~40 bytes, not including the stored item and the reference to it from the Node object. stores the items in separate objects that can be spread all over the heap. This helps to keep memory fragmentation low, as the individual node objects can be allocated wherever there's a spare 60 bytes. In contrast, SortedList... has no additional overhead per item (only the reference to it in the array entries), however the backing arrays can be significantly larger than you need; every time the arrays are resized they double in size. That means that if you add 513 items to a SortedList, the backing arrays will each have a length of 1024. To conteract this, the TrimExcess method resizes the arrays back down to the actual size needed, or you can simply assign list.Capacity = list.Count. stores its items in a continuous block in memory. If the list stores thousands of items, this can cause significant problems with Large Object Heap memory fragmentation as the array resizes, which SortedDictionary doesn't have. Performance Operations on a SortedDictionary always have O(log n) performance, regardless of where in the collection you're adding or removing items. In contrast, SortedList has O(n) performance when you're altering the middle of the collection. If you're adding or removing from the end (ie the largest item), then performance is O(log n), same as SortedDictionary (in practice, it will likely be slightly faster, due to the array items all being in the same area in memory, also called locality of reference). So, when should you use one and not the other? As always with these sort of things, there are no hard-and-fast rules. But generally, if you: need to access items using their index within the collection are populating the dictionary all at once from sorted data aren't adding or removing keys once it's populated then use a SortedList. But if you: don't know how many items are going to be in the dictionary are populating the dictionary from random, unsorted data are adding & removing items randomly then use a SortedDictionary. The default (again, there's no definite rules on these sort of things!) should be to use SortedDictionary, unless there's a good reason to use SortedList, due to the bad performance of SortedList when altering the middle of the collection.

Read the article
Emit Knowledge - social network for knowledge sharing

- by hajan

Emit Knowledge, as the words refer - it's a social network for emitting / sharing knowledge from users by users. Those who can benefit the most out of this network is perhaps all of YOU who have something to share with others and contribute to the knowledge world. I've been closely communicating with the core team of this very, very interesting, brand new social network (with specific purpose!) about the concept, idea and the vision they have for their product and I can say with a lot of confidence that this network has real potential to become something from which we will all benefit. I won't speak much about that and would prefer to give you link and try it yourself - http://www.emitknowledge.com Mainly, through the past few months I've been testing this network and it is getting improved all the time. The user experience is great, you can easily find out what you need and it follows some known patterns that are common for all social networks. They have some real good ideas and plans that are already under development for the next updates of their product. You can do micro blogging or you can do regular normal blogging… it’s up to you, and the way it works, it is seamless. Here is a short Question and Answers (QA) interview I made with the lead of the team, Marijan Nikolovski: 1. Can you please explain us briefly, what is Emit Knowledge? Emit Knowledge is a brand new knowledge based social network, delivering quality content from users to users. We believe that people’s knowledge, experience and professional thoughts compose quality content, worth sharing among millions around the world. Therefore, we created the platform that matches people’s need to share and gain knowledge in the most suitable and comfortable way. Easy to work with, Emit Knowledge lets you to smoothly craft and emit knowledge around the globe. 2. How 'old' is Emit Knowledge? In hamster’s years we are almost five years old start-up :). Just kidding. We’ve released our public beta about three months ago. Our official release date is 27 of June 2012. 3. How did you come up with this idea? Everything started from a simple idea to solve a complex problem. We’ve seen that the social web has become polluted with data and is on the right track to lose its base principles – socialization and common cause. That was our start point. We’ve gathered the team, drew some sketches and started to mind map the idea. After several idea refactoring’s Emit Knowledge was born. 4. Is there any competition out there in the market? Currently we don't have any competitors that share the same cause. What makes our platform different is the ideology that our product promotes and the functionalities that our platform offers for easy socialization based on interests and knowledge sharing. 5. What are the main technologies used to build Emit Knowledge? Emit Knowledge was built on a heterogeneous pallet of technologies. Currently, we have four of separation: UI – Built on ASP.NET MVC3 and Knockout.js; Messaging infrastructure – Build on top of RabbitMQ; Background services – Our in-house solution for job distribution, orchestration and processing; Data storage – Build on top of MongoDB; What are the main reasons you've chosen ASP.NET MVC? Since all of our team members are .NET engineers, the decision was very natural. ASP.NET MVC is the only Microsoft web stack that sticks to the HTTP behavioral standards. It is easy to work with, have a tiny learning curve and everyone who is familiar with the HTTP will understand its architecture and convention without any difficulties. 6. What are the main reasons for choosing ASP.NET MVC? Since all of our team members are .NET engineers, the decision was very natural. ASP.NET MVC is the only Microsoft web stack that sticks to the HTTP behavioral standards. It is easy to work with, have a tiny learning curve and everyone who is familiar with the HTTP will understand its architecture and convention without any difficulties. 7. Did you use some of the latest Microsoft technologies? If yes, which ones? Yes, we like to rock the cutting edge tech house. Currently we are using Microsoft’s latest technologies like ASP.NET MVC, Web API (work in progress) and the best for the last; we are utilizing Windows Azure IaaS to the bone. 8. Can you please tell us shortly, what would be the benefit of regular bloggers in other blogging platforms to join Emit Knowledge? Well, unless you are some of the smoking ace gurus whose blogs are followed by a large number of users, our platform offers knowledge based segregated community equipped with tools that will enable both current and future users to expand their relations and to self-promote in the community based on their activity and knowledge sharing. 10. I see you are working very intensively and there is already integration with some third-party services to make the process of sharing and emitting knowledge easier, which services did you integrate until now and what do you plan do to next? We have “reemit” functionality for internal sharing and we also support external services like: Twitter; LinkedIn; Facebook; For the regular bloggers we have an extra cream, Windows Live Writer support for easy blog posts emitting. 11. What should we expect next? Currently, we are working on a new fancy community feature. This means that we are going to support user groups to be formed. So for all existing communities and user groups out there, wait us a little bit, we are coming for rescue :). One of the top next features they are developing is the Community Feature. It means, if you have your own User Group, Community Group or any other Group on which you and your users are mostly blogging or sharing (emitting) knowledge in various ways, Emit Knowledge as a platform will help you have everything you need to promote your group, make new followers and host all the necessary stuff that you have had need of. I would invite you to try the network and start sharing knowledge in a way that will help you gather new followers and spread your knowledge faster, easier and in a more efficient way! Let’s Emit Knowledge!

Read the article
SQL SERVER – Weekly Series – Memory Lane – #050

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Executing Remote Stored Procedure – Calling Stored Procedure on Linked Server In this example we see two different methods of how to call Stored Procedures remotely. Connection Property of SQL Server Management Studio SSMS A very simple example of the how to build connection properties for SQL Server with the help of SSMS. Sample Example of RANKING Functions – ROW_NUMBER, RANK, DENSE_RANK, NTILE SQL Server has a total of 4 ranking functions. Ranking functions return a ranking value for each row in a partition. All the ranking functions are non-deterministic. T-SQL Script to Add Clustered Primary Key Jr. DBA asked me three times in a day, how to create Clustered Primary Key. I gave him following sample example. That was the last time he asked “How to create Clustered Primary Key to table?” 2008 2008 – TRIM() Function – User Defined Function SQL Server does not have functions which can trim leading or trailing spaces of any string at the same time. SQL does have LTRIM() and RTRIM() which can trim leading and trailing spaces respectively. SQL Server 2008 also does not have TRIM() function. User can easily use LTRIM() and RTRIM() together and simulate TRIM() functionality. http://www.youtube.com/watch?v=1-hhApy6MHM 2009 Earlier I have written two different articles on the subject Remove Bookmark Lookup. This article is as part 3 of original article. Please read the first two articles here before continuing reading this article. Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup – Part 2 Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup – Part 3 Interesting Observation – Query Hint – FORCE ORDER SQL Server never stops to amaze me. As regular readers of this blog already know that besides conducting corporate training, I work on large-scale projects on query optimizations and server tuning projects. In one of the recent projects, I have noticed that a Junior Database Developer used the query hint Force Order; when I asked for details, I found out that the basic concept was not properly understood by him. Queries Waiting for Memory Allocation to Execute In one of the recent projects, I was asked to create a report of queries that are waiting for memory allocation. The reason was that we were doubtful regarding whether the memory was sufficient for the application. The following query can be useful in similar cases. Queries that do not have to wait on a memory grant will not appear in the result set of following query. 2010 Quickest Way to Identify Blocking Query and Resolution – Dirty Solution As the title suggests, this is quite a dirty solution; it’s not as elegant as you expect. However, it works totally fine. Simple Explanation of Data Type Precedence While I was working on creating a question for SQL SERVER – SQL Quiz – The View, The Table and The Clustered Index Confusion, I had actually created yet another question along with this question. However, I felt that the one which is posted on the SQL Quiz is much better than this one because what makes that more challenging question is that it has a multiple answer. Encrypted Stored Procedure and Activity Monitor I recently had received questionable if any stored procedure is encrypted can we see its definition in Activity Monitor.Answer is - No. Let us do a quick test. Let us create following Stored Procedure and then launch the Activity Monitor and check the text. Indexed View always Use Index on Table A single table can have maximum 249 non clustered indexes and 1 clustered index. In SQL Server 2008, a single table can have maximum 999 non clustered indexes and 1 clustered index. It is widely believed that a table can have only 1 clustered index, and this belief is true. I have some questions for all of you. Let us assume that I am creating view from the table itself and then create a clustered index on it. In my view, I am selecting the complete table itself. 2011 Detecting Database Case Sensitive Property using fn_helpcollations() I received a question on how to determine the case sensitivity of the database. The quick answer to this is to identify the collation of the database and check the properties of the collation. I have previously written how one can identify database collation. Once you have figured out the collation of the database, you can put that in the WHERE condition of the following T-SQL and then check the case sensitivity from the description. Server Side Paging in SQL Server CE (Compact Edition) SQL Server Denali is coming up with new T-SQL of Paging. I have written about the same earlier.SQL SERVER – Server Side Paging in SQL Server Denali – A Better Alternative, SQL SERVER – Server Side Paging in SQL Server Denali Performance Comparison, SQL SERVER – Server Side Paging in SQL Server Denali – Part2 What is very interesting is that SQL Server CE 4.0 have the same feature introduced. Here is the quick example of the same. To run the script in the example, you will have to do installWebmatrix 4.0 and download sample database. Once done you can run following script. Why I am Going to Attend PASS Summit Unite 2011 The four-day event will be marked by a lot of learning, sharing, and networking, which will help me increase both my knowledge and contacts. Every year, PASS Summit provides me a golden opportunity to build my network as well as to identify and meet potential customers or employees. 2012 Manage Help Settings – CTRL + ALT + F1 This is very interesting read as my daughter once accidently came across a screen in SQL Server Management Studio. It took me 2-3 minutes to figure out how she has created the same screen. Recover the Accidentally Renamed Table “I accidentally renamed table in my SSMS. I was scrolling very fast and I made mistakes. It was either because I double clicked or clicked on F2 (shortcut key for renaming). However, I have made the mistake and now I have no idea how to fix this. If you have renamed the table, I think you pretty much is out of luck. Here are few things which you can do which can give you an idea about what your table name can be if you are lucky. Identify Numbers of Non Clustered Index on Tables for Entire Database Here is the script which will give you numbers of non clustered indexes on any table in entire database. Identify Most Resource Intensive Queries – SQL in Sixty Seconds #029 – Video Here is the complete complete script which I have used in the SQL in Sixty Seconds Video. Thanks Harsh for important Tip in the comment. http://www.youtube.com/watch?v=3kDHC_Tjrns Advanced Data Quality Services with Melissa Data – Azure Data Market For the purposes of the review, I used a database I had in an Excel spreadsheet with name and address information. Upon a cursory inspection, there are miscellaneous problems with these records; some addresses are missing ZIP codes, others missing a city, and some records are slightly misspelled or have unparsed suites. With DQS, I can easily add a knowledge base to help standardize my values, such as for state abbreviations. But how do I know that my address is correct? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
J2EE Applications, SPARC T4, Solaris Containers, and Resource Pools

- by user12620111

I've obtained a substantial performance improvement on a SPARC T4-2 Server running a J2EE Application Server Cluster by deploying the cluster members into Oracle Solaris Containers and binding those containers to cores of the SPARC T4 Processor. This is not a surprising result, in fact, it is consistent with other results that are available on the Internet. See the "references", below, for some examples. Nonetheless, here is a summary of my configuration and results. (1.0) Before deploying a J2EE Application Server Cluster into a virtualized environment, many decisions need to be made. I'm not claiming that all of the decisions that I have a made will work well for every environment. In fact, I'm not even claiming that all of the decisions are the best possible for my environment. I'm only claiming that of the small sample of configurations that I've tested, this is the one that is working best for me. Here are some of the decisions that needed to be made: (1.1) Which virtualization option? There are several virtualization options and isolation levels that are available. Options include: Hard partitions: Dynamic Domains on Sun SPARC Enterprise M-Series Servers Hypervisor based virtualization such as Oracle VM Server for SPARC (LDOMs) on SPARC T-Series Servers OS Virtualization using Oracle Solaris Containers Resource management tools in the Oracle Solaris OS to control the amount of resources an application receives, such as CPU cycles, physical memory, and network bandwidth. Oracle Solaris Containers provide the right level of isolation and flexibility for my environment. To borrow some words from my friends in marketing, "The SPARC T4 processor leverages the unique, no-cost virtualization capabilities of Oracle Solaris Zones" (1.2) How to associate Oracle Solaris Containers with resources? There are several options available to associate containers with resources, including (a) resource pool association (b) dedicated-cpu resources and (c) capped-cpu resources. I chose to create resource pools and associate them with the containers because I wanted explicit control over the cores and virtual processors. (1.3) Cluster Topology? Is it best to deploy (a) multiple application servers on one node, (b) one application server on multiple nodes, or (c) multiple application servers on multiple nodes? After a few quick tests, it appears that one application server per Oracle Solaris Container is a good solution. (1.4) Number of cluster members to deploy? I chose to deploy four big 64-bit application servers. I would like go back a test many 32-bit application servers, but that is left for another day. (2.0) Configuration tested. (2.1) I was using a SPARC T4-2 Server which has 2 CPU and 128 virtual processors. To understand the physical layout of the hardware on Solaris 10, I used the OpenSolaris psrinfo perl script available at http://hub.opensolaris.org/bin/download/Community+Group+performance/files/psrinfo.pl: test# ./psrinfo.pl -pv The physical processor has 8 cores and 64 virtual processors (0-63) The core has 8 virtual processors (0-7) The core has 8 virtual processors (8-15) The core has 8 virtual processors (16-23) The core has 8 virtual processors (24-31) The core has 8 virtual processors (32-39) The core has 8 virtual processors (40-47) The core has 8 virtual processors (48-55) The core has 8 virtual processors (56-63) SPARC-T4 (chipid 0, clock 2848 MHz) The physical processor has 8 cores and 64 virtual processors (64-127) The core has 8 virtual processors (64-71) The core has 8 virtual processors (72-79) The core has 8 virtual processors (80-87) The core has 8 virtual processors (88-95) The core has 8 virtual processors (96-103) The core has 8 virtual processors (104-111) The core has 8 virtual processors (112-119) The core has 8 virtual processors (120-127) SPARC-T4 (chipid 1, clock 2848 MHz) (2.2) The "before" test: without processor binding. I started with a 4-member cluster deployed into 4 Oracle Solaris Containers. Each container used a unique gigabit Ethernet port for HTTP traffic. The containers shared a 10 gigabit Ethernet port for JDBC traffic. (2.3) The "after" test: with processor binding. I ran one application server in the Global Zone and another application server in each of the three non-global zones (NGZ): (3.0) Configuration steps. The following steps need to be repeated for all three Oracle Solaris Containers. (3.1) Stop AppServers from the BUI. (3.2) Stop the NGZ. test# ssh test-z2 init 5 (3.3) Enable resource pools: test# svcadm enable pools (3.4) Create the resource pool: test# poolcfg -dc 'create pool pool-test-z2' (3.5) Create the processor set: test# poolcfg -dc 'create pset pset-test-z2' (3.6) Specify the maximum number of CPU's that may be addd to the processor set: test# poolcfg -dc 'modify pset pset-test-z2 (uint pset.max=32)' (3.7) bash syntax to add Virtual CPUs to the processor set: test# (( i = 64 )); while (( i < 96 )); do poolcfg -dc "transfer to pset pset-test-z2 (cpu $i)"; (( i = i + 1 )) ; done (3.8) Associate the resource pool with the processor set: test# poolcfg -dc 'associate pool pool-test-z2 (pset pset-test-z2)' (3.9) Tell the zone to use the resource pool that has been created: test# zonecfg -z test-z1 set pool=pool-test-z2 (3.10) Boot the Oracle Solaris Container test# zoneadm -z test-z2 boot (3.11) Save the configuration to /etc/pooladm.conf test# pooladm -s (4.0) Results. Using the resource pools improves both throughput and response time: (5.0) References: System Administration Guide: Oracle Solaris Containers-Resource Management and Oracle Solaris Zones Capitalizing on large numbers of processors with WebSphere Portal on Solaris WebSphere Application Server and T5440 (Dileep Kumar's Weblog) http://www.brendangregg.com/zones.html Reuters Market Data System, RMDS 6 Multiple Instances (Consolidated), Performance Test Results in Solaris, Containers/Zones Environment on Sun Blade X6270 by Amjad Khan, 2009.

Read the article
Premature-Optimization and Performance Anxiety

- by James Michael Hare

While writing my post analyzing the new .NET 4 ConcurrentDictionary class (here), I fell into one of the classic blunders that I myself always love to warn about. After analyzing the differences of time between a Dictionary with locking versus the new ConcurrentDictionary class, I noted that the ConcurrentDictionary was faster with read-heavy multi-threaded operations. Then, I made the classic blunder of thinking that because the original Dictionary with locking was faster for those write-heavy uses, it was the best choice for those types of tasks. In short, I fell into the premature-optimization anti-pattern. Basically, the premature-optimization anti-pattern is when a developer is coding very early for a perceived (whether rightly-or-wrongly) performance gain and sacrificing good design and maintainability in the process. At best, the performance gains are usually negligible and at worst, can either negatively impact performance, or can degrade maintainability so much that time to market suffers or the code becomes very fragile due to the complexity. Keep in mind the distinction above. I'm not talking about valid performance decisions. There are decisions one should make when designing and writing an application that are valid performance decisions. Examples of this are knowing the best data structures for a given situation (Dictionary versus List, for example) and choosing performance algorithms (linear search vs. binary search). But these in my mind are macro optimizations. The error is not in deciding to use a better data structure or algorithm, the anti-pattern as stated above is when you attempt to over-optimize early on in such a way that it sacrifices maintainability. In my case, I was actually considering trading the safety and maintainability gains of the ConcurrentDictionary (no locking required) for a slight performance gain by using the Dictionary with locking. This would have been a mistake as I would be trading maintainability (ConcurrentDictionary requires no locking which helps readability) and safety (ConcurrentDictionary is safe for iteration even while being modified and you don't risk the developer locking incorrectly) -- and I fell for it even when I knew to watch out for it. I think in my case, and it may be true for others as well, a large part of it was due to the time I was trained as a developer. I began college in in the 90s when C and C++ was king and hardware speed and memory were still relatively priceless commodities and not to be squandered. In those days, using a long instead of a short could waste precious resources, and as such, we were taught to try to minimize space and favor performance. This is why in many cases such early code-bases were very hard to maintain. I don't know how many times I heard back then to avoid too many function calls because of the overhead -- and in fact just last year I heard a new hire in the company where I work declare that she didn't want to refactor a long method because of function call overhead. Now back then, that may have been a valid concern, but with today's modern hardware even if you're calling a trivial method in an extremely tight loop (which chances are the JIT compiler would optimize anyway) the results of removing method calls to speed up performance are negligible for the great majority of applications. Now, obviously, there are those coding applications where speed is absolutely king (for example drivers, computer games, operating systems) where such sacrifices may be made. But I would strongly advice against such optimization because of it's cost. Many folks that are performing an optimization think it's always a win-win. That they're simply adding speed to the application, what could possibly be wrong with that? What they don't realize is the cost of their choice. For every piece of straight-forward code that you obfuscate with performance enhancements, you risk the introduction of bugs in the long term technical debt of the application. It will become so fragile over time that maintenance will become a nightmare. I've seen such applications in places I have worked. There are times I've seen applications where the designer was so obsessed with performance that they even designed their own memory management system for their application to try to squeeze out every ounce of performance. Unfortunately, the application stability often suffers as a result and it is very difficult for anyone other than the original designer to maintain. I've even seen this recently where I heard a C++ developer bemoaning that in VS2010 the iterators are about twice as slow as they used to be because Microsoft added range checking (probably as part of the 0x standard implementation). To me this was almost a joke. Twice as slow sounds bad, but it almost never as bad as you think -- especially if you're gaining safety. The only time twice is really that much slower is when once was too slow to begin with. Think about it. 2 minutes is slow as a response time because 1 minute is slow. But if an iterator takes 1 microsecond to move one position and a new, safer iterator takes 2 microseconds, this is trivial! The only way you'd ever really notice this would be in iterating a collection just for the sake of iterating (i.e. no other operations). To my mind, the added safety makes the extra time worth it. Always favor safety and maintainability when you can. I know it can be a hard habit to break, especially if you started out your career early or in a language such as C where they are very performance conscious. But in reality, these type of micro-optimizations only end up hurting you in the long run. Remember the two laws of optimization. I'm not sure where I first heard these, but they are so true: For beginners: Do not optimize. For experts: Do not optimize yet. This is so true. If you're a beginner, resist the urge to optimize at all costs. And if you are an expert, delay that decision. As long as you have chosen the right data structures and algorithms for your task, your performance will probably be more than sufficient. Chances are it will be network, database, or disk hits that will be your slow-down, not your code. As they say, 98% of your code's bottleneck is in 2% of your code so premature-optimization may add maintenance and safety debt that won't have any measurable impact. Instead, code for maintainability and safety, and then, and only then, when you find a true bottleneck, then you should go back and optimize further.

Read the article
More Fun with C# Iterators and Generators

- by James Michael Hare

In my last post, I talked quite a bit about iterators and how they can be really powerful tools for filtering a list of items down to a subset of items. This had both pros and cons over returning a full collection, which, in summary, were: Pros: If traversal is only partial, does not have to visit rest of collection. If evaluation method is costly, only incurs that cost on elements visited. Adds little to no garbage collection pressure. Cons: Very slight performance impact if you know caller will always consume all items in collection. And as we saw in the last post, that con for the cost was very, very small and only really became evident on very tight loops consuming very large lists completely. One of the key items to note, though, is the garbage! In the traditional (return a new collection) method, if you have a 1,000,000 element collection, and wish to transform or filter it in some way, you have to allocate space for that copy of the collection. That is, say you have a collection of 1,000,000 items and you want to double every item in the collection. Well, that means you have to allocate a collection to hold those 1,000,000 items to return, which is a lot especially if you are just going to use it once and toss it. Iterators, though, don't have this problem. Each time you visit the node, it would return the doubled value of the node (in this example) and not allocate a second collection of 1,000,000 doubled items. Do you see the distinction? In both cases, we're consuming 1,000,000 items. But in one case we pass back each doubled item which is just an int (for example's sake) on the stack and in the other case, we allocate a list containing 1,000,000 items which then must be garbage collected. So iterators in C# are pretty cool, eh? Well, here's one more thing a C# iterator can do that a traditional "return a new collection" transformation can't! It can return **unbounded** collections! I know, I know, that smells a lot like an infinite loop, eh? Yes and no. Basically, you're relying on the caller to put the bounds on the list, and as long as the caller doesn't you keep going. Consider this example: public static class Fibonacci { // returns the infinite fibonacci sequence public static IEnumerable<int> Sequence() { int iteration = 0; int first = 1; int second = 1; int current = 0; while (true) { if (iteration++ < 2) { current = 1; } else { current = first + second; second = first; first = current; } yield return current; } } } Whoa, you say! Yes, that's an infinite loop! What the heck is going on there? Yes, that was intentional. Would it be better to have a fibonacci sequence that returns only a specific number of items? Perhaps, but that wouldn't give you the power to defer the execution to the caller. The beauty of this function is it is as infinite as the sequence itself! The fibonacci sequence is unbounded, and so is this method. It will continue to return fibonacci numbers for as long as you ask for them. Now that's not something you can do with a traditional method that would return a collection of ints representing each number. In that case you would eventually run out of memory as you got to higher and higher numbers. This method, though, never runs out of memory. Now, that said, you do have to know when you use it that it is an infinite collection and bound it appropriately. Fortunately, Linq provides a lot of these extension methods for you! Let's say you only want the first 10 fibonacci numbers: foreach(var fib in Fibonacci.Sequence().Take(10)) { Console.WriteLine(fib); } Or let's say you only want the fibonacci numbers that are less than 100: foreach(var fib in Fibonacci.Sequence().TakeWhile(f => f < 100)) { Console.WriteLine(fib); } So, you see, one of the nice things about iterators is their power to work with virtually any size (even infinite) collections without adding the garbage collection overhead of making new collections. You can also do fun things like this to make a more "fluent" interface for for loops: // A set of integer generator extension methods public static class IntExtensions { // Begins counting to inifity, use To() to range this. public static IEnumerable<int> Every(this int start) { // deliberately avoiding condition because keeps going // to infinity for as long as values are pulled. for (var i = start; ; ++i) { yield return i; } } // Begins counting to infinity by the given step value, use To() to public static IEnumerable<int> Every(this int start, int byEvery) { // deliberately avoiding condition because keeps going // to infinity for as long as values are pulled. for (var i = start; ; i += byEvery) { yield return i; } } // Begins counting to inifity, use To() to range this. public static IEnumerable<int> To(this int start, int end) { for (var i = start; i <= end; ++i) { yield return i; } } // Ranges the count by specifying the upper range of the count. public static IEnumerable<int> To(this IEnumerable<int> collection, int end) { return collection.TakeWhile(item => item <= end); } } Note that there are two versions of each method. One that starts with an int and one that starts with an IEnumerable<int>. This is to allow more power in chaining from either an existing collection or from an int. This lets you do things like: // count from 1 to 30 foreach(var i in 1.To(30)) { Console.WriteLine(i); } // count from 1 to 10 by 2s foreach(var i in 0.Every(2).To(10)) { Console.WriteLine(i); } // or, if you want an infinite sequence counting by 5s until something inside breaks you out... foreach(var i in 0.Every(5)) { if (someCondition) { break; } ... } Yes, those are kinda play functions and not particularly useful, but they show some of the power of generators and extension methods to form a fluid interface. So what do you think? What are some of your favorite generators and iterators?

Read the article
What Makes a Good Design Critic? CHI 2010 Panel Review

- by Applications User Experience

Author: Daniel Schwartz, Senior Interaction Designer, Oracle Applications User Experience Oracle Applications UX Chief Evangelist Patanjali Venkatacharya organized and moderated an innovative and stimulating panel discussion titled "What Makes a Good Design Critic? Food Design vs. Product Design Criticism" at CHI 2010, the annual ACM Conference on Human Factors in Computing Systems. The panelists included Janice Rohn, VP of User Experience at Experian; Tami Hardeman, a food stylist; Ed Seiber, a restaurant architect and designer; Jonathan Kessler, a food critic and writer at the Atlanta Journal-Constitution; and Larry Powers, Chef de Cuisine at Shaun's restaurant in Atlanta, Georgia. Building off the momentum of his highly acclaimed panel at CHI 2009 on what interaction design can learn from food design (for which I was on the other side as a panelist), Venkatacharya brought together new people with different roles in the restaurant and software interaction design fields. The session was also quite delicious -- but more on that later. Criticism, as it applies to food and product or interaction design, was the tasty topic for this forum and showed that strong parallels exist between food and interaction design criticism. Figure 1. The panelists in discussion: (left to right) Janice Rohn, Ed Seiber, Tami Hardeman, and Jonathan Kessler. The panelists had great insights to share from their respective fields, and they enthusiastically discussed as if they were at a casual collegial dinner. Jonathan Kessler stated that he prefers to have one professional critic's opinion in general than a large sampling of customers, however, "Web sites like Yelp get users excited by the collective approach. People are attracted to things desired by so many." Janice Rohn added that this collective desire was especially true for users of consumer products. Ed Seiber remarked that while people looked to the popular view for their target tastes and product choices, "professional critics like John [Kessler] still hold a big weight on public opinion." Chef Powers indicated that chefs take in feedback from all sources, adding, "word of mouth is very powerful. We also look heavily at the sales of the dishes to see what's moving; what's selling and thus successful." Hearing this discussion validates our design work at Oracle in that we listen to our users (our diners) and industry feedback (our critics) to ensure an optimal user experience of our products. Rohn considers that restaurateur Danny Meyer's book, Setting the Table: The Transforming Power of Hospitality in Business, which is about creating successful restaurant experiences, has many applicable parallels to user experience design. Meyer actually argues that the customer is not always right, but that "they must always feel heard." Seiber agreed, but noted "customers are not designers," and while designers need to listen to customer feedback, it is the designer's job to synthesize it. Seiber feels it's the critic's job to point out when something is missing or not well-prioritized. In interaction design, our challenges are quite similar, if not parallel. Software tasks are like puzzles that are in search of a solution on how to be best completed. As a food stylist, Tami Hardeman has the demanding and challenging task of presenting food to be as delectable as can be. To present food in its best light requires a lot of creativity and insight into consumer tastes. It's no doubt then that this former fashion stylist came up with the ultimate catch phrase to capture the emotion that clients want to draw from their users: "craveability." The phrase was a hit with the audience and panelists alike. Sometime later in the discussion, Seiber remarked, "designers strive to apply craveability to products, and I do so for restaurants in my case." Craveabilty is also very applicable to interaction design. Creating straightforward and smooth workflows for users of Oracle Applications is a primary goal for my colleagues. We want our users to really enjoy working with our products where it makes them more efficient and better at their jobs. That's our "craveability." Patanjali Venkatacharya asked the panel, "if a design's "craveability" appeals to some cultures but not to others, then what is the impact to the food or product design process?" Rohn stated that "taste is part nature and part nurture" and that the design must take the full context of a product's usage into consideration. Kessler added, "good design is about understanding the context" that the experience necessitates. Seiber remarked how important seat comfort is for diners and how the quality of seating will add so much to the complete dining experience. Sometimes if these non-food factors are not well executed, they can also take away from an otherwise pleasant dining experience. Kessler recounted a time when he was dining at a restaurant that actually had very good food, but the photographs hanging on all the walls did not fit in with the overall décor and created a negative overall dining experience. While the tastiness of the food is critical to a restaurant's success, it is a captivating complete user experience, as in interaction design, which will keep customers coming back and ultimately making the restaurant a hit. Figure 2. Patnajali Venkatacharya enjoyed the Sardian flatbread salad. As a surprise Chef Powers brought out a signature dish from Shaun's restaurant for all the panelists to sample and critique. The Sardinian flatbread dish showcased Atlanta's taste for fresh and local produce and cheese at its finest as a salad served on a crispy flavorful flat bread. Hardeman said it could be photographed from any angle, a high compliment coming from a food stylist. Seiber really enjoyed the colors that the dish brought together and thought it would be served very well in a casual restaurant on a summer's day. The panel really appreciated the taste and quality of the different components and how the rosemary brought all the flavors together. Seiber remarked that "a lot of effort goes into the appearance of simplicity." Rohn indicated that the same notion holds true with software user interface design. A tremendous amount of work goes into crafting straightforward interfaces, including user research, prototyping, design iterations, and usability studies. Design criticism for food and software interfaces clearly share many similarities. Both areas value expert opinions and user feedback. Both areas understand the importance of great design needing to work well in its context. Last but not least, both food and interaction design criticism value "craveability" and how having users excited about experiencing and enjoying the designs is an important goal. Now if we can just improve the taste of software user interfaces, people may choose to dine on their enterprise applications over a fresh organic salad.

Read the article
Getting your bearings and defining the project objective

- by johndoucette

I wrote this two years ago and thought it was worth posting… Some may think this is a daunting task and some may even say “what a waste of time” and want to open MS Project and start typing out tasks because someone asked for an estimate and a task list. Hell, maybe you even use Excel and pump out a spreadsheet with some real scientific formula for guessing how long it will take to code a bunch of classes. However, this short exercise will provide the basis for the entire project, whether small or large and be a great friend when communicating to anyone on your team or even your client. I call this the Project Brief. If you find yourself going beyond a single page, then you must decompose the sections and summarize your findings so there is a complete and clear picture of the project you are working on in a relatively short statement. Here is a great quote from the PMBOK (Project Management Body of Knowledge) relative to what a project is; A project is a temporary endeavor undertaken to create a unique product, service or result. With this in mind, the project brief should encompass the entirety (objective) of the endeavor in its explanation and what it will take (goals) to create the product, service or result (deliverables). Normally the process of identifying the project objective is done during the first stage of a project called the Project Kickoff, but you can perform this very important step anytime to help you get a bearing. There are many more parts to helping a project stay on course, but this is usually the foundation where it can be grounded on. Through a series of 3 exercises, you should be able to come up with the objective, goals and deliverables on your project. Follow these steps, and in no time (about ½ hour), you will have the foundation of your project plan. (See examples below) Exercise 1 – Objectives Begin with the end in mind. Think about your project in business terms with a couple things to help you understand the objective; Reference the business benefit in terms of cost, speed and / or quality, Provide a higher level of what the outcome will look like (future sense) It should be non-measurable, that’s what the goals are all about The output should be a single paragraph with three sentences and take 10 minutes to write. *Typically, agreement must be reached on the objectives of the project before you would proceed to the next steps of the project. Exercise 2 – Goals A project goal is a statement that answers questions about who, what, why, where and when. A good project goal statement; Answers the five “W” questions for the project Is measurable in each of its parts Is published and agreed on by all the owners This helps the Project Manager receive confirmation on defining the project target. Using the established project objective done in the first exercise, think about the things it will take to get the job done. Think about tangible activities which are the top level tasks in a typical Work Breakdown Structure (WBS). The overall goal statement plus all the deliverables (next exercise) can be seen as the project team’s contract with the project owners. Write 3 - 5 goals in about 10 minutes. You should not write the words “Who, what, why, where and when, but merely be able to answer the questions when you read a goal. Exercise 3 – Deliverables Every project creates some type of output and these outputs are called deliverables. There are two classes of deliverables; Internal – produced for project team members to meet their goals External – produced for project owners to meet their expectations The list you enter here provides a checklist for the team’s delivery and/or is a statement of all the expectations of the project owners. Here are some typical project deliverables; Product and product documentation End product/system Requirements/feature documents Installation guides Demo/prototype System design documents User guides/help files Plans Project plan Training plan Conversion/installation/delivery plan Test plans Documentation plan Communication plan Reports and general documentation Progress reports System acceptance tests Outstanding bug list Procedures Risk and issue logs Project history Deliverables should go with each of the goals. Have 3-5 deliverables for each goal. When you are done, you will have established a great foundation for the clarity of your project. This exercise can take some time, but with practice, you should be able to whip this one out in 10 minutes as well, especially if you are intimate with an ongoing project. Samples Objective [Client] is implementing a series of MOSS sites to support external public (Internet), internal employee (Intranet) and an external secure (password protected Internet) applications. This project will focus on the public-facing web site and will provide [Client] with architectural recommendations based on the current design being done by their design partner [Partner] and the internal Content Team. In addition, it will provide [Client] with a development plan and confidence they need to deploy a world class public Internet website. Goals 1. [Consultant] will provide technical guidance and set project team expectations for the implementation of the MOSS Internet site based on provided features/functions within three weeks. 2. [Consultant] will understand phase 2 secure password-protected Internet site design and provide recommendations. Deliverables 1.1 Public Internet (unsecure) Architectural Recommendation Plan 1.2 Physical Site construction Work Breakdown Structure and plan (Time, cost and resources needed) 2.1 Two Factor authentication recommendation document Objective [Client] is currently using an application developed by [Consultant] many years ago called "XXX". This application, although functional, does not meet their new updated business requirements and contains a few defects which [Client] has developed work-around processes. [Client] would like to have a "new and improved" system to support their membership management needs by expanding membership and subscription capabilities, provide accounting integration with internal (GL) and external (VeriSign) systems, and implement hooks to the current CRM solution. This effort will take place through a series of phases, beginning with envisioning. Goals 1. Through discussions with users, [Consultant] will discover current issues/bugs which need to be resolved which must meet the current functionality requirements within three weeks. 2. [Consultant] will gather requirements from the users about what is "needed" vs. "what they have" for enhancements and provide a high level document supporting their needs. 3. [Consultant] will meet with the team members through a series of meetings and help define the overall project plan to deliver a new and improved solution. Deliverables 1.1 Prioritized list of Current application issues/bugs that need to be resolved 1.2 Provide a resolution plan on the issues/bugs identified in the current application 1.3 Risk Assessment Document 2.1 Deliver a Requirements Document showing high-level [Client] needs for the new XXX application. · New feature functionality not in the application today · Existing functionality that will remain in the new functionality 2.2 Reporting Requirements Document 3.1 A Project Plan showing the deliverables and cost for the next (second) phase of this project. 3.2 A Statement of Work for the next (second) phase of this project. 3.3 An Estimate of any work that would need to follow the second phase.

Read the article
T-SQL Tuesday #31 - Logging Tricks with CONTEXT_INFO

- by Most Valuable Yak (Rob Volk)

This month's T-SQL Tuesday is being hosted by Aaron Nelson [b | t], fellow Atlantan (the city in Georgia, not the famous sunken city, or the resort in the Bahamas) and covers the topic of logging (the recording of information, not the harvesting of trees) and maintains the fine T-SQL Tuesday tradition begun by Adam Machanic [b | t] (the SQL Server guru, not the guy who fixes cars, check the spelling again, there will be a quiz later). This is a trick I learned from Fernando Guerrero [b | t] waaaaaay back during the PASS Summit 2004 in sunny, hurricane-infested Orlando, during his session on Secret SQL Server (not sure if that's the correct title, and I haven't used parentheses in this paragraph yet). CONTEXT_INFO is a neat little feature that's existed since SQL Server 2000 and perhaps even earlier. It lets you assign data to the current session/connection, and maintains that data until you disconnect or change it. In addition to the CONTEXT_INFO() function, you can also query the context_info column in sys.dm_exec_sessions, or even sysprocesses if you're still running SQL Server 2000, if you need to see it for another session. While you're limited to 128 bytes, one big advantage that CONTEXT_INFO has is that it's independent of any transactions. If you've ever logged to a table in a transaction and then lost messages when it rolled back, you can understand how aggravating it can be. CONTEXT_INFO also survives across multiple SQL batches (GO separators) in the same connection, so for those of you who were going to suggest "just log to a table variable, they don't get rolled back": HA-HA, I GOT YOU! Since GO starts a new batch all variable declarations are lost. Here's a simple example I recently used at work. I had to test database mirroring configurations for disaster recovery scenarios and measure the network throughput. I also needed to log how long it took for the script to run and include the mirror settings for the database in question. I decided to use AdventureWorks as my database model, and Adam Machanic's Big Adventure script to provide a fairly large workload that's repeatable and easily scalable. My test would consist of several copies of AdventureWorks running the Big Adventure script while I mirrored the databases (or not). Since Adam's script contains several batches, I decided CONTEXT_INFO would have to be used. As it turns out, I only needed to grab the start time at the beginning, I could get the rest of the data at the end of the process. The code is pretty small: declare @time binary(128)=cast(getdate() as binary(8)) set context_info @time ... rest of Big Adventure code ... go use master; insert mirror_test(server,role,partner,db,state,safety,start,duration) select @@servername, mirroring_role_desc, mirroring_partner_instance, db_name(database_id), mirroring_state_desc, mirroring_safety_level_desc, cast(cast(context_info() as binary(8)) as datetime), datediff(s,cast(cast(context_info() as binary(8)) as datetime),getdate()) from sys.database_mirroring where db_name(database_id) like 'Adv%'; I declared @time as a binary(128) since CONTEXT_INFO is defined that way. I couldn't convert GETDATE() to binary(128) as it would pad the first 120 bytes as 0x00. To keep the CAST functions simple and avoid using SUBSTRING, I decided to CAST GETDATE() as binary(8) and let SQL Server do the implicit conversion. It's not the safest way perhaps, but it works on my machine. :) As I mentioned earlier, you can query system views for sessions and get their CONTEXT_INFO. With a little boilerplate code this can be used to monitor long-running procedures, in case you need to kill a process, or are just curious how long certain parts take. In this example, I added code to Adam's Big Adventure script to set CONTEXT_INFO messages at strategic places I want to monitor. (His code is in UPPERCASE as it was in the original, mine is all lowercase): declare @msg binary(128) set @msg=cast('Altering bigProduct.ProductID' as binary(128)) set context_info @msg go ALTER TABLE bigProduct ALTER COLUMN ProductID INT NOT NULL GO set context_info 0x0 go declare @msg1 binary(128) set @msg1=cast('Adding pk_bigProduct Constraint' as binary(128)) set context_info @msg1 go ALTER TABLE bigProduct ADD CONSTRAINT pk_bigProduct PRIMARY KEY (ProductID) GO set context_info 0x0 go declare @msg2 binary(128) set @msg2=cast('Altering bigTransactionHistory.TransactionID' as binary(128)) set context_info @msg2 go ALTER TABLE bigTransactionHistory ALTER COLUMN TransactionID INT NOT NULL GO set context_info 0x0 go declare @msg3 binary(128) set @msg3=cast('Adding pk_bigTransactionHistory Constraint' as binary(128)) set context_info @msg3 go ALTER TABLE bigTransactionHistory ADD CONSTRAINT pk_bigTransactionHistory PRIMARY KEY NONCLUSTERED(TransactionID) GO set context_info 0x0 go declare @msg4 binary(128) set @msg4=cast('Creating IX_ProductId_TransactionDate Index' as binary(128)) set context_info @msg4 go CREATE NONCLUSTERED INDEX IX_ProductId_TransactionDate ON bigTransactionHistory(ProductId,TransactionDate) INCLUDE(Quantity,ActualCost) GO set context_info 0x0 This doesn't include the entire script, only those portions that altered a table or created an index. One annoyance is that SET CONTEXT_INFO requires a literal or variable, you can't use an expression. And since GO starts a new batch I need to declare a variable in each one. And of course I have to use CAST because it won't implicitly convert varchar to binary. And even though context_info is a nullable column, you can't SET CONTEXT_INFO NULL, so I have to use SET CONTEXT_INFO 0x0 to clear the message after the statement completes. And if you're thinking of turning this into a UDF, you can't, although a stored procedure would work. So what does all this aggravation get you? As the code runs, if I want to see which stage the session is at, I can run the following (assuming SPID 51 is the one I want): select CAST(context_info as varchar(128)) from sys.dm_exec_sessions where session_id=51 Since SQL Server 2005 introduced the new system and dynamic management views (DMVs) there's not as much need for tagging a session with these kinds of messages. You can get the session start time and currently executing statement from them, and neatly presented if you use Adam's sp_whoisactive utility (and you absolutely should be using it). Of course you can always use xp_cmdshell, a CLR function, or some other tricks to log information outside of a SQL transaction. All the same, I've used this trick to monitor long-running reports at a previous job, and I still think CONTEXT_INFO is a great feature, especially if you're still using SQL Server 2000 or want to supplement your instrumentation. If you'd like an exercise, consider adding the system time to the messages in the last example, and an automated job to query and parse it from the system tables. That would let you track how long each statement ran without having to run Profiler. #TSQL2sDay

Read the article
Nashorn, the rhino in the room

- by costlow

Nashorn is a new runtime within JDK 8 that allows developers to run code written in JavaScript and call back and forth with Java. One advantage to the Nashorn scripting engine is that is allows for quick prototyping of functionality or basic shell scripts that use Java libraries. The previous JavaScript runtime, named Rhino, was introduced in JDK 6 (released 2006, end of public updates Feb 2013). Keeping tradition amongst the global developer community, "Nashorn" is the German word for rhino. The Java platform and runtime is an intentional home to many languages beyond the Java language itself. OpenJDK’s Da Vinci Machine helps coordinate work amongst language developers and tool designers and has helped different languages by introducing the Invoke Dynamic instruction in Java 7 (2011), which resulted in two major benefits: speeding up execution of dynamic code, and providing the groundwork for Java 8’s lambda executions. Many of these improvements are discussed at the JVM Language Summit, where language and tool designers get together to discuss experiences and issues related to building these complex components. There are a number of benefits to running JavaScript applications on JDK 8’s Nashorn technology beyond writing scripts quickly: Interoperability with Java and JavaScript libraries. Scripts do not need to be compiled. Fast execution and multi-threading of JavaScript running in Java’s JRE. The ability to remotely debug applications using an IDE like NetBeans, Eclipse, or IntelliJ (instructions on the Nashorn blog). Automatic integration with Java monitoring tools, such as performance, health, and SIEM. In the remainder of this blog post, I will explain how to use Nashorn and the benefit from those features. Nashorn execution environment The Nashorn scripting engine is included in all versions of Java SE 8, both the JDK and the JRE. Unlike Java code, scripts written in nashorn are interpreted and do not need to be compiled before execution. Developers and users can access it in two ways: Users running JavaScript applications can call the binary directly:jre8/bin/jjs This mechanism can also be used in shell scripts by specifying a shebang like #!/usr/bin/jjs Developers can use the API and obtain a ScriptEngine through:ScriptEngine engine = new ScriptEngineManager().getEngineByName("nashorn"); When using a ScriptEngine, please understand that they execute code. Avoid running untrusted scripts or passing in untrusted/unvalidated inputs. During compilation, consider isolating access to the ScriptEngine and using Type Annotations to only allow @Untainted String arguments. One noteworthy difference between JavaScript executed in or outside of a web browser is that certain objects will not be available. For example when run outside a browser, there is no access to a document object or DOM tree. Other than that, all syntax, semantics, and capabilities are present. Examples of Java and JavaScript The Nashorn script engine allows developers of all experience levels the ability to write and run code that takes advantage of both languages. The specific dialect is ECMAScript 5.1 as identified by the User Guide and its standards definition through ECMA international. In addition to the example below, Benjamin Winterberg has a very well written Java 8 Nashorn Tutorial that provides a large number of code samples in both languages. Basic Operations A basic Hello World application written to run on Nashorn would look like this: #!/usr/bin/jjs print("Hello World"); The first line is a standard script indication, so that Linux or Unix systems can run the script through Nashorn. On Windows where scripts are not as common, you would run the script like: jjs helloWorld.js. Receiving Arguments In order to receive program arguments your jjs invocation needs to use the -scripting flag and a double-dash to separate which arguments are for jjs and which are for the script itself:jjs -scripting print.js -- "This will print" #!/usr/bin/jjs var whatYouSaid = $ARG.length==0 ? "You did not say anything" : $ARG[0] print(whatYouSaid); Interoperability with Java libraries (including 3rd party dependencies) Another goal of Nashorn was to allow for quick scriptable prototypes, allowing access into Java types and any libraries. Resources operate in the context of the script (either in-line with the script or as separate threads) so if you open network sockets and your script terminates, those sockets will be released and available for your next run. Your code can access Java types the same as regular Java classes. The “import statements” are written somewhat differently to accommodate for language. There is a choice of two styles: For standard classes, just name the class: var ServerSocket = java.net.ServerSocket For arrays or other items, use Java.type: var ByteArray = Java.type("byte[]")You could technically do this for all. The same technique will allow your script to use Java types from any library or 3rd party component and quickly prototype items. Building a user interface One major difference between JavaScript inside and outside of a web browser is the availability of a DOM object for rendering views. When run outside of the browser, JavaScript has full control to construct the entire user interface with pre-fabricated UI controls, charts, or components. The example below is a variation from the Nashorn and JavaFX guide to show how items work together. Nashorn has a -fx flag to make the user interface components available. With the example script below, just specify: jjs -fx -scripting fx.js -- "My title" #!/usr/bin/jjs -fx var Button = javafx.scene.control.Button; var StackPane = javafx.scene.layout.StackPane; var Scene = javafx.scene.Scene; var clickCounter=0; $STAGE.title = $ARG.length>0 ? $ARG[0] : "You didn't provide a title"; var button = new Button(); button.text = "Say 'Hello World'"; button.onAction = myFunctionForButtonClicking; var root = new StackPane(); root.children.add(button); $STAGE.scene = new Scene(root, 300, 250); $STAGE.show(); function myFunctionForButtonClicking(){ var text = "Click Counter: " + clickCounter; button.setText(text); clickCounter++; print(text); } For a more advanced post on using Nashorn to build a high-performing UI, see JavaFX with Nashorn Canvas example. Interoperable with frameworks like Node, Backbone, or Facebook React The major benefit of any language is the interoperability gained by people and systems that can read, write, and use it for interactions. Because Nashorn is built for the ECMAScript specification, developers familiar with JavaScript frameworks can write their code and then have system administrators deploy and monitor the applications the same as any other Java application. A number of projects are also running Node applications on Nashorn through Project Avatar and the supported modules. In addition to the previously mentioned Nashorn tutorial, Benjamin has also written a post about Using Backbone.js with Nashorn. To show the multi-language power of the Java Runtime, there is another interesting example that unites Facebook React and Clojure on JDK 8’s Nashorn. Summary Nashorn provides a simple and fast way of executing JavaScript applications and bridging between the best of each language. By making the full range of Java libraries to JavaScript applications, and the quick prototyping style of JavaScript to Java applications, developers are free to work as they see fit. Software Architects and System Administrators can take advantage of one runtime and leverage any work that they have done to tune, monitor, and certify their systems. Additional information is available within: The Nashorn Users’ Guide Java Magazine’s article "Next Generation JavaScript Engine for the JVM." The Nashorn team’s primary blog or a very helpful collection of Nashorn links.

Read the article
C# Performance Pitfall – Interop Scenarios Change the Rules

- by Reed

C# and .NET, overall, really do have fantastic performance in my opinion. That being said, the performance characteristics dramatically differ from native programming, and take some relearning if you’re used to doing performance optimization in most other languages, especially C, C++, and similar. However, there are times when revisiting tricks learned in native code play a critical role in performance optimization in C#. I recently ran across a nasty scenario that illustrated to me how dangerous following any fixed rules for optimization can be… The rules in C# when optimizing code are very different than C or C++. Often, they’re exactly backwards. For example, in C and C++, lifting a variable out of loops in order to avoid memory allocations often can have huge advantages. If some function within a call graph is allocating memory dynamically, and that gets called in a loop, it can dramatically slow down a routine. This can be a tricky bottleneck to track down, even with a profiler. Looking at the memory allocation graph is usually the key for spotting this routine, as it’s often “hidden” deep in call graph. For example, while optimizing some of my scientific routines, I ran into a situation where I had a loop similar to: for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i]); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This loop was at a fairly high level in the call graph, and often could take many hours to complete, depending on the input data. As such, any performance optimization we could achieve would be greatly appreciated by our users. After a fair bit of profiling, I noticed that a couple of function calls down the call graph (inside of ProcessElement), there was some code that effectively was doing: // Allocate some data required DataStructure* data = new DataStructure(num); // Call into a subroutine that passed around and manipulated this data highly CallSubroutine(data); // Read and use some values from here double values = data->Foo; // Cleanup delete data; // ... return bar; Normally, if “DataStructure” was a simple data type, I could just allocate it on the stack. However, it’s constructor, internally, allocated it’s own memory using new, so this wouldn’t eliminate the problem. In this case, however, I could change the call signatures to allow the pointer to the data structure to be passed into ProcessElement and through the call graph, allowing the inner routine to reuse the same “data” memory instead of allocating. At the highest level, my code effectively changed to something like: DataStructure* data = new DataStructure(numberToProcess); for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i], data); } delete data; Granted, this dramatically reduced the maintainability of the code, so it wasn’t something I wanted to do unless there was a significant benefit. In this case, after profiling the new version, I found that it increased the overall performance dramatically – my main test case went from 35 minutes runtime down to 21 minutes. This was such a significant improvement, I felt it was worth the reduction in maintainability. In C and C++, it’s generally a good idea (for performance) to: Reduce the number of memory allocations as much as possible, Use fewer, larger memory allocations instead of many smaller ones, and Allocate as high up the call stack as possible, and reuse memory I’ve seen many people try to make similar optimizations in C# code. For good or bad, this is typically not a good idea. The garbage collector in .NET completely changes the rules here. In C#, reallocating memory in a loop is not always a bad idea. In this scenario, for example, I may have been much better off leaving the original code alone. The reason for this is the garbage collector. The GC in .NET is incredibly effective, and leaving the allocation deep inside the call stack has some huge advantages. First and foremost, it tends to make the code more maintainable – passing around object references tends to couple the methods together more than necessary, and overall increase the complexity of the code. This is something that should be avoided unless there is a significant reason. Second, (unlike C and C++) memory allocation of a single object in C# is normally cheap and fast. Finally, and most critically, there is a large advantage to having short lived objects. If you lift a variable out of the loop and reuse the memory, its much more likely that object will get promoted to Gen1 (or worse, Gen2). This can cause expensive compaction operations to be required, and also lead to (at least temporary) memory fragmentation as well as more costly collections later. As such, I’ve found that it’s often (though not always) faster to leave memory allocations where you’d naturally place them – deep inside of the call graph, inside of the loops. This causes the objects to stay very short lived, which in turn increases the efficiency of the garbage collector, and can dramatically improve the overall performance of the routine as a whole. In C#, I tend to: Keep variable declarations in the tightest scope possible Declare and allocate objects at usage While this tends to cause some of the same goals (reducing unnecessary allocations, etc), the goal here is a bit different – it’s about keeping the objects rooted for as little time as possible in order to (attempt) to keep them completely in Gen0, or worst case, Gen1. It also has the huge advantage of keeping the code very maintainable – objects are used and “released” as soon as possible, which keeps the code very clean. It does, however, often have the side effect of causing more allocations to occur, but keeping the objects rooted for a much shorter time. Now – nowhere here am I suggesting that these rules are hard, fast rules that are always true. That being said, my time spent optimizing over the years encourages me to naturally write code that follows the above guidelines, then profile and adjust as necessary. In my current project, however, I ran across one of those nasty little pitfalls that’s something to keep in mind – interop changes the rules. In this case, I was dealing with an API that, internally, used some COM objects. In this case, these COM objects were leading to native allocations (most likely C++) occurring in a loop deep in my call graph. Even though I was writing nice, clean managed code, the normal managed code rules for performance no longer apply. After profiling to find the bottleneck in my code, I realized that my inner loop, a innocuous looking block of C# code, was effectively causing a set of native memory allocations in every iteration. This required going back to a “native programming” mindset for optimization. Lifting these variables and reusing them took a 1:10 routine down to 0:20 – again, a very worthwhile improvement. Overall, the lessons here are: Always profile if you suspect a performance problem – don’t assume any rule is correct, or any code is efficient just because it looks like it should be Remember to check memory allocations when profiling, not just CPU cycles Interop scenarios often cause managed code to act very differently than “normal” managed code. Native code can be hidden very cleverly inside of managed wrappers

Read the article
tile_static, tile_barrier, and tiled matrix multiplication with C++ AMP

- by Daniel Moth

We ended the previous post with a mechanical transformation of the C++ AMP matrix multiplication example to the tiled model and in the process introduced tiled_index and tiled_grid. This is part 2. tile_static memory You all know that in regular CPU code, static variables have the same value regardless of which thread accesses the static variable. This is in contrast with non-static local variables, where each thread has its own copy. Back to C++ AMP, the same rules apply and each thread has its own value for local variables in your lambda, whereas all threads see the same global memory, which is the data they have access to via the array and array_view. In addition, on an accelerator like the GPU, there is a programmable cache, a third kind of memory type if you'd like to think of it that way (some call it shared memory, others call it scratchpad memory). Variables stored in that memory share the same value for every thread in the same tile. So, when you use the tiled model, you can have variables where each thread in the same tile sees the same value for that variable, that threads from other tiles do not. The new storage class for local variables introduced for this purpose is called tile_static. You can only use tile_static in restrict(direct3d) functions, and only when explicitly using the tiled model. What this looks like in code should be no surprise, but here is a snippet to confirm your mental image, using a good old regular C array // each tile of threads has its own copy of locA, // shared among the threads of the tile tile_static float locA[16][16]; Note that tile_static variables are scoped and have the lifetime of the tile, and they cannot have constructors or destructors. tile_barrier In amp.h one of the types introduced is tile_barrier. You cannot construct this object yourself (although if you had one, you could use a copy constructor to create another one). So how do you get one of these? You get it, from a tiled_index object. Beyond the 4 properties returning index objects, tiled_index has another property, barrier, that returns a tile_barrier object. The tile_barrier class exposes a single member, the method wait. 15: // Given a tiled_index object named t_idx 16: t_idx.barrier.wait(); 17: // more code …in the code above, all threads in the tile will reach line 16 before a single one progresses to line 17. Note that all threads must be able to reach the barrier, i.e. if you had branchy code in such a way which meant that there is a chance that not all threads could reach line 16, then the code above would be illegal. Tiled Matrix Multiplication Example – part 2 So now that we added to our understanding the concepts of tile_static and tile_barrier, let me obfuscate rewrite the matrix multiplication code so that it takes advantage of tiling. Before you start reading this, I suggest you get a cup of your favorite non-alcoholic beverage to enjoy while you try to fully understand the code. 01: void MatrixMultiplyTiled(vector<float>& vC, const vector<float>& vA, const vector<float>& vB, int M, int N, int W) 02: { 03: static const int TS = 16; 04: array_view<const float,2> a(M, W, vA); 05: array_view<const float,2> b(W, N, vB); 06: array_view<writeonly<float>,2> c(M,N,vC); 07: parallel_for_each(c.grid.tile< TS, TS >(), 08: [=] (tiled_index< TS, TS> t_idx) restrict(direct3d) 09: { 10: int row = t_idx.local[0]; int col = t_idx.local[1]; 11: float sum = 0.0f; 12: for (int i = 0; i < W; i += TS) { 13: tile_static float locA[TS][TS], locB[TS][TS]; 14: locA[row][col] = a(t_idx.global[0], col + i); 15: locB[row][col] = b(row + i, t_idx.global[1]); 16: t_idx.barrier.wait(); 17: for (int k = 0; k < TS; k++) 18: sum += locA[row][k] * locB[k][col]; 19: t_idx.barrier.wait(); 20: } 21: c[t_idx.global] = sum; 22: }); 23: } Notice that all the code up to line 9 is the same as per the changes we made in part 1 of tiling introduction. If you squint, the body of the lambda itself preserves the original algorithm on lines 10, 11, and 17, 18, and 21. The difference being that those lines use new indexing and the tile_static arrays; the tile_static arrays are declared and initialized on the brand new lines 13-15. On those lines we copy from the global memory represented by the array_view objects (a and b), to the tile_static vanilla arrays (locA and locB) – we are copying enough to fit a tile. Because in the code that follows on line 18 we expect the data for this tile to be in the tile_static storage, we need to synchronize the threads within each tile with a barrier, which we do on line 16 (to avoid accessing uninitialized memory on line 18). We also need to synchronize the threads within a tile on line 19, again to avoid the race between lines 14, 15 (retrieving the next set of data for each tile and overwriting the previous set) and line 18 (not being done processing the previous set of data). Luckily, as part of the awesome C++ AMP debugger in Visual Studio there is an option that helps you find such races, but that is a story for another blog post another time. May I suggest reading the next section, and then coming back to re-read and walk through this code with pen and paper to really grok what is going on, if you haven't already? Cool. Why would I introduce this tiling complexity into my code? Funny you should ask that, I was just about to tell you. There is only one reason we tiled our extent, had to deal with finding a good tile size, ensure the number of threads we schedule are correctly divisible with the tile size, had to use a tiled_index instead of a normal index, and had to understand tile_barrier and to figure out where we need to use it, and double the size of our lambda in terms of lines of code: the reason is to be able to use tile_static memory. Why do we want to use tile_static memory? Because accessing tile_static memory is around 10 times faster than accessing the global memory on an accelerator like the GPU, e.g. in the code above, if you can get 150GB/second accessing data from the array_view a, you can get 1500GB/second accessing the tile_static array locA. And since by definition you are dealing with really large data sets, the savings really pay off. We have seen tiled implementations being twice as fast as their non-tiled counterparts. Now, some algorithms will not have performance benefits from tiling (and in fact may deteriorate), e.g. algorithms that require you to go only once to global memory will not benefit from tiling, since with tiling you already have to fetch the data once from global memory! Other algorithms may benefit, but you may decide that you are happy with your code being 150 times faster than the serial-version you had, and you do not need to invest to make it 250 times faster. Also algorithms with more than 3 dimensions, which C++ AMP supports in the non-tiled model, cannot be tiled. Also note that in future releases, we may invest in making the non-tiled model, which already uses tiling under the covers, go the extra step and use tile_static memory on your behalf, but it is obviously way to early to commit to anything like that, and we certainly don't do any of that today. Comments about this post by Daniel Moth welcome at the original blog.

Read the article
The SPARC SuperCluster

- by Karoly Vegh

Oracle has been providing a lead in the Engineered Systems business for quite a while now, in accordance with the motto "Hardware and Software Engineered to Work Together." Indeed it is hard to find a better definition of these systems. Allow me to summarize the idea. It is: Build a compute platform optimized to run your technologies Develop application aware, intelligently caching storage components Take an impressively fast network technology interconnecting it with the compute nodes Tune the application to scale with the nodes to yet unseen performance Reduce the amount of data moving via compression Provide this all in a pre-integrated single product with a single-pane management interface All these ideas have been around in IT for quite some time now. The real Oracle advantage is adding the last one to put these all together. Oracle has built quite a portfolio of Engineered Systems, to run its technologies - and run those like they never ran before. In this post I'll focus on one of them that serves as a consolidation demigod, a multi-purpose engineered system. As you probably have guessed, I am talking about the SPARC SuperCluster. It has many great features inherited from its predecessors, and it adds several new ones. Allow me to pick out and elaborate about some of the most interesting ones from a technological point of view. I. It is the SPARC SuperCluster T4-4. That is, as compute nodes, it includes SPARC T4-4 servers that we learned to appreciate and respect for their features: The SPARC T4 CPUs: Each CPU has 8 cores, each core runs 8 threads. The SPARC T4-4 servers have 4 sockets. That is, a single compute node can in parallel, simultaneously execute 256 threads. Now, a full-rack SPARC SuperCluster has 4 of these servers on board. Remember the keyword demigod. While retaining the forerunner SPARC T3's exceptional throughput, the SPARC T4 CPUs raise the bar with single performance too - a humble 5x better one than their ancestors. actually, the SPARC T4 CPU cores run in both single-threaded and multi-threaded mode, and switch between these two on-the-fly, fulfilling not only single-threaded OR multi-threaded applications' needs, but even mixed requirements (like in database workloads!). Data security, anyone? Every SPARC T4 CPU core has a built-in encryption engine, that is, encryption algorithms cast into silicon. A PCI controller right on the chip for customers who need I/O performance. Built-in, no-cost Virtualization: Oracle VM for SPARC (the former LDoms or Logical Domains) is not a server-emulation virtualization technology but rather a serverpartitioning one, the hypervisor runs in the server firmware, and all the VMs' HW resources (I/O, CPU, memory) are accessed natively, without performance overhead. This enables customers to run a number of Solaris 10 and Solaris 11 VMs separated, independent of each other within a physical server II. For Database performance, it includes Exadata Storage Cells - one of the main reasons why the Exadata Database Machine performs at diabolic speed. What makes them important? They provide DB backend storage for your Oracle Databases to run on the SPARC SuperCluster, that is what they are built and tuned for DB performance. These storage cells are SQL-aware. That is, if a SPARC T4 database compute node executes a query, it doesn't simply request tons of raw datablocks from the storage, filters the received data, and throws away most of it where the statement doesn't apply, but provides the SQL query to the storage node too. The storage cell software speaks SQL, that is, it is able to prefilter and through that transfer only the relevant data. With this, the traffic between database nodes and storage cells is reduced immensely. Less I/O is a good thing - as they say, all the CPUs of the world do one thing just as fast as any other - and that is waiting for I/O. They don't only pre-filter, but also provide data preprocessing features - e.g. if a DB-node requests an aggregate of data, they can calculate it, and handover only the results, not the whole set. Again, less data to transfer. They support the magical HCC, (Hybrid Columnar Compression). That is, data can be stored in a precompressed form on the storage. Less data to transfer. Of course one can't simply rely on disks for performance, there is Flash Storage included there for caching. III. The low latency, high-speed backbone network: InfiniBand, that interconnects all the members with: Real High Speed: 40 Gbit/s. Full Duplex, of course. Oh, and a really low latency. RDMA. Remote Direct Memory Access. This technology allows the DB nodes to do exactly that. Remotely, directly placing SQL commands into the Memory of the storage cells. Dodging all the network-stack bottlenecks, avoiding overhead, placing requests directly into the process queue. You can also run IP over InfiniBand if you please - that's the way the compute nodes can communicate with each other. IV. Including a general-purpose storage too: the ZFSSA, which is a unified storage, providing NAS and SAN access too, with the following features: NFS over RDMA over InfiniBand. Nothing is faster network-filesystem-wise. All the ZFS features onboard, hybrid storage pools, compression, deduplication, snapshot, replication, NFS and CIFS shares Storageheads in a HA-Cluster configuration providing availability of the data DTrace Live Analytics in a web-based Administration UI Being a general purpose application data storage for your non-database applications running on the SPARC SuperCluster over whichever protocol they prefer, easily replicating, snapshotting, cloning data for them. There's a lot of great technology included in Oracle's SPARC SuperCluster, we have talked its interior through. As for external scalability: you can start with a half- of full- rack SPARC SuperCluster, and scale out to several racks - that is, stacking not separate full-rack SPARC SuperClusters, but extending always one large instance of the size of several full-racks. Yes, over InfiniBand network. Add racks as you grow. What technologies shall run on it? SPARC SuperCluster is a general purpose scaleout consolidation/cloud environment. You can run Oracle Databases with RAC scaling, or Oracle Weblogic (end enjoy the SPARC T4's advantages to run Java). Remember, Oracle technologies have been integrated with the Oracle Engineered Systems - this is the Oracle on Oracle advantage. But you can run other software environments such as SAP if you please too. Run any application that runs on Oracle Solaris 10 or Solaris 11. Separate them in Virtual Machines, or even Oracle Solaris Zones, monitor and manage those from a central UI. Here the key takeaways once again: The SPARC SuperCluster: Is a pre-integrated Engineered System Contains SPARC T4-4 servers with built-in virtualization, cryptography, dynamic threading Contains the Exadata storage cells that intelligently offload the burden of the DB-nodes Contains a highly available ZFS Storage Appliance, that provides SAN/NAS storage in a unified way Combines all these elements over a high-speed, low-latency backbone network implemented with InfiniBand Can grow from a single half-rack to several full-rack size Supports the consolidation of hundreds of applications To summarize: All these technologies are great by themselves, but the real value is like in every other Oracle Engineered System: Integration. All these technologies are tuned to perform together. Together they are way more than the sum of all - and a careful and actually very time consuming integration process is necessary to orchestrate all these for performance. The SPARC SuperCluster's goal is to enable infrastructure operations and offer a pre-integrated solution that can be architected and delivered in hours instead of months of evaluations and tests. The tedious and most importantly time and resource consuming part of the work - testing and evaluating - has been done. Now go, provide services. -- charlie

Read the article

Search Results

Search found 10417 results on 417 pages for 'large'.

Page 395/417 | < Previous Page | 391 392 393 394 395 396 397 398 399 400 401 402 | Next Page >

- by danx

- by Most Valuable Yak (Rob Volk)

- by jatin.thaker

- by pinaldave

- by James Michael Hare

- by igor

- by Mike Stiles

- by Rick Strahl

- by Charles Young

- by Stephen.Walther

- by DrJohn

- by PointsToShare

- by Simon Cooper

- by hajan

- by Pinal Dave

- by user12620111

- by James Michael Hare

- by James Michael Hare

- by Applications User Experience

- by johndoucette

- by Most Valuable Yak (Rob Volk)

- by costlow

- by Reed

- by Daniel Moth

- by Karoly Vegh

< Previous Page | 391 392 393 394 395 396 397 398 399 400 401 402 | Next Page >