Search Results

Search found 18096 results on 724 pages for 'let me be'.

Page 25/724 | < Previous Page | 21 22 23 24 25 26 27 28 29 30 31 32  | Next Page >

  • N-gram split function for string similarity comparison

    - by Michael
    As part of excersise to better understand F# which I am currently learning , I wrote function to split given string into n-grams. 1) I would like to receive feedback about my function : can this be written simpler or in more efficient way? 2) My overall goal is to write function that returns string similarity (on 0.0 .. 1.0 scale) based on n-gram similarity; Does this approach works well for short strings comparisons , or can this method reliably be used to compare large strings (like articles for example). 3) I am aware of the fact that n-gram comparisons ignore context of two strings. What method would you suggest to accomplish my goal? //s:string - target string to split into n-grams //n:int - n-gram size to split string into let ngram_split (s:string, n:int) = let ngram_count = s.Length - (s.Length % n) let ngram_list = List.init ngram_count (fun i -> if( i + n >= s.Length ) then s.Substring(i,s.Length - i) + String.init ((i + n) - s.Length) (fun i -> "#") else s.Substring(i,n) ) let ngram_array_unique = ngram_list |> Seq.ofList |> Seq.distinct |> Array.ofSeq //produce tuples of ngrams (ngram string,how much occurrences in original string) Seq.init ngram_array_unique.Length (fun i -> (ngram_array_unique.[i], ngram_list |> List.filter(fun item -> item = ngram_array_unique.[i]) |> List.length) )

    Read the article

  • Project Euler #14 and memoization in Clojure

    - by dbyrne
    As a neophyte clojurian, it was recommended to me that I go through the Project Euler problems as a way to learn the language. Its definitely a great way to improve your skills and gain confidence. I just finished up my answer to problem #14. It works fine, but to get it running efficiently I had to implement some memoization. I couldn't use the prepackaged memoize function because of the way my code was structured, and I think it was a good experience to roll my own anyways. My question is if there is a good way to encapsulate my cache within the function itself, or if I have to define an external cache like I have done. Also, any tips to make my code more idiomatic would be appreciated. (use 'clojure.test) (def mem (atom {})) (with-test (defn chain-length ([x] (chain-length x x 0)) ([start-val x c] (if-let [e (last(find @mem x))] (let [ret (+ c e)] (swap! mem assoc start-val ret) ret) (if (<= x 1) (let [ret (+ c 1)] (swap! mem assoc start-val ret) ret) (if (even? x) (recur start-val (/ x 2) (+ c 1)) (recur start-val (+ 1 (* x 3)) (+ c 1))))))) (is (= 10 (chain-length 13)))) (with-test (defn longest-chain ([] (longest-chain 2 0 0)) ([c max start-num] (if (>= c 1000000) start-num (let [l (chain-length c)] (if (> l max) (recur (+ 1 c) l c) (recur (+ 1 c) max start-num)))))) (is (= 837799 (longest-chain))))

    Read the article

  • F# Higher-order property accessors

    - by Nathan Sanders
    I just upgraded my prototyping tuple to a record. Someday it may become a real class. In the meantime, I want to translate code like this: type Example = int * int let examples = [(1,2); (3,4); (5,6)] let field1s = Seq.map (fst >> printfn "%d") examples to this: type Example = { Field1 : int Field2 : int Description : string } let examples = [{Field1 = 1; Field2 = 2; Description = "foo"} {Field1 = 3; Field2 = 4; Description = "bar"} {Field1 = 5; Field2 = 6; Description = "baz"}] let field1s = Seq.map Description examples The problem is that I expected to get a function Description : Example -> string when I declared the Example record, but I don't. I've poked around a little and tried properties on classes, but that doesn't work either. Am I just missing something in the documentation or will I have to write higher-order accessors manually? (That's the workaround I'm using now.)

    Read the article

  • f# pattern matching with types

    - by philbrowndotcom
    I'm trying to recursively print out all an objects properties and sub-type properties etc. My object model is as follows... type suggestedFooWidget = { value: float ; hasIncreasedSinceLastPeriod: bool ; } type firmIdentifier = { firmId: int ; firmName: string ; } type authorIdentifier = { authorId: int ; authorName: string ; firm: firmIdentifier ; } type denormalizedSuggestedFooWidgets = { id: int ; ticker: string ; direction: string ; author: authorIdentifier ; totalAbsoluteWidget: suggestedFooWidget ; totalSectorWidget: suggestedFooWidget ; totalExchangeWidget: suggestedFooWidget ; todaysAbsoluteWidget: suggestedFooWidget ; msdAbsoluteWidget: suggestedFooWidget ; msdSectorWidget: suggestedFooWidget ; msdExchangeWidget: suggestedFooWidget ; } And my recursion is based on the following pattern matching... let rec printObj (o : obj) (sb : StringBuilder) (depth : int) let props = o.GetType().GetProperties() let enumer = props.GetEnumerator() while enumer.MoveNext() do let currObj = (enumer.Current : obj) ignore <| match currObj with | :? string as s -> sb.Append(s.ToString()) | :? bool as c -> sb.Append(c.ToString()) | :? int as i -> sb.Append(i.ToString()) | :? float as i -> sb.Append(i.ToString()) | _ -> printObj currObj sb (depth + 1) sb In the debugger I see that currObj is of type string, int, float, etc but it always jumps to the defualt case at the bottom. Any idea why this is happening?

    Read the article

  • Best practices to deal with "slightly different" branches of source code

    - by jedi_coder
    This question is rather agnostic than related to a certain version control program. Assume there is a source code tree under certain distributed version control. Let's call it A. At some point somebody else clones it and gets its own copy. Let's call it B. I'll call A and B branches, even if some version control tools have different definitions for branches (some might call A and B repositories). Let's assume that branch A is the "main" branch. In the context of distributed version control this only means that branch A is modified much more actively and the owner of branch B periodically syncs (pulls) new updates from branch A. Let's consider that a certain source file in branch B contains a class (again, it's also language agnostic). The owner of branch B considers that some class methods are more appropriate and groups them together by moving them inside the class body. Functionally nothing has changed - this is a very trivial refactoring of the code. But the change gets reflected in diffs. Now, assuming that this change from branch B will never get merged into branch A, the owner of branch B will always get this difference when pulling from branch A and merging into his own workspace. Even if there's only one such trivial change, the owner of branch B needs to resolve conflicts every time when pulling from branch A. As long as branches A and B are modified independently, more and more conflicts like this appear. What is the workaround for this situation? Which workflow should the owner of branch B follow to minimize the effort for periodically syncing with branch A?

    Read the article

  • Integration test for H1 text failing when it should be passing (Rspec and Capybara)

    - by rebec
    Below you can see my test for what happens when a user tries to access the page to edit a profile photo that they don't own. I've used the same test on the NEW action, where it worked fine, but it surprised me by failing when I copied it down to the EDIT action tests. I've used save_and_open_page to test what Capybara's seeing (as you can see below); the resulting page definitely has an h1 with the specified text in it. No spelling errors, and case is all the same as in the test. I've tried using both have_css and have_selector. Both fail. I'm still learning Ruby, Rails, Rspec and especially Capybara (was using webrat previously and recently switched over), and wonder if I'm misconceiving of something that's making me expect this to pass when it doesn't. Any thoughts? Thanks! describe "EDIT" do let(:user) { FactoryGirl.create(:user) } let(:different_user) { FactoryGirl.create(:user) } let(:admin_user) { FactoryGirl.create(:user, role: "admin") } let(:profile_photo1) { FactoryGirl.create(:profile_photo, user: user) } subject { page } context "when signed in as any member" do before { login different_user visit edit_user_profile_photo_path(:id => profile_photo1, :user_id => profile_photo1.user_id) save_and_open_page } # It should deny access with an Unauthorised/Forbidden message. it { should have_css('h1', text: "Unauthorised/Forbidden.") } end

    Read the article

  • Ways to prevent multiple email notification with php&mysql

    - by Louis Loudog Trottier
    My 1st post, but i got many great answers and tips from stackoverflow so far. This one was a close call- How does facebook, gmail send the real time notification? but not exactly, so let brainstorm this together. I have CMS system with mail notification when a change is made on the site. Everything wotk very well but i want to prevent multiple notification if somene make another quick change to, let say, fix a typo. Using php mail(), obviously. I've tough of 2 ways, one simple, and one.... let just say, pretty heavy... cough. the 3rd one was inspired by Implementing Email Notification but really doesn't look appealing to me to send bunch of email at once. Use a timestamp to check if another change was made in the 'let say' last 5 minutes. Record the last change, and compare it to the new one. Could be usefull for backup at the same time since i'll have to save the change somewhere, but text can be long and making an sql search would be painfull. Wouldn't it? Use cron to send changes every x minutes... convince me if you ythink it is a suitable solution. Any ideas, comment or suggestion of your own? Looking forward for your inputs, and since i now registered, i'll do my best to help around. Cheers, all llt

    Read the article

  • C# 4: The Curious ConcurrentDictionary

    - by James Michael Hare
    In my previous post (here) I did a comparison of the new ConcurrentQueue versus the old standard of a System.Collections.Generic Queue with simple locking.  The results were exactly what I would have hoped, that the ConcurrentQueue was faster with multi-threading for most all situations.  In addition, concurrent collections have the added benefit that you can enumerate them even if they're being modified. So I set out to see what the improvements would be for the ConcurrentDictionary, would it have the same performance benefits as the ConcurrentQueue did?  Well, after running some tests and multiple tweaks and tunes, I have good and bad news. But first, let's look at the tests.  Obviously there's many things we can do with a dictionary.  One of the most notable uses, of course, in a multi-threaded environment is for a small, local in-memory cache.  So I set about to do a very simple simulation of a cache where I would create a test class that I'll just call an Accessor.  This accessor will attempt to look up a key in the dictionary, and if the key exists, it stops (i.e. a cache "hit").  However, if the lookup fails, it will then try to add the key and value to the dictionary (i.e. a cache "miss").  So here's the Accessor that will run the tests: 1: internal class Accessor 2: { 3: public int Hits { get; set; } 4: public int Misses { get; set; } 5: public Func<int, string> GetDelegate { get; set; } 6: public Action<int, string> AddDelegate { get; set; } 7: public int Iterations { get; set; } 8: public int MaxRange { get; set; } 9: public int Seed { get; set; } 10:  11: public void Access() 12: { 13: var randomGenerator = new Random(Seed); 14:  15: for (int i=0; i<Iterations; i++) 16: { 17: // give a wide spread so will have some duplicates and some unique 18: var target = randomGenerator.Next(1, MaxRange); 19:  20: // attempt to grab the item from the cache 21: var result = GetDelegate(target); 22:  23: // if the item doesn't exist, add it 24: if(result == null) 25: { 26: AddDelegate(target, target.ToString()); 27: Misses++; 28: } 29: else 30: { 31: Hits++; 32: } 33: } 34: } 35: } Note that so I could test different implementations, I defined a GetDelegate and AddDelegate that will call the appropriate dictionary methods to add or retrieve items in the cache using various techniques. So let's examine the three techniques I decided to test: Dictionary with mutex - Just your standard generic Dictionary with a simple lock construct on an internal object. Dictionary with ReaderWriterLockSlim - Same Dictionary, but now using a lock designed to let multiple readers access simultaneously and then locked when a writer needs access. ConcurrentDictionary - The new ConcurrentDictionary from System.Collections.Concurrent that is supposed to be optimized to allow multiple threads to access safely. So the approach to each of these is also fairly straight-forward.  Let's look at the GetDelegate and AddDelegate implementations for the Dictionary with mutex lock: 1: var addDelegate = (key,val) => 2: { 3: lock (_mutex) 4: { 5: _dictionary[key] = val; 6: } 7: }; 8: var getDelegate = (key) => 9: { 10: lock (_mutex) 11: { 12: string val; 13: return _dictionary.TryGetValue(key, out val) ? val : null; 14: } 15: }; Nothing new or fancy here, just your basic lock on a private object and then query/insert into the Dictionary. Now, for the Dictionary with ReadWriteLockSlim it's a little more complex: 1: var addDelegate = (key,val) => 2: { 3: _readerWriterLock.EnterWriteLock(); 4: _dictionary[key] = val; 5: _readerWriterLock.ExitWriteLock(); 6: }; 7: var getDelegate = (key) => 8: { 9: string val; 10: _readerWriterLock.EnterReadLock(); 11: if(!_dictionary.TryGetValue(key, out val)) 12: { 13: val = null; 14: } 15: _readerWriterLock.ExitReadLock(); 16: return val; 17: }; And finally, the ConcurrentDictionary, which since it does all it's own concurrency control, is remarkably elegant and simple: 1: var addDelegate = (key,val) => 2: { 3: _concurrentDictionary[key] = val; 4: }; 5: var getDelegate = (key) => 6: { 7: string s; 8: return _concurrentDictionary.TryGetValue(key, out s) ? s : null; 9: };                    Then, I set up a test harness that would simply ask the user for the number of concurrent Accessors to attempt to Access the cache (as specified in Accessor.Access() above) and then let them fly and see how long it took them all to complete.  Each of these tests was run with 10,000,000 cache accesses divided among the available Accessor instances.  All times are in milliseconds. 1: Dictionary with Mutex Locking 2: --------------------------------------------------- 3: Accessors Mostly Misses Mostly Hits 4: 1 7916 3285 5: 10 8293 3481 6: 100 8799 3532 7: 1000 8815 3584 8:  9:  10: Dictionary with ReaderWriterLockSlim Locking 11: --------------------------------------------------- 12: Accessors Mostly Misses Mostly Hits 13: 1 8445 3624 14: 10 11002 4119 15: 100 11076 3992 16: 1000 14794 4861 17:  18:  19: Concurrent Dictionary 20: --------------------------------------------------- 21: Accessors Mostly Misses Mostly Hits 22: 1 17443 3726 23: 10 14181 1897 24: 100 15141 1994 25: 1000 17209 2128 The first test I did across the board is the Mostly Misses category.  The mostly misses (more adds because data requested was not in the dictionary) shows an interesting trend.  In both cases the Dictionary with the simple mutex lock is much faster, and the ConcurrentDictionary is the slowest solution.  But this got me thinking, and a little research seemed to confirm it, maybe the ConcurrentDictionary is more optimized to concurrent "gets" than "adds".  So since the ratio of misses to hits were 2 to 1, I decided to reverse that and see the results. So I tweaked the data so that the number of keys were much smaller than the number of iterations to give me about a 2 to 1 ration of hits to misses (twice as likely to already find the item in the cache than to need to add it).  And yes, indeed here we see that the ConcurrentDictionary is indeed faster than the standard Dictionary here.  I have a strong feeling that as the ration of hits-to-misses gets higher and higher these number gets even better as well.  This makes sense since the ConcurrentDictionary is read-optimized. Also note that I tried the tests with capacity and concurrency hints on the ConcurrentDictionary but saw very little improvement, I think this is largely because on the 10,000,000 hit test it quickly ramped up to the correct capacity and concurrency and thus the impact was limited to the first few milliseconds of the run. So what does this tell us?  Well, as in all things, ConcurrentDictionary is not a panacea.  It won't solve all your woes and it shouldn't be the only Dictionary you ever use.  So when should we use each? Use System.Collections.Generic.Dictionary when: You need a single-threaded Dictionary (no locking needed). You need a multi-threaded Dictionary that is loaded only once at creation and never modified (no locking needed). You need a multi-threaded Dictionary to store items where writes are far more prevalent than reads (locking needed). And use System.Collections.Concurrent.ConcurrentDictionary when: You need a multi-threaded Dictionary where the writes are far more prevalent than reads. You need to be able to iterate over the collection without locking it even if its being modified. Both Dictionaries have their strong suits, I have a feeling this is just one where you need to know from design what you hope to use it for and make your decision based on that criteria.

    Read the article

  • What's up with LDoms: Part 4 - Virtual Networking Explained

    - by Stefan Hinker
    I'm back from my summer break (and some pressing business that kept me away from this), ready to continue with Oracle VM Server for SPARC ;-) In this article, we'll have a closer look at virtual networking.  Basic connectivity as we've seen it in the first, simple example, is easy enough.  But there are numerous options for the virtual switches and virtual network ports, which we will discuss in more detail now.   In this section, we will concentrate on virtual networking - the capabilities of virtual switches and virtual network ports - only.  Other options involving hardware assignment or redundancy will be covered in separate sections later on. There are two basic components involved in virtual networking for LDoms: Virtual switches and virtual network devices.  The virtual switch should be seen just like a real ethernet switch.  It "runs" in the service domain and moves ethernet packets back and forth.  A virtual network device is plumbed in the guest domain.  It corresponds to a physical network device in the real world.  There, you'd be plugging a cable into the network port, and plug the other end of that cable into a switch.  In the virtual world, you do the same:  You create a virtual network device for your guest and connect it to a virtual switch in a service domain.  The result works just like in the physical world, the network device sends and receives ethernet packets, and the switch does all those things ethernet switches tend to do. If you look at the reference manual of Oracle VM Server for SPARC, there are numerous options for virtual switches and network devices.  Don't be confused, it's rather straight forward, really.  Let's start with the simple case, and work our way to some more sophisticated options later on.  In many cases, you'll want to have several guests that communicate with the outside world on the same ethernet segment.  In the real world, you'd connect each of these systems to the same ethernet switch.  So, let's do the same thing in the virtual world: root@sun # ldm add-vsw net-dev=nxge2 admin-vsw primary root@sun # ldm add-vnet admin-net admin-vsw mars root@sun # ldm add-vnet admin-net admin-vsw venus We've just created a virtual switch called "admin-vsw" and connected it to the physical device nxge2.  In the physical world, we'd have powered up our ethernet switch and installed a cable between it and our big enterprise datacenter switch.  We then created a virtual network interface for each one of the two guest systems "mars" and "venus" and connected both to that virtual switch.  They can now communicate with each other and with any system reachable via nxge2.  If primary were running Solaris 10, communication with the guests would not be possible.  This is different with Solaris 11, please see the Admin Guide for details.  Note that I've given both the vswitch and the vnet devices some sensible names, something I always recommend. Unless told otherwise, the LDoms Manager software will automatically assign MAC addresses to all network elements that need one.  It will also make sure that these MAC addresses are unique and reuse MAC addresses to play nice with all those friendly DHCP servers out there.  However, if we want to do this manually, we can also do that.  (One reason might be firewall rules that work on MAC addresses.)  So let's give mars a manually assigned MAC address: root@sun # ldm set-vnet mac-addr=0:14:4f:f9:c4:13 admin-net mars Within the guest, these virtual network devices have their own device driver.  In Solaris 10, they'd appear as "vnet0".  Solaris 11 would apply it's usual vanity naming scheme.  We can configure these interfaces just like any normal interface, give it an IP-address and configure sophisticated routing rules, just like on bare metal.  In many cases, using Jumbo Frames helps increase throughput performance.  By default, these interfaces will run with the standard ethernet MTU of 1500 bytes.  To change this,  it is usually sufficient to set the desired MTU for the virtual switch.  This will automatically set the same MTU for all vnet devices attached to that switch.  Let's change the MTU size of our admin-vsw from the example above: root@sun # ldm set-vsw mtu=9000 admin-vsw primary Note that that you can set the MTU to any value between 1500 and 16000.  Of course, whatever you set needs to be supported by the physical network, too. Another very common area of network configuration is VLAN tagging. This can be a little confusing - my advise here is to be very clear on what you want, and perhaps draw a little diagram the first few times.  As always, keeping a configuration simple will help avoid errors of all kind.  Nevertheless, VLAN tagging is very usefull to consolidate different networks onto one physical cable.  And as such, this concept needs to be carried over into the virtual world.  Enough of the introduction, here's a little diagram to help in explaining how VLANs work in LDoms: Let's remember that any VLANs not explicitly tagged have the default VLAN ID of 1. In this example, we have a vswitch connected to a physical network that carries untagged traffic (VLAN ID 1) as well as VLANs 11, 22, 33 and 44.  There might also be other VLANs on the wire, but the vswitch will ignore all those packets.  We also have two vnet devices, one for mars and one for venus.  Venus will see traffic from VLANs 33 and 44 only.  For VLAN 44, venus will need to configure a tagged interface "vnet44000".  For VLAN 33, the vswitch will untag all incoming traffic for venus, so that venus will see this as "normal" or untagged ethernet traffic.  This is very useful to simplify guest configuration and also allows venus to perform Jumpstart or AI installations over this network even if the Jumpstart or AI server is connected via VLAN 33.  Mars, on the other hand, has full access to untagged traffic from the outside world, and also to VLANs 11,22 and 33, but not 44.  On the command line, we'd do this like this: root@sun # ldm add-vsw net-dev=nxge2 pvid=1 vid=11,22,33,44 admin-vsw primary root@sun # ldm add-vnet admin-net pvid=1 vid=11,22,33 admin-vsw mars root@sun # ldm add-vnet admin-net pvid=33 vid=44 admin-vsw venus Finally, I'd like to point to a neat little option that will make your live easier in all those cases where configurations tend to change over the live of a guest system.  It's the "id=<somenumber>" option available for both vswitches and vnet devices.  Normally, Solaris in the guest would enumerate network devices sequentially.  However, it has ways of remembering this initial numbering.  This is good in the physical world.  In the virtual world, whenever you unbind (aka power off and disassemble) a guest system, remove and/or add network devices and bind the system again, chances are this numbering will change.  Configuration confusion will follow suit.  To avoid this, nail down the initial numbering by assigning each vnet device it's device-id explicitly: root@sun # ldm add-vnet admin-net id=1 admin-vsw venus Please consult the Admin Guide for details on this, and how to decipher these network ids from Solaris running in the guest. Thanks for reading this far.  Links for further reading are essentially only the Admin Guide and Reference Manual and can be found above.  I hope this is useful and, as always, I welcome any comments.

    Read the article

  • SQL SERVER – Step by Step Guide to Beginning Data Quality Services in SQL Server 2012 – Introduction to DQS

    - by pinaldave
    Data Quality Services is a very important concept of SQL Server. I have recently started to explore the same and I am really learning some good concepts. Here are two very important blog posts which one should go over before continuing this blog post. Installing Data Quality Services (DQS) on SQL Server 2012 Connecting Error to Data Quality Services (DQS) on SQL Server 2012 This article is introduction to Data Quality Services for beginners. We will be using an Excel file Click on the image to enlarge the it. In the first article we learned to install DQS. In this article we will see how we can learn about building Knowledge Base and using it to help us identify the quality of the data as well help correct the bad quality of the data. Here are the two very important steps we will be learning in this tutorial. Building a New Knowledge Base  Creating a New Data Quality Project Let us start the building the Knowledge Base. Click on New Knowledge Base. In our project we will be using the Excel as a knowledge base. Here is the Excel which we will be using. There are two columns. One is Colors and another is Shade. They are independent columns and not related to each other. The point which I am trying to show is that in Column A there are unique data and in Column B there are duplicate records. Clicking on New Knowledge Base will bring up the following screen. Enter the name of the new knowledge base. Clicking NEXT will bring up following screen where it will allow to select the EXCE file and it will also let users select the source column. I have selected Colors and Shade both as a source column. Creating a domain is very important. Here you can create a unique domain or domain which is compositely build from Colors and Shade. As this is the first example, I will create unique domain – for Colors I will create domain Colors and for Shade I will create domain Shade. Here is the screen which will demonstrate how the screen will look after creating domains. Clicking NEXT it will bring you to following screen where you can do the data discovery. Clicking on the START will start the processing of the source data provided. Pre-processed data will show various information related to the source data. In our case it shows that Colors column have unique data whereas Shade have non-unique data and unique data rows are only two. In the next screen you can actually add more rows as well see the frequency of the data as the values are listed unique. Clicking next will publish the knowledge base which is just created. Now the knowledge base is created. We will try to take any random data and attempt to do DQS implementation over it. I am using another excel sheet here for simplicity purpose. In reality you can easily use SQL Server table for the same. Click on New Data Quality Project to see start DQS Project. In the next screen it will ask which knowledge base to use. We will be using our Colors knowledge base which we have recently created. In the Colors knowledge base we had two columns – 1) Colors and 2) Shade. In our case we will be using both of the mappings here. User can select one or multiple column mapping over here. Now the most important phase of the complete project. Click on Start and it will make the cleaning process and shows various results. In our case there were two columns to be processed and it completed the task with necessary information. It demonstrated that in Colors columns it has not corrected any value by itself but in Shade value there is a suggestion it has. We can train the DQS to correct values but let us keep that subject for future blog posts. Now click next and keep the domain Colors selected left side. It will demonstrate that there are two incorrect columns which it needs to be corrected. Here is the place where once corrected value will be auto-corrected in future. I manually corrected the value here and clicked on Approve radio buttons. As soon as I click on Approve buttons the rows will be disappeared from this tab and will move to Corrected Tab. If I had rejected tab it would have moved the rows to Invalid tab as well. In this screen you can see how the corrected 2 rows are demonstrated. You can click on Correct tab and see previously validated 6 rows which passed the DQS process. Now let us click on the Shade domain on the left side of the screen. This domain shows very interesting details as there DQS system guessed the correct answer as Dark with the confidence level of 77%. It is quite a high confidence level and manual observation also demonstrate that Dark is the correct answer. I clicked on Approve and the row moved to corrected tab. On the next screen DQS shows the summary of all the activities. It also demonstrates how the correction of the quality of the data was performed. The user can explore their data to a SQL Server Table, CSV file or Excel. The user also has an option to either explore data and all the associated cleansing info or data only. I will select Data only for demonstration purpose. Clicking explore will generate the files. Let us open the generated file. It will look as following and it looks pretty complete and corrected. Well, we have successfully completed DQS Process. The process is indeed very easy. I suggest you try this out yourself and you will find it very easy to learn. In future we will go over advanced concepts. Are you using this feature on your production server? If yes, would you please leave a comment with your environment and business need. It will be indeed interesting to see where it is implemented. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Business Intelligence, Data Warehousing, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Data Quality Services, DQS

    Read the article

  • Exploring the Excel Services REST API

    - by jamiet
    Over the last few years Analysis Services guru Chris Webb and I have been on something of a crusade to enable better access to data that is locked up in countless Excel workbooks that litter the hard drives of enterprise PCs. The most prominent manifestation of that crusade up to now has been a forum thread that Chris began on Microsoft Answers entitled Excel Web App API? Chris began that thread with: I was wondering whether there was an API for the Excel Web App? Specifically, I was wondering if it was possible (or if it will be possible in the future) to expose data in a spreadsheet in the Excel Web App as an OData feed, in the way that it is possible with Excel Services? Up to recently the last 10 words of that paragraph "in the way that it is possible with Excel Services" had completely washed over me however a comment on my recent blog post Thoughts on ExcelMashup.com (and a rant) by Josh Booker in which Josh said: Excel Services is a service application built for sharepoint 2010 which exposes a REST API for excel documents. We're looking forward to pros like you giving it a try now that Office365 makes sharepoint more easily accessible.  Can't wait for your future blog about using REST API to load data from Excel on Offce 365 in SSIS. made me think that perhaps the Excel Services REST API is something I should be looking into and indeed that is what I have been doing over the past few days. And you know what? I'm rather impressed with some of what Excel Services' REST API has to offer. Unfortunately Excel Services' REST API also has one debilitating aspect that renders this blog post much less useful than it otherwise would be; namely that it is not publicly available from the Excel Web App on SkyDrive. Therefore all I can do in this blog post is show you screenshots of what the REST API provides in Sharepoint rather than linking you directly to those REST resources; that's a great shame because one of the benefits of a REST API is that it is easily and ubiquitously demonstrable from a web browser. Instead I am hosting a workbook on Sharepoint in Office 365 because that does include Excel Services' REST API but, again, all I can do is show you screenshots. N.B. If anyone out there knows how to make Office-365-hosted spreadsheets publicly-accessible (i.e. without requiring a username/password) please do let me know (because knowing which forum on which to ask the question is an exercise in futility). In order to demonstrate Excel Services' REST API I needed some decent data and for that I used the World Tourism Organization Statistics Database and Yearbook - United Nations World Tourism Organization dataset hosted on Azure Datamarket (its free, by the way); this dataset "provides comprehensive information on international tourism worldwide and offers a selection of the latest available statistics on international tourist arrivals, tourism receipts and expenditure" and you can explore the data for yourself here. If you want to play along at home by viewing the data as it exists in Excel then it can be viewed here. Let's dive in.   The root of Excel Services' REST API is the model resource which resides at: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model Note that this is true for every workbook hosted in a Sharepoint document library - each Excel workbook is a RESTful resource. (Update: Mark Stacey on Twitter tells me that "It's turned off by default in onpremise Sharepoint (1 tickbox to turn on though)". Thanks Mark!) The data is provided as an ATOM feed but I have Firefox's feed reading ability turned on so you don't see the underlying XML goo. As you can see there are four top level resources, Ranges, Charts, Tables and PivotTables; exploring one of those resources is where things get interesting. Let's take a look at the Tables Resource: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Tables Our workbook contains only one table, called ‘Table1’ (to reiterate, you can explore this table yourself here). Viewing that table via the REST API is pretty easy, we simply append the name of the table onto our previous URI: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Tables('Table1') As you can see, that quite simply gives us a representation of the data in that table. What you cannot see from this screenshot is that this is pure HTML that is being served up; that is all well and good but actually we can do more interesting things. If we specify that the data should be returned not as HTML but as: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Tables('Table1')?$format=image then that data comes back as a pure image and can be used in any web page where you would ordinarily use images. This is the thing that I really like about Excel Services’ REST API – we can embed an image in any web page but instead of being a copy of the data, that image is actually live – if the underlying data in the workbook were to change then hitting refresh will show a new image. Pretty cool, no? The same is true of any Charts or Pivot Tables in your workbook - those can be embedded as images too and if the underlying data changes, boom, the image in your web page changes too. There is a lot of data in the workbook so the image returned by that previous URI is too large to show here so instead let’s take a look at a different resource, this time a range: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Ranges('Data!A1|C15') That URI returns cells A1 to C15 from a worksheet called “Data”: And if we ask for that as an image again: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Ranges('Data!A1|C15')?$format=image Were this image resource not behind a username/password then this would be a live image of the data in the workbook as opposed to one that I had to copy and upload elsewhere. Nonetheless I hope this little wrinkle doesn't detract from the inate value of what I am trying to articulate here; that an existing image in a web page can be changed on-the-fly simply by inserting some data into an Excel workbook. I for one think that that is very cool indeed! I think that's enough in the way of demo for now as this shows what is possible using Excel Services' REST API. Of course, not all features work quite how I would like and here is a bulleted list of some of my more negative feedback: The URIs are pig-ugly. Are "_vti_bin" & "ExcelRest.aspx" really necessary as part of the URI? Would this not be better: http://server/Documents/TourismExpenditureInMillionsOfUSD.xlsx/Model/Tables(‘Table1’) That URI provides the necessary addressability and is a lot easier to remember. Discoverability of these resources is not easy, we essentially have to handcrank a URI ourselves. Take the example of embedding a chart into a blog post - would it not be better if I could browse first through the document library to an Excel workbook and THEN through the workbook to the chart/range/table that I am interested in? Call it a wizard if you like. That would be really cool and would, I am sure, promote this feature and cut down on the copy-and-paste disease that the REST API is meant to alleviate. The resources that I demonstrated can be returned as feeds as well as images or HTML simply by changing the format parameter to ?$format=atom however for some inexplicable reason they don't return OData and no-one on the Excel Services team can tell me why (believe me, I have asked). $format is an OData parameter however other useful parameters such as $top and $filter are not supported. It would be nice if they were. Although I haven't demonstrated it here Excel Services' REST API does provide a makeshift way of altering the data by changing the value of specific cells however what it does not allow you to do is add new data into the workbook. Google Docs allows this and was one of the motivating factors for Chris Webb's forum post that I linked to above. None of this works for Excel workbooks hosted on SkyDrive This blog post is as long as it needs to be for a short introduction so I'll stop now. If you want to know more than I recommend checking out a few links: Excel Services REST API documentation on MSDNSo what does REST on Excel Services look like??? by Shahar PrishExcel Services in SharePoint 2010 REST API Syntax by Christian Stich. Any thoughts? Let's hear them in the comments section below! @Jamiet 

    Read the article

  • C#: Optional Parameters - Pros and Pitfalls

    - by James Michael Hare
    When Microsoft rolled out Visual Studio 2010 with C# 4, I was very excited to learn how I could apply all the new features and enhancements to help make me and my team more productive developers. Default parameters have been around forever in C++, and were intentionally omitted in Java in favor of using overloading to satisfy that need as it was though that having too many default parameters could introduce code safety issues.  To some extent I can understand that move, as I’ve been bitten by default parameter pitfalls before, but at the same time I feel like Java threw out the baby with the bathwater in that move and I’m glad to see C# now has them. This post briefly discusses the pros and pitfalls of using default parameters.  I’m avoiding saying cons, because I really don’t believe using default parameters is a negative thing, I just think there are things you must watch for and guard against to avoid abuses that can cause code safety issues. Pro: Default Parameters Can Simplify Code Let’s start out with positives.  Consider how much cleaner it is to reduce all the overloads in methods or constructors that simply exist to give the semblance of optional parameters.  For example, we could have a Message class defined which allows for all possible initializations of a Message: 1: public class Message 2: { 3: // can either cascade these like this or duplicate the defaults (which can introduce risk) 4: public Message() 5: : this(string.Empty) 6: { 7: } 8:  9: public Message(string text) 10: : this(text, null) 11: { 12: } 13:  14: public Message(string text, IDictionary<string, string> properties) 15: : this(text, properties, -1) 16: { 17: } 18:  19: public Message(string text, IDictionary<string, string> properties, long timeToLive) 20: { 21: // ... 22: } 23: }   Now consider the same code with default parameters: 1: public class Message 2: { 3: // can either cascade these like this or duplicate the defaults (which can introduce risk) 4: public Message(string text = "", IDictionary<string, string> properties = null, long timeToLive = -1) 5: { 6: // ... 7: } 8: }   Much more clean and concise and no repetitive coding!  In addition, in the past if you wanted to be able to cleanly supply timeToLive and accept the default on text and properties above, you would need to either create another overload, or pass in the defaults explicitly.  With named parameters, though, we can do this easily: 1: var msg = new Message(timeToLive: 100);   Pro: Named Parameters can Improve Readability I must say one of my favorite things with the default parameters addition in C# is the named parameters.  It lets code be a lot easier to understand visually with no comments.  Think how many times you’ve run across a TimeSpan declaration with 4 arguments and wondered if they were passing in days/hours/minutes/seconds or hours/minutes/seconds/milliseconds.  A novice running through your code may wonder what it is.  Named arguments can help resolve the visual ambiguity: 1: // is this days/hours/minutes/seconds (no) or hours/minutes/seconds/milliseconds (yes) 2: var ts = new TimeSpan(1, 2, 3, 4); 3:  4: // this however is visually very explicit 5: var ts = new TimeSpan(days: 1, hours: 2, minutes: 3, seconds: 4);   Or think of the times you’ve run across something passing a Boolean literal and wondered what it was: 1: // what is false here? 2: var sub = CreateSubscriber(hostname, port, false); 3:  4: // aha! Much more visibly clear 5: var sub = CreateSubscriber(hostname, port, isBuffered: false);   Pitfall: Don't Insert new Default Parameters In Between Existing Defaults Now let’s consider a two potential pitfalls.  The first is really an abuse.  It’s not really a fault of the default parameters themselves, but a fault in the use of them.  Let’s consider that Message constructor again with defaults.  Let’s say you want to add a messagePriority to the message and you think this is more important than a timeToLive value, so you decide to put messagePriority before it in the default, this gives you: 1: public class Message 2: { 3: public Message(string text = "", IDictionary<string, string> properties = null, int priority = 5, long timeToLive = -1) 4: { 5: // ... 6: } 7: }   Oh boy have we set ourselves up for failure!  Why?  Think of all the code out there that could already be using the library that already specified the timeToLive, such as this possible call: 1: var msg = new Message(“An error occurred”, myProperties, 1000);   Before this specified a message with a TTL of 1000, now it specifies a message with a priority of 1000 and a time to live of -1 (infinite).  All of this with NO compiler errors or warnings. So the rule to take away is if you are adding new default parameters to a method that’s currently in use, make sure you add them to the end of the list or create a brand new method or overload. Pitfall: Beware of Default Parameters in Inheritance and Interface Implementation Now, the second potential pitfalls has to do with inheritance and interface implementation.  I’ll illustrate with a puzzle: 1: public interface ITag 2: { 3: void WriteTag(string tagName = "ITag"); 4: } 5:  6: public class BaseTag : ITag 7: { 8: public virtual void WriteTag(string tagName = "BaseTag") { Console.WriteLine(tagName); } 9: } 10:  11: public class SubTag : BaseTag 12: { 13: public override void WriteTag(string tagName = "SubTag") { Console.WriteLine(tagName); } 14: } 15:  16: public static class Program 17: { 18: public static void Main() 19: { 20: SubTag subTag = new SubTag(); 21: BaseTag subByBaseTag = subTag; 22: ITag subByInterfaceTag = subTag; 23:  24: // what happens here? 25: subTag.WriteTag(); 26: subByBaseTag.WriteTag(); 27: subByInterfaceTag.WriteTag(); 28: } 29: }   What happens?  Well, even though the object in each case is SubTag whose tag is “SubTag”, you will get: 1: SubTag 2: BaseTag 3: ITag   Why?  Because default parameter are resolved at compile time, not runtime!  This means that the default does not belong to the object being called, but by the reference type it’s being called through.  Since the SubTag instance is being called through an ITag reference, it will use the default specified in ITag. So the moral of the story here is to be very careful how you specify defaults in interfaces or inheritance hierarchies.  I would suggest avoiding repeating them, and instead concentrating on the layer of classes or interfaces you must likely expect your caller to be calling from. For example, if you have a messaging factory that returns an IMessage which can be either an MsmqMessage or JmsMessage, it only makes since to put the defaults at the IMessage level since chances are your user will be using the interface only. So let’s sum up.  In general, I really love default and named parameters in C# 4.0.  I think they’re a great tool to help make your code easier to read and maintain when used correctly. On the plus side, default parameters: Reduce redundant overloading for the sake of providing optional calling structures. Improve readability by being able to name an ambiguous argument. But remember to make sure you: Do not insert new default parameters in the middle of an existing set of default parameters, this may cause unpredictable behavior that may not necessarily throw a syntax error – add to end of list or create new method. Be extremely careful how you use default parameters in inheritance hierarchies and interfaces – choose the most appropriate level to add the defaults based on expected usage. Technorati Tags: C#,.NET,Software,Default Parameters

    Read the article

  • Exploring the Excel Services REST API

    - by jamiet
    Over the last few years Analysis Services guru Chris Webb and I have been on something of a crusade to enable better access to data that is locked up in countless Excel workbooks that litter the hard drives of enterprise PCs. The most prominent manifestation of that crusade up to now has been a forum thread that Chris began on Microsoft Answers entitled Excel Web App API? Chris began that thread with: I was wondering whether there was an API for the Excel Web App? Specifically, I was wondering if it was possible (or if it will be possible in the future) to expose data in a spreadsheet in the Excel Web App as an OData feed, in the way that it is possible with Excel Services? Up to recently the last 10 words of that paragraph "in the way that it is possible with Excel Services" had completely washed over me however a comment on my recent blog post Thoughts on ExcelMashup.com (and a rant) by Josh Booker in which Josh said: Excel Services is a service application built for sharepoint 2010 which exposes a REST API for excel documents. We're looking forward to pros like you giving it a try now that Office365 makes sharepoint more easily accessible.  Can't wait for your future blog about using REST API to load data from Excel on Offce 365 in SSIS. made me think that perhaps the Excel Services REST API is something I should be looking into and indeed that is what I have been doing over the past few days. And you know what? I'm rather impressed with some of what Excel Services' REST API has to offer. Unfortunately Excel Services' REST API also has one debilitating aspect that renders this blog post much less useful than it otherwise would be; namely that it is not publicly available from the Excel Web App on SkyDrive. Therefore all I can do in this blog post is show you screenshots of what the REST API provides in Sharepoint rather than linking you directly to those REST resources; that's a great shame because one of the benefits of a REST API is that it is easily and ubiquitously demonstrable from a web browser. Instead I am hosting a workbook on Sharepoint in Office 365 because that does include Excel Services' REST API but, again, all I can do is show you screenshots. N.B. If anyone out there knows how to make Office-365-hosted spreadsheets publicly-accessible (i.e. without requiring a username/password) please do let me know (because knowing which forum on which to ask the question is an exercise in futility). In order to demonstrate Excel Services' REST API I needed some decent data and for that I used the World Tourism Organization Statistics Database and Yearbook - United Nations World Tourism Organization dataset hosted on Azure Datamarket (its free, by the way); this dataset "provides comprehensive information on international tourism worldwide and offers a selection of the latest available statistics on international tourist arrivals, tourism receipts and expenditure" and you can explore the data for yourself here. If you want to play along at home by viewing the data as it exists in Excel then it can be viewed here. Let's dive in.   The root of Excel Services' REST API is the model resource which resides at: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model Note that this is true for every workbook hosted in a Sharepoint document library - each Excel workbook is a RESTful resource. (Update: Mark Stacey on Twitter tells me that "It's turned off by default in onpremise Sharepoint (1 tickbox to turn on though)". Thanks Mark!) The data is provided as an ATOM feed but I have Firefox's feed reading ability turned on so you don't see the underlying XML goo. As you can see there are four top level resources, Ranges, Charts, Tables and PivotTables; exploring one of those resources is where things get interesting. Let's take a look at the Tables Resource: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Tables Our workbook contains only one table, called ‘Table1’ (to reiterate, you can explore this table yourself here). Viewing that table via the REST API is pretty easy, we simply append the name of the table onto our previous URI: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Tables('Table1') As you can see, that quite simply gives us a representation of the data in that table. What you cannot see from this screenshot is that this is pure HTML that is being served up; that is all well and good but actually we can do more interesting things. If we specify that the data should be returned not as HTML but as: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Tables('Table1')?$format=image then that data comes back as a pure image and can be used in any web page where you would ordinarily use images. This is the thing that I really like about Excel Services’ REST API – we can embed an image in any web page but instead of being a copy of the data, that image is actually live – if the underlying data in the workbook were to change then hitting refresh will show a new image. Pretty cool, no? The same is true of any Charts or Pivot Tables in your workbook - those can be embedded as images too and if the underlying data changes, boom, the image in your web page changes too. There is a lot of data in the workbook so the image returned by that previous URI is too large to show here so instead let’s take a look at a different resource, this time a range: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Ranges('Data!A1|C15') That URI returns cells A1 to C15 from a worksheet called “Data”: And if we ask for that as an image again: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Ranges('Data!A1|C15')?$format=image Were this image resource not behind a username/password then this would be a live image of the data in the workbook as opposed to one that I had to copy and upload elsewhere. Nonetheless I hope this little wrinkle doesn't detract from the inate value of what I am trying to articulate here; that an existing image in a web page can be changed on-the-fly simply by inserting some data into an Excel workbook. I for one think that that is very cool indeed! I think that's enough in the way of demo for now as this shows what is possible using Excel Services' REST API. Of course, not all features work quite how I would like and here is a bulleted list of some of my more negative feedback: The URIs are pig-ugly. Are "_vti_bin" & "ExcelRest.aspx" really necessary as part of the URI? Would this not be better: http://server/Documents/TourismExpenditureInMillionsOfUSD.xlsx/Model/Tables(‘Table1’) That URI provides the necessary addressability and is a lot easier to remember. Discoverability of these resources is not easy, we essentially have to handcrank a URI ourselves. Take the example of embedding a chart into a blog post - would it not be better if I could browse first through the document library to an Excel workbook and THEN through the workbook to the chart/range/table that I am interested in? Call it a wizard if you like. That would be really cool and would, I am sure, promote this feature and cut down on the copy-and-paste disease that the REST API is meant to alleviate. The resources that I demonstrated can be returned as feeds as well as images or HTML simply by changing the format parameter to ?$format=atom however for some inexplicable reason they don't return OData and no-one on the Excel Services team can tell me why (believe me, I have asked). $format is an OData parameter however other useful parameters such as $top and $filter are not supported. It would be nice if they were. Although I haven't demonstrated it here Excel Services' REST API does provide a makeshift way of altering the data by changing the value of specific cells however what it does not allow you to do is add new data into the workbook. Google Docs allows this and was one of the motivating factors for Chris Webb's forum post that I linked to above. None of this works for Excel workbooks hosted on SkyDrive This blog post is as long as it needs to be for a short introduction so I'll stop now. If you want to know more than I recommend checking out a few links: Excel Services REST API documentation on MSDNSo what does REST on Excel Services look like??? by Shahar PrishExcel Services in SharePoint 2010 REST API Syntax by Christian Stich. Any thoughts? Let's hear them in the comments section below! @Jamiet 

    Read the article

  • SQL SERVER – Weekly Series – Memory Lane – #035

    - by Pinal Dave
    Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Row Overflow Data Explanation  In SQL Server 2005 one table row can contain more than one varchar(8000) fields. One more thing, the exclusions has exclusions also the limit of each individual column max width of 8000 bytes does not apply to varchar(max), nvarchar(max), varbinary(max), text, image or xml data type columns. Comparison Index Fragmentation, Index De-Fragmentation, Index Rebuild – SQL SERVER 2000 and SQL SERVER 2005 An old but like a gold article. Talks about lots of concepts related to Index and the difference from earlier version to the newer version. I strongly suggest that everyone should read this article just to understand how SQL Server has moved forward with the technology. Improvements in TempDB SQL Server 2005 had come up with quite a lots of improvements and this blog post describes them and explains the same. If you ask me what is my the most favorite article from early career. I must point out to this article as when I wrote this one I personally have learned a lot of new things. Recompile All The Stored Procedure on Specific TableI prefer to recompile all the stored procedure on the table, which has faced mass insert or update. sp_recompiles marks stored procedures to recompile when they execute next time. This blog post explains the same with the help of a script.  2008 SQLAuthority Download – SQL Server Cheatsheet You can download and print this cheat sheet and use it for your personal reference. If you have any suggestions, please let me know and I will see if I can update this SQL Server cheat sheet. Difference Between DBMS and RDBMS What is the difference between DBMS and RDBMS? DBMS – Data Base Management System RDBMS – Relational Data Base Management System or Relational DBMS High Availability – Hot Add Memory Hot Add CPU and Hot Add Memory are extremely interesting features of the SQL Server, however, personally I have not witness them heavily used. These features also have few restriction as well. I blogged about them in detail. 2009 Delete Duplicate Rows I have demonstrated in this blog post how one can identify and delete duplicate rows. Interesting Observation of Logon Trigger On All Servers – Solution The question I put forth in my previous article was – In single login why the trigger fires multiple times; it should be fired only once. I received numerous answers in thread as well as in my MVP private news group. Now, let us discuss the answer for the same. The answer is – It happens because multiple SQL Server services are running as well as intellisense is turned on. Blog post demonstrates how we can do the same with the help of SQL scripts. Management Studio New Features I have selected my favorite 5 features and blogged about it. IntelliSense for Query Editing Multi Server Query Query Editor Regions Object Explorer Enhancements Activity Monitors Maximum Number of Index per Table One of the questions I asked in my user group was – What is the maximum number of Index per table? I received lots of answers to this question but only two answers are correct. Let us now take a look at them in this blog post. 2010 Default Statistics on Column – Automatic Statistics on Column The truth is, Statistics can be in a table even though there is no Index in it. If you have the auto- create and/or auto-update Statistics feature turned on for SQL Server database, Statistics will be automatically created on the Column based on a few conditions. Please read my previously posted article, SQL SERVER – When are Statistics Updated – What triggers Statistics to Update, for the specific conditions when Statistics is updated. 2011 T-SQL Scripts to Find Maximum between Two Numbers In this blog post there are two different scripts listed which demonstrates way to find the maximum number between two numbers. I need your help, which one of the script do you think is the most accurate way to find maximum number? Find Details for Statistics of Whole Database – DMV – T-SQL Script I was recently asked is there a single script which can provide all the necessary details about statistics for any database. This question made me write following script. I was initially planning to use sp_helpstats command but I remembered that this is marked to be deprecated in future. 2012 Introduction to Function SIGN SIGN Function is very fundamental function. It will return the value 1, -1 or 0. If your value is negative it will return you negative -1 and if it is positive it will return you positive +1. Let us start with a simple small example. Template Browser – A Very Important and Useful Feature of SSMS Templates are like a quick cheat sheet or quick reference. Templates are available to create objects like databases, tables, views, indexes, stored procedures, triggers, statistics, and functions. Templates are also available for Analysis Services as well. The template scripts contain parameters to help you customize the code. You can Replace Template Parameters dialog box to insert values into the script. An invalid floating point operation occurred If you run any of the above functions they will give you an error related to invalid floating point. Honestly there is no workaround except passing the function appropriate values. SQRT of a negative number will give you result in real numbers which is not supported at this point of time as well LOG of a negative number is not possible (because logarithm is the inverse function of an exponential function and the exponential function is NEVER negative). Validating Spatial Object with IsValidDetailed Function SQL Server 2012 has introduced the new function IsValidDetailed(). This function has made my life very easy. In simple words, this function will check if the spatial object passed is valid or not. If it is valid it will give information that it is valid. If the spatial object is not valid it will return the answer that it is not valid and the reason for the same. This makes it very easy to debug the issue and make the necessary correction. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • C#: LINQ vs foreach - Round 1.

    - by James Michael Hare
    So I was reading Peter Kellner's blog entry on Resharper 5.0 and its LINQ refactoring and thought that was very cool.  But that raised a point I had always been curious about in my head -- which is a better choice: manual foreach loops or LINQ?    The answer is not really clear-cut.  There are two sides to any code cost arguments: performance and maintainability.  The first of these is obvious and quantifiable.  Given any two pieces of code that perform the same function, you can run them side-by-side and see which piece of code performs better.   Unfortunately, this is not always a good measure.  Well written assembly language outperforms well written C++ code, but you lose a lot in maintainability which creates a big techncial debt load that is hard to offset as the application ages.  In contrast, higher level constructs make the code more brief and easier to understand, hence reducing technical cost.   Now, obviously in this case we're not talking two separate languages, we're comparing doing something manually in the language versus using a higher-order set of IEnumerable extensions that are in the System.Linq library.   Well, before we discuss any further, let's look at some sample code and the numbers.  First, let's take a look at the for loop and the LINQ expression.  This is just a simple find comparison:       // find implemented via LINQ     public static bool FindViaLinq(IEnumerable<int> list, int target)     {         return list.Any(item => item == target);     }         // find implemented via standard iteration     public static bool FindViaIteration(IEnumerable<int> list, int target)     {         foreach (var i in list)         {             if (i == target)             {                 return true;             }         }           return false;     }   Okay, looking at this from a maintainability point of view, the Linq expression is definitely more concise (8 lines down to 1) and is very readable in intention.  You don't have to actually analyze the behavior of the loop to determine what it's doing.   So let's take a look at performance metrics from 100,000 iterations of these methods on a List<int> of varying sizes filled with random data.  For this test, we fill a target array with 100,000 random integers and then run the exact same pseudo-random targets through both searches.                       List<T> On 100,000 Iterations     Method      Size     Total (ms)  Per Iteration (ms)  % Slower     Any         10       26          0.00046             30.00%     Iteration   10       20          0.00023             -     Any         100      116         0.00201             18.37%     Iteration   100      98          0.00118             -     Any         1000     1058        0.01853             16.78%     Iteration   1000     906         0.01155             -     Any         10,000   10,383      0.18189             17.41%     Iteration   10,000   8843        0.11362             -     Any         100,000  104,004     1.8297              18.27%     Iteration   100,000  87,941      1.13163             -   The LINQ expression is running about 17% slower for average size collections and worse for smaller collections.  Presumably, this is due to the overhead of the state machine used to track the iterators for the yield returns in the LINQ expressions, which seems about right in a tight loop such as this.   So what about other LINQ expressions?  After all, Any() is one of the more trivial ones.  I decided to try the TakeWhile() algorithm using a Count() to get the position stopped like the sample Pete was using in his blog that Resharper refactored for him into LINQ:       // Linq form     public static int GetTargetPosition1(IEnumerable<int> list, int target)     {         return list.TakeWhile(item => item != target).Count();     }       // traditionally iterative form     public static int GetTargetPosition2(IEnumerable<int> list, int target)     {         int count = 0;           foreach (var i in list)         {             if(i == target)             {                 break;             }               ++count;         }           return count;     }   Once again, the LINQ expression is much shorter, easier to read, and should be easier to maintain over time, reducing the cost of technical debt.  So I ran these through the same test data:                       List<T> On 100,000 Iterations     Method      Size     Total (ms)  Per Iteration (ms)  % Slower     TakeWhile   10       41          0.00041             128%     Iteration   10       18          0.00018             -     TakeWhile   100      171         0.00171             88%     Iteration   100      91          0.00091             -     TakeWhile   1000     1604        0.01604             94%     Iteration   1000     825         0.00825             -     TakeWhile   10,000   15765       0.15765             92%     Iteration   10,000   8204        0.08204             -     TakeWhile   100,000  156950      1.5695              92%     Iteration   100,000  81635       0.81635             -     Wow!  I expected some overhead due to the state machines iterators produce, but 90% slower?  That seems a little heavy to me.  So then I thought, well, what if TakeWhile() is not the right tool for the job?  The problem is TakeWhile returns each item for processing using yield return, whereas our for-loop really doesn't care about the item beyond using it as a stop condition to evaluate. So what if that back and forth with the iterator state machine is the problem?  Well, we can quickly create an (albeit ugly) lambda that uses the Any() along with a count in a closure (if a LINQ guru knows a better way PLEASE let me know!), after all , this is more consistent with what we're trying to do, we're trying to find the first occurence of an item and halt once we find it, we just happen to be counting on the way.  This mostly matches Any().       // a new method that uses linq but evaluates the count in a closure.     public static int TakeWhileViaLinq2(IEnumerable<int> list, int target)     {         int count = 0;         list.Any(item =>             {                 if(item == target)                 {                     return true;                 }                   ++count;                 return false;             });         return count;     }     Now how does this one compare?                         List<T> On 100,000 Iterations     Method         Size     Total (ms)  Per Iteration (ms)  % Slower     TakeWhile      10       41          0.00041             128%     Any w/Closure  10       23          0.00023             28%     Iteration      10       18          0.00018             -     TakeWhile      100      171         0.00171             88%     Any w/Closure  100      116         0.00116             27%     Iteration      100      91          0.00091             -     TakeWhile      1000     1604        0.01604             94%     Any w/Closure  1000     1101        0.01101             33%     Iteration      1000     825         0.00825             -     TakeWhile      10,000   15765       0.15765             92%     Any w/Closure  10,000   10802       0.10802             32%     Iteration      10,000   8204        0.08204             -     TakeWhile      100,000  156950      1.5695              92%     Any w/Closure  100,000  108378      1.08378             33%     Iteration      100,000  81635       0.81635             -     Much better!  It seems that the overhead of TakeAny() returning each item and updating the state in the state machine is drastically reduced by using Any() since Any() iterates forward until it finds the value we're looking for -- for the task we're attempting to do.   So the lesson there is, make sure when you use a LINQ expression you're choosing the best expression for the job, because if you're doing more work than you really need, you'll have a slower algorithm.  But this is true of any choice of algorithm or collection in general.     Even with the Any() with the count in the closure it is still about 30% slower, but let's consider that angle carefully.  For a list of 100,000 items, it was the difference between 1.01 ms and 0.82 ms roughly in a List<T>.  That's really not that bad at all in the grand scheme of things.  Even running at 90% slower with TakeWhile(), for the vast majority of my projects, an extra millisecond to save potential errors in the long term and improve maintainability is a small price to pay.  And if your typical list is 1000 items or less we're talking only microseconds worth of difference.   It's like they say: 90% of your performance bottlenecks are in 2% of your code, so over-optimizing almost never pays off.  So personally, I'll take the LINQ expression wherever I can because they will be easier to read and maintain (thus reducing technical debt) and I can rely on Microsoft's development to have coded and unit tested those algorithm fully for me instead of relying on a developer to code the loop logic correctly.   If something's 90% slower, yes, it's worth keeping in mind, but it's really not until you start get magnitudes-of-order slower (10x, 100x, 1000x) that alarm bells should really go off.  And if I ever do need that last millisecond of performance?  Well then I'll optimize JUST THAT problem spot.  To me it's worth it for the readability, speed-to-market, and maintainability.

    Read the article

  • More Fun With Math

    - by PointsToShare
    More Fun with Math   The runaway student – three different ways of solving one problem Here is a problem I read in a Russian site: A student is running away. He is moving at 1 mph. Pursuing him are a lion, a tiger and his math teacher. The lion is 40 miles behind and moving at 6 mph. The tiger is 28 miles behind and moving at 4 mph. His math teacher is 30 miles behind and moving at 5 mph. Who will catch him first? Analysis Obviously we have a set of three problems. They are all basically the same, but the details are different. The problems are of the same class. Here is a little excursion into computer science. One of the things we strive to do is to create solutions for classes of problems rather than individual problems. In your daily routine, you call it re-usability. Not all classes of problems have such solutions. If a class has a general (re-usable) solution, it is called computable. Otherwise it is unsolvable. Within unsolvable classes, we may still solve individual (some but not all) problems, albeit with different approaches to each. Luckily the vast majority of our daily problems are computable, and the 3 problems of our runaway student belong to a computable class. So, let’s solve for the catch-up time by the math teacher, after all she is the most frightening. She might even make the poor runaway solve this very problem – perish the thought! Method 1 – numerical analysis. At 30 miles and 5 mph, it’ll take her 6 hours to come to where the student was to begin with. But by then the student has advanced by 6 miles. 6 miles require 6/5 hours, but by then the student advanced by another 6/5 of a mile as well. And so on and so forth. So what are we to do? One way is to write code and iterate it until we have solved it. But this is an infinite process so we’ll end up with an infinite loop. So what to do? We’ll use the principles of numerical analysis. Any calculator – your computer included – has a limited number of digits. A double floating point number is good for about 14 digits. Nothing can be computed at a greater accuracy than that. This means that we will not iterate ad infinidum, but rather to the point where 2 consecutive iterations yield the same result. When we do financial computations, we don’t even have to go that far. We stop at the 10th of a penny.  It behooves us here to stop at a 10th of a second (100 milliseconds) and this will how we will avoid an infinite loop. Interestingly this alludes to the Zeno paradoxes of motion – in particular “Achilles and the Tortoise”. Zeno says exactly the same. To catch the tortoise, Achilles must always first come to where the tortoise was, but the tortoise keeps moving – hence Achilles will never catch the tortoise and our math teacher (or lion, or tiger) will never catch the student, or the policeman the thief. Here is my resolution to the paradox. The distance and time in each step are smaller and smaller, so the student will be caught. The only thing that is infinite is the iterative solution. The race is a convergent geometric process so the steps are diminishing, but each step in the solution takes the same amount of effort and time so with an infinite number of steps, we’ll spend an eternity solving it.  This BTW is an original thought that I have never seen before. But I digress. Let’s simply write the code to solve the problem. To make sure that it runs everywhere, I’ll do it in JavaScript. function LongCatchUpTime(D, PV, FV) // D is Distance; PV is Pursuers Velocity; FV is Fugitive’ Velocity {     var t = 0;     var T = 0;     var d = parseFloat(D);     var pv = parseFloat (PV);     var fv = parseFloat (FV);     t = d / pv;     while (t > 0.000001) //a 10th of a second is 1/36,000 of an hour, I used 1/100,000     {         T = T + t;         d = t * fv;         t = d / pv;     }     return T;     } By and large, the higher the Pursuer’s velocity relative to the fugitive, the faster the calculation. Solving this with the 10th of a second limit yields: 7.499999232000001 Method 2 – Geometric Series. Each step in the iteration above is smaller than the next. As you saw, we stopped iterating when the last step was small enough, small enough not to really matter.  When we have a sequence of numbers in which the ratio of each number to its predecessor is fixed we call the sequence geometric. When we are looking at the sum of sequence, we call the sequence of sums series.  Now let’s look at our student and teacher. The teacher runs 5 times faster than the student, so with each iteration the distance between them shrinks to a fifth of what it was before. This is a fixed ratio so we deal with a geometric series.  We normally designate this ratio as q and when q is less than 1 (0 < q < 1) the sum of  + … +  is  – 1) / (q – 1). When q is less than 1, it is easier to use ) / (1 - q). Now, the steps are 6 hours then 6/5 hours then 6/5*5 and so on, so q = 1/5. And the whole series is multiplied by 6. Also because q is less than 1 , 1/  diminishes to 0. So the sum is just  / (1 - q). or 1/ (1 – 1/5) = 1 / (4/5) = 5/4. This times 6 yields 7.5 hours. We can now continue with some algebra and take it back to a simpler formula. This is arduous and I am not going to do it here. Instead let’s do some simpler algebra. Method 3 – Simple Algebra. If the time to capture the fugitive is T and the fugitive travels at 1 mph, then by the time the pursuer catches him he travelled additional T miles. Time is distance divided by speed, so…. (D + T)/V = T  thus D + T = VT  and D = VT – T = (V – 1)T  and T = D/(V – 1) This “strangely” coincides with the solution we just got from the geometric sequence. This is simpler ad faster. Here is the corresponding code. function ShortCatchUpTime(D, PV, FV) {     var d = parseFloat(D);     var pv = parseFloat (PV);     var fv = parseFloat (FV);     return d / (pv - fv); } The code above, for both the iterative solution and the algebraic solution are actually for a larger class of problems.  In our original problem the student’s velocity (speed) is 1 mph. In the code it may be anything as long as it is less than the pursuer’s velocity. As long as PV > FV, the pursuer will catch up. Here is the really general formula: T = D / (PV – FV) Finally, let’s run the program for each of the pursuers.  It could not be worse. I know he’d rather be eaten alive than suffering through yet another math lesson. See the code run? Select  “Catch Up Time” in www.mgsltns.com/games.htm The host is running on Unix, so the link is case sensitive. That’s All Folks

    Read the article

  • Advanced Continuous Delivery to Azure from TFS, Part 1: Good Enough Is Not Great

    - by jasont
    The folks over on the TFS / Visual Studio team have been working hard at releasing a steady stream of new features for their new hosted Team Foundation Service in the cloud. One of the most significant features released was simple continuous delivery of your solution into your Azure deployments. The original announcement from Brian Harry can be found here. Team Foundation Service is a great platform for .Net developers who are used to working with TFS on-premises. I’ve been using it since it became available at the //BUILD conference in 2011, and when I recently came to work at Stackify, it was one of the first changes I made. Managing work items is much easier than the tool we were using previously, although there are some limitations (more on that in another blog post). However, when continuous deployment was made available, it blew my mind. It was the killer feature I didn’t know I needed. Not to say that I wasn’t previously an advocate for continuous delivery; just that it was always a pain to set up and configure. Having it hosted - and a one-click setup – well, that’s just the best thing since sliced bread. It made perfect sense: my source code is in the cloud, and my deployment is in the cloud. Great! I can queue up a build from my iPad or phone and just let it go! I quickly tore through the quick setup and saw it all work… sort of. This will be the first in a three part series on how to take the building block of Team Foundation Service continuous delivery and build a CD model that will actually work for any team deploying something more advanced than a “Hello World” example. Part 1: Good Enough Is Not Great Part 2: A Model That Works: Branching and Multiple Deployment Environments Part 3: Other Considerations: SQL, Custom Tasks, Etc Good Enough Is Not Great There. I’ve said it. I certainly hope no one on the TFS team is offended, but it’s the truth. Let’s take a look under the hood and understand how it works, and then why it’s not enough to handle real world CD as-is. How it works. (note that I’ve skipped a couple of steps; I already have my accounts set up and something deployed to Azure) The first step is to establish some oAuth magic between your Azure management portal and your TFS Instance. You do this via the management portal. Once it’s done, you have a new build process template in your TFS instance. (Image lifted from the documentation) From here, you’ll get the usual prompts for security, allowing access, etc. But you’ll also get to pick which Solution in your source control to build. Here’s what the bulk of the build definition looks like. All I’ve had to do is add in the solution to build (notice that mine is from a specific branch – Release – more on that later) and I’ve changed the configuration. I trigger the build, and voila! I have an Azure deployment a few minutes later. The beauty of this is that it’s all in the cloud and I’m not waiting for my machine to compile and upload the package. (I also had to enable the build definition first – by default it is created in disabled state, probably a good thing since it will trigger on every.single.checkin by default.) I get to see a history of deployments from the Azure portal, and can link into TFS to see the associated changesets and work items. You’ll notice also that this build definition also automatically put my code in the Staging slot of my Azure deployment – more on this soon. For now, I can VIP swap and be in production. (P.S. I hate VIP swap and “production” and “staging” in Azure. More on that later too.) That’s it. That’s the default out-of-box experience. Easy, right? But it’s full of room for improvement, so let’s get into that….   The Problems Nothing is perfect (except my code – it’s always perfect), and neither is Continuous Deployment without a bit of work to help it fit your dev team’s process. So what are the issues? Issue 1: Staging vs QA vs Prod vs whatever other environments your team may have. This, for me, is the big hairy one. Remember how this automatically deployed to staging rather than prod for us? There are a couple of issues with this model: If I want to deliver to prod, it requires intervention on my part after deployment (via a VIP swap). If I truly want to promote between environments (i.e. Nightly Build –> Stable QA –> Production) I likely have configuration changes between each environment such as database connection strings and this process (and the VIP swap) doesn’t account for this. Yet. Issue 2: Branching and delivering on every check-in. As I mentioned above, I have set this up to target a specific branch – Release – of my code. For the purposes of this example, I have adopted the “basic” branching strategy as defined by the ALM Rangers. This basically establishes a “Main” trunk where you branch off Dev and Release branches. Granted, the Release branch is usually the only thing you will deploy to production, but you certainly don’t want to roll to production automatically when you merge to the Release branch and check-in (unless you like the thrill of it, and in that case, I like your style, cowboy….). Rather, you have nightly build and QA environments, or if you’ve adopted the feature-branch model you have environments for those. Those are the environments you want to continuously deploy to. But that takes us back to Issue 1: we currently have a 1:1 solution to Azure deployment target. Issue 3: SQL and other custom tasks. Let’s be honest and address the elephant in the room: I need to get some sleep because I see an elephant in the room. But seriously, I can’t think of an application I have touched in the last 10 years that doesn’t need to consider SQL changes when deploying code and upgrading an environment. Microsoft seems perfectly content to ignore this elephant for now: yes, they’ve added Data Tier Applications. But let’s be honest with ourselves again: no one really uses it, and it’s not suitable for anything more complex than a Hello World sample project database. Why? Because it doesn’t fit well into a great source control story. Developers make stored procedure and table changes all day long while coding complex applications, and if someone forgets to go update the DACPAC before the automated deployment, you have a broken build until it’s completed. Developers – not just DBAs – also like to work with SQL in SQL tools, not in Visual Studio. I’m really picking on SQL because that’s generally the biggest concern that I hear. But we need to account for any custom tasks as well in the build process.   The Solutions… ? We’ve taken a look at how this all works, and addressed the shortcomings. In my next post (which I promise will be very, very soon), I will detail how I’ve overcome these shortcomings and used this foundation to create a mature, flexible model for deploying my app – any version, any time, to any environment.

    Read the article

  • Pet Peeves with the Windows Phone 7 Marketplace

    - by Bil Simser
    Have you ever noticed how something things just gnaw at your very being. This is the case with the WP7 marketplace, the Zune software, and the things that drive me batshit crazy with a side of fries. To go. I wanted to share. XBox Live is Not the Centre of the Universe Okay, it’s fine that the Zune software has an XBox live tag for games so can see them clearly but do we really need to have it shoved down our throats. On every click? Click on Games in the marketplace: The first thing that it defaults to on the filters on the right is XBox Live: Okay. Fine. However if you change it (say to Paid) then click onto a title when you come back from that title is the filter still set to Paid? No. It’s back to XBox Live again. Really? Give us a break. If you change to any filter on any other genre then click on the selected title, it doesn’t revert back to anything. It stays on the selection you picked. Let’s be fair here. The Games genre should behave just like every other one. If I pick Paid then when I come back to the list please remember that. Double Dipping On the subject of XBox Live titles, Microsoft (and developers who have an agreement with Microsoft to produce Live titles, which generally rules out indie game developers) is double dipping with regards to exposure of their titles. Here’s the Puzzle and Trivia Game section on the Marketplace for XBox Live titles: And here’s the same category filtered on Paid titles: See the problem? Two indie titles while the rest are XBox Live ones. So while XBL has it’s filter, they also get to showcase their wares in the Paid and Free filters as well. If you’re going to have an XBox Live filter then use it and stop pushing down indie titles until they’re off the screen (on some genres this is already the case). Free and Paid titles should be just that and not include XBox Live ones. If you’re really stoked that people can’t find the Free XBox Live titles vs. the paid ones, then create a Free XBox Live filter and a Paid XBox Live filter. I don’t think we would mind much. Whose Trial is it Anyways? You might notice apps in the marketplace with titles like “My Fart App Professional Lite” or “Silicon Lamb Spleen Builder Free”. When you submit and app to the marketplace it can either be free or paid. If it’s a paid app you also have the option to submit it with Trial capabilities. It’s up to you to decide what you offer in the trial version but trial versions can be purchased from within the app so after someone trys out your app (for free) and wants to unlock the Super Secret Obama Spy Ring Level, they can just go to the marketplace from your app (if you built that functionality in) and upgrade to the paid version. However it creates a rift of sorts when it comes to visibility. Some developers go the route of the paid app with a trial version, others decide to submit *two* apps instead of one. One app is the “Free” or “Lite” verions and the other is the paid version. Why go to the hassle of submitting two apps when you can just create a trial version in the same app? Again, visibility. There’s no way to tell Paid apps with Trial versions and ones without (it’s an option, you don’t have to provide trial versions, although I think it’s a good idea). However there is a way to see the Free apps from the Paid ones so some submit the two apps and have the Free version have links to buy the paid one (again through the Marketplace tasks in the API). What we as developers need for visibility is a new filter. Trial. That’s it. It would simply filter on Paid apps that have trial capabilities and surface up those apps just like the free ones. If Microsoft added this filter to the marketplace, it would eliminate the need for people to submit their “Free” and “Lite” versions and make it easier for the developer not to have to maintain two systems. I mean, is it really that hard? Can’t be any more difficult than the XBox Live Filter that’s already there. Location is Everything The last thing on my bucket list is about location. When I launch Zune I’m running in my native location setting, Canada. What’s great is that I navigate to the Travel Tools section where I have one of my apps and behold the splendour that I see: There are my apps in the number 1 and number 4 slot for top selling in that category. I show it to my wife to make up for the sleepless nights writing this stuff and we dance around and celebrate. Then I change my location on my operation system to United States and re-launch Zune. WTF? My flight app has slipped to the 10th spot (I’m only showing 4 across here out of the 7 in Zune) and my border check app that was #1 is now in the 32nd spot! End of celebration. Not only is relevance being looked at here, I value the comments people make on may apps as do most developers. I want to respond to them and show them that I’m listening. The next version of my border app will provide multiple camera angles. However when I’m running in my native Canada location, I only see two reviews. Changing over to United States I see fourteen! While there are tools out there to provide with you a unified view, I shouldn’t have to rely on them. My own Zune desktop software should allow me to see everything. I realize that some developers will submit an app and only target it for some locations and that’s their choice. However I shouldn’t have to jump through hoops to see what apps are ahead of mine, or see people comments and ratings. Another proposal. Either unify the marketplace (i.e. when I’m looking at it show me everything combined) or let me choose a filter. I think the first option might be difficult as you’re trying to average out top selling apps across all markets and have to deal with some apps that have been omitted from some markets. Although I think you could come up with a set of use cases that would handle that, maybe that’s too much work. At the very least, let us developers view the markets in a drop down or something from within the Zune desktop. Having to shut down Zune, change our location, and re-launch Zune to see other perspectives is just too onerous. A Call to Action These are just one mans opinion. Do you agree? Disagree? Feel hungry for a bacon sandwich? Let everyone know via the comments below. Perhaps someone from Microsoft will be reading and take some of these ideas under advisement. Maybe not, but at least let’s get the word out that we really want to see some change. Egypt can do it, why not WP7 developers!

    Read the article

  • The SSIS tuning tip that everyone misses

    - by Rob Farley
    I know that everyone misses this, because I’m yet to find someone who doesn’t have a bit of an epiphany when I describe this. When tuning Data Flows in SQL Server Integration Services, people see the Data Flow as moving from the Source to the Destination, passing through a number of transformations. What people don’t consider is the Source, getting the data out of a database. Remember, the source of data for your Data Flow is not your Source Component. It’s wherever the data is, within your database, probably on a disk somewhere. You need to tune your query to optimise it for SSIS, and this is what most people fail to do. I’m not suggesting that people don’t tune their queries – there’s plenty of information out there about making sure that your queries run as fast as possible. But for SSIS, it’s not about how fast your query runs. Let me say that again, but in bolder text: The speed of an SSIS Source is not about how fast your query runs. If your query is used in a Source component for SSIS, the thing that matters is how fast it starts returning data. In particular, those first 10,000 rows to populate that first buffer, ready to pass down the rest of the transformations on its way to the Destination. Let’s look at a very simple query as an example, using the AdventureWorks database: We’re picking the different Weight values out of the Product table, and it’s doing this by scanning the table and doing a Sort. It’s a Distinct Sort, which means that the duplicates are discarded. It'll be no surprise to see that the data produced is sorted. Obvious, I know, but I'm making a comparison to what I'll do later. Before I explain the problem here, let me jump back into the SSIS world... If you’ve investigated how to tune an SSIS flow, then you’ll know that some SSIS Data Flow Transformations are known to be Blocking, some are Partially Blocking, and some are simply Row transformations. Take the SSIS Sort transformation, for example. I’m using a larger data set for this, because my small list of Weights won’t demonstrate it well enough. Seven buffers of data came out of the source, but none of them could be pushed past the Sort operator, just in case the last buffer contained the data that would be sorted into the first buffer. This is a blocking operation. Back in the land of T-SQL, we consider our Distinct Sort operator. It’s also blocking. It won’t let data through until it’s seen all of it. If you weren’t okay with blocking operations in SSIS, why would you be happy with them in an execution plan? The source of your data is not your OLE DB Source. Remember this. The source of your data is the NCIX/CIX/Heap from which it’s being pulled. Picture it like this... the data flowing from the Clustered Index, through the Distinct Sort operator, into the SELECT operator, where a series of SSIS Buffers are populated, flowing (as they get full) down through the SSIS transformations. Alright, I know that I’m taking some liberties here, because the two queries aren’t the same, but consider the visual. The data is flowing from your disk and through your execution plan before it reaches SSIS, so you could easily find that a blocking operation in your plan is just as painful as a blocking operation in your SSIS Data Flow. Luckily, T-SQL gives us a brilliant query hint to help avoid this. OPTION (FAST 10000) This hint means that it will choose a query which will optimise for the first 10,000 rows – the default SSIS buffer size. And the effect can be quite significant. First let’s consider a simple example, then we’ll look at a larger one. Consider our weights. We don’t have 10,000, so I’m going to use OPTION (FAST 1) instead. You’ll notice that the query is more expensive, using a Flow Distinct operator instead of the Distinct Sort. This operator is consuming 84% of the query, instead of the 59% we saw from the Distinct Sort. But the first row could be returned quicker – a Flow Distinct operator is non-blocking. The data here isn’t sorted, of course. It’s in the same order that it came out of the index, just with duplicates removed. As soon as a Flow Distinct sees a value that it hasn’t come across before, it pushes it out to the operator on its left. It still has to maintain the list of what it’s seen so far, but by handling it one row at a time, it can push rows through quicker. Overall, it’s a lot more work than the Distinct Sort, but if the priority is the first few rows, then perhaps that’s exactly what we want. The Query Optimizer seems to do this by optimising the query as if there were only one row coming through: This 1 row estimation is caused by the Query Optimizer imagining the SELECT operation saying “Give me one row” first, and this message being passed all the way along. The request might not make it all the way back to the source, but in my simple example, it does. I hope this simple example has helped you understand the significance of the blocking operator. Now I’m going to show you an example on a much larger data set. This data was fetching about 780,000 rows, and these are the Estimated Plans. The data needed to be Sorted, to support further SSIS operations that needed that. First, without the hint. ...and now with OPTION (FAST 10000): A very different plan, I’m sure you’ll agree. In case you’re curious, those arrows in the top one are 780,000 rows in size. In the second, they’re estimated to be 10,000, although the Actual figures end up being 780,000. The top one definitely runs faster. It finished several times faster than the second one. With the amount of data being considered, these numbers were in minutes. Look at the second one – it’s doing Nested Loops, across 780,000 rows! That’s not generally recommended at all. That’s “Go and make yourself a coffee” time. In this case, it was about six or seven minutes. The faster one finished in about a minute. But in SSIS-land, things are different. The particular data flow that was consuming this data was significant. It was being pumped into a Script Component to process each row based on previous rows, creating about a dozen different flows. The data flow would take roughly ten minutes to run – ten minutes from when the data first appeared. The query that completes faster – chosen by the Query Optimizer with no hints, based on accurate statistics (rather than pretending the numbers are smaller) – would take a minute to start getting the data into SSIS, at which point the ten-minute flow would start, taking eleven minutes to complete. The query that took longer – chosen by the Query Optimizer pretending it only wanted the first 10,000 rows – would take only ten seconds to fill the first buffer. Despite the fact that it might have taken the database another six or seven minutes to get the data out, SSIS didn’t care. Every time it wanted the next buffer of data, it was already available, and the whole process finished in about ten minutes and ten seconds. When debugging SSIS, you run the package, and sit there waiting to see the Debug information start appearing. You look for the numbers on the data flow, and seeing operators going Yellow and Green. Without the hint, I’d sit there for a minute. With the hint, just ten seconds. You can imagine which one I preferred. By adding this hint, it felt like a magic wand had been waved across the query, to make it run several times faster. It wasn’t the case at all – but it felt like it to SSIS.

    Read the article

  • Azure Diagnostics: The Bad, The Ugly, and a Better Way

    - by jasont
    If you’re a .Net web developer today, no doubt you’ve enjoyed watching Windows Azure grow up over the past couple of years. The platform has scaled, stabilized (mostly), and added on a slew of great (and sometimes overdue) features. What was once just an endpoint to host a solution, developers today have tremendous flexibility and options in the platform. Organizations are building new solutions and offerings on the platform, and others have, or are in the process of, migrating existing applications out of their own data centers into the Azure cloud. Whether new application development or migrating legacy, every development shop and IT organization needs to monitor their applications in the cloud, the same as they do on premises. Azure Diagnostics has some capabilities, but what I constantly hear from users is that it’s either (a) not enough, or (b) too cumbersome to set up. Today, Stackify is happy to announce that we fully support Azure deployments, just the same as your on-premises deployments. Let’s take a look below and compare and contrast the options. Azure Diagnostics Let’s crack open the Windows Azure documentation on Azure Diagnostics and see just how easy it is to use. The high level steps are:   Step 1: Import the Diagnostics Oh, I’ve already deployed my app without the diagnostics module. Guess I can’t do anything until I do this and re-deploy. Step 2: Configure the Diagnostics (and multiple sub-steps) Do I want it all? Or just pieces of it? Whoops, forgot to include a specific performance counter, I guess I’ll have to deploy again. Wait a minute… I have to specifically code these performance counters into my role’s OnStart() method, compile and deploy again? And query and consume it myself? Step 3: (Optional) Permanently store diagnostic data Lucky for me, Azure storage has gotten pretty cheap. But how often should I move the data into storage? I want to see real-time data, so I guess that’s out now as well. Step 4: (Optional) View stored diagnostic data Optional? Of course I want to see it. Conveniently, Microsoft recommends 3 tools to do this with. Un-conveniently, none of these are web based and they all just give you access to raw data, and very little charting or real-time intelligence. Just….. data. Nevermind that one product seems to have gotten stale since a recent acquisition, and doesn’t even have screenshots!   So, let’s summarize: lots of diagnostics data is available, but think realistically. Think Dev Ops. What happens when you are in the middle of a major production performance issue and you don’t have the diagnostics you need? You are redeploying an application (and thankfully you have a great branching strategy, so you feel perfectly safe just willy-nilly launching code into prod, don’t you?) to get data, then shipping it to storage, and then digging through that data to find a needle in a haystack. Would you like to be able to troubleshoot a performance issue in the middle of the night, or on a weekend, from your iPad or home computer’s web browser? Forget it: the best you get is this spark line in the Azure portal. If it’s real pointy, you probably have an issue; but since there is no alert based on a threshold your customers have likely already let you know. And high CPU, Memory, I/O, or Network doesn’t tell you anything about where the problem is. The Better Way – Stackify Stackify supports application and server monitoring in real time, all through a great web interface. All of the things that Azure Diagnostics provides, Stackify provides for your on-premises deployments, and you don’t need to know ahead of time that you’ll need it. It’s always there, it’s always on. Azure deployments are essentially no different than on-premises. It’s a Windows Server (or Linux) in the cloud. It’s behind a different firewall than your corporate servers. That’s it. Stackify can provide the same powerful tools to your Azure deployments in two simple steps. Step 1 Add a startup task to your web or worker role and deploy. If you can’t deploy and need it right now, no worries! Remote Desktop to the Azure instance and you can execute a Powershell script to download / install Stackify.   Step 2 Log in to your account at www.stackify.com and begin monitoring as much as you want, as often as you want and see the results instantly. WMI? It’s there Event Viewer? You’ve got it. File System Access? Yes, please! Would love to make sure my web.config is correct.   IIS / App Pool Info? Yep. You can even restart it. Running Services? All of them. Start and Stop them to your heart’s content. SQL Database access? You bet’cha. Alerts and Notification? Of course! You should know before your customers let you know. … and so much more.   Conclusion Microsoft has shown, consistently, that they love developers, developers, developers. What every developer needs to realize from this is that they’ve given you a canvas, which is exactly what Azure is. It’s great infrastructure that is readily available, easy to manage, and fairly cost effective. However, the tooling is your responsibility. What you get, at best, is bare bones. App and server diagnostics should be available when you need them. While we, as developers, try to plan for and think of everything ahead of time, there will come times where we need to get data that just isn’t available. And having to go through a lot of cumbersome steps to get that data, and then have to find a friendlier way to consume it…. well, that just doesn’t make a lot of sense to me. I’d rather spend my time writing and developing features and completing bug fixes for my applications, than to be writing code to monitor and diagnose.

    Read the article

  • How I Work: Staying Productive Whilst Traveling

    - by BuckWoody
    I travel a lot. Not like some folks that are gone every week, mind you, although in the last month I’ve been to: Cambridge, UK; Anchorage, AK; San Jose, CA; Copenhagen, DK, Boston, MA; and I’m currently en-route to Anaheim, CA.  While this many places in a month is a bit unusual for me, I would say I travel frequently. I’ve travelled most of my 28+ years in IT, and at one time was a consultant traveling weekly.   With that much time away from my primary work location, I have to find ways to stay productive. Some might say “just rest – take a nap!” – but I’m not able to do that. For one thing, I’m a very light sleeper and I’ve never slept on a plane - even a 30+ hour trip to New Zealand in Business Class - so that just isn’t option. I also am not always in the plane, of course. There’s the hotel, the taxi/bus/train, the airport and then all that over again when I arrive. Since my regular jobs have many demands, I have to get work done.   Note: No, I’m not always focused on work. I need downtime just like everyone else. Sometimes I just think, watch a movie or listen to tunes – and I give myself permission to do that anytime – sometimes the whole trip. I have too fewheartbeats left in life to only focus on work – it’s just not that important, and neither am I. Some of these tasks are letters to friends and family, or other personal things. What I’m talking about here is a plan, not some task list I have to follow. When I get to the location I’m traveling to, I always build in as much time as I can to ensure I enjoy those sights and the people I’m with. I would find traveling to be a waste if not for that.   The Unrealistic Expectation As I would evaluate the trip I was taking – say a 6-8 hour flight – I would expect to get 10-12 hours of work done. After all, there’s the time at the airport, the taxi and so on, and then of course the time in the air with all of the room, power, internet and everything else I needed to get my work done. I would pile up tasks at home, pack my bags, and head happily to the magical land of the TSA.   Right. On return from the trip, I had accomplished little, had more e-mails and other work that had piled up, and I was tired, hungry, and unorganized. This had to change. So, I decided to do three things: Segment my work Set realistic expectations Plan accordingly  Segmenting By Available Resources The first task was to decide what kind of work I could do in each location – if any. I found that I was dependent on a few things to get work done, such as power, the Internet, and a place to sit down. Before I fly, I take some time at home to get all of the work I’d like to accomplish while away segmented into these areas, and print that out on paper, which goes in my suit-coat pocket along with a mechanical pencil. I print my tickets, and I’m all set for the adventure ahead. Then I simply do each kind of work whenever I’m in that situation. No power There are certain times when I don’t have power available. But not only that, I might not even be able to use most of my electronics. So I now schedule as many phone calls as I can for the taxi/bus/train ride and the airports as I can. I have a paper notebook (Moleskine, of course) and a pencil and I print out any notes or numbers I need prior to the trip. Once I’m airborne or at the airport, I work on my laptop. I check and respond to e-mails, create slides, write code, do architecture, whatever I can.  If I can’t use any electronics, or once the power runs out, I schedule time for reading. I can read at the airport or anywhere, actually, even in-flight or any other transport. I “read with a pencil”, meaning I take a lot of notes, which I liketo put in OneNote, but since in most cases I don’t have power, I use the Moleskine to do that. Speaking of which, sometimes as I’m thinking I come up with new topics, ideas, blog posts, or things to teach in my classes. Once again I take out the notebook and write it down. All of these notes get a check-mark when I get back to the office and transfer the writing to OneNote. I’ve tried those “smart pens” and so on to automate this, but it just never works out. Pencil and paper are just fine. As I mentioned, sometime I just need to think. I’ll do nothing, and let my mind wander, thinking of nothing in particular, or some math problem or science question I’m interested in. My only issue with this is that I communicate tothink, and I don’t want to drive people crazy by being that guy that won’t shut up, so I think in a different way. Power, but no Internet or Phone If I have power but no Internet or phone, I focus on the laptop and the tablet as before, and I also recharge my other gadgets. Power, Internet, Phone and a Place to Work At first I thought that when I arrived at the hotel or event I could get the same amount of work done that I do at the office. Not so. There’s simply too many distractions, things you need, or other issues that allow this. Of course, Ican work on any device, read, think, write or whatever, but I am simply not as productive as I am in my home office. So I plan for about 25-50% as much work getting done in this environment as I think I could really do. I’ve done some measurements, and this holds out to be true almost every time. The key is that I re-set my expectations (and my co-worker’s expectations as well) that this is the case. I use the Out-Of-Office notices to let people know that I’m just not going to be 100% at this time – it’s hard for everyone, but it’s more honest and realistic, and I’d rather they know that – and that I realize that – than to let them think I’m totally available. Because I’m not – I’m traveling. I don’t tend to put too much detail, because after all I don’t necessarily want to let people know when I’m not home :) but I do think it’s important to let people that depend on my know that I’ll get back with them later. I hope this helps you think through your own methodology of staying productive when you travel. Or perhaps you just go offline, and don’t worry about any of this – good for you! That’s completely valid as well.   (Oh, and yes, I wrote this at 35K feet, on Alaska Airlines on a trip. :)  Practice what you preach, Buck.)

    Read the article

  • SOA Implementation Challenges

    Why do companies think that if they put up a web service that they are doing Service-Oriented Architecture (SOA)? Unfortunately, the IT and business world love to run on the latest hype or buzz words of which very few even understand the meaning. One of the largest issues companies have today as they consider going down the path of SOA, is the lack of knowledge regarding the architectural style and the over usage of the term SOA. So how do we solve this issue?I am sure most of you are thinking by now that you know what SOA is because you developed a few web services.  Isn’t that SOA, right? No, that is not SOA, but instead Just Another Web Service (JAWS). For us to better understand what SOA is let’s look at a few definitions.Douglas K. Bary defines service-oriented architecture as a collection of services. These services are enabled to communicate with each other in order to pass data or coordinating some activity with other services.If you look at this definition closely you will notice that Bary states that services communicate with each other. Let us compare this statement with my first statement regarding companies that claim to be doing SOA when they have just a collection of web services. In order for these web services to for an SOA application they need to be interdependent on one another forming some sort of architectural hierarchy. Just because a company has a few web services does not mean that they are all interconnected.SearchSOA from TechTarget.com states that SOA defines how two computing entities work collectively to enable one entity to perform a unit of work on behalf of another. Once again, just because a company has a few web services does not guarantee that they are even working together let alone if they are performing work for each other.SearchSOA also points out service interactions should be self-contained and loosely-coupled so that all interactions operate independent of each other.Of all the definitions regarding SOA Thomas Erl’s seems to shed the most light on this concept. He states that “SOA establishes an architectural model that aims to enhance the efficiency, agility, and productivity of an enterprise by positioning services as the primary means through which solution logic is represented in support of the realization of the strategic goals associated with service-oriented computing.” (Erl, 2011) Once again this definition proves that a collection of web services does not mean that a company is doing SOA. However, it does mean that a company has a collection of web services, and that is it.In order for a company to start to go down the path of SOA, they must take  a hard look at their existing business process while abstracting away any technology so that they can define what is they really want to accomplish. Once a company has done this, they can begin to factor out common sub business process like credit card process, user authentication or system notifications in to small components that can be built independent of each other and then reassembled to form new and dynamic services that are loosely coupled and agile in that they can change as a business grows.Another key pitfall of companies doing SOA is the fact that they let vendors drive their architecture. Why do companies do this? Vendors’ do not hold your company’s success as their top priority; in fact they hold their own success as their top priority by selling you as much stuff as you are willing to buy. In my experience companies tend to strive for the maximum amount of benefits with a minimal amount of cost. Does anyone else see any conflicts between this and the driving force behind vendors.Mike Kavis recommends in an article written in CIO.com that companies need to figure out what they need before they talk to a vendor or at least have some idea of what they need. It is important to thoroughly evaluate each vendor and watch them perform a live demo of their system so that you as the company fully understand what kind of product or service the vendor is actually offering. In addition, do research on each vendor that you are considering, check out blog posts, online reviews, and any information you can find on the vendor through various search engines.Finally he recommends companies to verify any recommendations supplied by a vendor. From personal experience this is very important. I can remember when the company I worked for purchased a $200,000 add-on to their phone system that never actually worked as it was intended. In fact, just after my departure from the company started the process of attempting to get their money back from the vendor. This potentially could have been avoided if the company had done the research before selecting this vendor to ensure that their product and vendor would live up to their claims. I know that some SOA vendor offer free training regarding SOA because they know that there are a lot of misconceptions about the topic. Superficially this is a great thing for companies to take part in especially if the company is starting to implement SOA architecture and are still unsure about some topics or are looking for some guidance regarding the topic. However beware that some companies will focus on their product line only regarding the training. As an example, InfoWorld.com claims that companies providing deep seminars disguised as training, focusing more about ESBs and SOA governance technology, and less on how to approach and solve the architectural issues of the attendees.In short, it is important to remember that we as software professionals are responsible for guiding a business’s technology sections should be well informed and fully understand any new concepts that may be considered for implementation. As I have demonstrated already a company that has a few web services does not mean that they are doing SOA.  Additionally, we must not let the new buzz word of the day drive our technology, but instead our technology decisions should be driven from research and proven experience. Finally, it is important to rely on vendors when necessary, however, always take what they say with a grain of salt while cross checking any claims that they may make because we have to live with the aftermath of a system after the vendors are gone.   References: Barry, D. K. (2011). Service-oriented architecture (SOA) definition. Retrieved 12 12, 2011, from Service-Architecture.com: http://www.service-architecture.com/web-services/articles/service-oriented_architecture_soa_definition.html Connell, B. (2003, 9). service-oriented architecture (SOA). Retrieved 12 12, 2011, from SearchSOA: http://searchsoa.techtarget.com/definition/service-oriented-architecture Erl, T. (2011, 12 12). Service-Oriented Architecture. Retrieved 12 12, 2011, from WhatIsSOA: http://www.whatissoa.com/p10.php InfoWorld. (2008, 6 1). Should you get your SOA knowledge from SOA vendors? . Retrieved 12 12, 2011, from InfoWorld.com: http://www.infoworld.com/d/architecture/should-you-get-your-soa-knowledge-soa-vendors-453 Kavis, M. (2008, 6 18). Top 10 Reasons Why People are Making SOA Fail. Retrieved 12 13, 2011, from CIO.com: http://www.cio.com/article/438413/Top_10_Reasons_Why_People_are_Making_SOA_Fail?page=5&taxonomyId=3016  

    Read the article

  • Silverlight 4 Twitter Client &ndash; Part 3

    - by Max
    Finally Silverlight 4 RC is released and also that Windows 7 Phone Series will rely heavily on Silverlight platform for apps platform. its a really good news for Silverlight developers and designers. More information on this here. You can use SL 4 RC with VS 2010. SL 4 RC does not come with VS 2010, you need to download it separately and install it. So for the next part, be ready with VS 2010 and SL4 RC, we will start using them and not With this momentum, let us go to the next part of our twitter client tutorial. This tutorial will cover setting your status in Twitter and also retrieving your 1) As everything in Silverlight is asynchronous, we need to have some visual representation showing that something is going on in the background. So what I did was to create a progress bar with indeterminate animation. The XAML is here below. <ProgressBar Maximum="100" Width="300" Height="50" Margin="20" Visibility="Collapsed" IsIndeterminate="True" Name="progressBar1" VerticalAlignment="Center" HorizontalAlignment="Center" /> 2) I will be toggling this progress bar to show the background work. So I thought of writing this small method, which I use to toggle the visibility of this progress bar. Just pass a bool to this method and this will toggle it based on its current visibility status. public void toggleProgressBar(bool Option){ if (Option) { if (progressBar1.Visibility == System.Windows.Visibility.Collapsed) progressBar1.Visibility = System.Windows.Visibility.Visible; } else { if (progressBar1.Visibility == System.Windows.Visibility.Visible) progressBar1.Visibility = System.Windows.Visibility.Collapsed; }} 3) Now let us create a grid to hold a textbox and a update button. The XAML will look like something below <Grid HorizontalAlignment="Center"> <Grid.RowDefinitions> <RowDefinition Height="50"></RowDefinition> </Grid.RowDefinitions> <Grid.ColumnDefinitions> <ColumnDefinition Width="400"></ColumnDefinition> <ColumnDefinition Width="200"></ColumnDefinition> </Grid.ColumnDefinitions> <TextBox Name="TwitterStatus" Width="380" Height="50"></TextBox> <Button Name="UpdateStatus" Content="Update" Grid.Row="1" Grid.Column="2" Width="200" Height="50" Click="UpdateStatus_Click"></Button></Grid> 4) The click handler for this update button will be again using the Web Client to post values. Posting values using Web Client. The code is: private void UpdateStatus_Click(object sender, RoutedEventArgs e){ toggleProgressBar(true); string statusupdate = "status=" + TwitterStatus.Text; WebRequest.RegisterPrefix("https://", System.Net.Browser.WebRequestCreator.ClientHttp);  WebClient myService = new WebClient(); myService.AllowReadStreamBuffering = true; myService.UseDefaultCredentials = false; myService.Credentials = new NetworkCredential(GlobalVariable.getUserName(), GlobalVariable.getPassword());  myService.UploadStringCompleted += new UploadStringCompletedEventHandler(myService_UploadStringCompleted); myService.UploadStringAsync(new Uri("https://twitter.com/statuses/update.xml"), statusupdate);  this.Dispatcher.BeginInvoke(() => ClearTextBoxValue());} 5) In the above code, we have a event handler which will be fired on this request is completed – !! Remember SL is Asynch !! So in the myService_UploadStringCompleted, we will just toggle the progress bar and change some status text to say that its done. The code for this will be StatusMessage is just another textblock conveniently positioned in the page.  void myService_UploadStringCompleted(object sender, UploadStringCompletedEventArgs e){ if (e.Error != null) { StatusMessage.Text = "Status Update Failed: " + e.Error.Message.ToString(); } else { toggleProgressBar(false); TwitterCredentialsSubmit(); }} 6) Now let us look at fetching the friends updates of the logged in user and displaying it in a datagrid. So just define a data grid and set its autogenerate columns as true. 7) Let us first create a data structure for use with fetching the friends timeline. The code is something like below: namespace MaxTwitter.Classes{ public class Status { public Status() {} public string ID { get; set; } public string Text { get; set; } public string Source { get; set; } public string UserID { get; set; } public string UserName { get; set; } }} You can add as many fields as you want, for the list of fields, have a look at here. It will ask for your Twitter username and password, just provide them and this will display the xml file. Go through them pick and choose your desired fields and include in your Data Structure. 8) Now the web client request for this is similar to the one we saw in step 4. Just change the uri in the last but one step to https://twitter.com/statuses/friends_timeline.xml Be sure to change the event handler to something else and within that we will use XLINQ to fetch the required details for us. Now let us how this event handler fetches details. public void parseXML(string text){ XDocument xdoc; if(text.Length> 0) xdoc = XDocument.Parse(text); else xdoc = XDocument.Parse(@"I USED MY OWN LOCAL COPY OF XML FILE HERE FOR OFFLINE TESTING"); statusList = new List<Status>(); statusList = (from status in xdoc.Descendants("status") select new Status { ID = status.Element("id").Value, Text = status.Element("text").Value, Source = status.Element("source").Value, UserID = status.Element("user").Element("id").Value, UserName = status.Element("user").Element("screen_name").Value, }).ToList(); //MessageBox.Show(text); //this.Dispatcher.BeginInvoke(() => CallDatabindMethod(StatusCollection)); //MessageBox.Show(statusList.Count.ToString()); DataGridStatus.ItemsSource = statusList; StatusMessage.Text = "Datagrid refreshed."; toggleProgressBar(false);} in the event handler, we call this method with e.Result.ToString() Parsing XML files using LINQ is super cool, I love it.   I am stopping it here for  this post. Will post the completed files in next post, as I’ve worked on a few more features in this page and don’t want to confuse you. See you soon in my next post where will play with Twitter lists. Have a nice day! Technorati Tags: Silverlight,LINQ,XLINQ,Twitter API,Twitter,Network Credentials

    Read the article

  • Node.js Adventure - When Node Flying in Wind

    - by Shaun
    In the first post of this series I mentioned some popular modules in the community, such as underscore, async, etc.. I also listed a module named “Wind (zh-CN)”, which is created by one of my friend, Jeff Zhao (zh-CN). Now I would like to use a separated post to introduce this module since I feel it brings a new async programming style in not only Node.js but JavaScript world. If you know or heard about the new feature in C# 5.0 called “async and await”, or you learnt F#, you will find the “Wind” brings the similar async programming experience in JavaScript. By using “Wind”, we can write async code that looks like the sync code. The callbacks, async stats and exceptions will be handled by “Wind” automatically and transparently.   What’s the Problem: Dense “Callback” Phobia Let’s firstly back to my second post in this series. As I mentioned in that post, when we wanted to read some records from SQL Server we need to open the database connection, and then execute the query. In Node.js all IO operation are designed as async callback pattern which means when the operation was done, it will invoke a function which was taken from the last parameter. For example the database connection opening code would be like this. 1: sql.open(connectionString, function(error, conn) { 2: if(error) { 3: // some error handling code 4: } 5: else { 6: // connection opened successfully 7: } 8: }); And then if we need to query the database the code would be like this. It nested in the previous function. 1: sql.open(connectionString, function(error, conn) { 2: if(error) { 3: // some error handling code 4: } 5: else { 6: // connection opened successfully 7: conn.queryRaw(command, function(error, results) { 8: if(error) { 9: // failed to execute this command 10: } 11: else { 12: // records retrieved successfully 13: } 14: }; 15: } 16: }); Assuming if we need to copy some data from this database to another then we need to open another connection and execute the command within the function under the query function. 1: sql.open(connectionString, function(error, conn) { 2: if(error) { 3: // some error handling code 4: } 5: else { 6: // connection opened successfully 7: conn.queryRaw(command, function(error, results) { 8: if(error) { 9: // failed to execute this command 10: } 11: else { 12: // records retrieved successfully 13: target.open(targetConnectionString, function(error, t_conn) { 14: if(error) { 15: // connect failed 16: } 17: else { 18: t_conn.queryRaw(copy_command, function(error, results) { 19: if(error) { 20: // copy failed 21: } 22: else { 23: // and then, what do you want to do now... 24: } 25: }; 26: } 27: }; 28: } 29: }; 30: } 31: }); This is just an example. In the real project the logic would be more complicated. This means our application might be messed up and the business process will be fragged by many callback functions. I would like call this “Dense Callback Phobia”. This might be a challenge how to make code straightforward and easy to read, something like below. 1: try 2: { 3: // open source connection 4: var s_conn = sqlConnect(s_connectionString); 5: // retrieve data 6: var results = sqlExecuteCommand(s_conn, s_command); 7: 8: // open target connection 9: var t_conn = sqlConnect(t_connectionString); 10: // prepare the copy command 11: var t_command = getCopyCommand(results); 12: // execute the copy command 13: sqlExecuteCommand(s_conn, t_command); 14: } 15: catch (ex) 16: { 17: // error handling 18: }   What’s the Problem: Sync-styled Async Programming Similar as the previous problem, the callback-styled async programming model makes the upcoming operation as a part of the current operation, and mixed with the error handling code. So it’s very hard to understand what on earth this code will do. And since Node.js utilizes non-blocking IO mode, we cannot invoke those operations one by one, as they will be executed concurrently. For example, in this post when I tried to copy the records from Windows Azure SQL Database (a.k.a. WASD) to Windows Azure Table Storage, if I just insert the data into table storage one by one and then print the “Finished” message, I will see the message shown before the data had been copied. This is because all operations were executed at the same time. In order to make the copy operation and print operation executed synchronously I introduced a module named “async” and the code was changed as below. 1: async.forEach(results.rows, 2: function (row, callback) { 3: var resource = { 4: "PartitionKey": row[1], 5: "RowKey": row[0], 6: "Value": row[2] 7: }; 8: client.insertEntity(tableName, resource, function (error) { 9: if (error) { 10: callback(error); 11: } 12: else { 13: console.log("entity inserted."); 14: callback(null); 15: } 16: }); 17: }, 18: function (error) { 19: if (error) { 20: error["target"] = "insertEntity"; 21: res.send(500, error); 22: } 23: else { 24: console.log("all done."); 25: res.send(200, "Done!"); 26: } 27: }); It ensured that the “Finished” message will be printed when all table entities had been inserted. But it cannot promise that the records will be inserted in sequence. It might be another challenge to make the code looks like in sync-style? 1: try 2: { 3: forEach(row in rows) { 4: var entity = { /* ... */ }; 5: tableClient.insert(tableName, entity); 6: } 7:  8: console.log("Finished"); 9: } 10: catch (ex) { 11: console.log(ex); 12: }   How “Wind” Helps “Wind” is a JavaScript library which provides the control flow with plain JavaScript for asynchronous programming (and more) without additional pre-compiling steps. It’s available in NPM so that we can install it through “npm install wind”. Now let’s create a very simple Node.js application as the example. This application will take some website URLs from the command arguments and tried to retrieve the body length and print them in console. Then at the end print “Finish”. I’m going to use “request” module to make the HTTP call simple so I also need to install by the command “npm install request”. The code would be like this. 1: var request = require("request"); 2:  3: // get the urls from arguments, the first two arguments are `node.exe` and `fetch.js` 4: var args = process.argv.splice(2); 5:  6: // main function 7: var main = function() { 8: for(var i = 0; i < args.length; i++) { 9: // get the url 10: var url = args[i]; 11: // send the http request and try to get the response and body 12: request(url, function(error, response, body) { 13: if(!error && response.statusCode == 200) { 14: // log the url and the body length 15: console.log( 16: "%s: %d.", 17: response.request.uri.href, 18: body.length); 19: } 20: else { 21: // log error 22: console.log(error); 23: } 24: }); 25: } 26: 27: // finished 28: console.log("Finished"); 29: }; 30:  31: // execute the main function 32: main(); Let’s execute this application. (I made them in multi-lines for better reading.) 1: node fetch.js 2: "http://www.igt.com/us-en.aspx" 3: "http://www.igt.com/us-en/games.aspx" 4: "http://www.igt.com/us-en/cabinets.aspx" 5: "http://www.igt.com/us-en/systems.aspx" 6: "http://www.igt.com/us-en/interactive.aspx" 7: "http://www.igt.com/us-en/social-gaming.aspx" 8: "http://www.igt.com/support.aspx" Below is the output. As you can see the finish message was printed at the beginning, and the pages’ length retrieved in a different order than we specified. This is because in this code the request command, console logging command are executed asynchronously and concurrently. Now let’s introduce “Wind” to make them executed in order, which means it will request the websites one by one, and print the message at the end.   First of all we need to import the “Wind” package and make sure the there’s only one global variant named “Wind”, and ensure it’s “Wind” instead of “wind”. 1: var Wind = require("wind");   Next, we need to tell “Wind” which code will be executed asynchronously so that “Wind” can control the execution process. In this case the “request” operation executed asynchronously so we will create a “Task” by using a build-in helps function in “Wind” named Wind.Async.Task.create. 1: var requestBodyLengthAsync = function(url) { 2: return Wind.Async.Task.create(function(t) { 3: request(url, function(error, response, body) { 4: if(error || response.statusCode != 200) { 5: t.complete("failure", error); 6: } 7: else { 8: var data = 9: { 10: uri: response.request.uri.href, 11: length: body.length 12: }; 13: t.complete("success", data); 14: } 15: }); 16: }); 17: }; The code above created a “Task” from the original request calling code. In “Wind” a “Task” means an operation will be finished in some time in the future. A “Task” can be started by invoke its start() method, but no one knows when it actually will be finished. The Wind.Async.Task.create helped us to create a task. The only parameter is a function where we can put the actual operation in, and then notify the task object it’s finished successfully or failed by using the complete() method. In the code above I invoked the request method. If it retrieved the response successfully I set the status of this task as “success” with the URL and body length. If it failed I set this task as “failure” and pass the error out.   Next, we will change the main() function. In “Wind” if we want a function can be controlled by Wind we need to mark it as “async”. This should be done by using the code below. 1: var main = eval(Wind.compile("async", function() { 2: })); When the application is running, Wind will detect “eval(Wind.compile(“async”, function” and generate an anonymous code from the body of this original function. Then the application will run the anonymous code instead of the original one. In our example the main function will be like this. 1: var main = eval(Wind.compile("async", function() { 2: for(var i = 0; i < args.length; i++) { 3: try 4: { 5: var result = $await(requestBodyLengthAsync(args[i])); 6: console.log( 7: "%s: %d.", 8: result.uri, 9: result.length); 10: } 11: catch (ex) { 12: console.log(ex); 13: } 14: } 15: 16: console.log("Finished"); 17: })); As you can see, when I tried to request the URL I use a new command named “$await”. It tells Wind, the operation next to $await will be executed asynchronously, and the main thread should be paused until it finished (or failed). So in this case, my application will be pause when the first response was received, and then print its body length, then try the next one. At the end, print the finish message.   Finally, execute the main function. The full code would be like this. 1: var request = require("request"); 2: var Wind = require("wind"); 3:  4: var args = process.argv.splice(2); 5:  6: var requestBodyLengthAsync = function(url) { 7: return Wind.Async.Task.create(function(t) { 8: request(url, function(error, response, body) { 9: if(error || response.statusCode != 200) { 10: t.complete("failure", error); 11: } 12: else { 13: var data = 14: { 15: uri: response.request.uri.href, 16: length: body.length 17: }; 18: t.complete("success", data); 19: } 20: }); 21: }); 22: }; 23:  24: var main = eval(Wind.compile("async", function() { 25: for(var i = 0; i < args.length; i++) { 26: try 27: { 28: var result = $await(requestBodyLengthAsync(args[i])); 29: console.log( 30: "%s: %d.", 31: result.uri, 32: result.length); 33: } 34: catch (ex) { 35: console.log(ex); 36: } 37: } 38: 39: console.log("Finished"); 40: })); 41:  42: main().start();   Run our new application. At the beginning we will see the compiled and generated code by Wind. Then we can see the pages were requested one by one, and at the end the finish message was printed. Below is the code Wind generated for us. As you can see the original code, the output code were shown. 1: // Original: 2: function () { 3: for(var i = 0; i < args.length; i++) { 4: try 5: { 6: var result = $await(requestBodyLengthAsync(args[i])); 7: console.log( 8: "%s: %d.", 9: result.uri, 10: result.length); 11: } 12: catch (ex) { 13: console.log(ex); 14: } 15: } 16: 17: console.log("Finished"); 18: } 19:  20: // Compiled: 21: /* async << function () { */ (function () { 22: var _builder_$0 = Wind.builders["async"]; 23: return _builder_$0.Start(this, 24: _builder_$0.Combine( 25: _builder_$0.Delay(function () { 26: /* var i = 0; */ var i = 0; 27: /* for ( */ return _builder_$0.For(function () { 28: /* ; i < args.length */ return i < args.length; 29: }, function () { 30: /* ; i ++) { */ i ++; 31: }, 32: /* try { */ _builder_$0.Try( 33: _builder_$0.Delay(function () { 34: /* var result = $await(requestBodyLengthAsync(args[i])); */ return _builder_$0.Bind(requestBodyLengthAsync(args[i]), function (result) { 35: /* console.log("%s: %d.", result.uri, result.length); */ console.log("%s: %d.", result.uri, result.length); 36: return _builder_$0.Normal(); 37: }); 38: }), 39: /* } catch (ex) { */ function (ex) { 40: /* console.log(ex); */ console.log(ex); 41: return _builder_$0.Normal(); 42: /* } */ }, 43: null 44: ) 45: /* } */ ); 46: }), 47: _builder_$0.Delay(function () { 48: /* console.log("Finished"); */ console.log("Finished"); 49: return _builder_$0.Normal(); 50: }) 51: ) 52: ); 53: /* } */ })   How Wind Works Someone may raise a big concern when you find I utilized “eval” in my code. Someone may assume that Wind utilizes “eval” to execute some code dynamically while “eval” is very low performance. But I would say, Wind does NOT use “eval” to run the code. It only use “eval” as a flag to know which code should be compiled at runtime. When the code was firstly been executed, Wind will check and find “eval(Wind.compile(“async”, function”. So that it knows this function should be compiled. Then it utilized parse-js to analyze the inner JavaScript and generated the anonymous code in memory. Then it rewrite the original code so that when the application was running it will use the anonymous one instead of the original one. Since the code generation was done at the beginning of the application was started, in the future no matter how long our application runs and how many times the async function was invoked, it will use the generated code, no need to generate again. So there’s no significant performance hurt when using Wind.   Wind in My Previous Demo Let’s adopt Wind into one of my previous demonstration and to see how it helps us to make our code simple, straightforward and easy to read and understand. In this post when I implemented the functionality that copied the records from my WASD to table storage, the logic would be like this. 1, Open database connection. 2, Execute a query to select all records from the table. 3, Recreate the table in Windows Azure table storage. 4, Create entities from each of the records retrieved previously, and then insert them into table storage. 5, Finally, show message as the HTTP response. But as the image below, since there are so many callbacks and async operations, it’s very hard to understand my logic from the code. Now let’s use Wind to rewrite our code. First of all, of course, we need the Wind package. Then we need to include the package files into project and mark them as “Copy always”. Add the Wind package into the source code. Pay attention to the variant name, you must use “Wind” instead of “wind”. 1: var express = require("express"); 2: var async = require("async"); 3: var sql = require("node-sqlserver"); 4: var azure = require("azure"); 5: var Wind = require("wind"); Now we need to create some async functions by using Wind. All async functions should be wrapped so that it can be controlled by Wind which are open database, retrieve records, recreate table (delete and create) and insert entity in table. Below are these new functions. All of them are created by using Wind.Async.Task.create. 1: sql.openAsync = function (connectionString) { 2: return Wind.Async.Task.create(function (t) { 3: sql.open(connectionString, function (error, conn) { 4: if (error) { 5: t.complete("failure", error); 6: } 7: else { 8: t.complete("success", conn); 9: } 10: }); 11: }); 12: }; 13:  14: sql.queryAsync = function (conn, query) { 15: return Wind.Async.Task.create(function (t) { 16: conn.queryRaw(query, function (error, results) { 17: if (error) { 18: t.complete("failure", error); 19: } 20: else { 21: t.complete("success", results); 22: } 23: }); 24: }); 25: }; 26:  27: azure.recreateTableAsync = function (tableName) { 28: return Wind.Async.Task.create(function (t) { 29: client.deleteTable(tableName, function (error, successful, response) { 30: console.log("delete table finished"); 31: client.createTableIfNotExists(tableName, function (error, successful, response) { 32: console.log("create table finished"); 33: if (error) { 34: t.complete("failure", error); 35: } 36: else { 37: t.complete("success", null); 38: } 39: }); 40: }); 41: }); 42: }; 43:  44: azure.insertEntityAsync = function (tableName, entity) { 45: return Wind.Async.Task.create(function (t) { 46: client.insertEntity(tableName, entity, function (error, entity, response) { 47: if (error) { 48: t.complete("failure", error); 49: } 50: else { 51: t.complete("success", null); 52: } 53: }); 54: }); 55: }; Then in order to use these functions we will create a new function which contains all steps for data copying. 1: var copyRecords = eval(Wind.compile("async", function (req, res) { 2: try { 3: } 4: catch (ex) { 5: console.log(ex); 6: res.send(500, "Internal error."); 7: } 8: })); Let’s execute steps one by one with the “$await” keyword introduced by Wind so that it will be invoked in sequence. First is to open the database connection. 1: var copyRecords = eval(Wind.compile("async", function (req, res) { 2: try { 3: // connect to the windows azure sql database 4: var conn = $await(sql.openAsync(connectionString)); 5: console.log("connection opened"); 6: } 7: catch (ex) { 8: console.log(ex); 9: res.send(500, "Internal error."); 10: } 11: })); Then retrieve all records from the database connection. 1: var copyRecords = eval(Wind.compile("async", function (req, res) { 2: try { 3: // connect to the windows azure sql database 4: var conn = $await(sql.openAsync(connectionString)); 5: console.log("connection opened"); 6: // retrieve all records from database 7: var results = $await(sql.queryAsync(conn, "SELECT * FROM [Resource]")); 8: console.log("records selected. count = %d", results.rows.length); 9: } 10: catch (ex) { 11: console.log(ex); 12: res.send(500, "Internal error."); 13: } 14: })); After recreated the table, we need to create the entities and insert them into table storage. 1: var copyRecords = eval(Wind.compile("async", function (req, res) { 2: try { 3: // connect to the windows azure sql database 4: var conn = $await(sql.openAsync(connectionString)); 5: console.log("connection opened"); 6: // retrieve all records from database 7: var results = $await(sql.queryAsync(conn, "SELECT * FROM [Resource]")); 8: console.log("records selected. count = %d", results.rows.length); 9: if (results.rows.length > 0) { 10: // recreate the table 11: $await(azure.recreateTableAsync(tableName)); 12: console.log("table created"); 13: // insert records in table storage one by one 14: for (var i = 0; i < results.rows.length; i++) { 15: var entity = { 16: "PartitionKey": results.rows[i][1], 17: "RowKey": results.rows[i][0], 18: "Value": results.rows[i][2] 19: }; 20: $await(azure.insertEntityAsync(tableName, entity)); 21: console.log("entity inserted"); 22: } 23: } 24: } 25: catch (ex) { 26: console.log(ex); 27: res.send(500, "Internal error."); 28: } 29: })); Finally, send response back to the browser. 1: var copyRecords = eval(Wind.compile("async", function (req, res) { 2: try { 3: // connect to the windows azure sql database 4: var conn = $await(sql.openAsync(connectionString)); 5: console.log("connection opened"); 6: // retrieve all records from database 7: var results = $await(sql.queryAsync(conn, "SELECT * FROM [Resource]")); 8: console.log("records selected. count = %d", results.rows.length); 9: if (results.rows.length > 0) { 10: // recreate the table 11: $await(azure.recreateTableAsync(tableName)); 12: console.log("table created"); 13: // insert records in table storage one by one 14: for (var i = 0; i < results.rows.length; i++) { 15: var entity = { 16: "PartitionKey": results.rows[i][1], 17: "RowKey": results.rows[i][0], 18: "Value": results.rows[i][2] 19: }; 20: $await(azure.insertEntityAsync(tableName, entity)); 21: console.log("entity inserted"); 22: } 23: // send response 24: console.log("all done"); 25: res.send(200, "All done!"); 26: } 27: } 28: catch (ex) { 29: console.log(ex); 30: res.send(500, "Internal error."); 31: } 32: })); If we compared with the previous code we will find now it became more readable and much easy to understand. It’s very easy to know what this function does even though without any comments. When user go to URL “/was/copyRecords” we will execute the function above. The code would be like this. 1: app.get("/was/copyRecords", function (req, res) { 2: copyRecords(req, res).start(); 3: }); And below is the logs printed in local compute emulator console. As we can see the functions executed one by one and then finally the response back to me browser.   Scaffold Functions in Wind Wind provides not only the async flow control and compile functions, but many scaffold methods as well. We can build our async code more easily by using them. I’m going to introduce some basic scaffold functions here. In the code above I created some functions which wrapped from the original async function such as open database, create table, etc.. All of them are very similar, created a task by using Wind.Async.Task.create, return error or result object through Task.complete function. In fact, Wind provides some functions for us to create task object from the original async functions. If the original async function only has a callback parameter, we can use Wind.Async.Binding.fromCallback method to get the task object directly. For example the code below returned the task object which wrapped the file exist check function. 1: var Wind = require("wind"); 2: var fs = require("fs"); 3:  4: fs.existsAsync = Wind.Async.Binding.fromCallback(fs.exists); In Node.js a very popular async function pattern is that, the first parameter in the callback function represent the error object, and the other parameters is the return values. In this case we can use another build-in function in Wind named Wind.Async.Binding.fromStandard. For example, the open database function can be created from the code below. 1: sql.openAsync = Wind.Async.Binding.fromStandard(sql.open); 2:  3: /* 4: sql.openAsync = function (connectionString) { 5: return Wind.Async.Task.create(function (t) { 6: sql.open(connectionString, function (error, conn) { 7: if (error) { 8: t.complete("failure", error); 9: } 10: else { 11: t.complete("success", conn); 12: } 13: }); 14: }); 15: }; 16: */ When I was testing the scaffold functions under Wind.Async.Binding I found for some functions, such as the Azure SDK insert entity function, cannot be processed correctly. So I personally suggest writing the wrapped method manually.   Another scaffold method in Wind is the parallel tasks coordination. In this example, the steps of open database, retrieve records and recreated table should be invoked one by one, but it can be executed in parallel when copying data from database to table storage. In Wind there’s a scaffold function named Task.whenAll which can be used here. Task.whenAll accepts a list of tasks and creates a new task. It will be returned only when all tasks had been completed, or any errors occurred. For example in the code below I used the Task.whenAll to make all copy operation executed at the same time. 1: var copyRecordsInParallel = eval(Wind.compile("async", function (req, res) { 2: try { 3: // connect to the windows azure sql database 4: var conn = $await(sql.openAsync(connectionString)); 5: console.log("connection opened"); 6: // retrieve all records from database 7: var results = $await(sql.queryAsync(conn, "SELECT * FROM [Resource]")); 8: console.log("records selected. count = %d", results.rows.length); 9: if (results.rows.length > 0) { 10: // recreate the table 11: $await(azure.recreateTableAsync(tableName)); 12: console.log("table created"); 13: // insert records in table storage in parallal 14: var tasks = new Array(results.rows.length); 15: for (var i = 0; i < results.rows.length; i++) { 16: var entity = { 17: "PartitionKey": results.rows[i][1], 18: "RowKey": results.rows[i][0], 19: "Value": results.rows[i][2] 20: }; 21: tasks[i] = azure.insertEntityAsync(tableName, entity); 22: } 23: $await(Wind.Async.Task.whenAll(tasks)); 24: // send response 25: console.log("all done"); 26: res.send(200, "All done!"); 27: } 28: } 29: catch (ex) { 30: console.log(ex); 31: res.send(500, "Internal error."); 32: } 33: })); 34:  35: app.get("/was/copyRecordsInParallel", function (req, res) { 36: copyRecordsInParallel(req, res).start(); 37: });   Besides the task creation and coordination, Wind supports the cancellation solution so that we can send the cancellation signal to the tasks. It also includes exception solution which means any exceptions will be reported to the caller function.   Summary In this post I introduced a Node.js module named Wind, which created by my friend Jeff Zhao. As you can see, different from other async library and framework, adopted the idea from F# and C#, Wind utilizes runtime code generation technology to make it more easily to write async, callback-based functions in a sync-style way. By using Wind there will be almost no callback, and the code will be very easy to understand. Currently Wind is still under developed and improved. There might be some problems but the author, Jeff, should be very happy and enthusiastic to learn your problems, feedback, suggestion and comments. You can contact Jeff by - Email: [email protected] - Group: https://groups.google.com/d/forum/windjs - GitHub: https://github.com/JeffreyZhao/wind/issues   Source code can be download here.   Hope this helps, Shaun All documents and related graphics, codes are provided "AS IS" without warranty of any kind. Copyright © Shaun Ziyan Xu. This work is licensed under the Creative Commons License.

    Read the article

  • Introducing Data Annotations Extensions

    - by srkirkland
    Validation of user input is integral to building a modern web application, and ASP.NET MVC offers us a way to enforce business rules on both the client and server using Model Validation.  The recent release of ASP.NET MVC 3 has improved these offerings on the client side by introducing an unobtrusive validation library built on top of jquery.validation.  Out of the box MVC comes with support for Data Annotations (that is, System.ComponentModel.DataAnnotations) and can be extended to support other frameworks.  Data Annotations Validation is becoming more popular and is being baked in to many other Microsoft offerings, including Entity Framework, though with MVC it only contains four validators: Range, Required, StringLength and Regular Expression.  The Data Annotations Extensions project attempts to augment these validators with additional attributes while maintaining the clean integration Data Annotations provides. A Quick Word About Data Annotations Extensions The Data Annotations Extensions project can be found at http://dataannotationsextensions.org/, and currently provides 11 additional validation attributes (ex: Email, EqualTo, Min/Max) on top of Data Annotations’ original 4.  You can find a current list of the validation attributes on the afore mentioned website. The core library provides server-side validation attributes that can be used in any .NET 4.0 project (no MVC dependency). There is also an easily pluggable client-side validation library which can be used in ASP.NET MVC 3 projects using unobtrusive jquery validation (only MVC3 included javascript files are required). On to the Preview Let’s say you had the following “Customer” domain model (or view model, depending on your project structure) in an MVC 3 project: public class Customer { public string Email { get; set; } public int Age { get; set; } public string ProfilePictureLocation { get; set; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } When it comes time to create/edit this Customer, you will probably have a CustomerController and a simple form that just uses one of the Html.EditorFor() methods that the ASP.NET MVC tooling generates for you (or you can write yourself).  It should look something like this: With no validation, the customer can enter nonsense for an email address, and then can even report their age as a negative number!  With the built-in Data Annotations validation, I could do a bit better by adding a Range to the age, adding a RegularExpression for email (yuck!), and adding some required attributes.  However, I’d still be able to report my age as 10.75 years old, and my profile picture could still be any string.  Let’s use Data Annotations along with this project, Data Annotations Extensions, and see what we can get: public class Customer { [Email] [Required] public string Email { get; set; }   [Integer] [Min(1, ErrorMessage="Unless you are benjamin button you are lying.")] [Required] public int Age { get; set; }   [FileExtensions("png|jpg|jpeg|gif")] public string ProfilePictureLocation { get; set; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Now let’s try to put in some invalid values and see what happens: That is very nice validation, all done on the client side (will also be validated on the server).  Also, the Customer class validation attributes are very easy to read and understand. Another bonus: Since Data Annotations Extensions can integrate with MVC 3’s unobtrusive validation, no additional scripts are required! Now that we’ve seen our target, let’s take a look at how to get there within a new MVC 3 project. Adding Data Annotations Extensions To Your Project First we will File->New Project and create an ASP.NET MVC 3 project.  I am going to use Razor for these examples, but any view engine can be used in practice.  Now go into the NuGet Extension Manager (right click on references and select add Library Package Reference) and search for “DataAnnotationsExtensions.”  You should see the following two packages: The first package is for server-side validation scenarios, but since we are using MVC 3 and would like comprehensive sever and client validation support, click on the DataAnnotationsExtensions.MVC3 project and then click Install.  This will install the Data Annotations Extensions server and client validation DLLs along with David Ebbo’s web activator (which enables the validation attributes to be registered with MVC 3). Now that Data Annotations Extensions is installed you have all you need to start doing advanced model validation.  If you are already using Data Annotations in your project, just making use of the additional validation attributes will provide client and server validation automatically.  However, assuming you are starting with a blank project I’ll walk you through setting up a controller and model to test with. Creating Your Model In the Models folder, create a new User.cs file with a User class that you can use as a model.  To start with, I’ll use the following class: public class User { public string Email { get; set; } public string Password { get; set; } public string PasswordConfirm { get; set; } public string HomePage { get; set; } public int Age { get; set; } } Next, create a simple controller with at least a Create method, and then a matching Create view (note, you can do all of this via the MVC built-in tooling).  Your files will look something like this: UserController.cs: public class UserController : Controller { public ActionResult Create() { return View(new User()); }   [HttpPost] public ActionResult Create(User user) { if (!ModelState.IsValid) { return View(user); }   return Content("User valid!"); } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Create.cshtml: @model NuGetValidationTester.Models.User   @{ ViewBag.Title = "Create"; }   <h2>Create</h2>   <script src="@Url.Content("~/Scripts/jquery.validate.min.js")" type="text/javascript"></script> <script src="@Url.Content("~/Scripts/jquery.validate.unobtrusive.min.js")" type="text/javascript"></script>   @using (Html.BeginForm()) { @Html.ValidationSummary(true) <fieldset> <legend>User</legend> @Html.EditorForModel() <p> <input type="submit" value="Create" /> </p> </fieldset> } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } In the Create.cshtml view, note that we are referencing jquery validation and jquery unobtrusive (jquery is referenced in the layout page).  These MVC 3 included scripts are the only ones you need to enjoy both the basic Data Annotations validation as well as the validation additions available in Data Annotations Extensions.  These references are added by default when you use the MVC 3 “Add View” dialog on a modification template type. Now when we go to /User/Create we should see a form for editing a User Since we haven’t yet added any validation attributes, this form is valid as shown (including no password, email and an age of 0).  With the built-in Data Annotations attributes we can make some of the fields required, and we could use a range validator of maybe 1 to 110 on Age (of course we don’t want to leave out supercentenarians) but let’s go further and validate our input comprehensively using Data Annotations Extensions.  The new and improved User.cs model class. { [Required] [Email] public string Email { get; set; }   [Required] public string Password { get; set; }   [Required] [EqualTo("Password")] public string PasswordConfirm { get; set; }   [Url] public string HomePage { get; set; }   [Integer] [Min(1)] public int Age { get; set; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Now let’s re-run our form and try to use some invalid values: All of the validation errors you see above occurred on the client, without ever even hitting submit.  The validation is also checked on the server, which is a good practice since client validation is easily bypassed. That’s all you need to do to start a new project and include Data Annotations Extensions, and of course you can integrate it into an existing project just as easily. Nitpickers Corner ASP.NET MVC 3 futures defines four new data annotations attributes which this project has as well: CreditCard, Email, Url and EqualTo.  Unfortunately referencing MVC 3 futures necessitates taking an dependency on MVC 3 in your model layer, which may be unadvisable in a multi-tiered project.  Data Annotations Extensions keeps the server and client side libraries separate so using the project’s validation attributes don’t require you to take any additional dependencies in your model layer which still allowing for the rich client validation experience if you are using MVC 3. Custom Error Message and Globalization: Since the Data Annotations Extensions are build on top of Data Annotations, you have the ability to define your own static error messages and even to use resource files for very customizable error messages. Available Validators: Please see the project site at http://dataannotationsextensions.org/ for an up-to-date list of the new validators included in this project.  As of this post, the following validators are available: CreditCard Date Digits Email EqualTo FileExtensions Integer Max Min Numeric Url Conclusion Hopefully I’ve illustrated how easy it is to add server and client validation to your MVC 3 projects, and how to easily you can extend the available validation options to meet real world needs. The Data Annotations Extensions project is fully open source under the BSD license.  Any feedback would be greatly appreciated.  More information than you require, along with links to the source code, is available at http://dataannotationsextensions.org/. Enjoy!

    Read the article

< Previous Page | 21 22 23 24 25 26 27 28 29 30 31 32  | Next Page >