Search Results

Search found 464 results on 19 pages for 'the consequences of cheat'.

Page 19/19 | < Previous Page | 15 16 17 18 19

You should NOT be writing jQuery in SharePoint if…

- by Mark Rackley

Yes… another one of these posts. What can I say? I’m a pot stirrer.. a rabble rouser *rabble rabble* jQuery in SharePoint seems to be a fairly polarizing issue with one side thinking it is the most awesome thing since Princess Leia as the slave girl in Return of the Jedi and the other half thinking it is the worst idea since Mannequin 2: On the Move. The correct answer is OF COURSE “it depends”. But what are those deciding factors that make jQuery an awesome fit or leave a bad taste in your mouth? Let’s see if I can drive the discussion here with some polarizing comments of my own… I know some of you are getting ready to leave your comments even now before reading the rest of the blog, which is great! Iron sharpens iron… These discussions hopefully open us up to understanding the entire process better and think about things in a different way. You should not be writing jQuery in SharePoint if you are not a developer… Let’s start off with my most polarizing and rant filled portion of the blog post. If you don’t know what you are doing or you don’t have a background that helps you understand the implications of what you are writing then you should not be writing jQuery in SharePoint! I truly believe that one of the biggest reasons for the jQuery haters is because of all the bad jQuery out there. If you don’t know what you are doing you can do some NASTY things! One of the best stories I’ve heard about this is from my good friend John Ferringer (@ferringer). John tells this story during our Mythbusters session we do together. One of his clients was undergoing a Denial of Service attack and they couldn’t figure out what was going on! After much searching they found that some genius jQuery developer wrote some code for an image rotator, but did not take into account what happens when there are no images to load! The code just kept hitting the servers over and over and over again which prevented anything else from getting done! Now, I’m NOT saying that I have not done the same sort of thing in the past or am immune from such mistakes. My point is that if you don’t know what you are doing, there are very REAL consequences that can have a major impact on your organization AND they will be hard to track down. Think how happy your boss will be after you copy and pasted some jQuery from a blog without understanding what it does, it brings down the farm, AND it takes them 3 days to track it back to you. :/ Good times will not be had. Like it or not JavaScript/jQuery is a programming language. While you .NET people sit on your high horses because your code is compiled and “runs faster” (also debatable), the rest of us will be actually getting work done and delivering solutions while you are trying to figure out why your widget won’t deploy. I can pick at that scab because I write .NET code too and speak from experience. I can do both, and do both well. So, I am not speaking from ignorance here. In JavaScript/jQuery you have variables, loops, conditionals, functions, arrays, events, and built in methods. If you are not a developer you just aren’t going to take advantage of all of that and use it correctly. Ahhh.. but there is hope! There is a lot of jQuery resources out there to help you learn and learn well! There are many experts on the subject that will gladly tell you when you are smoking crack. I just this minute saw a tweet from @cquick with a link to: “jQuery Fundamentals”. I just glanced through it and this may be a great primer for you aspiring jQuery devs. Take advantage of all the resources and become a developer! Hey, it will look awesome on your resume right? You should not be writing jQuery in SharePoint if it depends too much on client resources for a good user experience I’ve said it once and I’ll say it over and over until you understand. jQuery is executed on the client’s computer. Got it? If you are looping through hundreds of rows of data, searching through an enormous DOM, or performing many calculations it is going to take some time! AND if your user happens to be sitting on some old PC somewhere that they picked up at a garage sale their experience will be that much worse! If you can’t give the user a good experience they will not use the site. So, if jQuery is causing the user to have a bad experience, don’t use it. I sometimes go as far to say that you should NOT go to jQuery as a first option for external facing web sites because you have ZERO control over what the end user’s computer will be. You just can’t guarantee an awesome user experience all of the time. Ahhh… but you have no choice? (where have I heard that before?). Well… if you really have no choice, here are some tips to help improve the experience: Avoid screen scraping This is not 1999 and SharePoint is not an old green screen from a mainframe… so why are you treating it like it is? Screen scraping is time consuming and client intensive. Take advantage of tools like SPServices to do your data retrieval when possible. Fine tune your DOM searches A lot of time can be eaten up just searching the DOM and ignoring table rows that you don’t need. Write better jQuery to only loop through tables rows that you need, or only access specific elements you need. Take advantage of Element ID’s to return the one element you are looking for instead of looping through all the DOM over and over again. Write better jQuery Remember this is development. Think about how you can write cleaner, faster jQuery. This directly relates to the previous point of improving your DOM searches, but also when using arrays, variables and loops. Do you REALLY need to loop through that array 3 times? How can you knock it down to 2 times or even 1? When you have lots of calculations and data that you are manipulating every operation adds up. Think about how you can streamline it. Back in the old days before RAM was abundant, Cores were plentiful and dinosaurs roamed the earth, us developers had to take performance into account in everything we did. It’s a lost art that really needs to be used here. You should not be writing jQuery in SharePoint if you are sending a lot of data over the wire… Developer: “Awesome… you can easily call SharePoint’s web services to retrieve and write data using SPServices!” Administrator: “Crap! you can easily call SharePoint’s web services to retrieve and write data using SPServices!” SPServices may indeed be the best thing that happened to SharePoint since the invention of SharePoint Saturdays by Godfather Lotter… BUT you HAVE to use it wisely! (I REFUSE to make the Spiderman reference). If you do not know what you are doing your code will bring back EVERY field and EVERY row from a list and push that over the internet with all that lovely XML wrapped around it. That can be a HUGE amount of data and will GREATLY impact performance! Calling several web service methods at the same time can cause the same problem and can negatively impact your SharePoint servers. These problems, thankfully, are not difficult to rectify if you are careful: Limit list data retrieved Use CAML to reduce the number of rows returned and limit the fields returned using ViewFields. You should definitely be doing this regardless. If you aren’t I hope your admin thumps you upside the head. Batch large list updates You may or may not have noticed that if you try to do large updates (hundreds of rows) that the performance is either completely abysmal or it fails over half the time. You can greatly improve performance and avoid timeouts by breaking up your updates into several smaller updates. I don’t know if there is a magic number for best performance, it really depends on how much data you are sending back more than the number of rows. However, I have found that 200 rows generally works well. Play around and find the right number for your situation. Delay Web Service calls when possible One of the cool things about jQuery and SPServices is that you can delay queries to the server until they are actually needed instead of doing them all at once. This can lead to performance improvements over DataViewWebParts and even .NET code in the right situations. So, don’t load the data until it’s needed. In some instances you may not need to retrieve the data at all, so why retrieve it ALL the time? You should not be writing jQuery in SharePoint if there is a better solution… jQuery is NOT the silver bullet in SharePoint, it is not the answer to every question, it is just another tool in the developers toolkit. I urge all developers to know what options exist out there and choose the right one! Sometimes it will be jQuery, sometimes it will be .NET, sometimes it will be XSL, and sometimes it will be some other choice… So, when is there a better solution to jQuery? When you can’t get away from performance problems Sometimes jQuery will just give you horrible performance regardless of what you do because of unavoidable obstacles. In these situations you are going to have to figure out an alternative. Can I do it with a DVWP or do I have to crack open Visual Studio? When you need to do something that jQuery can’t do There are lots of things you can’t do in jQuery like elevate privileges, event handlers, workflows, or interact with back end systems that have no web service interface. It just can’t do everything. When it can be done faster and more efficiently another way Why are you spending time to write jQuery to do a DataViewWebPart that would take 5 minutes? Or why are you trying to implement complicated logic that would be simple to do in .NET? If your answer is that you don’t have the option, okay. BUT if you do have the option don’t reinvent the wheel! Take advantage of the other tools. The answer is not always jQuery… sorry… the kool-aid tastes good, but sweet tea is pretty awesome too. You should not be using jQuery in SharePoint if you are a moron… Let’s finish up the blog on a high note… Yes.. it’s true, I sometimes type things just to get a reaction… guess this section title might be a good example, but it feels good sometimes just to type the words that a lot of us think… So.. don’t be that guy! Another good buddy of mine that works for Microsoft told me. “I loved jQuery in SharePoint…. until I had to support it.”. He went on to explain that some user was making several web service calls on a page using jQuery and then was calling Microsoft and COMPLAINING because the page took so long to load… DUH! What do you expect to happen when you are pushing that much data over the wire and are making that many web service calls at once!! It’s one thing to write that kind of code and accept it’s just going to take a while, it’s COMPLETELY another issue to do that and then complain when it’s not lightning fast! Someone’s gene pool needs some chlorine. So, I think this is a nice summary of the blog… DON’T be that guy… don’t be a moron. How can you stop yourself from being a moron? Ah.. glad you asked, here are some tips: Think Is jQuery the right solution to my problem? Is there a better approach? What are the implications and pitfalls of using jQuery in this situation? Search What are others doing? Does someone have a better solution? Is there a third party library that does the same thing I need? Plan Write good jQuery. Limit calculations and data sent over the wire and don’t reinvent the wheel when possible. Test Okay, it works well on your machine. Try it on others ESPECIALLY if this is for an external site. Test with empty data. Test with hundreds of rows of data. Test as many scenarios as possible. Monitor those server resources to see the impact there as well. Ask the experts As smart as you are, there are people smarter than you. Even the experts talk to each other to make sure they aren't doing something stupid. And for the MOST part they are pretty nice guys. Marc Anderson and Christophe Humbert are two guys who regularly keep me in line. Make sure you aren’t doing something stupid. Repeat So, when you think you have the best solution possible, repeat the steps above just to be safe. Conclusion jQuery is an awesome tool and has come in handy on many occasions. I’m even teaching a 1/2 day SharePoint & jQuery workshop at the upcoming SPTechCon in Boston if you want to berate me in person. However, it’s only as awesome as the developer behind the keyboard. It IS development and has its pitfalls. Knowledge and experience are invaluable to giving the user the best experience possible. Let’s face it, in the end, no matter our opinions, prejudices, or ego providing our clients, customers, and users with the best solution possible is what counts. Period… end of sentence…

Read the article
VS 2012 Code Review – Before Check In OR After Check In?

- by Tarun Arora

“Is Code Review Important and Effective?” There is a consensus across the industry that code review is an effective and practical way to collar code inconsistency and possible defects early in the software development life cycle. Among others some of the advantages of code reviews are, Bugs are found faster Forces developers to write readable code (code that can be read without explanation or introduction!) Optimization methods/tricks/productive programs spread faster Programmers as specialists "evolve" faster It's fun “Code review is systematic examination (often known as peer review) of computer source code. It is intended to find and fix mistakes overlooked in the initial development phase, improving both the overall quality of software and the developers' skills. Reviews are done in various forms such as pair programming, informal walkthroughs, and formal inspections.” Wikipedia No where does the definition mention whether its better to review code before the code has been committed to version control or after the commit has been performed. No matter which side you favour, Visual Studio 2012 allows you to request for a code review both before check in and also request for a review after check in. Let’s weigh the pros and cons of the approaches independently. Code Review Before Check In or Code Review After Check In? Approach 1 – Code Review before Check in Developer completes the code and feels the code quality is appropriate for check in to TFS. The developer raises a code review request to have a second pair of eyes validate if the code abides to the recommended best practices, will not result in any defects due to common coding mistakes and whether any optimizations can be made to improve the code quality. Image 1 – code review before check in Pros Everything that gets committed to source control is reviewed. Minimizes the chances of smelly code making its way into the code base. Decreases the cost of fixing bugs, remember, the earlier you find them, the lesser the pain in fixing them. Cons Development Code Freeze – Since the changes aren’t in the source control yet. Further development can only be done off-line. The changes have not been through a CI build, hard to say whether the code abides to all build quality standards. Inconsistent! Cumbersome to track the actual code review process. Not every change to the code base is worth reviewing, a lot of effort is invested for very little gain. Approach 2 – Code Review after Check in Developer checks in, random code reviews are performed on the checked in code. Image 2 – Code review after check in Pros The code has already passed the CI build and run through any code analysis plug ins you may have running on the build server. Instruct the developer to ensure ZERO fx cop, style cop and static code analysis before check in. Code is cleaner and smell free even before the code review. No Offline development, developers can continue to develop against the source control. Cons Bad code can easily make its way into the code base. Since the review take place much later in the cycle, the cost of fixing issues can prove to be much higher. Approach 3 – Hybrid Approach The community advocates a more hybrid approach, a blend of tooling and human accountability quotient. Image 3 – Hybrid Approach 1. Code review high impact check ins. It is not possible to review everything, by setting up code review check in policies you can end up slowing your team. More over, the code that you are reviewing before check in hasn't even been through a green CI build either. 2. Tooling. Let the tooling work for you. By running static analysis, fx cop, style cop and other plug ins on the build agent, you can identify the real issues that in my opinion can't possibly be identified using human reviews. Configure the tooling to report back top 10 issues every day. Mandate the manual code review of individuals who keep making it to this list of shame more often. 3. During Merge. I would prefer eliminating some of the other code issues during merge from Main branch to the release branch. In a scrum project this is still easier because cheery picking the merges is a possibility and the size of code being reviewed is still limited. Let the tooling work for you, if some one breaks the CI build often, put them on a gated check in build course until you see improvement. If some one appears on the top 10 list of shame generated via the build then ensure that all their code is reviewed till you see improvement. At the end of the day, the goal is to ensure that the code being delivered is top quality. By enforcing a code review before any check in, you force the developer to work offline or stay put till the review is complete. What do the experts say? So I asked a few expects what they thought of “Code Review quality gate before Checking in code?" Terje Sandstrom | Microsoft ALM MVP You mean a review quality gate BEFORE checking in code????? That would mean a lot of code staying either local or in shelvesets, and not even been through a CI build, and a green CI build being the main criteria for going further, f.e. to the review state. I would not like code laying around with no checkin’s. Having a requirement that code is checked in small pieces, 4-8 hours work max, and AT LEAST daily checkins, a manual code review comes second down the lane. I would expect review quality gates to happen before merging back to main, or before merging to release. But that would all be on checked-in code. Branching is absolutely one way to ease the pain. Another way we are using is automatic quality builds, running metrics, coverage, static code analysis. Unfortunately it takes some time, would be great to be on CI’s – but…., so it’s done scheduled every night. Based on this we get, among other stuff, top 10 lists of suspicious code, which is then subjected to reviews. If a person seems to be very popular on these top 10 lists, we subject every check in from that person to a review for a period. That normally helps. None of the clients I have can afford to have every checkin reviewed, so we need to find ways around it. I don’t disagree with the nicety of having all the code reviewed, but I find it hard to find those resources in today’s enterprises. David V. Corbin | Visual Studio ALM Ranger I tend to agree with both sides. I hate having code that is not checked in, but at the same time hate having “bad” code in the repository. I have found that branching is one approach to solving this dilemma. Code is checked into the private/feature branch before the review, but is not merged over to the “official” branch until after the review. I advocate both, depending on circumstance (especially team dynamics) - The “pre-checkin” is usually for elements that may impact the project as a whole. Think of it as another “gate” along with passing unit tests. - The “post-checkin” may very well not be at the changeset level, but correlates to a review at the “user story” level. Again, this depends on team dynamics in play…. Robert MacLean | Microsoft ALM MVP I do not think there is no right answer for the industry as a whole. In short the question is why do you do reviews? Your question implies risk mitigation, so in low risk areas you can get away with it after check in while in high risk you need to do it before check in. An example is those new to a team or juniors need it much earlier (maybe that is before checkin, maybe that is soon after) than seniors who have shipped twenty sprints on the team. Abhimanyu Singhal | Visual Studio ALM Ranger Depends on per scenario basis. We recommend post check-in reviews when: 1. We don't want to block other checks and processes on manual code reviews. Manual reviews take time, and some pieces may not require manual reviews at all. 2. We need to trace all changes and track history. 3. We have a code promotion strategy/process in place. For risk mitigation, post checkin code can be promoted to Accepted branches. Or can be rejected. Pre Checkin Reviews are used when 1. There is a high risk factor associated 2. Reviewers are generally (most of times) have immediate availability. 3. Team does not have strict tracking needs. Simply speaking, no single process fits all scenarios. You need to select what works best for your team/project. Thomas Schissler | Visual Studio ALM Ranger This is an interesting discussion, I’m right now discussing details about executing code reviews with my teams. I see and understand the aspects you brought in, but there is another side as well, I’d like to point out. 1.) If you do reviews per check in this is not very practical as a hard rule because this will disturb the flow of the team very often or it will lead to reduce the checkin frequency of the devs which I would not accept. 2.) If you do later reviews, for example if you review PBIs, it is not easy to find out which code you should review. Either you review all changesets associate with the PBI, but then you might review code which has been changed with a later checkin and the dev maybe has already fixed the issue. Or you review the diff of the latest changeset of the PBI with the first but then you might also review changes of other PBIs. Jakob Leander | Sr. Director, Avanade In my experience, manual code review: 1. Does not get done and at the very least does not get redone after changes (regardless of intentions at start of project) 2. When a project actually do it, they often do not do it right away = errors pile up 3. Requires a lot of time discussing/defining the standard and for the team to learn it However code review is very important since e.g. even small memory leaks in a high volume web solution have big consequences In the last years I have advocated following approach for code review - Architects up front do “at least one best practice example” of each type of component and tell the team. Copy from this one. This should include error handling, logging, security etc. - Dev lead on project continuously browse code to validate that the best practices are used. Especially that patterns etc. are not broken. You can do this formally after each sprint/iteration if you want. Once this is validated it is unlikely to “go bad” even during later code changes Agree with customer to rely on static code analysis from Visual Studio as the one and only coding standard. This has HUUGE benefits - You can easily tweak to reach the level you desire together with customer - It is easy to measure for both developers/management - It is 100% consistent across code base - It gets validated all the time so you never end up getting hammered by a customer review in the end - It is easy to tell the developer that you do not want code back unless it has zero errors = minimize communication You need to track this at least during nightly builds and make sure team sees total # issues. Do not allow #issues it to grow uncontrolled. On the project I run I require code analysis to have run on code before checkin (checkin rule). This means - You have to have clean compile (or CA wont run) so this is extra benefit = very few broken builds - You can change a few of the rules to compile as errors instead of warnings. I often do this for “missing dispose” issues which you REALLY do not want in your app Tip: Place your custom CA rules files as part of solution. That way it works when you do branching etc. (path to CA file is relative in VS) Some may argue that CA is not as good as manual inspection. But since manual inspection in reality suffers from the 3 issues in start it is IMO a MUCH better (and much cheaper) approach from helicopter perspective Tirthankar Dutta | Director, Avanade I think code review should be run both before and after check ins. There are some code metrics that are meant to be run on the entire codebase … Also, especially on multi-site projects, one should strive to architect in a way that lets men manage the framework while boys write the repetitive code… scales very well with the need to review less by containment and imposing architectural restrictions to emphasise the design. Bruno Capuano | Microsoft ALM MVP For code reviews (means peer reviews) in distributed team I use http://www.vsanywhere.com/default.aspx David Jobling | Global Sr. Director, Avanade Peer review is the only way to scale and its a great practice for all in the team to learn to perform and accept. In my experience you soon learn who's code to watch more than others and tune the attention. Mikkel Toudal Kristiansen | Manager, Avanade If you have several branches in your code base, you will need to merge often. This requires manual merging, when a file has been changed in both branches. It offers a good opportunity to actually review to changed code. So my advice is: Merging between branches should be done as often as possible, it should be done by a senior developer, and he/she should perform a full code review of the code being merged. As for detecting architectural smells and code smells creeping into the code base, one really good third party tools exist: Ndepend (http://www.ndepend.com/, for static code analysis of the current state of the code base). You could also consider adding StyleCop to the solution. Jesse Houwing | Visual Studio ALM Ranger I gave a presentation on this subject on the TechDays conference in NL last year. See my presentation and slides here (talk in Dutch, but English presentation): http://blog.jessehouwing.nl/2012/03/did-you-miss-my-techdaysnl-talk-on-code.html I’d like to add a few more points: - Before/After checking is mostly a trust issue. If you have a team that does diligent peer reviews and regularly talk/sit together or peer review, there’s no need to enforce a before-checkin policy. The peer peer-programming and regular feedback during development can take care of most of the review requirements as long as the team isn’t under stress. - Under stress, enforce pre-checkin reviews, it might sound strange, if you’re already under time or budgetary constraints, but it is under such conditions most real issues start to be created or pile up. - Use tools to catch most common errors, Code Analysis/FxCop was already mentioned. HP Fortify, Resharper, Coderush etc can help you there. There are also a lot of 3rd party rules you can add to Code Analysis. I’ve written a few myself (http://fccopcontrib.codeplex.com) and various teams from Microsoft have added their own rules (MSOCAF for SharePoint, WSSF for WCF). For common errors that keep cropping up, see if you can define a rule. It’s much easier. But more importantly make sure you have a good help page explaining *WHY* it's wrong. If you have small feature or developer branches/shelvesets, you might want to review pre-merge. It’s still better to do peer reviews and peer programming, but the most important thing is that bad quality code doesn’t make it into the important branch. So my philosophy: - Use tooling as much as possible. - Make sure the team understands the tooling and the importance of the things it flags. It’s too easy to just click suppress all to ignore the warnings. - Under stress, tighten process, it’s under stress that the problems of late reviews will really surface - Most importantly if you do reviews do them as early as possible, but never later than needed. In other words, pre-checkin/post checking doesn’t really matter, as long as the review is done before the code is released. It’ll just be much more expensive to fix any review outcomes the later you find them. --- I would love to hear what you think!

Read the article
Unable to connect to Samba printer

- by user127236

I have a headless Ubuntu 12.04 server for files and printers. It shares files via Samba just fine. However, the HP PSC-750xi connected to the server via USB is not accessible from my Ubuntu 12.04 laptop. I can browse for it in the Printing control panel, but any attempt to authenticate my ID to the printer with my user credentials results in the error "This print share is not accessible". I have included the Samba smb.conf file below. Any help appreciated. Thanks... JGB # # Sample configuration file for the Samba suite for Debian GNU/Linux. # # # This is the main Samba configuration file. You should read the # smb.conf(5) manual page in order to understand the options listed # here. Samba has a huge number of configurable options most of which # are not shown in this example # # Some options that are often worth tuning have been included as # commented-out examples in this file. # - When such options are commented with ";", the proposed setting # differs from the default Samba behaviour # - When commented with "#", the proposed setting is the default # behaviour of Samba but the option is considered important # enough to be mentioned here # # NOTE: Whenever you modify this file you should run the command # "testparm" to check that you have not made any basic syntactic # errors. # A well-established practice is to name the original file # "smb.conf.master" and create the "real" config file with # testparm -s smb.conf.master >smb.conf # This minimizes the size of the really used smb.conf file # which, according to the Samba Team, impacts performance # However, use this with caution if your smb.conf file contains nested # "include" statements. See Debian bug #483187 for a case # where using a master file is not a good idea. # #======================= Global Settings ======================= [global] log file = /var/log/samba/log.%m passwd chat = *Enter\snew\s*\spassword:* %n\n *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* . obey pam restrictions = yes map to guest = bad user encrypt passwords = true passwd program = /usr/bin/passwd %u passdb backend = tdbsam dns proxy = no writeable = yes server string = %h server (Samba, Ubuntu) unix password sync = yes workgroup = WORKGROUP syslog = 0 panic action = /usr/share/samba/panic-action %d usershare allow guests = yes max log size = 1000 pam password change = yes ## Browsing/Identification ### # Change this to the workgroup/NT-domain name your Samba server will part of # server string is the equivalent of the NT Description field # Windows Internet Name Serving Support Section: # WINS Support - Tells the NMBD component of Samba to enable its WINS Server # wins support = no # WINS Server - Tells the NMBD components of Samba to be a WINS Client # Note: Samba can be either a WINS Server, or a WINS Client, but NOT both ; wins server = w.x.y.z # This will prevent nmbd to search for NetBIOS names through DNS. # What naming service and in what order should we use to resolve host names # to IP addresses ; name resolve order = lmhosts host wins bcast #### Networking #### # The specific set of interfaces / networks to bind to # This can be either the interface name or an IP address/netmask; # interface names are normally preferred ; interfaces = 127.0.0.0/8 eth0 # Only bind to the named interfaces and/or networks; you must use the # 'interfaces' option above to use this. # It is recommended that you enable this feature if your Samba machine is # not protected by a firewall or is a firewall itself. However, this # option cannot handle dynamic or non-broadcast interfaces correctly. ; bind interfaces only = yes #### Debugging/Accounting #### # This tells Samba to use a separate log file for each machine # that connects # Cap the size of the individual log files (in KiB). # If you want Samba to only log through syslog then set the following # parameter to 'yes'. # syslog only = no # We want Samba to log a minimum amount of information to syslog. Everything # should go to /var/log/samba/log.{smbd,nmbd} instead. If you want to log # through syslog you should set the following parameter to something higher. # Do something sensible when Samba crashes: mail the admin a backtrace ####### Authentication ####### # "security = user" is always a good idea. This will require a Unix account # in this server for every user accessing the server. See # /usr/share/doc/samba-doc/htmldocs/Samba3-HOWTO/ServerType.html # in the samba-doc package for details. # security = user # You may wish to use password encryption. See the section on # 'encrypt passwords' in the smb.conf(5) manpage before enabling. # If you are using encrypted passwords, Samba will need to know what # password database type you are using. # This boolean parameter controls whether Samba attempts to sync the Unix # password with the SMB password when the encrypted SMB password in the # passdb is changed. # For Unix password sync to work on a Debian GNU/Linux system, the following # parameters must be set (thanks to Ian Kahan <<[email protected]> for # sending the correct chat script for the passwd program in Debian Sarge). # This boolean controls whether PAM will be used for password changes # when requested by an SMB client instead of the program listed in # 'passwd program'. The default is 'no'. # This option controls how unsuccessful authentication attempts are mapped # to anonymous connections ########## Domains ########### # Is this machine able to authenticate users. Both PDC and BDC # must have this setting enabled. If you are the BDC you must # change the 'domain master' setting to no # ; domain logons = yes # # The following setting only takes effect if 'domain logons' is set # It specifies the location of the user's profile directory # from the client point of view) # The following required a [profiles] share to be setup on the # samba server (see below) ; logon path = \\%N\profiles\%U # Another common choice is storing the profile in the user's home directory # (this is Samba's default) # logon path = \\%N\%U\profile # The following setting only takes effect if 'domain logons' is set # It specifies the location of a user's home directory (from the client # point of view) ; logon drive = H: # logon home = \\%N\%U # The following setting only takes effect if 'domain logons' is set # It specifies the script to run during logon. The script must be stored # in the [netlogon] share # NOTE: Must be store in 'DOS' file format convention ; logon script = logon.cmd # This allows Unix users to be created on the domain controller via the SAMR # RPC pipe. The example command creates a user account with a disabled Unix # password; please adapt to your needs ; add user script = /usr/sbin/adduser --quiet --disabled-password --gecos "" %u # This allows machine accounts to be created on the domain controller via the # SAMR RPC pipe. # The following assumes a "machines" group exists on the system ; add machine script = /usr/sbin/useradd -g machines -c "%u machine account" -d /var/lib/samba -s /bin/false %u # This allows Unix groups to be created on the domain controller via the SAMR # RPC pipe. ; add group script = /usr/sbin/addgroup --force-badname %g ########## Printing ########## # If you want to automatically load your printer list rather # than setting them up individually then you'll need this # load printers = yes # lpr(ng) printing. You may wish to override the location of the # printcap file ; printing = bsd ; printcap name = /etc/printcap # CUPS printing. See also the cupsaddsmb(8) manpage in the # cupsys-client package. ; printing = cups ; printcap name = cups ############ Misc ############ # Using the following line enables you to customise your configuration # on a per machine basis. The %m gets replaced with the netbios name # of the machine that is connecting ; include = /home/samba/etc/smb.conf.%m # Most people will find that this option gives better performance. # See smb.conf(5) and /usr/share/doc/samba-doc/htmldocs/Samba3-HOWTO/speed.html # for details # You may want to add the following on a Linux system: # SO_RCVBUF=8192 SO_SNDBUF=8192 # socket options = TCP_NODELAY # The following parameter is useful only if you have the linpopup package # installed. The samba maintainer and the linpopup maintainer are # working to ease installation and configuration of linpopup and samba. ; message command = /bin/sh -c '/usr/bin/linpopup "%f" "%m" %s; rm %s' & # Domain Master specifies Samba to be the Domain Master Browser. If this # machine will be configured as a BDC (a secondary logon server), you # must set this to 'no'; otherwise, the default behavior is recommended. # domain master = auto # Some defaults for winbind (make sure you're not using the ranges # for something else.) ; idmap uid = 10000-20000 ; idmap gid = 10000-20000 ; template shell = /bin/bash # The following was the default behaviour in sarge, # but samba upstream reverted the default because it might induce # performance issues in large organizations. # See Debian bug #368251 for some of the consequences of *not* # having this setting and smb.conf(5) for details. ; winbind enum groups = yes ; winbind enum users = yes # Setup usershare options to enable non-root users to share folders # with the net usershare command. # Maximum number of usershare. 0 (default) means that usershare is disabled. ; usershare max shares = 100 # Allow users who've been granted usershare privileges to create # public shares, not just authenticated ones #======================= Share Definitions ======================= # Un-comment the following (and tweak the other settings below to suit) # to enable the default home directory shares. This will share each # user's home director as \\server\username ;[homes] ; comment = Home Directories ; browseable = no # By default, the home directories are exported read-only. Change the # next parameter to 'no' if you want to be able to write to them. ; read only = yes # File creation mask is set to 0700 for security reasons. If you want to # create files with group=rw permissions, set next parameter to 0775. ; create mask = 0700 # Directory creation mask is set to 0700 for security reasons. If you want to # create dirs. with group=rw permissions, set next parameter to 0775. ; directory mask = 0700 # By default, \\server\username shares can be connected to by anyone # with access to the samba server. Un-comment the following parameter # to make sure that only "username" can connect to \\server\username # The following parameter makes sure that only "username" can connect # # This might need tweaking when using external authentication schemes ; valid users = %S # Un-comment the following and create the netlogon directory for Domain Logons # (you need to configure Samba to act as a domain controller too.) ;[netlogon] ; comment = Network Logon Service ; path = /home/samba/netlogon ; guest ok = yes ; read only = yes # Un-comment the following and create the profiles directory to store # users profiles (see the "logon path" option above) # (you need to configure Samba to act as a domain controller too.) # The path below should be writable by all users so that their # profile directory may be created the first time they log on ;[profiles] ; comment = Users profiles ; path = /home/samba/profiles ; guest ok = no ; browseable = no ; create mask = 0600 ; directory mask = 0700 [printers] comment = All Printers browseable = no path = /var/spool/samba printable = yes guest ok = no read only = yes create mask = 0700 # Windows clients look for this share name as a source of downloadable # printer drivers [print$] comment = Printer Drivers browseable = yes writeable = no path = /var/lib/samba/printers # Uncomment to allow remote administration of Windows print drivers. # You may need to replace 'lpadmin' with the name of the group your # admin users are members of. # Please note that you also need to set appropriate Unix permissions # to the drivers directory for these users to have write rights in it ; write list = root, @lpadmin # A sample share for sharing your CD-ROM with others. ;[cdrom] ; comment = Samba server's CD-ROM ; read only = yes ; locking = no ; path = /cdrom ; guest ok = yes # The next two parameters show how to auto-mount a CD-ROM when the # cdrom share is accesed. For this to work /etc/fstab must contain # an entry like this: # # /dev/scd0 /cdrom iso9660 defaults,noauto,ro,user 0 0 # # The CD-ROM gets unmounted automatically after the connection to the # # If you don't want to use auto-mounting/unmounting make sure the CD # is mounted on /cdrom # ; preexec = /bin/mount /cdrom ; postexec = /bin/umount /cdrom [mediafiles] path = /media/multimedia/

Read the article
Advanced TSQL Tuning: Why Internals Knowledge Matters

- by Paul White

There is much more to query tuning than reducing logical reads and adding covering nonclustered indexes. Query tuning is not complete as soon as the query returns results quickly in the development or test environments. In production, your query will compete for memory, CPU, locks, I/O and other resources on the server. Today’s entry looks at some tuning considerations that are often overlooked, and shows how deep internals knowledge can help you write better TSQL. As always, we’ll need some example data. In fact, we are going to use three tables today, each of which is structured like this: Each table has 50,000 rows made up of an INTEGER id column and a padding column containing 3,999 characters in every row. The only difference between the three tables is in the type of the padding column: the first table uses CHAR(3999), the second uses VARCHAR(MAX), and the third uses the deprecated TEXT type. A script to create a database with the three tables and load the sample data follows: USE master; GO IF DB_ID('SortTest') IS NOT NULL DROP DATABASE SortTest; GO CREATE DATABASE SortTest COLLATE LATIN1_GENERAL_BIN; GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest', SIZE = 3GB, MAXSIZE = 3GB ); GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest_log', SIZE = 256MB, MAXSIZE = 1GB, FILEGROWTH = 128MB ); GO ALTER DATABASE SortTest SET ALLOW_SNAPSHOT_ISOLATION OFF ; ALTER DATABASE SortTest SET AUTO_CLOSE OFF ; ALTER DATABASE SortTest SET AUTO_CREATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_SHRINK OFF ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS_ASYNC ON ; ALTER DATABASE SortTest SET PARAMETERIZATION SIMPLE ; ALTER DATABASE SortTest SET READ_COMMITTED_SNAPSHOT OFF ; ALTER DATABASE SortTest SET MULTI_USER ; ALTER DATABASE SortTest SET RECOVERY SIMPLE ; USE SortTest; GO CREATE TABLE dbo.TestCHAR ( id INTEGER IDENTITY (1,1) NOT NULL, padding CHAR(3999) NOT NULL, CONSTRAINT [PK dbo.TestCHAR (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestMAX ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL, CONSTRAINT [PK dbo.TestMAX (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestTEXT ( id INTEGER IDENTITY (1,1) NOT NULL, padding TEXT NOT NULL, CONSTRAINT [PK dbo.TestTEXT (id)] PRIMARY KEY CLUSTERED (id), ) ; -- ============= -- Load TestCHAR (about 3s) -- ============= INSERT INTO dbo.TestCHAR WITH (TABLOCKX) ( padding ) SELECT padding = REPLICATE(CHAR(65 + (Data.n % 26)), 3999) FROM ( SELECT TOP (50000) n = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1 FROM master.sys.columns C1, master.sys.columns C2, master.sys.columns C3 ORDER BY n ASC ) AS Data ORDER BY Data.n ASC ; -- ============ -- Load TestMAX (about 3s) -- ============ INSERT INTO dbo.TestMAX WITH (TABLOCKX) ( padding ) SELECT CONVERT(VARCHAR(MAX), padding) FROM dbo.TestCHAR ORDER BY id ; -- ============= -- Load TestTEXT (about 5s) -- ============= INSERT INTO dbo.TestTEXT WITH (TABLOCKX) ( padding ) SELECT CONVERT(TEXT, padding) FROM dbo.TestCHAR ORDER BY id ; -- ========== -- Space used -- ========== -- EXECUTE sys.sp_spaceused @objname = 'dbo.TestCHAR'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAX'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestTEXT'; ; CHECKPOINT ; That takes around 15 seconds to run, and shows the space allocated to each table in its output: To illustrate the points I want to make today, the example task we are going to set ourselves is to return a random set of 150 rows from each table. The basic shape of the test query is the same for each of the three test tables: SELECT TOP (150) T.id, T.padding FROM dbo.Test AS T ORDER BY NEWID() OPTION (MAXDOP 1) ; Test 1 – CHAR(3999) Running the template query shown above using the TestCHAR table as the target, we find that the query takes around 5 seconds to return its results. This seems slow, considering that the table only has 50,000 rows. Working on the assumption that generating a GUID for each row is a CPU-intensive operation, we might try enabling parallelism to see if that speeds up the response time. Running the query again (but without the MAXDOP 1 hint) on a machine with eight logical processors, the query now takes 10 seconds to execute – twice as long as when run serially. Rather than attempting further guesses at the cause of the slowness, let’s go back to serial execution and add some monitoring. The script below monitors STATISTICS IO output and the amount of tempdb used by the test query. We will also run a Profiler trace to capture any warnings generated during query execution. DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TC.id, TC.padding FROM dbo.TestCHAR AS TC ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; Let’s take a closer look at the statistics and query plan generated from this: Following the flow of the data from right to left, we see the expected 50,000 rows emerging from the Clustered Index Scan, with a total estimated size of around 191MB. The Compute Scalar adds a column containing a random GUID (generated from the NEWID() function call) for each row. With this extra column in place, the size of the data arriving at the Sort operator is estimated to be 192MB. Sort is a blocking operator – it has to examine all of the rows on its input before it can produce its first row of output (the last row received might sort first). This characteristic means that Sort requires a memory grant – memory allocated for the query’s use by SQL Server just before execution starts. In this case, the Sort is the only memory-consuming operator in the plan, so it has access to the full 243MB (248,696KB) of memory reserved by SQL Server for this query execution. Notice that the memory grant is significantly larger than the expected size of the data to be sorted. SQL Server uses a number of techniques to speed up sorting, some of which sacrifice size for comparison speed. Sorts typically require a very large number of comparisons, so this is usually a very effective optimization. One of the drawbacks is that it is not possible to exactly predict the sort space needed, as it depends on the data itself. SQL Server takes an educated guess based on data types, sizes, and the number of rows expected, but the algorithm is not perfect. In spite of the large memory grant, the Profiler trace shows a Sort Warning event (indicating that the sort ran out of memory), and the tempdb usage monitor shows that 195MB of tempdb space was used – all of that for system use. The 195MB represents physical write activity on tempdb, because SQL Server strictly enforces memory grants – a query cannot ‘cheat’ and effectively gain extra memory by spilling to tempdb pages that reside in memory. Anyway, the key point here is that it takes a while to write 195MB to disk, and this is the main reason that the query takes 5 seconds overall. If you are wondering why using parallelism made the problem worse, consider that eight threads of execution result in eight concurrent partial sorts, each receiving one eighth of the memory grant. The eight sorts all spilled to tempdb, resulting in inefficiencies as the spilled sorts competed for disk resources. More importantly, there are specific problems at the point where the eight partial results are combined, but I’ll cover that in a future post. CHAR(3999) Performance Summary: 5 seconds elapsed time 243MB memory grant 195MB tempdb usage 192MB estimated sort set 25,043 logical reads Sort Warning Test 2 – VARCHAR(MAX) We’ll now run exactly the same test (with the additional monitoring) on the table using a VARCHAR(MAX) padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TM.id, TM.padding FROM dbo.TestMAX AS TM ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query takes around 8 seconds to complete (3 seconds longer than Test 1). Notice that the estimated row and data sizes are very slightly larger, and the overall memory grant has also increased very slightly to 245MB. The most marked difference is in the amount of tempdb space used – this query wrote almost 391MB of sort run data to the physical tempdb file. Don’t draw any general conclusions about VARCHAR(MAX) versus CHAR from this – I chose the length of the data specifically to expose this edge case. In most cases, VARCHAR(MAX) performs very similarly to CHAR – I just wanted to make test 2 a bit more exciting. MAX Performance Summary: 8 seconds elapsed time 245MB memory grant 391MB tempdb usage 193MB estimated sort set 25,043 logical reads Sort warning Test 3 – TEXT The same test again, but using the deprecated TEXT data type for the padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TT.id, TT.padding FROM dbo.TestTEXT AS TT ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query runs in 500ms. If you look at the metrics we have been checking so far, it’s not hard to understand why: TEXT Performance Summary: 0.5 seconds elapsed time 9MB memory grant 5MB tempdb usage 5MB estimated sort set 207 logical reads 596 LOB logical reads Sort warning SQL Server’s memory grant algorithm still underestimates the memory needed to perform the sorting operation, but the size of the data to sort is so much smaller (5MB versus 193MB previously) that the spilled sort doesn’t matter very much. Why is the data size so much smaller? The query still produces the correct results – including the large amount of data held in the padding column – so what magic is being performed here? TEXT versus MAX Storage The answer lies in how columns of the TEXT data type are stored. By default, TEXT data is stored off-row in separate LOB pages – which explains why this is the first query we have seen that records LOB logical reads in its STATISTICS IO output. You may recall from my last post that LOB data leaves an in-row pointer to the separate storage structure holding the LOB data. SQL Server can see that the full LOB value is not required by the query plan until results are returned, so instead of passing the full LOB value down the plan from the Clustered Index Scan, it passes the small in-row structure instead. SQL Server estimates that each row coming from the scan will be 79 bytes long – 11 bytes for row overhead, 4 bytes for the integer id column, and 64 bytes for the LOB pointer (in fact the pointer is rather smaller – usually 16 bytes – but the details of that don’t really matter right now). OK, so this query is much more efficient because it is sorting a very much smaller data set – SQL Server delays retrieving the LOB data itself until after the Sort starts producing its 150 rows. The question that normally arises at this point is: Why doesn’t SQL Server use the same trick when the padding column is defined as VARCHAR(MAX)? The answer is connected with the fact that if the actual size of the VARCHAR(MAX) data is 8000 bytes or less, it is usually stored in-row in exactly the same way as for a VARCHAR(8000) column – MAX data only moves off-row into LOB storage when it exceeds 8000 bytes. The default behaviour of the TEXT type is to be stored off-row by default, unless the ‘text in row’ table option is set suitably and there is room on the page. There is an analogous (but opposite) setting to control the storage of MAX data – the ‘large value types out of row’ table option. By enabling this option for a table, MAX data will be stored off-row (in a LOB structure) instead of in-row. SQL Server Books Online has good coverage of both options in the topic In Row Data. The MAXOOR Table The essential difference, then, is that MAX defaults to in-row storage, and TEXT defaults to off-row (LOB) storage. You might be thinking that we could get the same benefits seen for the TEXT data type by storing the VARCHAR(MAX) values off row – so let’s look at that option now. This script creates a fourth table, with the VARCHAR(MAX) data stored off-row in LOB pages: CREATE TABLE dbo.TestMAXOOR ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL, CONSTRAINT [PK dbo.TestMAXOOR (id)] PRIMARY KEY CLUSTERED (id), ) ; EXECUTE sys.sp_tableoption @TableNamePattern = N'dbo.TestMAXOOR', @OptionName = 'large value types out of row', @OptionValue = 'true' ; SELECT large_value_types_out_of_row FROM sys.tables WHERE [schema_id] = SCHEMA_ID(N'dbo') AND name = N'TestMAXOOR' ; INSERT INTO dbo.TestMAXOOR WITH (TABLOCKX) ( padding ) SELECT SPACE(0) FROM dbo.TestCHAR ORDER BY id ; UPDATE TM WITH (TABLOCK) SET padding.WRITE (TC.padding, NULL, NULL) FROM dbo.TestMAXOOR AS TM JOIN dbo.TestCHAR AS TC ON TC.id = TM.id ; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAXOOR' ; CHECKPOINT ; Test 4 – MAXOOR We can now re-run our test on the MAXOOR (MAX out of row) table: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) MO.id, MO.padding FROM dbo.TestMAXOOR AS MO ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; TEXT Performance Summary: 0.3 seconds elapsed time 245MB memory grant 0MB tempdb usage 193MB estimated sort set 207 logical reads 446 LOB logical reads No sort warning The query runs very quickly – slightly faster than Test 3, and without spilling the sort to tempdb (there is no sort warning in the trace, and the monitoring query shows zero tempdb usage by this query). SQL Server is passing the in-row pointer structure down the plan and only looking up the LOB value on the output side of the sort. The Hidden Problem There is still a huge problem with this query though – it requires a 245MB memory grant. No wonder the sort doesn’t spill to tempdb now – 245MB is about 20 times more memory than this query actually requires to sort 50,000 records containing LOB data pointers. Notice that the estimated row and data sizes in the plan are the same as in test 2 (where the MAX data was stored in-row). The optimizer assumes that MAX data is stored in-row, regardless of the sp_tableoption setting ‘large value types out of row’. Why? Because this option is dynamic – changing it does not immediately force all MAX data in the table in-row or off-row, only when data is added or actually changed. SQL Server does not keep statistics to show how much MAX or TEXT data is currently in-row, and how much is stored in LOB pages. This is an annoying limitation, and one which I hope will be addressed in a future version of the product. So why should we worry about this? Excessive memory grants reduce concurrency and may result in queries waiting on the RESOURCE_SEMAPHORE wait type while they wait for memory they do not need. 245MB is an awful lot of memory, especially on 32-bit versions where memory grants cannot use AWE-mapped memory. Even on a 64-bit server with plenty of memory, do you really want a single query to consume 0.25GB of memory unnecessarily? That’s 32,000 8KB pages that might be put to much better use. The Solution The answer is not to use the TEXT data type for the padding column. That solution happens to have better performance characteristics for this specific query, but it still results in a spilled sort, and it is hard to recommend the use of a data type which is scheduled for removal. I hope it is clear to you that the fundamental problem here is that SQL Server sorts the whole set arriving at a Sort operator. Clearly, it is not efficient to sort the whole table in memory just to return 150 rows in a random order. The TEXT example was more efficient because it dramatically reduced the size of the set that needed to be sorted. We can do the same thing by selecting 150 unique keys from the table at random (sorting by NEWID() for example) and only then retrieving the large padding column values for just the 150 rows we need. The following script implements that idea for all four tables: SET STATISTICS IO ON ; WITH TestTable AS ( SELECT * FROM dbo.TestCHAR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id = ANY (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAX ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestTEXT ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAXOOR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; All four queries now return results in much less than a second, with memory grants between 6 and 12MB, and without spilling to tempdb. The small remaining inefficiency is in reading the id column values from the clustered primary key index. As a clustered index, it contains all the in-row data at its leaf. The CHAR and VARCHAR(MAX) tables store the padding column in-row, so id values are separated by a 3999-character column, plus row overhead. The TEXT and MAXOOR tables store the padding values off-row, so id values in the clustered index leaf are separated by the much-smaller off-row pointer structure. This difference is reflected in the number of logical page reads performed by the four queries: Table 'TestCHAR' logical reads 25511 lob logical reads 000 Table 'TestMAX'. logical reads 25511 lob logical reads 000 Table 'TestTEXT' logical reads 00412 lob logical reads 597 Table 'TestMAXOOR' logical reads 00413 lob logical reads 446 We can increase the density of the id values by creating a separate nonclustered index on the id column only. This is the same key as the clustered index, of course, but the nonclustered index will not include the rest of the in-row column data. CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestCHAR (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAX (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestTEXT (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAXOOR (id); The four queries can now use the very dense nonclustered index to quickly scan the id values, sort them by NEWID(), select the 150 ids we want, and then look up the padding data. The logical reads with the new indexes in place are: Table 'TestCHAR' logical reads 835 lob logical reads 0 Table 'TestMAX' logical reads 835 lob logical reads 0 Table 'TestTEXT' logical reads 686 lob logical reads 597 Table 'TestMAXOOR' logical reads 686 lob logical reads 448 With the new index, all four queries use the same query plan (click to enlarge): Performance Summary: 0.3 seconds elapsed time 6MB memory grant 0MB tempdb usage 1MB sort set 835 logical reads (CHAR, MAX) 686 logical reads (TEXT, MAXOOR) 597 LOB logical reads (TEXT) 448 LOB logical reads (MAXOOR) No sort warning I’ll leave it as an exercise for the reader to work out why trying to eliminate the Key Lookup by adding the padding column to the new nonclustered indexes would be a daft idea Conclusion This post is not about tuning queries that access columns containing big strings. It isn’t about the internal differences between TEXT and MAX data types either. It isn’t even about the cool use of UPDATE .WRITE used in the MAXOOR table load. No, this post is about something else: Many developers might not have tuned our starting example query at all – 5 seconds isn’t that bad, and the original query plan looks reasonable at first glance. Perhaps the NEWID() function would have been blamed for ‘just being slow’ – who knows. 5 seconds isn’t awful – unless your users expect sub-second responses – but using 250MB of memory and writing 200MB to tempdb certainly is! If ten sessions ran that query at the same time in production that’s 2.5GB of memory usage and 2GB hitting tempdb. Of course, not all queries can be rewritten to avoid large memory grants and sort spills using the key-lookup technique in this post, but that’s not the point either. The point of this post is that a basic understanding of execution plans is not enough. Tuning for logical reads and adding covering indexes is not enough. If you want to produce high-quality, scalable TSQL that won’t get you paged as soon as it hits production, you need a deep understanding of execution plans, and as much accurate, deep knowledge about SQL Server as you can lay your hands on. The advanced database developer has a wide range of tools to use in writing queries that perform well in a range of circumstances. By the way, the examples in this post were written for SQL Server 2008. They will run on 2005 and demonstrate the same principles, but you won’t get the same figures I did because 2005 had a rather nasty bug in the Top N Sort operator. Fair warning: if you do decide to run the scripts on a 2005 instance (particularly the parallel query) do it before you head out for lunch… This post is dedicated to the people of Christchurch, New Zealand. © 2011 Paul White email: @[email protected] twitter: @SQL_Kiwi

Read the article
Inside the Concurrent Collections: ConcurrentDictionary

- by Simon Cooper

Using locks to implement a thread-safe collection is rather like using a sledgehammer - unsubtle, easy to understand, and tends to make any other tool redundant. Unlike the previous two collections I looked at, ConcurrentStack and ConcurrentQueue, ConcurrentDictionary uses locks quite heavily. However, it is careful to wield locks only where necessary to ensure that concurrency is maximised. This will, by necessity, be a higher-level look than my other posts in this series, as there is quite a lot of code and logic in ConcurrentDictionary. Therefore, I do recommend that you have ConcurrentDictionary open in a decompiler to have a look at all the details that I skip over. The problem with locks There's several things to bear in mind when using locks, as encapsulated by the lock keyword in C# and the System.Threading.Monitor class in .NET (if you're unsure as to what lock does in C#, I briefly covered it in my first post in the series): Locks block threads The most obvious problem is that threads waiting on a lock can't do any work at all. No preparatory work, no 'optimistic' work like in ConcurrentQueue and ConcurrentStack, nothing. It sits there, waiting to be unblocked. This is bad if you're trying to maximise concurrency. Locks are slow Whereas most of the methods on the Interlocked class can be compiled down to a single CPU instruction, ensuring atomicity at the hardware level, taking out a lock requires some heavy lifting by the CLR and the operating system. There's quite a bit of work required to take out a lock, block other threads, and wake them up again. If locks are used heavily, this impacts performance. Deadlocks When using locks there's always the possibility of a deadlock - two threads, each holding a lock, each trying to aquire the other's lock. Fortunately, this can be avoided with careful programming and structured lock-taking, as we'll see. So, it's important to minimise where locks are used to maximise the concurrency and performance of the collection. Implementation As you might expect, ConcurrentDictionary is similar in basic implementation to the non-concurrent Dictionary, which I studied in a previous post. I'll be using some concepts introduced there, so I recommend you have a quick read of it. So, if you were implementing a thread-safe dictionary, what would you do? The naive implementation is to simply have a single lock around all methods accessing the dictionary. This would work, but doesn't allow much concurrency. Fortunately, the bucketing used by Dictionary allows a simple but effective improvement to this - one lock per bucket. This allows different threads modifying different buckets to do so in parallel. Any thread making changes to the contents of a bucket takes the lock for that bucket, ensuring those changes are thread-safe. The method that maps each bucket to a lock is the GetBucketAndLockNo method: private void GetBucketAndLockNo( int hashcode, out int bucketNo, out int lockNo, int bucketCount) { // the bucket number is the hashcode (without the initial sign bit) // modulo the number of buckets bucketNo = (hashcode & 0x7fffffff) % bucketCount; // and the lock number is the bucket number modulo the number of locks lockNo = bucketNo % m_locks.Length; } However, this does require some changes to how the buckets are implemented. The 'implicit' linked list within a single backing array used by the non-concurrent Dictionary adds a dependency between separate buckets, as every bucket uses the same backing array. Instead, ConcurrentDictionary uses a strict linked list on each bucket: This ensures that each bucket is entirely separate from all other buckets; adding or removing an item from a bucket is independent to any changes to other buckets. Modifying the dictionary All the operations on the dictionary follow the same basic pattern: void AlterBucket(TKey key, ...) { int bucketNo, lockNo; 1: GetBucketAndLockNo( key.GetHashCode(), out bucketNo, out lockNo, m_buckets.Length); 2: lock (m_locks[lockNo]) { 3: Node headNode = m_buckets[bucketNo]; 4: Mutate the node linked list as appropriate } } For example, when adding another entry to the dictionary, you would iterate through the linked list to check whether the key exists already, and add the new entry as the head node. When removing items, you would find the entry to remove (if it exists), and remove the node from the linked list. Adding, updating, and removing items all follow this pattern. Performance issues There is a problem we have to address at this point. If the number of buckets in the dictionary is fixed in the constructor, then the performance will degrade from O(1) to O(n) when a large number of items are added to the dictionary. As more and more items get added to the linked lists in each bucket, the lookup operations will spend most of their time traversing a linear linked list. To fix this, the buckets array has to be resized once the number of items in each bucket has gone over a certain limit. (In ConcurrentDictionary this limit is when the size of the largest bucket is greater than the number of buckets for each lock. This check is done at the end of the TryAddInternal method.) Resizing the bucket array and re-hashing everything affects every bucket in the collection. Therefore, this operation needs to take out every lock in the collection. Taking out mutiple locks at once inevitably summons the spectre of the deadlock; two threads each hold a lock, and each trying to acquire the other lock. How can we eliminate this? Simple - ensure that threads never try to 'swap' locks in this fashion. When taking out multiple locks, always take them out in the same order, and always take out all the locks you need before starting to release them. In ConcurrentDictionary, this is controlled by the AcquireLocks, AcquireAllLocks and ReleaseLocks methods. Locks are always taken out and released in the order they are in the m_locks array, and locks are all released right at the end of the method in a finally block. At this point, it's worth pointing out that the locks array is never re-assigned, even when the buckets array is increased in size. The number of locks is fixed in the constructor by the concurrencyLevel parameter. This simplifies programming the locks; you don't have to check if the locks array has changed or been re-assigned before taking out a lock object. And you can be sure that when a thread takes out a lock, another thread isn't going to re-assign the lock array. This would create a new series of lock objects, thus allowing another thread to ignore the existing locks (and any threads controlling them), breaking thread-safety. Consequences of growing the array Just because we're using locks doesn't mean that race conditions aren't a problem. We can see this by looking at the GrowTable method. The operation of this method can be boiled down to: private void GrowTable(Node[] buckets) { try { 1: Acquire first lock in the locks array // this causes any other thread trying to take out // all the locks to block because the first lock in the array // is always the one taken out first // check if another thread has already resized the buckets array // while we were waiting to acquire the first lock 2: if (buckets != m_buckets) return; 3: Calculate the new size of the backing array 4: Node[] array = new array[size]; 5: Acquire all the remaining locks 6: Re-hash the contents of the existing buckets into array 7: m_buckets = array; } finally { 8: Release all locks } } As you can see, there's already a check for a race condition at step 2, for the case when the GrowTable method is called twice in quick succession on two separate threads. One will successfully resize the buckets array (blocking the second in the meantime), when the second thread is unblocked it'll see that the array has already been resized & exit without doing anything. There is another case we need to consider; looking back at the AlterBucket method above, consider the following situation: Thread 1 calls AlterBucket; step 1 is executed to get the bucket and lock numbers. Thread 2 calls GrowTable and executes steps 1-5; thread 1 is blocked when it tries to take out the lock in step 2. Thread 2 re-hashes everything, re-assigns the buckets array, and releases all the locks (steps 6-8). Thread 1 is unblocked and continues executing, but the calculated bucket and lock numbers are no longer valid. Between calculating the correct bucket and lock number and taking out the lock, another thread has changed where everything is. Not exactly thread-safe. Well, a similar problem was solved in ConcurrentStack and ConcurrentQueue by storing a local copy of the state, doing the necessary calculations, then checking if that state is still valid. We can use a similar idea here: void AlterBucket(TKey key, ...) { while (true) { Node[] buckets = m_buckets; int bucketNo, lockNo; GetBucketAndLockNo( key.GetHashCode(), out bucketNo, out lockNo, buckets.Length); lock (m_locks[lockNo]) { // if the state has changed, go back to the start if (buckets != m_buckets) continue; Node headNode = m_buckets[bucketNo]; Mutate the node linked list as appropriate } break; } } TryGetValue and GetEnumerator And so, finally, we get onto TryGetValue and GetEnumerator. I've left these to the end because, well, they don't actually use any locks. How can this be? Whenever you change a bucket, you need to take out the corresponding lock, yes? Indeed you do. However, it is important to note that TryGetValue and GetEnumerator don't actually change anything. Just as immutable objects are, by definition, thread-safe, read-only operations don't need to take out a lock because they don't change anything. All lockless methods can happily iterate through the buckets and linked lists without worrying about locking anything. However, this does put restrictions on how the other methods operate. Because there could be another thread in the middle of reading the dictionary at any time (even if a lock is taken out), the dictionary has to be in a valid state at all times. Every change to state has to be made visible to other threads in a single atomic operation (all relevant variables are marked volatile to help with this). This restriction ensures that whatever the reading threads are doing, they never read the dictionary in an invalid state (eg items that should be in the collection temporarily removed from the linked list, or reading a node that has had it's key & value removed before the node itself has been removed from the linked list). Fortunately, all the operations needed to change the dictionary can be done in that way. Bucket resizes are made visible when the new array is assigned back to the m_buckets variable. Any additions or modifications to a node are done by creating a new node, then splicing it into the existing list using a single variable assignment. Node removals are simply done by re-assigning the node's m_next pointer. Because the dictionary can be changed by another thread during execution of the lockless methods, the GetEnumerator method is liable to return dirty reads - changes made to the dictionary after GetEnumerator was called, but before the enumeration got to that point in the dictionary. It's worth listing at this point which methods are lockless, and which take out all the locks in the dictionary to ensure they get a consistent view of the dictionary: Lockless: TryGetValue GetEnumerator The indexer getter ContainsKey Takes out every lock (lockfull?): Count IsEmpty Keys Values CopyTo ToArray Concurrent principles That covers the overall implementation of ConcurrentDictionary. I haven't even begun to scratch the surface of this sophisticated collection. That I leave to you. However, we've looked at enough to be able to extract some useful principles for concurrent programming: Partitioning When using locks, the work is partitioned into independant chunks, each with its own lock. Each partition can then be modified concurrently to other partitions. Ordered lock-taking When a method does need to control the entire collection, locks are taken and released in a fixed order to prevent deadlocks. Lockless reads Read operations that don't care about dirty reads don't take out any lock; the rest of the collection is implemented so that any reading thread always has a consistent view of the collection. That leads us to the final collection in this little series - ConcurrentBag. Lacking a non-concurrent analogy, it is quite different to any other collection in the class libraries. Prepare your thinking hats!

Read the article
HttpModule Init method is called several times - why?

- by MartinF

I was creating a http module and while debugging I noticed something which at first (at least) seemed like weird behaviour. When i set a breakpoint in the init method of the httpmodule i can see that the http module init method is being called several times even though i have only started up the website for debugging and made one single request (sometimes it is hit only 1 time, other times as many as 10 times). I know that I should expect several instances of the HttpApplication to be running and for each the http modules will be created, but when i request a single page it should be handled by a single http application object and therefore only fire the events associated once, but still it fires the events several times for each request which makes no sense - other than it must have been added several times within that httpApplication - which means it is the same httpmodule init method which is being called every time and not a new http application being created each time it hits my break point (see my code example at the bottom etc.). What could be going wrong here ? is it because i am debugging and set a breakpoint in the http module? It have noticed that it seems that if i startup the website for debugging and quickly step over the breakpoint in the httpmodule it will only hit the init method once and the same goes for the eventhandler. If I instead let it hang at the breakpoint for a few seconds the init method is being called several times (seems like it depends on how long time i wait before stepping over the breakpoint). Maybe this could be some build in feature to make sure that the httpmodule is initialised and the http application can serve requests , but it also seems like something that could have catastrophic consequences. This could seem logical, as it might be trying to finish the request and since i have set the break point it thinks something have gone wrong and try to call the init method again ? soo it can handle the request ? But is this what is happening and is everything fine (i am just guessing), or is it a real problem ? What i am specially concerned about is that if something makes it hang on the "production/live" server for a few seconds a lot of event handlers are added through the init and therefore each request to the page suddenly fires the eventhandler several times. This behaviour could quickly bring any site down. I have looked at the "original" .net code used for the httpmodules for formsauthentication and the rolemanagermodule etc but my code isnt any different that those modules uses. My code looks like this. public void Init(HttpApplication app) { if (CommunityAuthenticationIntegration.IsEnabled) { FormsAuthenticationModule formsAuthModule = (FormsAuthenticationModule) app.Modules["FormsAuthentication"]; formsAuthModule.Authenticate += new FormsAuthenticationEventHandler(this.OnAuthenticate); } } here is an example how it is done in the RoleManagerModule from the .NET framework public void Init(HttpApplication app) { if (Roles.Enabled) { app.PostAuthenticateRequest += new EventHandler(this.OnEnter); app.EndRequest += new EventHandler(this.OnLeave); } } Do anyone know what is going on? (i just hope someone out there can tell me why this is happening and assure me that everything is perfectly fine) :) UPDATE: I have tried to narrow down the problem and so far i have found that the Init method being called is always on a new object of my http module (contary to what i thought before). I seems that for the first request (when starting up the site) all of the HttpApplication objects being created and its modules are all trying to serve the first request and therefore all hit the eventhandler that is being added. I cant really figure out why this is happening. If i request another page all the HttpApplication's created (and their moduless) will again try to serve the request causing it to hit the eventhandler multiple times. But it also seems that if i then jump back to the first page (or another one) only one HttpApplication will start to take care of the request and everything is as expected - as long as i dont let it hang at a break point. If i let it hang at a breakpoint it begins to create new HttpApplication's objects and starts adding HttpApplications (more than 1) to serve/handle the request (which is already in process of being served by the HttpApplication which is currently stopped at the breakpoint). I guess or hope that it might be some intelligent "behind the scenes" way of helping to distribute and handle load and / or errors. But I have no clue. I hope some out there can assure me that it is perfectly fine and how it is supposed to be?

Read the article
How can I use the FOR attribute of a LABEL tag without the ID attribute on the INPUT tag

- by Shawn

Is there a solution to the problem illustrated in the code below? Start by opening the code in a browser to get straight to the point and not have to look through all that code before knowing what you're looking for. <html> <head> <title>Input ID creates problems</title> <style type="text/css"> #prologue, #summary { margin: 5em; } </style> </head> <body> <h1>Input ID creates a bug</h1> <p id="prologue"> In this example, I make a list of checkboxes representing things which could appear in a book. If you want some in your book, you check them: </p> <form> <ul> <li> <input type="checkbox" id="prologue" /> <label for="prologue">prologue</label> </li> <li> <input type="checkbox" id="chapter" /> <label for="chapter">chapter</label> </li> <li> <input type="checkbox" id="summary" /> <label for="summary">summary</label> </li> <li> <input type="checkbox" id="etc" /> <label for="etc">etc</label> <label> </li> </ul> </form> <p id="summary"> For each checkbox, I want to assign an ID so that clicking a label checks the corresponding checkbox. The problems occur when other elements in the page already use those IDs. In this case, a CSS declaration was made to add margins to the two paragraphs which IDs are "prologue" and "summary", but because of the IDs given to the checkboxes, the checkboxes named "prologue" and "summary" are also affected by this declaration. The following links simply call a javascript function which writes out the element whose id is <a href="javascript:alert(document.getElementById('prologue'));">prologue</a> and <a href="javascript:alert(document.getElementById('summary'));">summary</a>, respectively. In the first case (prologue), the script writes out [object HTMLParagraphElement], because the first element found with id "prologue" is a paragraph. But in the second case (summary), the script writes out [object HTMLInputElement] because the first element found with id "summary" is an input. In the case of another script, the consequences of this mix up could have been much more dramatic. Now try clicking on the label prologue in the list above. It does not check the checkbox as clicking on any other label. This is because it finds the paragraph whose ID is also "prologue" and tries to check that instead. By the way, if there were another checkbox whose id was "prologue", then clicking on the label would check the one which appears first in the code. </p> <p> An easy fix for this would be to chose other IDs for the checkboxes, but this doesn't apply if these IDs are given dynamically, by a php script for example. Another easy fix for this would be to write labels like this: <pre> <label><input type="checkbox" />prologue</label> </pre> and not need to give an ID to the checkboxes. But this only works if the label and checkbox are next to each other. </p> <p> Well, that's the problem. I guess the ideal solution would be to link a label to a checkboxe using another mechanism (not using ID). I think the perfect way to do this would be to match a label to the input element whose NAME (not ID) is the same as the label's FOR attribute. What do you think? </p> </body> </html>

Read the article
Java: Cannot find a method's symbol even though that method is declared later in the class. The remaining code is looking for a class.

- by Midimistro

This is an assignment that we use strings in Java to analyze a phone number. The error I am having is anything below tester=invalidCharacters(c); does not compile because every line past tester=invalidCharacters(c); is looking for a symbol or the class. In get invalidResults, all I am trying to do is evaluate a given string for non-alphabetical characters such as *,(,^,&,%,@,#,), and so on. What to answer: Why is it producing an error, what will work, and is there an easier method WITHOUT using regex. Here is the link to the assignment: http://cis.csuohio.edu/~hwang/teaching/cis260/assignments/assignment9.html public class PhoneNumber { private int areacode; private int number; private int ext; /////Constructors///// //Third Constructor (given one string arg) "xxx-xxxxxxx" where first three are numbers and the remaining (7) are numbers or letters public PhoneNumber(String newNumber){ //Note: Set default ext to 0 ext=0; ////Declare Temporary Storage and other variables//// //for the first three numbers String areaCodeString; //for the remaining seven characters String newNumberString; //For use in testing the second half of the string boolean containsLetters; boolean containsInvalid; /////Separate the two parts of string///// //Get the area code part of the string areaCodeString=newNumber.substring(0,2); //Convert the string and set it to the area code areacode=Integer.parseInt(areaCodeString); //Skip the "-" and Get the remaining part of the string newNumberString=newNumber.substring(4); //Create an array of characters from newNumberString to reuse in later methods for int length=newNumberString.length(); char [] myCharacters= new char [length]; int i; for (i=0;i<length;i++){ myCharacters [i]=newNumberString.charAt(i); } //Test if newNumberString contains letters & converting them into numbers String reNewNumber=""; //Test for invalid characters containsInvalid=getInvalidResults(newNumberString,length); if (containsInvalid==false){ containsLetters=getCharResults(newNumberString,length); if (containsLetters==true){ for (i=0;i<length;i++){ myCharacters [i]=(char)convertLetNum((myCharacters [i])); reNewNumber=reNewNumber+myCharacters[i]; } } } if (containsInvalid==false){ number=Integer.parseInt(reNewNumber); } else{ System.out.println("Error!"+"\t"+newNumber+" contains illegal characters. This number will be ignored and skipped."); } } //////Primary Methods/Behaviors/////// //Compare this phone number with the one passed by the caller public boolean equals(PhoneNumber pn){ boolean equal; String concat=(areacode+"-"+number); String pN=pn.toString(); if (concat==pN){ equal=true; } else{ equal=false; } return equal; } //Convert the stored number to a certain string depending on extension public String toString(){ String returned; if(ext==0){ returned=(areacode+"-"+number); } else{ returned=(areacode+"-"+number+" ext "+ext); } return returned; } //////Secondary Methods/////// //Method for testing if the second part of the string contains any letters public static boolean getCharResults(String newNumString,int getLength){ //Recreate a character array int i; char [] myCharacters= new char [getLength]; for (i=0;i<getLength;i++){ myCharacters [i]=newNumString.charAt(i); } boolean doesContainLetter=false; int j; for (j=0;j<getLength;j++){ if ((Character.isDigit(myCharacters[j])==true)){ doesContainLetter=false; } else{ doesContainLetter=true; return doesContainLetter; } } return doesContainLetter; } //Method for testing if the second part of the string contains any letters public static boolean getInvalidResults(String newNumString,int getLength){ boolean doesContainInvalid=false; int i; char c; boolean tester; char [] invalidCharacters= new char [getLength]; for (i=0;i<getLength;i++){ invalidCharacters [i]=newNumString.charAt(i); c=invalidCharacters [i]; tester=invalidCharacters(c); if(tester==true)){ doesContainInvalid=false; } else{ doesContainInvalid=true; return doesContainInvalid; } } return doesContainInvalid; } //Method for evaluating string for invalid characters public boolean invalidCharacters(char letter){ boolean returnNum=false; switch (letter){ case 'A': return returnNum; case 'B': return returnNum; case 'C': return returnNum; case 'D': return returnNum; case 'E': return returnNum; case 'F': return returnNum; case 'G': return returnNum; case 'H': return returnNum; case 'I': return returnNum; case 'J': return returnNum; case 'K': return returnNum; case 'L': return returnNum; case 'M': return returnNum; case 'N': return returnNum; case 'O': return returnNum; case 'P': return returnNum; case 'Q': return returnNum; case 'R': return returnNum; case 'S': return returnNum; case 'T': return returnNum; case 'U': return returnNum; case 'V': return returnNum; case 'W': return returnNum; case 'X': return returnNum; case 'Y': return returnNum; case 'Z': return returnNum; default: return true; } } //Method for converting letters to numbers public int convertLetNum(char letter){ int returnNum; switch (letter){ case 'A': returnNum=2;return returnNum; case 'B': returnNum=2;return returnNum; case 'C': returnNum=2;return returnNum; case 'D': returnNum=3;return returnNum; case 'E': returnNum=3;return returnNum; case 'F': returnNum=3;return returnNum; case 'G': returnNum=4;return returnNum; case 'H': returnNum=4;return returnNum; case 'I': returnNum=4;return returnNum; case 'J': returnNum=5;return returnNum; case 'K': returnNum=5;return returnNum; case 'L': returnNum=5;return returnNum; case 'M': returnNum=6;return returnNum; case 'N': returnNum=6;return returnNum; case 'O': returnNum=6;return returnNum; case 'P': returnNum=7;return returnNum; case 'Q': returnNum=7;return returnNum; case 'R': returnNum=7;return returnNum; case 'S': returnNum=7;return returnNum; case 'T': returnNum=8;return returnNum; case 'U': returnNum=8;return returnNum; case 'V': returnNum=8;return returnNum; case 'W': returnNum=9;return returnNum; case 'X': returnNum=9;return returnNum; case 'Y': returnNum=9;return returnNum; case 'Z': returnNum=9;return returnNum; default: return 0; } } } Note: Please Do not use this program to cheat in your own class. To ensure of this, I will take this question down if it has not been answered by the end of 2013, if I no longer need an explanation for it, or if the term for the class has ended.

Read the article
Top things web developers should know about the Visual Studio 2013 release

- by Jon Galloway

ASP.NET and Web Tools for Visual Studio 2013 Release NotesASP.NET and Web Tools for Visual Studio 2013 Release NotesSummary for lazy readers: Visual Studio 2013 is now available for download on the Visual Studio site and on MSDN subscriber downloads) Visual Studio 2013 installs side by side with Visual Studio 2012 and supports round-tripping between Visual Studio versions, so you can try it out without committing to a switch Visual Studio 2013 ships with the new version of ASP.NET, which includes ASP.NET MVC 5, ASP.NET Web API 2, Razor 3, Entity Framework 6 and SignalR 2.0 The new releases ASP.NET focuses on One ASP.NET, so core features and web tools work the same across the platform (e.g. adding ASP.NET MVC controllers to a Web Forms application) New core features include new templates based on Bootstrap, a new scaffolding system, and a new identity system Visual Studio 2013 is an incredible editor for web files, including HTML, CSS, JavaScript, Markdown, LESS, Coffeescript, Handlebars, Angular, Ember, Knockdown, etc. Top links: Visual Studio 2013 content on the ASP.NET site are in the standard new releases area: http://www.asp.net/vnext ASP.NET and Web Tools for Visual Studio 2013 Release Notes Short intro videos on the new Visual Studio web editor features from Scott Hanselman and Mads Kristensen Announcing release of ASP.NET and Web Tools for Visual Studio 2013 post on the official .NET Web Development and Tools Blog Scott Guthrie's post: Announcing the Release of Visual Studio 2013 and Great Improvements to ASP.NET and Entity Framework Okay, for those of you who are still with me, let's dig in a bit. Quick web dev notes on downloading and installing Visual Studio 2013 I found Visual Studio 2013 to be a pretty fast install. According to Brian Harry's release post, installing over pre-release versions of Visual Studio is supported. I've installed the release version over pre-release versions, and it worked fine. If you're only going to be doing web development, you can speed up the install if you just select Web Developer tools. Of course, as a good Microsoft employee, I'll mention that you might also want to install some of those other features, like the Store apps for Windows 8 and the Windows Phone 8.0 SDK, but they do download and install a lot of other stuff (e.g. the Windows Phone SDK sets up Hyper-V and downloads several GB's of VM's). So if you're planning just to do web development for now, you can pick just the Web Developer Tools and install the other stuff later. If you've got a fast internet connection, I recommend using the web installer instead of downloading the ISO. The ISO includes all the features, whereas the web installer just downloads what you're installing. Visual Studio 2013 development settings and color theme When you start up Visual Studio, it'll prompt you to pick some defaults. These are totally up to you -whatever suits your development style - and you can change them later. As I said, these are completely up to you. I recommend either the Web Development or Web Development (Code Only) settings. The only real difference is that Code Only hides the toolbars, and you can switch between them using Tools / Import and Export Settings / Reset. Web Development settings Web Development (code only) settings Usually I've just gone with Web Development (code only) in the past because I just want to focus on the code, although the Standard toolbar does make it easier to switch default web browsers. More on that later. Color theme Sigh. Okay, everyone's got their favorite colors. I alternate between Light and Dark depending on my mood, and I personally like how the low contrast on the window chrome in those themes puts the emphasis on my code rather than the tabs and toolbars. I know some people got pretty worked up over that, though, and wanted the blue theme back. I personally don't like it - it reminds me of ancient versions of Visual Studio that I don't want to think about anymore. So here's the thing: if you install Visual Studio Ultimate, it defaults to Blue. The other versions default to Light. If you use Blue, I won't criticize you - out loud, that is. You can change themes really easily - either Tools / Options / Environment / General, or the smart way: ctrl+q for quick launch, then type Theme and hit enter. Signing in During the first run, you'll be prompted to sign in. You don't have to - you can click the "Not now, maybe later" link at the bottom of that dialog. I recommend signing in, though. It's not hooked in with licensing or tracking the kind of code you write to sell you components. It is doing good things, like syncing your Visual Studio settings between computers. More about that here. So, you don't have to, but I sure do. Overview of shiny new things in ASP.NET land There are a lot of good new things in ASP.NET. I'll list some of my favorite here, but you can read more on the ASP.NET site. One ASP.NET You've heard us talk about this for a while. The idea is that options are good, but choice can be a burden. When you start a new ASP.NET project, why should you have to make a tough decision - with long-term consequences - about how your application will work? If you want to use ASP.NET Web Forms, but have the option of adding in ASP.NET MVC later, why should that be hard? It's all ASP.NET, right? Ideally, you'd just decide that you want to use ASP.NET to build sites and services, and you could use the appropriate tools (the green blocks below) as you needed them. So, here it is. When you create a new ASP.NET application, you just create an ASP.NET application. Next, you can pick from some templates to get you started... but these are different. They're not "painful decision" templates, they're just some starting pieces. And, most importantly, you can mix and match. I can pick a "mostly" Web Forms template, but include MVC and Web API folders and core references. If you've tried to mix and match in the past, you're probably aware that it was possible, but not pleasant. ASP.NET MVC project files contained special project type GUIDs, so you'd only get controller scaffolding support in a Web Forms project if you manually edited the csproj file. Features in one stack didn't work in others. Project templates were painful choices. That's no longer the case. Hooray! I just did a demo in a presentation last week where I created a new Web Forms + MVC + Web API site, built a model, scaffolded MVC and Web API controllers with EF Code First, add data in the MVC view, viewed it in Web API, then added a GridView to the Web Forms Default.aspx page and bound it to the Model. In about 5 minutes. Sure, it's a simple example, but it's great to be able to share code and features across the whole ASP.NET family. Authentication In the past, authentication was built into the templates. So, for instance, there was an ASP.NET MVC 4 Intranet Project template which created a new ASP.NET MVC 4 application that was preconfigured for Windows Authentication. All of that authentication stuff was built into each template, so they varied between the stacks, and you couldn't reuse them. You didn't see a lot of changes to the authentication options, since they required big changes to a bunch of project templates. Now, the new project dialog includes a common authentication experience. When you hit the Change Authentication button, you get some common options that work the same way regardless of the template or reference settings you've made. These options work on all ASP.NET frameworks, and all hosting environments (IIS, IIS Express, or OWIN for self-host) The default is Individual User Accounts: This is the standard "create a local account, using username / password or OAuth" thing; however, it's all built on the new Identity system. More on that in a second. The one setting that has some configuration to it is Organizational Accounts, which lets you configure authentication using Active Directory, Windows Azure Active Directory, or Office 365. Identity There's a new identity system. We've taken the best parts of the previous ASP.NET Membership and Simple Identity systems, rolled in a lot of feedback and made big enhancements to support important developer concerns like unit testing and extensiblity. I've written long posts about ASP.NET identity, and I'll do it again. Soon. This is not that post. The short version is that I think we've finally got just the right Identity system. Some of my favorite features: There are simple, sensible defaults that work well - you can File / New / Run / Register / Login, and everything works. It supports standard username / password as well as external authentication (OAuth, etc.). It's easy to customize without having to re-implement an entire provider. It's built using pluggable pieces, rather than one large monolithic system. It's built using interfaces like IUser and IRole that allow for unit testing, dependency injection, etc. You can easily add user profile data (e.g. URL, twitter handle, birthday). You just add properties to your ApplicationUser model and they'll automatically be persisted. Complete control over how the identity data is persisted. By default, everything works with Entity Framework Code First, but it's built to support changes from small (modify the schema) to big (use another ORM, store your data in a document database or in the cloud or in XML or in the EXIF data of your desktop background or whatever). It's configured via OWIN. More on OWIN and Katana later, but the fact that it's built using OWIN means it's portable. You can find out more in the Authentication and Identity section of the ASP.NET site (and lots more content will be going up there soon). New Bootstrap based project templates The new project templates are built using Bootstrap 3. Bootstrap (formerly Twitter Bootstrap) is a front-end framework that brings a lot of nice benefits: It's responsive, so your projects will automatically scale to device width using CSS media queries. For example, menus are full size on a desktop browser, but on narrower screens you automatically get a mobile-friendly menu. The built-in Bootstrap styles make your standard page elements (headers, footers, buttons, form inputs, tables etc.) look nice and modern. Bootstrap is themeable, so you can reskin your whole site by dropping in a new Bootstrap theme. Since Bootstrap is pretty popular across the web development community, this gives you a large and rapidly growing variety of templates (free and paid) to choose from. Bootstrap also includes a lot of very useful things: components (like progress bars and badges), useful glyphicons, and some jQuery plugins for tooltips, dropdowns, carousels, etc.). Here's a look at how the responsive part works. When the page is full screen, the menu and header are optimized for a wide screen display: When I shrink the page down (this is all based on page width, not useragent sniffing) the menu turns into a nice mobile-friendly dropdown: For a quick example, I grabbed a new free theme off bootswatch.com. For simple themes, you just need to download the boostrap.css file and replace the /content/bootstrap.css file in your project. Now when I refresh the page, I've got a new theme: Scaffolding The big change in scaffolding is that it's one system that works across ASP.NET. You can create a new Empty Web project or Web Forms project and you'll get the Scaffold context menus. For release, we've got MVC 5 and Web API 2 controllers. We had a preview of Web Forms scaffolding in the preview releases, but they weren't fully baked for RTM. Look for them in a future update, expected pretty soon. This scaffolding system wasn't just changed to work across the ASP.NET frameworks, it's also built to enable future extensibility. That's not in this release, but should also hopefully be out soon. Project Readme page This is a small thing, but I really like it. When you create a new project, you get a Project_Readme.html page that's added to the root of your project and opens in the Visual Studio built-in browser. I love it. A long time ago, when you created a new project we just dumped it on you and left you scratching your head about what to do next. Not ideal. Then we started adding a bunch of Getting Started information to the new project templates. That told you what to do next, but you had to delete all of that stuff out of your website. It doesn't belong there. Not ideal. This is a simple HTML file that's not integrated into your project code at all. You can delete it if you want. But, it shows a lot of helpful links that are current for the project you just created. In the future, if we add new wacky project types, they can create readme docs with specific information on how to do appropriately wacky things. Side note: I really like that they used the internal browser in Visual Studio to show this content rather than popping open an HTML page in the default browser. I hate that. It's annoying. If you're doing that, I hope you'll stop. What if some unnamed person has 40 or 90 tabs saved in their browser session? When you pop open your "Thanks for installing my Visual Studio extension!" page, all eleventy billion tabs start up and I wish I'd never installed your thing. Be like these guys and pop stuff Visual Studio specific HTML docs in the Visual Studio browser. ASP.NET MVC 5 The biggest change with ASP.NET MVC 5 is that it's no longer a separate project type. It integrates well with the rest of ASP.NET. In addition to that and the other common features we've already looked at (Bootstrap templates, Identity, authentication), here's what's new for ASP.NET MVC. Attribute routing ASP.NET MVC now supports attribute routing, thanks to a contribution by Tim McCall, the author of http://attributerouting.net. With attribute routing you can specify your routes by annotating your actions and controllers. This supports some pretty complex, customized routing scenarios, and it allows you to keep your route information right with your controller actions if you'd like. Here's a controller that includes an action whose method name is Hiding, but I've used AttributeRouting to configure it to /spaghetti/with-nesting/where-is-waldo public class SampleController : Controller { [Route("spaghetti/with-nesting/where-is-waldo")] public string Hiding() { return "You found me!"; } } I enable that in my RouteConfig.cs, and I can use that in conjunction with my other MVC routes like this: public class RouteConfig { public static void RegisterRoutes(RouteCollection routes) { routes.IgnoreRoute("{resource}.axd/{*pathInfo}"); routes.MapMvcAttributeRoutes(); routes.MapRoute( name: "Default", url: "{controller}/{action}/{id}", defaults: new { controller = "Home", action = "Index", id = UrlParameter.Optional } ); } } You can read more about Attribute Routing in ASP.NET MVC 5 here. Filter enhancements There are two new additions to filters: Authentication Filters and Filter Overrides. Authentication filters are a new kind of filter in ASP.NET MVC that run prior to authorization filters in the ASP.NET MVC pipeline and allow you to specify authentication logic per-action, per-controller, or globally for all controllers. Authentication filters process credentials in the request and provide a corresponding principal. Authentication filters can also add authentication challenges in response to unauthorized requests. Override filters let you change which filters apply to a given action method or controller. Override filters specify a set of filter types that should not be run for a given scope (action or controller). This allows you to configure filters that apply globally but then exclude certain global filters from applying to specific actions or controllers. ASP.NET Web API 2 ASP.NET Web API 2 includes a lot of new features. Attribute Routing ASP.NET Web API supports the same attribute routing system that's in ASP.NET MVC 5. You can read more about the Attribute Routing features in Web API in this article. OAuth 2.0 ASP.NET Web API picks up OAuth 2.0 support, using security middleware running on OWIN (discussed below). This is great for features like authenticated Single Page Applications. OData Improvements ASP.NET Web API now has full OData support. That required adding in some of the most powerful operators: $select, $expand, $batch and $value. You can read more about OData operator support in this article by Mike Wasson. Lots more There's a huge list of other features, including CORS (cross-origin request sharing), IHttpActionResult, IHttpRequestContext, and more. I think the best overview is in the release notes. OWIN and Katana I've written about OWIN and Katana recently. I'm a big fan. OWIN is the Open Web Interfaces for .NET. It's a spec, like HTML or HTTP, so you can't install OWIN. The benefit of OWIN is that it's a community specification, so anyone who implements it can plug into the ASP.NET stack, either as middleware or as a host. Katana is the Microsoft implementation of OWIN. It leverages OWIN to wire up things like authentication, handlers, modules, IIS hosting, etc., so ASP.NET can host OWIN components and Katana components can run in someone else's OWIN implementation. Howard Dierking just wrote a cool article in MSDN magazine describing Katana in depth: Getting Started with the Katana Project. He had an interesting example showing an OWIN based pipeline which leveraged SignalR, ASP.NET Web API and NancyFx components in the same stack. If this kind of thing makes sense to you, that's great. If it doesn't, don't worry, but keep an eye on it. You're going to see some cool things happen as a result of ASP.NET becoming more and more pluggable. Visual Studio Web Tools Okay, this stuff's just crazy. Visual Studio has been adding some nice web dev features over the past few years, but they've really cranked it up for this release. Visual Studio is by far my favorite code editor for all web files: CSS, HTML, JavaScript, and lots of popular libraries. Stop thinking of Visual Studio as a big editor that you only use to write back-end code. Stop editing HTML and CSS in Notepad (or Sublime, Notepad++, etc.). Visual Studio starts up in under 2 seconds on a modern computer with an SSD. Misspelling HTML attributes or your CSS classes or jQuery or Angular syntax is stupid. It doesn't make you a better developer, it makes you a silly person who wastes time. Browser Link Browser Link is a real-time, two-way connection between Visual Studio and all connected browsers. It's only attached when you're running locally, in debug, but it applies to any and all connected browser, including emulators. You may have seen demos that showed the browsers refreshing based on changes in the editor, and I'll agree that's pretty cool. But it's really just the start. It's a two-way connection, and it's built for extensiblity. That means you can write extensions that push information from your running application (in IE, Chrome, a mobile emulator, etc.) back to Visual Studio. Mads and team have showed off some demonstrations where they enabled edit mode in the browser which updated the source HTML back on the browser. It's also possible to look at how the rendered HTML performs, check for compatibility issues, watch for unused CSS classes, the sky's the limit. New HTML editor The previous HTML editor had a lot of old code that didn't allow for improvements. The team rewrote the HTML editor to take advantage of the new(ish) extensibility features in Visual Studio, which then allowed them to add in all kinds of features - things like CSS Class and ID IntelliSense (so you type style="" and get a list of classes and ID's for your project), smart indent based on how your document is formatted, JavaScript reference auto-sync, etc. Here's a 3 minute tour from Mads Kristensen. The previous HTML editor had a lot of old code that didn't allow for improvements. The team rewrote the HTML editor to take advantage of the new(ish) extensibility features in Visual Studio, which then allowed them to add in all kinds of features - things like CSS Class and ID IntelliSense (so you type style="" and get a list of classes and ID's for your project), smart indent based on how your document is formatted, JavaScript reference auto-sync, etc. Lots more Visual Studio web dev features That's just a sampling - there's a ton of great features for JavaScript editing, CSS editing, publishing, and Page Inspector (which shows real-time rendering of your page inside Visual Studio). Here are some more short videos showing those features. Lots, lots more Okay, that's just a summary, and it's still quite a bit. Head on over to http://asp.net/vnext for more information, and download Visual Studio 2013 now to get started!

Read the article
Guidance: A Branching strategy for Scrum Teams

- by Martin Hinshelwood

Having a good branching strategy will save your bacon, or at least your code. Be careful when deviating from your branching strategy because if you do, you may be worse off than when you started! This is one possible branching strategy for Scrum teams and I will not be going in depth with Scrum but you can find out more about Scrum by reading the Scrum Guide and you can even assess your Scrum knowledge by having a go at the Scrum Open Assessment. You can also read SSW’s Rules to Better Scrum using TFS which have been developed during our own Scrum implementations. Acknowledgements Bill Heys – Bill offered some good feedback on this post and helped soften the language. Note: Bill is a VS ALM Ranger and co-wrote the Branching Guidance for TFS 2010 Willy-Peter Schaub – Willy-Peter is an ex Visual Studio ALM MVP turned blue badge and has been involved in most of the guidance including the Branching Guidance for TFS 2010 Chris Birmele – Chris wrote some of the early TFS Branching and Merging Guidance. Dr Paul Neumeyer, Ph.D Parallel Processes, ScrumMaster and SSW Solution Architect – Paul wanted to have feature branches coming from the release branch as well. We agreed that this is really a spin-off that needs own project, backlog, budget and Team. Scenario: A product is developed RTM 1.0 is released and gets great sales. Extra features are demanded but the new version will have double to price to pay to recover costs, work is approved by the guys with budget and a few sprints later RTM 2.0 is released. Sales a very low due to the pricing strategy. There are lots of clients on RTM 1.0 calling out for patches. As I keep getting Reverse Integration and Forward Integration mixed up and Bill keeps slapping my wrists I thought I should have a reminder: You still seemed to use reverse and/or forward integration in the wrong context. I would recommend reviewing your document at the end to ensure that it agrees with the common understanding of these terms merge (forward integration) from parent to child (same direction as the branch), and merge (reverse integration) from child to parent (the reverse direction of the branch). - one of my many slaps on the wrist from Bill Heys. As I mentioned previously we are using a single feature branching strategy in our current project. The single biggest mistake developers make is developing against the “Main” or “Trunk” line. This ultimately leads to messy code as things are added and never finished. Your only alternative is to NEVER check in unless your code is 100%, but this does not work in practice, even with a single developer. Your ADD will kick in and your half-finished code will be finished enough to pass the build and the tests. You do use builds don’t you? Sadly, this is a very common scenario and I have had people argue that branching merely adds complexity. Then again I have seen the other side of the universe ... branching structures from he... We should somehow convince everyone that there is a happy between no-branching and too-much-branching. - Willy-Peter Schaub, VS ALM Ranger, Microsoft A key benefit of branching for development is to isolate changes from the stable Main branch. Branching adds sanity more than it adds complexity. We do try to stress in our guidance that it is important to justify a branch, by doing a cost benefit analysis. The primary cost is the effort to do merges and resolve conflicts. A key benefit is that you have a stable code base in Main and accept changes into Main only after they pass quality gates, etc. - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft The second biggest mistake developers make is branching anything other than the WHOLE “Main” line. If you branch parts of your code and not others it gets out of sync and can make integration a nightmare. You should have your Source, Assets, Build scripts deployment scripts and dependencies inside the “Main” folder and branch the whole thing. Some departments within MSFT even go as far as to add the environments used to develop the product in there as well; although I would not recommend that unless you have a massive SQL cluster to house your source code. We tried the “add environment” back in South-Africa and while it was “phenomenal”, especially when having to switch between environments, the disk storage and processing requirements killed us. We opted for virtualization to skin this cat of keeping a ready-to-go environment handy. - Willy-Peter Schaub, VS ALM Ranger, Microsoft I think people often think that you should have separate branches for separate environments (e.g. Dev, Test, Integration Test, QA, etc.). I prefer to think of deploying to environments (such as from Main to QA) rather than branching for QA). - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft You can read about SSW’s Rules to better Source Control for some additional information on what Source Control to use and how to use it. There are also a number of branching Anti-Patterns that should be avoided at all costs: You know you are on the wrong track if you experience one or more of the following symptoms in your development environment: Merge Paranoia—avoiding merging at all cost, usually because of a fear of the consequences. Merge Mania—spending too much time merging software assets instead of developing them. Big Bang Merge—deferring branch merging to the end of the development effort and attempting to merge all branches simultaneously. Never-Ending Merge—continuous merging activity because there is always more to merge. Wrong-Way Merge—merging a software asset version with an earlier version. Branch Mania—creating many branches for no apparent reason. Cascading Branches—branching but never merging back to the main line. Mysterious Branches—branching for no apparent reason. Temporary Branches—branching for changing reasons, so the branch becomes a permanent temporary workspace. Volatile Branches—branching with unstable software assets shared by other branches or merged into another branch. Note Branches are volatile most of the time while they exist as independent branches. That is the point of having them. The difference is that you should not share or merge branches while they are in an unstable state. Development Freeze—stopping all development activities while branching, merging, and building new base lines. Berlin Wall—using branches to divide the development team members, instead of dividing the work they are performing. -Branching and Merging Primer by Chris Birmele - Developer Tools Technical Specialist at Microsoft Pty Ltd in Australia In fact, this can result in a merge exercise no-one wants to be involved in, merging hundreds of thousands of change sets and trying to get a consolidated build. Again, we need to find a happy medium. - Willy-Peter Schaub on Merge Paranoia Merge conflicts are generally the result of making changes to the same file in both the target and source branch. If you create merge conflicts, you will eventually need to resolve them. Often the resolution is manual. Merging more frequently allows you to resolve these conflicts close to when they happen, making the resolution clearer. Waiting weeks or months to resolve them, the Big Bang approach, means you are more likely to resolve conflicts incorrectly. - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft Figure: Main line, this is where your stable code lives and where any build has known entities, always passes and has a happy test that passes as well? Many development projects consist of, a single “Main” line of source and artifacts. This is good; at least there is source control . There are however a couple of issues that need to be considered. What happens if: you and your team are working on a new set of features and the customer wants a change to his current version? you are working on two features and the customer decides to abandon one of them? you have two teams working on different feature sets and their changes start interfering with each other? I just use labels instead of branches? That's a lot of “what if’s”, but there is a simple way of preventing this. Branching… In TFS, labels are not immutable. This does not mean they are not useful. But labels do not provide a very good development isolation mechanism. Branching allows separate code sets to evolve separately (e.g. Current with hotfixes, and vNext with new development). I don’t see how labels work here. - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft Figure: Creating a single feature branch means you can isolate the development work on that branch. Its standard practice for large projects with lots of developers to use Feature branching and you can check the Branching Guidance for the latest recommendations from the Visual Studio ALM Rangers for other methods. In the diagram above you can see my recommendation for branching when using Scrum development with TFS 2010. It consists of a single Sprint branch to contain all the changes for the current sprint. The main branch has the permissions changes so contributors to the project can only Branch and Merge with “Main”. This will prevent accidental check-ins or checkouts of the “Main” line that would contaminate the code. The developers continue to develop on sprint one until the completion of the sprint. Note: In the real world, starting a new Greenfield project, this process starts at Sprint 2 as at the start of Sprint 1 you would have artifacts in version control and no need for isolation. Figure: Once the sprint is complete the Sprint 1 code can then be merged back into the Main line. There are always good practices to follow, and one is to always do a Forward Integration from Main into Sprint 1 before you do a Reverse Integration from Sprint 1 back into Main. In this case it may seem superfluous, but this builds good muscle memory into your developer’s work ethic and means that no bad habits are learned that would interfere with additional Scrum Teams being added to the Product. The process of completing your sprint development: The Team completes their work according to their definition of done. Merge from “Main” into “Sprint1” (Forward Integration) Stabilize your code with any changes coming from other Scrum Teams working on the same product. If you have one Scrum Team this should be quick, but there may have been bug fixes in the Release branches. (we will talk about release branches later) Merge from “Sprint1” into “Main” to commit your changes. (Reverse Integration) Check-in Delete the Sprint1 branch Note: The Sprint 1 branch is no longer required as its useful life has been concluded. Check-in Done But you are not yet done with the Sprint. The goal in Scrum is to have a “potentially shippable product” at the end of every Sprint, and we do not have that yet, we only have finished code. Figure: With Sprint 1 merged you can create a Release branch and run your final packaging and testing In 99% of all projects I have been involved in or watched, a “shippable product” only happens towards the end of the overall lifecycle, especially when sprints are short. The in-between releases are great demonstration releases, but not shippable. Perhaps it comes from my 80’s brain washing that we only ship when we reach the agreed quality and business feature bar. - Willy-Peter Schaub, VS ALM Ranger, Microsoft Although you should have been testing and packaging your code all the way through your Sprint 1 development, preferably using an automated process, you still need to test and package with stable unchanging code. This is where you do what at SSW we call a “Test Please”. This is first an internal test of the product to make sure it meets the needs of the customer and you generally use a resource external to your Team. Then a “Test Please” is conducted with the Product Owner to make sure he is happy with the output. You can read about how to conduct a Test Please on our Rules to Successful Projects: Do you conduct an internal "test please" prior to releasing a version to a client? Figure: If you find a deviation from the expected result you fix it on the Release branch. If during your final testing or your “Test Please” you find there are issues or bugs then you should fix them on the release branch. If you can’t fix them within the time box of your Sprint, then you will need to create a Bug and put it onto the backlog for prioritization by the Product owner. Make sure you leave plenty of time between your merge from the development branch to find and fix any problems that are uncovered. This process is commonly called Stabilization and should always be conducted once you have completed all of your User Stories and integrated all of your branches. Even once you have stabilized and released, you should not delete the release branch as you would with the Sprint branch. It has a usefulness for servicing that may extend well beyond the limited life you expect of it. Note: Don't get forced by the business into adding features into a Release branch instead that indicates the unspoken requirement is that they are asking for a product spin-off. In this case you can create a new Team Project and branch from the required Release branch to create a new Main branch for that product. And you create a whole new backlog to work from. Figure: When the Team decides it is happy with the product you can create a RTM branch. Once you have fixed all the bugs you can, and added any you can’t to the Product Backlog, and you Team is happy with the result you can create a Release. This would consist of doing the final Build and Packaging it up ready for your Sprint Review meeting. You would then create a read-only branch that represents the code you “shipped”. This is really an Audit trail branch that is optional, but is good practice. You could use a Label, but Labels are not Auditable and if a dispute was raised by the customer you can produce a verifiable version of the source code for an independent party to check. Rare I know, but you do not want to be at the wrong end of a legal battle. Like the Release branch the RTM branch should never be deleted, or only deleted according to your companies legal policy, which in the UK is usually 7 years. Figure: If you have made any changes in the Release you will need to merge back up to Main in order to finalise the changes. Nothing is really ever done until it is in Main. The same rules apply when merging any fixes in the Release branch back into Main and you should do a reverse merge before a forward merge, again for the muscle memory more than necessity at this stage. Your Sprint is now nearly complete, and you can have a Sprint Review meeting knowing that you have made every effort and taken every precaution to protect your customer’s investment. Note: In order to really achieve protection for both you and your client you would add Automated Builds, Automated Tests, Automated Acceptance tests, Acceptance test tracking, Unit Tests, Load tests, Web test and all the other good engineering practices that help produce reliable software. Figure: After the Sprint Planning meeting the process begins again. Where the Sprint Review and Retrospective meetings mark the end of the Sprint, the Sprint Planning meeting marks the beginning. After you have completed your Sprint Planning and you know what you are trying to achieve in Sprint 2 you can create your new Branch to develop in. How do we handle a bug(s) in production that can’t wait? Although in Scrum the only work done should be on the backlog there should be a little buffer added to the Sprint Planning for contingencies. One of these contingencies is a bug in the current release that can’t wait for the Sprint to finish. But how do you handle that? Willy-Peter Schaub asked an excellent question on the release activities: In reality Sprint 2 starts when sprint 1 ends + weekend. Should we not cater for a possible parallelism between Sprint 2 and the release activities of sprint 1? It would introduce FI’s from main to sprint 2, I guess. Your “Figure: Merging print 2 back into Main.” covers, what I tend to believe to be reality in most cases. - Willy-Peter Schaub, VS ALM Ranger, Microsoft I agree, and if you have a single Scrum team then your resources are limited. The Scrum Team is responsible for packaging and release, so at least one run at stabilization, package and release should be included in the Sprint time box. If more are needed on the current production release during the Sprint 2 time box then resource needs to be pulled from Sprint 2. The Product Owner and the Team have four choices (in order of disruption/cost): Backlog: Add the bug to the backlog and fix it in the next Sprint Buffer Time: Use any buffer time included in the current Sprint to fix the bug quickly Make time: Remove a Story from the current Sprint that is of equal value to the time lost fixing the bug(s) and releasing. Note: The Team must agree that it can still meet the Sprint Goal. Cancel Sprint: Cancel the sprint and concentrate all resource on fixing the bug(s) Note: This can be a very costly if the current sprint has already had a lot of work completed as it will be lost. The choice will depend on the complexity and severity of the bug(s) and both the Product Owner and the Team need to agree. In this case we will go with option #2 or #3 as they are uncomplicated but severe bugs. Figure: Real world issue where a bug needs fixed in the current release. If the bug(s) is urgent enough then then your only option is to fix it in place. You can edit the release branch to find and fix the bug, hopefully creating a test so it can’t happen again. Follow the prior process and conduct an internal and customer “Test Please” before releasing. You can read about how to conduct a Test Please on our Rules to Successful Projects: Do you conduct an internal "test please" prior to releasing a version to a client? Figure: After you have fixed the bug you need to ship again. You then need to again create an RTM branch to hold the version of the code you released in escrow. Figure: Main is now out of sync with your Release. We now need to get these new changes back up into the Main branch. Do a reverse and then forward merge again to get the new code into Main. But what about the branch, are developers not working on Sprint 2? Does Sprint 2 now have changes that are not in Main and Main now have changes that are not in Sprint 2? Well, yes… and this is part of the hit you take doing branching. But would this scenario even have been possible without branching? Figure: Getting the changes in Main into Sprint 2 is very important. The Team now needs to do a Forward Integration merge into their Sprint and resolve any conflicts that occur. Maybe the bug has already been fixed in Sprint 2, maybe the bug no longer exists! This needs to be identified and resolved by the developers before they continue to get further out of Sync with Main. Note: Avoid the “Big bang merge” at all costs. Figure: Merging Sprint 2 back into Main, the Forward Integration, and R0 terminates. Sprint 2 now merges (Reverse Integration) back into Main following the procedures we have already established. Figure: The logical conclusion. This then allows the creation of the next release. By now you should be getting the big picture and hopefully you learned something useful from this post. I know I have enjoyed writing it as I find these exploratory posts coupled with real world experience really help harden my understanding. Branching is a tool; it is not a silver bullet. Don’t over use it, and avoid “Anti-Patterns” where possible. Although the diagram above looks complicated I hope showing you how it is formed simplifies it as much as possible. Technorati Tags: Branching,Scrum,VS ALM,TFS 2010,VS2010

Read the article
Sniffing out SQL Code Smells: Inconsistent use of Symbolic names and Datatypes

- by Phil Factor

It is an awkward feeling. You’ve just delivered a database application that seems to be working fine in production, and you just run a few checks on it. You discover that there is a potential bug that, out of sheer good chance, hasn’t kicked in to produce an error; but it lurks, like a smoking bomb. Worse, maybe you find that the bug has started its evil work of corrupting the data, but in ways that nobody has, so far detected. You investigate, and find the damage. You are somehow going to have to repair it. Yes, it still very occasionally happens to me. It is not a nice feeling, and I do anything I can to prevent it happening. That’s why I’m interested in SQL code smells. SQL Code Smells aren’t necessarily bad practices, but just show you where to focus your attention when checking an application. Sometimes with databases the bugs can be subtle. SQL is rather like HTML: the language does its best to try to carry out your wishes, rather than to be picky about your bugs. Most of the time, this is a great benefit, but not always. One particular place where this can be detrimental is where you have implicit conversion between different data types. Most of the time it is completely harmless but we’re concerned about the occasional time it isn’t. Let’s give an example: String truncation. Let’s give another even more frightening one, rounding errors on assignment to a number of different precision. Each requires a blog-post to explain in detail and I’m not now going to try. Just remember that it is not always a good idea to assign data to variables, parameters or even columns when they aren’t the same datatype, especially if you are relying on implicit conversion to work its magic.For details of the problem and the consequences, see here: SR0014: Data loss might occur when casting from {Type1} to {Type2} . For any experienced Database Developer, this is a more frightening read than a Vampire Story. This is why one of the SQL Code Smells that makes me edgy, in my own or other peoples’ code, is to see parameters, variables and columns that have the same names and different datatypes. Whereas quite a lot of this is perfectly normal and natural, you need to check in case one of two things have gone wrong. Either sloppy naming, or mixed datatypes. Sure it is hard to remember whether you decided that the length of a log entry was 80 or 100 characters long, or the precision of a number. That is why a little check like this I’m going to show you is excellent for tidying up your code before you check it back into source Control! 1/ Checking Parameters only If you were just going to check parameters, you might just do this. It simply groups all the parameters, either input or output, of all the routines (e.g. stored procedures or functions) by their name and checks to see, in the HAVING clause, whether their data types are all the same. If not, it lists all the examples and their origin (the routine) Even this little check can occasionally be scarily revealing. ;WITH userParameter AS ( SELECT c.NAME AS ParameterName, OBJECT_SCHEMA_NAME(c.object_ID) + '.' + OBJECT_NAME(c.object_ID) AS ObjectName, t.name + ' ' + CASE --we may have to put in the length WHEN t.name IN ('char', 'varchar', 'nchar', 'nvarchar') THEN '(' + CASE WHEN c.max_length = -1 THEN 'MAX' ELSE CONVERT(VARCHAR(4), CASE WHEN t.name IN ('nchar', 'nvarchar') THEN c.max_length / 2 ELSE c.max_length END) END + ')' WHEN t.name IN ('decimal', 'numeric') THEN '(' + CONVERT(VARCHAR(4), c.precision) + ',' + CONVERT(VARCHAR(4), c.Scale) + ')' ELSE '' END --we've done with putting in the length + CASE WHEN XML_collection_ID <> 0 THEN --deal with object schema names '(' + CASE WHEN is_XML_Document = 1 THEN 'DOCUMENT ' ELSE 'CONTENT ' END + COALESCE( (SELECT QUOTENAME(ss.name) + '.' + QUOTENAME(sc.name) FROM sys.xml_schema_collections sc INNER JOIN Sys.Schemas ss ON sc.schema_ID = ss.schema_ID WHERE sc.xml_collection_ID = c.XML_collection_ID),'NULL') + ')' ELSE '' END AS [DataType] FROM sys.parameters c INNER JOIN sys.types t ON c.user_Type_ID = t.user_Type_ID WHERE OBJECT_SCHEMA_NAME(c.object_ID) <> 'sys' AND parameter_id>0)SELECT CONVERT(CHAR(80),objectName+'.'+ParameterName),DataType FROM UserParameterWHERE ParameterName IN (SELECT ParameterName FROM UserParameter GROUP BY ParameterName HAVING MIN(Datatype)<>MAX(DataType))ORDER BY ParameterName so, in a very small example here, we have a @ClosingDelimiter variable that is only CHAR(1) when, by the looks of it, it should be up to ten characters long, or even worse, a function that should be a char(1) and seems to let in a string of ten characters. Worth investigating. Then we have a @Comment variable that can't decide whether it is a VARCHAR(2000) or a VARCHAR(MAX) 2/ Columns and Parameters Actually, once we’ve cleared up the mess we’ve made of our parameter-naming in the database we’re inspecting, we’re going to be more interested in listing both columns and parameters. We can do this by modifying the routine to list columns as well as parameters. Because of the slight complexity of creating the string version of the datatypes, we will create a fake table of both columns and parameters so that they can both be processed the same way. After all, we want the datatypes to match Unfortunately, parameters do not expose all the attributes we are interested in, such as whether they are nullable (oh yes, subtle bugs happen if this isn’t consistent for a datatype). We’ll have to leave them out for this check. Voila! A slight modification of the first routine ;WITH userObject AS ( SELECT Name AS DataName,--the actual name of the parameter or column ('@' removed) --and the qualified object name of the routine OBJECT_SCHEMA_NAME(ObjectID) + '.' + OBJECT_NAME(ObjectID) AS ObjectName, --now the harder bit: the definition of the datatype. TypeName + ' ' + CASE --we may have to put in the length. e.g. CHAR (10) WHEN TypeName IN ('char', 'varchar', 'nchar', 'nvarchar') THEN '(' + CASE WHEN MaxLength = -1 THEN 'MAX' ELSE CONVERT(VARCHAR(4), CASE WHEN TypeName IN ('nchar', 'nvarchar') THEN MaxLength / 2 ELSE MaxLength END) END + ')' WHEN TypeName IN ('decimal', 'numeric')--a BCD number! THEN '(' + CONVERT(VARCHAR(4), Precision) + ',' + CONVERT(VARCHAR(4), Scale) + ')' ELSE '' END --we've done with putting in the length + CASE WHEN XML_collection_ID <> 0 --tush tush. XML THEN --deal with object schema names '(' + CASE WHEN is_XML_Document = 1 THEN 'DOCUMENT ' ELSE 'CONTENT ' END + COALESCE( (SELECT TOP 1 QUOTENAME(ss.name) + '.' + QUOTENAME(sc.Name) FROM sys.xml_schema_collections sc INNER JOIN Sys.Schemas ss ON sc.schema_ID = ss.schema_ID WHERE sc.xml_collection_ID = XML_collection_ID),'NULL') + ')' ELSE '' END AS [DataType], DataObjectType FROM (Select t.name AS TypeName, REPLACE(c.name,'@','') AS Name, c.max_length AS MaxLength, c.precision AS [Precision], c.scale AS [Scale], c.[Object_id] AS ObjectID, XML_collection_ID, is_XML_Document,'P' AS DataobjectType FROM sys.parameters c INNER JOIN sys.types t ON c.user_Type_ID = t.user_Type_ID AND parameter_id>0 UNION all Select t.name AS TypeName, c.name AS Name, c.max_length AS MaxLength, c.precision AS [Precision], c.scale AS [Scale], c.[Object_id] AS ObjectID, XML_collection_ID,is_XML_Document, 'C' AS DataobjectType FROM sys.columns c INNER JOIN sys.types t ON c.user_Type_ID = t.user_Type_ID WHERE OBJECT_SCHEMA_NAME(c.object_ID) <> 'sys' )f)SELECT CONVERT(CHAR(80),objectName+'.' + CASE WHEN DataobjectType ='P' THEN '@' ELSE '' END + DataName),DataType FROM UserObjectWHERE DataName IN (SELECT DataName FROM UserObject GROUP BY DataName HAVING MIN(Datatype)<>MAX(DataType))ORDER BY DataName Hmm. I can tell you I found quite a few minor issues with the various tabases I tested this on, and found some potential bugs that really leap out at you from the results. Here is the start of the result for AdventureWorks. Yes, AccountNumber is, for some reason, a Varchar(10) in the Customer table. Hmm. odd. Why is a city fifty characters long in that view? The idea of the description of a colour being 256 characters long seems over-ambitious. Go down the list and you'll spot other mistakes. There are no bugs, but just mess. We started out with a listing to examine parameters, then we mixed parameters and columns. Our last listing is for a slightly more in-depth look at table columns. You’ll notice that we’ve delibarately removed the indication of whether a column is persisted, or is an identity column because that gives us false positives for our code smells. If you just want to browse your metadata for other reasons (and it can quite help in some circumstances) then uncomment them! ;WITH userColumns AS ( SELECT c.NAME AS columnName, OBJECT_SCHEMA_NAME(c.object_ID) + '.' + OBJECT_NAME(c.object_ID) AS ObjectName, REPLACE(t.name + ' ' + CASE WHEN is_computed = 1 THEN ' AS ' + --do DDL for a computed column (SELECT definition FROM sys.computed_columns cc WHERE cc.object_id = c.object_id AND cc.column_ID = c.column_ID) --we may have to put in the length WHEN t.Name IN ('char', 'varchar', 'nchar', 'nvarchar') THEN '(' + CASE WHEN c.Max_Length = -1 THEN 'MAX' ELSE CONVERT(VARCHAR(4), CASE WHEN t.Name IN ('nchar', 'nvarchar') THEN c.Max_Length / 2 ELSE c.Max_Length END) END + ')' WHEN t.name IN ('decimal', 'numeric') THEN '(' + CONVERT(VARCHAR(4), c.precision) + ',' + CONVERT(VARCHAR(4), c.Scale) + ')' ELSE '' END + CASE WHEN c.is_rowguidcol = 1 THEN ' ROWGUIDCOL' ELSE '' END + CASE WHEN XML_collection_ID <> 0 THEN --deal with object schema names '(' + CASE WHEN is_XML_Document = 1 THEN 'DOCUMENT ' ELSE 'CONTENT ' END + COALESCE((SELECT QUOTENAME(ss.name) + '.' + QUOTENAME(sc.name) FROM sys.xml_schema_collections sc INNER JOIN Sys.Schemas ss ON sc.schema_ID = ss.schema_ID WHERE sc.xml_collection_ID = c.XML_collection_ID), 'NULL') + ')' ELSE '' END + CASE WHEN is_identity = 1 THEN CASE WHEN OBJECTPROPERTY(object_id, 'IsUserTable') = 1 AND COLUMNPROPERTY(object_id, c.name, 'IsIDNotForRepl') = 0 AND OBJECTPROPERTY(object_id, 'IsMSShipped') = 0 THEN '' ELSE ' NOT FOR REPLICATION ' END ELSE '' END + CASE WHEN c.is_nullable = 0 THEN ' NOT NULL' ELSE ' NULL' END + CASE WHEN c.default_object_id <> 0 THEN ' DEFAULT ' + object_Definition(c.default_object_id) ELSE '' END + CASE WHEN c.collation_name IS NULL THEN '' WHEN c.collation_name <> (SELECT collation_name FROM sys.databases WHERE name = DB_NAME()) COLLATE Latin1_General_CI_AS THEN COALESCE(' COLLATE ' + c.collation_name, '') ELSE '' END,' ',' ') AS [DataType]FROM sys.columns c INNER JOIN sys.types t ON c.user_Type_ID = t.user_Type_ID WHERE OBJECT_SCHEMA_NAME(c.object_ID) <> 'sys')SELECT CONVERT(CHAR(80),objectName+'.'+columnName),DataType FROM UserColumnsWHERE columnName IN (SELECT columnName FROM UserColumns GROUP BY columnName HAVING MIN(Datatype)<>MAX(DataType))ORDER BY columnName If you take a look down the results against Adventureworks, you'll see once again that there are things to investigate, mostly, in the illustration, discrepancies between null and non-null datatypes So I here you ask, what about temporary variables within routines? If ever there was a source of elusive bugs, you'll find it there. Sadly, these temporary variables are not stored in the metadata so we'll have to find a more subtle way of flushing these out, and that will, I'm afraid, have to wait!

Read the article
Fedora 16 can connect to samba share using smbclient but not in nautilus 3.2.1

- by Nathan Jones

I have a machine running Ubuntu 11.10 Server acting as a Samba server to share my home directory. Everything works fine on my Windows 7 machine, but on my Fedora 16 laptop, if I use Nautilus to try to access the share using smb://192.168.0.8/nathan in the location bar, it just has the loading cursor and does nothing. It never shows any errors, nothing. Using smbclient works just fine, but I'd like to get it working in Nautilus. I know that there can be problems with SELinux and Samba, so I created a file called booleans.local that contains samba_enable_home_dirs=1. My smb.conf file looks like this: # For Unix password sync to work on a Debian GNU/Linux system, the following # parameters must be set (thanks to Ian Kahan <<[email protected]> for # sending the correct chat script for the passwd program in Debian Sarge). passwd program = /usr/bin/passwd %u passwd chat = *Enter\snew\s*\spassword:* %n\n *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* . # This boolean controls whether PAM will be used for password changes # when requested by an SMB client instead of the program listed in # 'passwd program'. The default is 'no'. pam password change = yes # This option controls how unsuccessful authentication attempts are mapped # to anonymous connections map to guest = bad user ########## Domains ########### # Is this machine able to authenticate users. Both PDC and BDC # must have this setting enabled. If you are the BDC you must # change the 'domain master' setting to no # ; domain logons = yes # # The following setting only takes effect if 'domain logons' is set # It specifies the location of the user's profile directory # from the client point of view) # The following required a [profiles] share to be setup on the # samba server (see below) ; logon path = \\%N\profiles\%U # Another common choice is storing the profile in the user's home directory # (this is Samba's default) # logon path = \\%N\%U\profile # The following setting only takes effect if 'domain logons' is set # It specifies the location of a user's home directory (from the client # point of view) ; logon drive = H: # logon home = \\%N\%U # The following setting only takes effect if 'domain logons' is set # It specifies the script to run during logon. The script must be stored # in the [netlogon] share # NOTE: Must be store in 'DOS' file format convention ; logon script = logon.cmd # This allows Unix users to be created on the domain controller via the SAMR # RPC pipe. The example command creates a user account with a disabled Unix # password; please adapt to your needs ; add user script = /usr/sbin/adduser --quiet --disabled-password --gecos "" %u # This allows machine accounts to be created on the domain controller via the # SAMR RPC pipe. # The following assumes a "machines" group exists on the system ; add machine script = /usr/sbin/useradd -g machines -c "%u machine account" -d /var/lib/samba -s /bin/false %u # This allows Unix groups to be created on the domain controller via the SAMR # RPC pipe. ; add group script = /usr/sbin/addgroup --force-badname %g ########## Printing ########## # If you want to automatically load your printer list rather # than setting them up individually then you'll need this # load printers = yes # lpr(ng) printing. You may wish to override the location of the # printcap file ; printing = bsd ; printcap name = /etc/printcap # CUPS printing. See also the cupsaddsmb(8) manpage in the # cupsys-client package. ; printing = cups ; printcap name = cups ############ Misc ############ # Using the following line enables you to customise your configuration # on a per machine basis. The %m gets replaced with the netbios name # of the machine that is connecting ; include = /home/samba/etc/smb.conf.%m # Most people will find that this option gives better performance. # See smb.conf(5) and /usr/share/doc/samba-doc/htmldocs/Samba3-HOWTO/speed.html # for details # You may want to add the following on a Linux system: # SO_RCVBUF=8192 SO_SNDBUF=8192 # socket options = TCP_NODELAY # The following parameter is useful only if you have the linpopup package # installed. The samba maintainer and the linpopup maintainer are # working to ease installation and configuration of linpopup and samba. ; message command = /bin/sh -c '/usr/bin/linpopup "%f" "%m" %s; rm %s' & # Domain Master specifies Samba to be the Domain Master Browser. If this # machine will be configured as a BDC (a secondary logon server), you # must set this to 'no'; otherwise, the default behavior is recommended. # domain master = auto # Some defaults for winbind (make sure you're not using the ranges # for something else.) ; idmap uid = 10000-20000 ; idmap gid = 10000-20000 ; template shell = /bin/bash # The following was the default behaviour in sarge, # but samba upstream reverted the default because it might induce # performance issues in large organizations. # See Debian bug #368251 for some of the consequences of *not* # having this setting and smb.conf(5) for details. ; winbind enum groups = yes ; winbind enum users = yes # Setup usershare options to enable non-root users to share folders # with the net usershare command. # Maximum number of usershare. 0 (default) means that usershare is disabled. ; usershare max shares = 100 # Allow users who've been granted usershare privileges to create # public shares, not just authenticated ones usershare allow guests = yes #======================= Share Definitions ======================= # Un-comment the following (and tweak the other settings below to suit) # to enable the default home directory shares. This will share each # user's home director as \\server\username [homes] comment = Home Directories browseable = yes # By default, the home directories are exported read-only. Change the # next parameter to 'no' if you want to be able to write to them. read only = no # File creation mask is set to 0700 for security reasons. If you want to # create files with group=rw permissions, set next parameter to 0775. ; create mask = 0775 # Directory creation mask is set to 0700 for security reasons. If you want to # create dirs. with group=rw permissions, set next parameter to 0775. ; directory mask = 0775 # By default, \\server\username shares can be connected to by anyone # with access to the samba server. Un-comment the following parameter # to make sure that only "username" can connect to \\server\username # The following parameter makes sure that only "username" can connect # # This might need tweaking when using external authentication schemes valid users = %S # Un-comment the following and create the netlogon directory for Domain Logons # (you need to configure Samba to act as a domain controller too.) ;[netlogon] ; comment = Network Logon Service ; path = /home/samba/netlogon ; guest ok = yes ; read only = yes # Un-comment the following and create the profiles directory to store # users profiles (see the "logon path" option above) # (you need to configure Samba to act as a domain controller too.) # The path below should be writable by all users so that their # profile directory may be created the first time they log on ;[profiles] ; comment = Users profiles ; path = /home/samba/profiles ; guest ok = no ; browseable = no ; create mask = 0600 ; directory mask = 0700 [printers] comment = All Printers browseable = no path = /var/spool/samba printable = yes guest ok = no read only = no create mask = 0700 # Windows clients look for this share name as a source of downloadable # printer drivers [print$] comment = Printer Drivers path = /var/lib/samba/printers browseable = yes read only = yes guest ok = no # Uncomment to allow remote administration of Windows print drivers. # You may need to replace 'lpadmin' with the name of the group your # admin users are members of. # Please note that you also need to set appropriate Unix permissions # to the drivers directory for these users to have write rights in it ; write list = root, @lpadmin # A sample share for sharing your CD-ROM with others. ;[cdrom] ; comment = Samba server's CD-ROM ; read only = yes ; locking = no ; path = /cdrom ; guest ok = yes # The next two parameters show how to auto-mount a CD-ROM when the # cdrom share is accesed. For this to work /etc/fstab must contain # an entry like this: # # /dev/scd0 /cdrom iso9660 defaults,noauto,ro,user 0 0 # # The CD-ROM gets unmounted automatically after the connection to the # # If you don't want to use auto-mounting/unmounting make sure the CD # is mounted on /cdrom # ; preexec = /bin/mount /cdrom ; postexec = /bin/umount /cdrom smbusers: <nathan> = <"nathan"> Any help would be very much appreciated! Thanks!

Read the article
What is correct HTTP status code when redirecting to a login page?

- by PHP_Jedi

When a user is not logged in and tries to access an page that requires login, what is the correct HTTP status code for a redirect to the login page? I don't feel that any of the 3xx fit that description. 10.3.1 300 Multiple Choices The requested resource corresponds to any one of a set of representations, each with its own specific location, and agent- driven negotiation information (section 12) is being provided so that the user (or user agent) can select a preferred representation and redirect its request to that location. Unless it was a HEAD request, the response SHOULD include an entity containing a list of resource characteristics and location(s) from which the user or user agent can choose the one most appropriate. The entity format is specified by the media type given in the Content- Type header field. Depending upon the format and the capabilities of the user agent, selection of the most appropriate choice MAY be performed automatically. However, this specification does not define any standard for such automatic selection. If the server has a preferred choice of representation, it SHOULD include the specific URI for that representation in the Location field; user agents MAY use the Location field value for automatic redirection. This response is cacheable unless indicated otherwise. 10.3.2 301 Moved Permanently The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs. Clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server, where possible. This response is cacheable unless indicated otherwise. The new permanent URI SHOULD be given by the Location field in the response. Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s). If the 301 status code is received in response to a request other than GET or HEAD, the user agent MUST NOT automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued. Note: When automatically redirecting a POST request after receiving a 301 status code, some existing HTTP/1.0 user agents will erroneously change it into a GET request. 10.3.3 302 Found The requested resource resides temporarily under a different URI. Since the redirection might be altered on occasion, the client SHOULD continue to use the Request-URI for future requests. This response is only cacheable if indicated by a Cache-Control or Expires header field. The temporary URI SHOULD be given by the Location field in the response. Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s). If the 302 status code is received in response to a request other than GET or HEAD, the user agent MUST NOT automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued. Note: RFC 1945 and RFC 2068 specify that the client is not allowed to change the method on the redirected request. However, most existing user agent implementations treat 302 as if it were a 303 response, performing a GET on the Location field-value regardless of the original request method. The status codes 303 and 307 have been added for servers that wish to make unambiguously clear which kind of reaction is expected of the client. 10.3.4 303 See Other The response to the request can be found under a different URI and SHOULD be retrieved using a GET method on that resource. This method exists primarily to allow the output of a POST-activated script to redirect the user agent to a selected resource. The new URI is not a substitute reference for the originally requested resource. The 303 response MUST NOT be cached, but the response to the second (redirected) request might be cacheable. The different URI SHOULD be given by the Location field in the response. Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s). Note: Many pre-HTTP/1.1 user agents do not understand the 303 status. When interoperability with such clients is a concern, the 302 status code may be used instead, since most user agents react to a 302 response as described here for 303. 10.3.5 304 Not Modified If the client has performed a conditional GET request and access is allowed, but the document has not been modified, the server SHOULD respond with this status code. The 304 response MUST NOT contain a message-body, and thus is always terminated by the first empty line after the header fields. The response MUST include the following header fields: - Date, unless its omission is required by section 14.18.1 If a clockless origin server obeys these rules, and proxies and clients add their own Date to any response received without one (as already specified by [RFC 2068], section 14.19), caches will operate correctly. - ETag and/or Content-Location, if the header would have been sent in a 200 response to the same request - Expires, Cache-Control, and/or Vary, if the field-value might differ from that sent in any previous response for the same variant If the conditional GET used a strong cache validator (see section 13.3.3), the response SHOULD NOT include other entity-headers. Otherwise (i.e., the conditional GET used a weak validator), the response MUST NOT include other entity-headers; this prevents inconsistencies between cached entity-bodies and updated headers. If a 304 response indicates an entity not currently cached, then the cache MUST disregard the response and repeat the request without the conditional. If a cache uses a received 304 response to update a cache entry, the cache MUST update the entry to reflect any new field values given in the response. 10.3.6 305 Use Proxy The requested resource MUST be accessed through the proxy given by the Location field. The Location field gives the URI of the proxy. The recipient is expected to repeat this single request via the proxy. 305 responses MUST only be generated by origin servers. Note: RFC 2068 was not clear that 305 was intended to redirect a single request, and to be generated by origin servers only. Not observing these limitations has significant security consequences. 10.3.7 306 (Unused) The 306 status code was used in a previous version of the specification, is no longer used, and the code is reserved. 10.3.8 307 Temporary Redirect The requested resource resides temporarily under a different URI. Since the redirection MAY be altered on occasion, the client SHOULD continue to use the Request-URI for future requests. This response is only cacheable if indicated by a Cache-Control or Expires header field. The temporary URI SHOULD be given by the Location field in the response. Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s) , since many pre-HTTP/1.1 user agents do not understand the 307 status. Therefore, the note SHOULD contain the information necessary for a user to repeat the original request on the new URI. If the 307 status code is received in response to a request other than GET or HEAD, the user agent MUST NOT automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued. I'm using 302 for now, until I find THE correct answer.

Read the article
A way of doing real-world test-driven development (and some thoughts about it)

- by Thomas Weller

Lately, I exchanged some arguments with Derick Bailey about some details of the red-green-refactor cycle of the Test-driven development process. In short, the issue revolved around the fact that it’s not enough to have a test red or green, but it’s also important to have it red or green for the right reasons. While for me, it’s sufficient to initially have a NotImplementedException in place, Derick argues that this is not totally correct (see these two posts: Red/Green/Refactor, For The Right Reasons and Red For The Right Reason: Fail By Assertion, Not By Anything Else). And he’s right. But on the other hand, I had no idea how his insights could have any practical consequence for my own individual interpretation of the red-green-refactor cycle (which is not really red-green-refactor, at least not in its pure sense, see the rest of this article). This made me think deeply for some days now. In the end I found out that the ‘right reason’ changes in my understanding depending on what development phase I’m in. To make this clear (at least I hope it becomes clear…) I started to describe my way of working in some detail, and then something strange happened: The scope of the article slightly shifted from focusing ‘only’ on the ‘right reason’ issue to something more general, which you might describe as something like 'Doing real-world TDD in .NET , with massive use of third-party add-ins’. This is because I feel that there is a more general statement about Test-driven development to make: It’s high time to speak about the ‘How’ of TDD, not always only the ‘Why’. Much has been said about this, and me myself also contributed to that (see here: TDD is not about testing, it's about how we develop software). But always justifying what you do is very unsatisfying in the long run, it is inherently defensive, and it costs time and effort that could be used for better and more important things. And frankly: I’m somewhat sick and tired of repeating time and again that the test-driven way of software development is highly preferable for many reasons - I don’t want to spent my time exclusively on stating the obvious… So, again, let’s say it clearly: TDD is programming, and programming is TDD. Other ways of programming (code-first, sometimes called cowboy-coding) are exceptional and need justification. – I know that there are many people out there who will disagree with this radical statement, and I also know that it’s not a description of the real world but more of a mission statement or something. But nevertheless I’m absolutely sure that in some years this statement will be nothing but a platitude. Side note: Some parts of this post read as if I were paid by Jetbrains (the manufacturer of the ReSharper add-in – R#), but I swear I’m not. Rather I think that Visual Studio is just not production-complete without it, and I wouldn’t even consider to do professional work without having this add-in installed... The three parts of a software component Before I go into some details, I first should describe my understanding of what belongs to a software component (assembly, type, or method) during the production process (i.e. the coding phase). Roughly, I come up with the three parts shown below: First, we need to have some initial sort of requirement. This can be a multi-page formal document, a vague idea in some programmer’s brain of what might be needed, or anything in between. In either way, there has to be some sort of requirement, be it explicit or not. – At the C# micro-level, the best way that I found to formulate that is to define interfaces for just about everything, even for internal classes, and to provide them with exhaustive xml comments. The next step then is to re-formulate these requirements in an executable form. This is specific to the respective programming language. - For C#/.NET, the Gallio framework (which includes MbUnit) in conjunction with the ReSharper add-in for Visual Studio is my toolset of choice. The third part then finally is the production code itself. It’s development is entirely driven by the requirements and their executable formulation. This is the delivery, the two other parts are ‘only’ there to make its production possible, to give it a decent quality and reliability, and to significantly reduce related costs down the maintenance timeline. So while the first two parts are not really relevant for the customer, they are very important for the developer. The customer (or in Scrum terms: the Product Owner) is not interested at all in how the product is developed, he is only interested in the fact that it is developed as cost-effective as possible, and that it meets his functional and non-functional requirements. The rest is solely a matter of the developer’s craftsmanship, and this is what I want to talk about during the remainder of this article… An example To demonstrate my way of doing real-world TDD, I decided to show the development of a (very) simple Calculator component. The example is deliberately trivial and silly, as examples always are. I am totally aware of the fact that real life is never that simple, but I only want to show some development principles here… The requirement As already said above, I start with writing down some words on the initial requirement, and I normally use interfaces for that, even for internal classes - the typical question “intf or not” doesn’t even come to mind. I need them for my usual workflow and using them automatically produces high componentized and testable code anyway. To think about their usage in every single situation would slow down the production process unnecessarily. So this is what I begin with: namespace Calculator { /// <summary> /// Defines a very simple calculator component for demo purposes. /// </summary> public interface ICalculator { /// <summary> /// Gets the result of the last successful operation. /// </summary> /// <value>The last result.</value> /// <remarks> /// Will be <see langword="null" /> before the first successful operation. /// </remarks> double? LastResult { get; } } // interface ICalculator } // namespace Calculator So, I’m not beginning with a test, but with a sort of code declaration - and still I insist on being 100% test-driven. There are three important things here: Starting this way gives me a method signature, which allows to use IntelliSense and AutoCompletion and thus eliminates the danger of typos - one of the most regular, annoying, time-consuming, and therefore expensive sources of error in the development process. In my understanding, the interface definition as a whole is more of a readable requirement document and technical documentation than anything else. So this is at least as much about documentation than about coding. The documentation must completely describe the behavior of the documented element. I normally use an IoC container or some sort of self-written provider-like model in my architecture. In either case, I need my components defined via service interfaces anyway. - I will use the LinFu IoC framework here, for no other reason as that is is very simple to use. The ‘Red’ (pt. 1) First I create a folder for the project’s third-party libraries and put the LinFu.Core dll there. Then I set up a test project (via a Gallio project template), and add references to the Calculator project and the LinFu dll. Finally I’m ready to write the first test, which will look like the following: namespace Calculator.Test { [TestFixture] public class CalculatorTest { private readonly ServiceContainer container = new ServiceContainer(); [Test] public void CalculatorLastResultIsInitiallyNull() { ICalculator calculator = container.GetService<ICalculator>(); Assert.IsNull(calculator.LastResult); } } // class CalculatorTest } // namespace Calculator.Test This is basically the executable formulation of what the interface definition states (part of). Side note: There’s one principle of TDD that is just plain wrong in my eyes: I’m talking about the Red is 'does not compile' thing. How could a compiler error ever be interpreted as a valid test outcome? I never understood that, it just makes no sense to me. (Or, in Derick’s terms: this reason is as wrong as a reason ever could be…) A compiler error tells me: Your code is incorrect, but nothing more. Instead, the ‘Red’ part of the red-green-refactor cycle has a clearly defined meaning to me: It means that the test works as intended and fails only if its assumptions are not met for some reason. Back to our Calculator. When I execute the above test with R#, the Gallio plugin will give me this output: So this tells me that the test is red for the wrong reason: There’s no implementation that the IoC-container could load, of course. So let’s fix that. With R#, this is very easy: First, create an ICalculator - derived type: Next, implement the interface members: And finally, move the new class to its own file: So far my ‘work’ was six mouse clicks long, the only thing that’s left to do manually here, is to add the Ioc-specific wiring-declaration and also to make the respective class non-public, which I regularly do to force my components to communicate exclusively via interfaces: This is what my Calculator class looks like as of now: using System; using LinFu.IoC.Configuration; namespace Calculator { [Implements(typeof(ICalculator))] internal class Calculator : ICalculator { public double? LastResult { get { throw new NotImplementedException(); } } } } Back to the test fixture, we have to put our IoC container to work: [TestFixture] public class CalculatorTest { #region Fields private readonly ServiceContainer container = new ServiceContainer(); #endregion // Fields #region Setup/TearDown [FixtureSetUp] public void FixtureSetUp() { container.LoadFrom(AppDomain.CurrentDomain.BaseDirectory, "Calculator.dll"); } ... Because I have a R# live template defined for the setup/teardown method skeleton as well, the only manual coding here again is the IoC-specific stuff: two lines, not more… The ‘Red’ (pt. 2) Now, the execution of the above test gives the following result: This time, the test outcome tells me that the method under test is called. And this is the point, where Derick and I seem to have somewhat different views on the subject: Of course, the test still is worthless regarding the red/green outcome (or: it’s still red for the wrong reasons, in that it gives a false negative). But as far as I am concerned, I’m not really interested in the test outcome at this point of the red-green-refactor cycle. Rather, I only want to assert that my test actually calls the right method. If that’s the case, I will happily go on to the ‘Green’ part… The ‘Green’ Making the test green is quite trivial. Just make LastResult an automatic property: [Implements(typeof(ICalculator))] internal class Calculator : ICalculator { public double? LastResult { get; private set; } } One more round… Now on to something slightly more demanding (cough…). Let’s state that our Calculator exposes an Add() method: ... /// <summary> /// Adds the specified operands. /// </summary> /// <param name="operand1">The operand1.</param> /// <param name="operand2">The operand2.</param> /// <returns>The result of the additon.</returns> /// <exception cref="ArgumentException"> /// Argument <paramref name="operand1"/> is < 0.<br/> /// -- or --<br/> /// Argument <paramref name="operand2"/> is < 0. /// </exception> double Add(double operand1, double operand2); } // interface ICalculator A remark: I sometimes hear the complaint that xml comment stuff like the above is hard to read. That’s certainly true, but irrelevant to me, because I read xml code comments with the CR_Documentor tool window. And using that, it looks like this: Apart from that, I’m heavily using xml code comments (see e.g. here for a detailed guide) because there is the possibility of automating help generation with nightly CI builds (using MS Sandcastle and the Sandcastle Help File Builder), and then publishing the results to some intranet location. This way, a team always has first class, up-to-date technical documentation at hand about the current codebase. (And, also very important for speeding up things and avoiding typos: You have IntelliSense/AutoCompletion and R# support, and the comments are subject to compiler checking…). Back to our Calculator again: Two more R# – clicks implement the Add() skeleton: ... public double Add(double operand1, double operand2) { throw new NotImplementedException(); } } // class Calculator As we have stated in the interface definition (which actually serves as our requirement document!), the operands are not allowed to be negative. So let’s start implementing that. Here’s the test: [Test] [Row(-0.5, 2)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) { ICalculator calculator = container.GetService<ICalculator>(); Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); } As you can see, I’m using a data-driven unit test method here, mainly for these two reasons: Because I know that I will have to do the same test for the second operand in a few seconds, I save myself from implementing another test method for this purpose. Rather, I only will have to add another Row attribute to the existing one. From the test report below, you can see that the argument values are explicitly printed out. This can be a valuable documentation feature even when everything is green: One can quickly review what values were tested exactly - the complete Gallio HTML-report (as it will be produced by the Continuous Integration runs) shows these values in a quite clear format (see below for an example). Back to our Calculator development again, this is what the test result tells us at the moment: So we’re red again, because there is not yet an implementation… Next we go on and implement the necessary parameter verification to become green again, and then we do the same thing for the second operand. To make a long story short, here’s the test and the method implementation at the end of the second cycle: // in CalculatorTest: [Test] [Row(-0.5, 2)] [Row(295, -123)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) { ICalculator calculator = container.GetService<ICalculator>(); Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); } // in Calculator: public double Add(double operand1, double operand2) { if (operand1 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand1"); } if (operand2 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand2"); } throw new NotImplementedException(); } So far, we have sheltered our method from unwanted input, and now we can safely operate on the parameters without further caring about their validity (this is my interpretation of the Fail Fast principle, which is regarded here in more detail). Now we can think about the method’s successful outcomes. First let’s write another test for that: [Test] [Row(1, 1, 2)] public void TestAdd(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); double result = calculator.Add(operand1, operand2); Assert.AreEqual(expectedResult, result); } Again, I’m regularly using row based test methods for these kinds of unit tests. The above shown pattern proved to be extremely helpful for my development work, I call it the Defined-Input/Expected-Output test idiom: You define your input arguments together with the expected method result. There are two major benefits from that way of testing: In the course of refining a method, it’s very likely to come up with additional test cases. In our case, we might add tests for some edge cases like ‘one of the operands is zero’ or ‘the sum of the two operands causes an overflow’, or maybe there’s an external test protocol that has to be fulfilled (e.g. an ISO norm for medical software), and this results in the need of testing against additional values. In all these scenarios we only have to add another Row attribute to the test. Remember that the argument values are written to the test report, so as a side-effect this produces valuable documentation. (This can become especially important if the fulfillment of some sort of external requirements has to be proven). So your test method might look something like that in the end: [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 2)] [Row(0, 999999999, 999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, double.MaxValue)] [Row(4, double.MaxValue - 2.5, double.MaxValue)] public void TestAdd(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); double result = calculator.Add(operand1, operand2); Assert.AreEqual(expectedResult, result); } And this will produce the following HTML report (with Gallio): Not bad for the amount of work we invested in it, huh? - There might be scenarios where reports like that can be useful for demonstration purposes during a Scrum sprint review… The last requirement to fulfill is that the LastResult property is expected to store the result of the last operation. I don’t show this here, it’s trivial enough and brings nothing new… And finally: Refactor (for the right reasons) To demonstrate my way of going through the refactoring portion of the red-green-refactor cycle, I added another method to our Calculator component, namely Subtract(). Here’s the code (tests and production): // CalculatorTest.cs: [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtract(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); double result = calculator.Subtract(operand1, operand2); Assert.AreEqual(expectedResult, result); } [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtractGivesExpectedLastResult(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); calculator.Subtract(operand1, operand2); Assert.AreEqual(expectedResult, calculator.LastResult); } ... // ICalculator.cs: /// <summary> /// Subtracts the specified operands. /// </summary> /// <param name="operand1">The operand1.</param> /// <param name="operand2">The operand2.</param> /// <returns>The result of the subtraction.</returns> /// <exception cref="ArgumentException"> /// Argument <paramref name="operand1"/> is < 0.<br/> /// -- or --<br/> /// Argument <paramref name="operand2"/> is < 0. /// </exception> double Subtract(double operand1, double operand2); ... // Calculator.cs: public double Subtract(double operand1, double operand2) { if (operand1 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand1"); } if (operand2 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand2"); } return (this.LastResult = operand1 - operand2).Value; } Obviously, the argument validation stuff that was produced during the red-green part of our cycle duplicates the code from the previous Add() method. So, to avoid code duplication and minimize the number of code lines of the production code, we do an Extract Method refactoring. One more time, this is only a matter of a few mouse clicks (and giving the new method a name) with R#: Having done that, our production code finally looks like that: using System; using LinFu.IoC.Configuration; namespace Calculator { [Implements(typeof(ICalculator))] internal class Calculator : ICalculator { #region ICalculator public double? LastResult { get; private set; } public double Add(double operand1, double operand2) { ThrowIfOneOperandIsInvalid(operand1, operand2); return (this.LastResult = operand1 + operand2).Value; } public double Subtract(double operand1, double operand2) { ThrowIfOneOperandIsInvalid(operand1, operand2); return (this.LastResult = operand1 - operand2).Value; } #endregion // ICalculator #region Implementation (Helper) private static void ThrowIfOneOperandIsInvalid(double operand1, double operand2) { if (operand1 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand1"); } if (operand2 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand2"); } } #endregion // Implementation (Helper) } // class Calculator } // namespace Calculator But is the above worth the effort at all? It’s obviously trivial and not very impressive. All our tests were green (for the right reasons), and refactoring the code did not change anything. It’s not immediately clear how this refactoring work adds value to the project. Derick puts it like this: STOP! Hold on a second… before you go any further and before you even think about refactoring what you just wrote to make your test pass, you need to understand something: if your done with your requirements after making the test green, you are not required to refactor the code. I know… I’m speaking heresy, here. Toss me to the wolves, I’ve gone over to the dark side! Seriously, though… if your test is passing for the right reasons, and you do not need to write any test or any more code for you class at this point, what value does refactoring add? Derick immediately answers his own question: So why should you follow the refactor portion of red/green/refactor? When you have added code that makes the system less readable, less understandable, less expressive of the domain or concern’s intentions, less architecturally sound, less DRY, etc, then you should refactor it. I couldn’t state it more precise. From my personal perspective, I’d add the following: You have to keep in mind that real-world software systems are usually quite large and there are dozens or even hundreds of occasions where micro-refactorings like the above can be applied. It’s the sum of them all that counts. And to have a good overall quality of the system (e.g. in terms of the Code Duplication Percentage metric) you have to be pedantic on the individual, seemingly trivial cases. My job regularly requires the reading and understanding of ‘foreign’ code. So code quality/readability really makes a HUGE difference for me – sometimes it can be even the difference between project success and failure… Conclusions The above described development process emerged over the years, and there were mainly two things that guided its evolution (you might call it eternal principles, personal beliefs, or anything in between): Test-driven development is the normal, natural way of writing software, code-first is exceptional. So ‘doing TDD or not’ is not a question. And good, stable code can only reliably be produced by doing TDD (yes, I know: many will strongly disagree here again, but I’ve never seen high-quality code – and high-quality code is code that stood the test of time and causes low maintenance costs – that was produced code-first…) It’s the production code that pays our bills in the end. (Though I have seen customers these days who demand an acceptance test battery as part of the final delivery. Things seem to go into the right direction…). The test code serves ‘only’ to make the production code work. But it’s the number of delivered features which solely counts at the end of the day - no matter how much test code you wrote or how good it is. With these two things in mind, I tried to optimize my coding process for coding speed – or, in business terms: productivity - without sacrificing the principles of TDD (more than I’d do either way…). As a result, I consider a ratio of about 3-5/1 for test code vs. production code as normal and desirable. In other words: roughly 60-80% of my code is test code (This might sound heavy, but that is mainly due to the fact that software development standards only begin to evolve. The entire software development profession is very young, historically seen; only at the very beginning, and there are no viable standards yet. If you think about software development as a kind of casting process, where the test code is the mold and the resulting production code is the final product, then the above ratio sounds no longer extraordinary…) Although the above might look like very much unnecessary work at first sight, it’s not. With the aid of the mentioned add-ins, doing all the above is a matter of minutes, sometimes seconds (while writing this post took hours and days…). The most important thing is to have the right tools at hand. Slow developer machines or the lack of a tool or something like that - for ‘saving’ a few 100 bucks - is just not acceptable and a very bad decision in business terms (though I quite some times have seen and heard that…). Production of high-quality products needs the usage of high-quality tools. This is a platitude that every craftsman knows… The here described round-trip will take me about five to ten minutes in my real-world development practice. I guess it’s about 30% more time compared to developing the ‘traditional’ (code-first) way. But the so manufactured ‘product’ is of much higher quality and massively reduces maintenance costs, which is by far the single biggest cost factor, as I showed in this previous post: It's the maintenance, stupid! (or: Something is rotten in developerland.). In the end, this is a highly cost-effective way of software development… But on the other hand, there clearly is a trade-off here: coding speed vs. code quality/later maintenance costs. The here described development method might be a perfect fit for the overwhelming majority of software projects, but there certainly are some scenarios where it’s not - e.g. if time-to-market is crucial for a software project. So this is a business decision in the end. It’s just that you have to know what you’re doing and what consequences this might have… Some last words First, I’d like to thank Derick Bailey again. His two aforementioned posts (which I strongly recommend for reading) inspired me to think deeply about my own personal way of doing TDD and to clarify my thoughts about it. I wouldn’t have done that without this inspiration. I really enjoy that kind of discussions… I agree with him in all respects. But I don’t know (yet?) how to bring his insights into the described production process without slowing things down. The above described method proved to be very “good enough” in my practical experience. But of course, I’m open to suggestions here… My rationale for now is: If the test is initially red during the red-green-refactor cycle, the ‘right reason’ is: it actually calls the right method, but this method is not yet operational. Later on, when the cycle is finished and the tests become part of the regular, automated Continuous Integration process, ‘red’ certainly must occur for the ‘right reason’: in this phase, ‘red’ MUST mean nothing but an unfulfilled assertion - Fail By Assertion, Not By Anything Else!

Read the article

Search Results

Search found 464 results on 19 pages for 'the consequences of cheat'.

Page 19/19 | < Previous Page | 15 16 17 18 19

- by Mark Rackley

- by Tarun Arora

- by user127236

- by Paul White

- by Simon Cooper

- by MartinF

- by Shawn

- by Midimistro

- by Jon Galloway

- by Martin Hinshelwood

- by Phil Factor

- by Nathan Jones

- by PHP_Jedi

- by Thomas Weller

< Previous Page | 15 16 17 18 19