Search Results

Search found 8274 results on 331 pages for 'offtopic but important'.

Page 19/331 | < Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26 | Next Page >

How to find and fix performance problems in ORM powered applications

- by FransBouma

Once in a while we get requests about how to fix performance problems with our framework. As it comes down to following the same steps and looking into the same things every single time, I decided to write a blogpost about it instead, so more people can learn from this and solve performance problems in their O/R mapper powered applications. In some parts it's focused on LLBLGen Pro but it's also usable for other O/R mapping frameworks, as the vast majority of performance problems in O/R mapper powered applications are not specific for a certain O/R mapper framework. Too often, the developer looks at the wrong part of the application, trying to fix what isn't a problem in that part, and getting frustrated that 'things are so slow with <insert your favorite framework X here>'. I'm in the O/R mapper business for a long time now (almost 10 years, full time) and as it's a small world, we O/R mapper developers know almost all tricks to pull off by now: we all know what to do to make task ABC faster and what compromises (because there are almost always compromises) to deal with if we decide to make ABC faster that way. Some O/R mapper frameworks are faster in X, others in Y, but you can be sure the difference is mainly a result of a compromise some developers are willing to deal with and others aren't. That's why the O/R mapper frameworks on the market today are different in many ways, even though they all fetch and save entities from and to a database. I'm not suggesting there's no room for improvement in today's O/R mapper frameworks, there always is, but it's not a matter of 'the slowness of the application is caused by the O/R mapper' anymore. Perhaps query generation can be optimized a bit here, row materialization can be optimized a bit there, but it's mainly coming down to milliseconds. Still worth it if you're a framework developer, but it's not much compared to the time spend inside databases and in user code: if a complete fetch takes 40ms or 50ms (from call to entity object collection), it won't make a difference for your application as that 10ms difference won't be noticed. That's why it's very important to find the real locations of the problems so developers can fix them properly and don't get frustrated because their quest to get a fast, performing application failed. Performance tuning basics and rules Finding and fixing performance problems in any application is a strict procedure with four prescribed steps: isolate, analyze, interpret and fix, in that order. It's key that you don't skip a step nor make assumptions: these steps help you find the reason of a problem which seems to be there, and how to fix it or leave it as-is. Skipping a step, or when you assume things will be bad/slow without doing analysis will lead to the path of premature optimization and won't actually solve your problems, only create new ones. The most important rule of finding and fixing performance problems in software is that you have to understand what 'performance problem' actually means. Most developers will say "when a piece of software / code is slow, you have a performance problem". But is that actually the case? If I write a Linq query which will aggregate, group and sort 5 million rows from several tables to produce a resultset of 10 rows, it might take more than a couple of milliseconds before that resultset is ready to be consumed by other logic. If I solely look at the Linq query, the code consuming the resultset of the 10 rows and then look at the time it takes to complete the whole procedure, it will appear to me to be slow: all that time taken to produce and consume 10 rows? But if you look closer, if you analyze and interpret the situation, you'll see it does a tremendous amount of work, and in that light it might even be extremely fast. With every performance problem you encounter, always do realize that what you're trying to solve is perhaps not a technical problem at all, but a perception problem. The second most important rule you have to understand is based on the old saying "Penny wise, Pound Foolish": the part which takes e.g. 5% of the total time T for a given task isn't worth optimizing if you have another part which takes a much larger part of the total time T for that same given task. Optimizing parts which are relatively insignificant for the total time taken is not going to bring you better results overall, even if you totally optimize that part away. This is the core reason why analysis of the complete set of application parts which participate in a given task is key to being successful in solving performance problems: No analysis -> no problem -> no solution. One warning up front: hunting for performance will always include making compromises. Fast software can be made maintainable, but if you want to squeeze as much performance out of your software, you will inevitably be faced with the dilemma of compromising one or more from the group {readability, maintainability, features} for the extra performance you think you'll gain. It's then up to you to decide whether it's worth it. In almost all cases it's not. The reason for this is simple: the vast majority of performance problems can be solved by implementing the proper algorithms, the ones with proven Big O-characteristics so you know the performance you'll get plus you know the algorithm will work. The time taken by the algorithm implementing code is inevitable: you already implemented the best algorithm. You might find some optimizations on the technical level but in general these are minor. Let's look at the four steps to see how they guide us through the quest to find and fix performance problems. Isolate The first thing you need to do is to isolate the areas in your application which are assumed to be slow. For example, if your application is a web application and a given page is taking several seconds or even minutes to load, it's a good candidate to check out. It's important to start with the isolate step because it allows you to focus on a single code path per area with a clear begin and end and ignore the rest. The rest of the steps are taken per identified problematic area. Keep in mind that isolation focuses on tasks in an application, not code snippets. A task is something that's started in your application by either another task or the user, or another program, and has a beginning and an end. You can see a task as a piece of functionality offered by your application. Analyze Once you've determined the problem areas, you have to perform analysis on the code paths of each area, to see where the performance problems occur and which areas are not the problem. This is a multi-layered effort: an application which uses an O/R mapper typically consists of multiple parts: there's likely some kind of interface (web, webservice, windows etc.), a part which controls the interface and business logic, the O/R mapper part and the RDBMS, all connected with either a network or inter-process connections provided by the OS or other means. Each of these parts, including the connectivity plumbing, eat up a part of the total time it takes to complete a task, e.g. load a webpage with all orders of a given customer X. To understand which parts participate in the task / area we're investigating and how much they contribute to the total time taken to complete the task, analysis of each participating task is essential. Start with the code you wrote which starts the task, analyze the code and track the path it follows through your application. What does the code do along the way, verify whether it's correct or not. Analyze whether you have implemented the right algorithms in your code for this particular area. Remember we're looking at one area at a time, which means we're ignoring all other code paths, just the code path of the current problematic area, from begin to end and back. Don't dig in and start optimizing at the code level just yet. We're just analyzing. If your analysis reveals big architectural stupidity, it's perhaps a good idea to rethink the architecture at this point. For the rest, we're analyzing which means we collect data about what could be wrong, for each participating part of the complete application. Reviewing the code you wrote is a good tool to get deeper understanding of what is going on for a given task but ultimately it lacks precision and overview what really happens: humans aren't good code interpreters, computers are. We therefore need to utilize tools to get deeper understanding about which parts contribute how much time to the total task, triggered by which other parts and for example how many times are they called. There are two different kind of tools which are necessary: .NET profilers and O/R mapper / RDBMS profilers. .NET profiling .NET profilers (e.g. dotTrace by JetBrains or Ants by Red Gate software) show exactly which pieces of code are called, how many times they're called, and the time it took to run that piece of code, at the method level and sometimes even at the line level. The .NET profilers are essential tools for understanding whether the time taken to complete a given task / area in your application is consumed by .NET code, where exactly in your code, the path to that code, how many times that code was called by other code and thus reveals where hotspots are located: the areas where a solution can be found. Importantly, they also reveal which areas can be left alone: remember our penny wise pound foolish saying: if a profiler reveals that a group of methods are fast, or don't contribute much to the total time taken for a given task, ignore them. Even if the code in them is perhaps complex and looks like a candidate for optimization: you can work all day on that, it won't matter. As we're focusing on a single area of the application, it's best to start profiling right before you actually activate the task/area. Most .NET profilers support this by starting the application without starting the profiling procedure just yet. You navigate to the particular part which is slow, start profiling in the profiler, in your application you perform the actions which are considered slow, and afterwards you get a snapshot in the profiler. The snapshot contains the data collected by the profiler during the slow action, so most data is produced by code in the area to investigate. This is important, because it allows you to stay focused on a single area. O/R mapper and RDBMS profiling .NET profilers give you a good insight in the .NET side of things, but not in the RDBMS side of the application. As this article is about O/R mapper powered applications, we're also looking at databases, and the software making it possible to consume the database in your application: the O/R mapper. To understand which parts of the O/R mapper and database participate how much to the total time taken for task T, we need different tools. There are two kind of tools focusing on O/R mappers and database performance profiling: O/R mapper profilers and RDBMS profilers. For O/R mapper profilers, you can look at LLBLGen Prof by hibernating rhinos or the Linq to Sql/LLBLGen Pro profiler by Huagati. Hibernating rhinos also have profilers for other O/R mappers like NHibernate (NHProf) and Entity Framework (EFProf) and work the same as LLBLGen Prof. For RDBMS profilers, you have to look whether the RDBMS vendor has a profiler. For example for SQL Server, the profiler is shipped with SQL Server, for Oracle it's build into the RDBMS, however there are also 3rd party tools. Which tool you're using isn't really important, what's important is that you get insight in which queries are executed during the task / area we're currently focused on and how long they took. Here, the O/R mapper profilers have an advantage as they collect the time it took to execute the query from the application's perspective so they also collect the time it took to transport data across the network. This is important because a query which returns a massive resultset or a resultset with large blob/clob/ntext/image fields takes more time to get transported across the network than a small resultset and a database profiler doesn't take this into account most of the time. Another tool to use in this case, which is more low level and not all O/R mappers support it (though LLBLGen Pro and NHibernate as well do) is tracing: most O/R mappers offer some form of tracing or logging system which you can use to collect the SQL generated and executed and often also other activity behind the scenes. While tracing can produce a tremendous amount of data in some cases, it also gives insight in what's going on. Interpret After we've completed the analysis step it's time to look at the data we've collected. We've done code reviews to see whether we've done anything stupid and which parts actually take place and if the proper algorithms have been implemented. We've done .NET profiling to see which parts are choke points and how much time they contribute to the total time taken to complete the task we're investigating. We've performed O/R mapper profiling and RDBMS profiling to see which queries were executed during the task, how many queries were generated and executed and how long they took to complete, including network transportation. All this data reveals two things: which parts are big contributors to the total time taken and which parts are irrelevant. Both aspects are very important. The parts which are irrelevant (i.e. don't contribute significantly to the total time taken) can be ignored from now on, we won't look at them. The parts which contribute a lot to the total time taken are important to look at. We now have to first look at the .NET profiler results, to see whether the time taken is consumed in our own code, in .NET framework code, in the O/R mapper itself or somewhere else. For example if most of the time is consumed by DbCommand.ExecuteReader, the time it took to complete the task is depending on the time the data is fetched from the database. If there was just 1 query executed, according to tracing or O/R mapper profilers / RDBMS profilers, check whether that query is optimal, uses indexes or has to deal with a lot of data. Interpret means that you follow the path from begin to end through the data collected and determine where, along the path, the most time is contributed. It also means that you have to check whether this was expected or is totally unexpected. My previous example of the 10 row resultset of a query which groups millions of rows will likely reveal that a long time is spend inside the database and almost no time is spend in the .NET code, meaning the RDBMS part contributes the most to the total time taken, the rest is compared to that time, irrelevant. Considering the vastness of the source data set, it's expected this will take some time. However, does it need tweaking? Perhaps all possible tweaks are already in place. In the interpret step you then have to decide that further action in this area is necessary or not, based on what the analysis results show: if the analysis results were unexpected and in the area where the most time is contributed to the total time taken is room for improvement, action should be taken. If not, you can only accept the situation and move on. In all cases, document your decision together with the analysis you've done. If you decide that the perceived performance problem is actually expected due to the nature of the task performed, it's essential that in the future when someone else looks at the application and starts asking questions you can answer them properly and new analysis is only necessary if situations changed. Fix After interpreting the analysis results you've concluded that some areas need adjustment. This is the fix step: you're actively correcting the performance problem with proper action targeted at the real cause. In many cases related to O/R mapper powered applications it means you'll use different features of the O/R mapper to achieve the same goal, or apply optimizations at the RDBMS level. It could also mean you apply caching inside your application (compromise memory consumption over performance) to avoid unnecessary re-querying data and re-consuming the results. After applying a change, it's key you re-do the analysis and interpretation steps: compare the results and expectations with what you had before, to see whether your actions had any effect or whether it moved the problem to a different part of the application. Don't fall into the trap to do partly analysis: do the full analysis again: .NET profiling and O/R mapper / RDBMS profiling. It might very well be that the changes you've made make one part faster but another part significantly slower, in such a way that the overall problem hasn't changed at all. Performance tuning is dealing with compromises and making choices: to use one feature over the other, to accept a higher memory footprint, to go away from the strict-OO path and execute queries directly onto the RDBMS, these are choices and compromises which will cross your path if you want to fix performance problems with respect to O/R mappers or data-access and databases in general. In most cases it's not a big issue: alternatives are often good choices too and the compromises aren't that hard to deal with. What is important is that you document why you made a choice, a compromise: which analysis data, which interpretation led you to the choice made. This is key for good maintainability in the years to come. Most common performance problems with O/R mappers Below is an incomplete list of common performance problems related to data-access / O/R mappers / RDBMS code. It will help you with fixing the hotspots you found in the interpretation step. SELECT N+1: (Lazy-loading specific). Lazy loading triggered performance bottlenecks. Consider a list of Orders bound to a grid. You have a Field mapped onto a related field in Order, Customer.CompanyName. Showing this column in the grid will make the grid fetch (indirectly) for each row the Customer row. This means you'll get for the single list not 1 query (for the orders) but 1+(the number of orders shown) queries. To solve this: use eager loading using a prefetch path to fetch the customers with the orders. SELECT N+1 is easy to spot with an O/R mapper profiler or RDBMS profiler: if you see a lot of identical queries executed at once, you have this problem. Prefetch paths using many path nodes or sorting, or limiting. Eager loading problem. Prefetch paths can help with performance, but as 1 query is fetched per node, it can be the number of data fetched in a child node is bigger than you think. Also consider that data in every node is merged on the client within the parent. This is fast, but it also can take some time if you fetch massive amounts of entities. If you keep fetches small, you can use tuning parameters like the ParameterizedPrefetchPathThreshold setting to get more optimal queries. Deep inheritance hierarchies of type Target Per Entity/Type. If you use inheritance of type Target per Entity / Type (each type in the inheritance hierarchy is mapped onto its own table/view), fetches will join subtype- and supertype tables in many cases, which can lead to a lot of performance problems if the hierarchy has many types. With this problem, keep inheritance to a minimum if possible, or switch to a hierarchy of type Target Per Hierarchy, which means all entities in the inheritance hierarchy are mapped onto the same table/view. Of course this has its own set of drawbacks, but it's a compromise you might want to take. Fetching massive amounts of data by fetching large lists of entities. LLBLGen Pro supports paging (and limiting the # of rows returned), which is often key to process through large sets of data. Use paging on the RDBMS if possible (so a query is executed which returns only the rows in the page requested). When using paging in a web application, be sure that you switch server-side paging on on the datasourcecontrol used. In this case, paging on the grid alone is not enough: this can lead to fetching a lot of data which is then loaded into the grid and paged there. Keep note that analyzing queries for paging could lead to the false assumption that paging doesn't occur, e.g. when the query contains a field of type ntext/image/clob/blob and DISTINCT can't be applied while it should have (e.g. due to a join): the datareader will do DISTINCT filtering on the client. this is a little slower but it does perform paging functionality on the data-reader so it won't fetch all rows even if the query suggests it does. Fetch massive amounts of data because blob/clob/ntext/image fields aren't excluded. LLBLGen Pro supports field exclusion for queries. You can exclude fields (also in prefetch paths) per query to avoid fetching all fields of an entity, e.g. when you don't need them for the logic consuming the resultset. Excluding fields can greatly reduce the amount of time spend on data-transport across the network. Use this optimization if you see that there's a big difference between query execution time on the RDBMS and the time reported by the .NET profiler for the ExecuteReader method call. Doing client-side aggregates/scalar calculations by consuming a lot of data. If possible, try to formulate a scalar query or group by query using the projection system or GetScalar functionality of LLBLGen Pro to do data consumption on the RDBMS server. It's far more efficient to process data on the RDBMS server than to first load it all in memory, then traverse the data in-memory to calculate a value. Using .ToList() constructs inside linq queries. It might be you use .ToList() somewhere in a Linq query which makes the query be run partially in-memory. Example: var q = from c in metaData.Customers.ToList() where c.Country=="Norway" select c; This will actually fetch all customers in-memory and do an in-memory filtering, as the linq query is defined on an IEnumerable<T>, and not on the IQueryable<T>. Linq is nice, but it can often be a bit unclear where some parts of a Linq query might run. Fetching all entities to delete into memory first. To delete a set of entities it's rather inefficient to first fetch them all into memory and then delete them one by one. It's more efficient to execute a DELETE FROM ... WHERE query on the database directly to delete the entities in one go. LLBLGen Pro supports this feature, and so do some other O/R mappers. It's not always possible to do this operation in the context of an O/R mapper however: if an O/R mapper relies on a cache, these kind of operations are likely not supported because they make it impossible to track whether an entity is actually removed from the DB and thus can be removed from the cache. Fetching all entities to update with an expression into memory first. Similar to the previous point: it is more efficient to update a set of entities directly with a single UPDATE query using an expression instead of fetching the entities into memory first and then updating the entities in a loop, and afterwards saving them. It might however be a compromise you don't want to take as it is working around the idea of having an object graph in memory which is manipulated and instead makes the code fully aware there's a RDBMS somewhere. Conclusion Performance tuning is almost always about compromises and making choices. It's also about knowing where to look and how the systems in play behave and should behave. The four steps I provided should help you stay focused on the real problem and lead you towards the solution. Knowing how to optimally use the systems participating in your own code (.NET framework, O/R mapper, RDBMS, network/services) is key for success as well as knowing what's going on inside the application you built. I hope you'll find this guide useful in tracking down performance problems and dealing with them in a useful way.

Read the article
The Product Owner

- by Robert May

In a previous post, I outlined the rules of Scrum. This post details one of those rules. Picking a most important part of Scrum is difficult. All of the rules are required, but if there were one rule that is “more” required that every other rule, its having a good Product Owner. Simply put, the Product Owner can make or break the project. Duties of the Product Owner A Product Owner has many duties and responsibilities. I’ll talk about each of these duties in detail below. A Product Owner: Discovers and records stories for the backlog. Prioritizes stories in the Product Backlog, Release Backlog and Iteration Backlog. Determines Release dates and Iteration Dates. Develops story details and helps the team understand those details. Helps QA to develop acceptance tests. Interact with the Customer to make sure that the product is meeting the customer’s needs. Discovers and Records Stories for the Backlog When I do Scrum, I always use User Stories as the means for capturing functionality that’s required in the system. Some people will use Use Cases, but the same rule applies. The Product Owner has the ultimate responsibility for figuring out what functionality will be in the system. Many different mechanisms for capturing this input can be used. User interviews are great, but all sources should be considered, including talking with Customer Support types. Often, they hear what users are struggling with the most and are a great source for stories that can make the application easier to use. Care should be taken when soliciting user stories from technical types such as programmers and the people that manage them. They will almost always give stories that are very technical in nature and may not have a direct benefit for the end user. Stories are about adding value to the company. If the stories don’t have direct benefit to the end user, the Product Owner should question whether or not the story should be implemented. In general, technical stories should be included as tasks in User Stories. Technical stories are often needed, but the ultimate value to the user is in user based functionality, so technical stories should be considered nothing more than overhead in providing that user functionality. Until the iteration prior to development, stories should be nothing more than short, one line placeholders. An exercise called Story Planning can be used to brainstorm and come up with stories. I’ll save the description of this activity for another blog post. For more information on User Stories, please read the book User Stories Applied by Mike Cohn. Prioritizes Stories in the Product Backlog, Release Backlog and Iteration Backlog Prioritization of stories is one of the most difficult tasks that a Product Owner must do. A key concept of Scrum done right is the need to have the team working from a single set of prioritized stories. If the team does not have a single set of prioritized stories, Scrum will likely fail at your organization. The Product Owner is the ONLY person who has the responsibility to prioritize that list. The Product Owner must be very diplomatic and sincerely listen to the people around him so that he can get the priorities correct. Just listening will still not yield the proper priorities. Care must also be taken to ensure that Return on Investment is also considered. Ultimately, determining which stories give the most value to the company for the least cost is the most important factor in determining priorities. Product Owners should be willing to look at cold, hard numbers to determine the order for stories. Even when many people want a feature, if that features is costly to develop, it may not have as high of a return on investment as features that are cheaper, but not as popular. The act of prioritization often causes conflict in an environment. Customer Service thinks that feature X is the most important, because it will stop people from calling. Operations thinks that feature Y is the most important, because it will stop servers from crashing. Developers think that feature Z is most important because it will make writing software much easier for them. All of these are useful goals, but the team can have only one list of items, and each item must have a priority that is different from all other stories. The Product Owner will determine which feature gives the best return on investment and the other features will have to wait their turn, which means that someone will not have their top priority feature implemented first. A weak Product Owner will refuse to do prioritization. I’ve heard from multiple Product Owners the following phrase, “Well, it’s all got to be done, so what does it matter what order we do it in?” If your product owner is using this phrase, you need a new Product Owner. Order is VERY important. In Scrum, every release is potentially shippable. If the wrong priority items are developed, then the value added in each release isn’t what it should be. Additionally, the Product Owner with this mindset doesn’t understand Agile. A product is NEVER finished, until the company has decided that it is no longer a going concern and they are no longer going to sell the product. Therefore, prioritization isn’t an event, its something that continues every day. The logical extension of the phrase “It’s all got to be done” is that you will never ship your product, since a product is never “done.” Once stories have been prioritized, assigning them to the Release Backlog and the Iteration Backlog becomes relatively simple. The top priority items are copied into the respective backlogs in order and the task is complete. The team does have the right to shuffle things around a little in the iteration backlog. For example, they may determine that working on story C with story A is appropriate because they’re related, even though story B is technically a higher priority than story C. Or they may decide that story B is too big to complete in the time available after Story A has tasks created, so they’ll work on Story C since it’s smaller. They can’t, however, go deep into the backlog to pick stories to implement. The team and the Product Owner should work together to determine what’s best for the company. Prioritization is time consuming, but its one of the most important things a Product Owner does. Determines Release Dates and Iteration Dates Product owners are responsible for determining release dates for a product. A common misconception that Product Owners have is that every “release” needs to correspond with an actual release to customers. This is not the case. In general, releases should be no more than 3 months long. You may decide to release the product to the customers, and many companies do release the product to customers, but it may also be an internal release. If a release date is too far away, developers will fall into the trap of not feeling a sense of urgency. The date is far enough away that they don’t need to give the release their full attention. Additionally, important tasks, such as performance tuning, regression testing, user documentation, and release preparation, will not happen regularly, making them much more difficult and time consuming to do. The more frequently you do these tasks, the easier they are to accomplish. The Product Owner will be a key participant in determining whether or not a release should be sent out to the customers. The determination should be made on whether or not the features contained in the release are valuable enough and complete enough that the customers will see real value in the release. Often, some features will take more than three months to get them to a state where they qualify for a release or need additional supporting features to be released. The product owner has the right to make this determination. In addition to release dates, the Product Owner also will help determine iteration dates. In general, an iteration length should be chosen and the team should follow that iteration length for an extended period of time. If the iteration length is changed every iteration, you’re not doing Scrum. Iteration lengths help the team and company get into a rhythm of developing quality software. Iterations should be somewhere between 2 and 4 weeks in length. Any shorter, and significant software will likely not be developed. Any longer, and the team won’t feel urgency and planning will become very difficult. Iterations may not be extended during the iteration. Companies where Scrum isn’t really followed will often use this as a strategy to complete all stories. They don’t want to face the harsh reality of what their true performance is, and looking good is more important than seeking visibility and improving the process and team. Companies like this typically don’t allow failure. This is unhealthy. Failure is part of life and unless we learn from it, we can’t improve. I would much rather see a team push out stories to the next iteration and then have healthy discussions about why they failed rather than extend the iteration and not deal with the core problems. If iteration length varies, retrospectives become more difficult. For example, evaluating the performance of the team’s estimation efforts becomes much more difficult if the iteration length varies. Also, the team must have a velocity measurement. If the iteration length varies, measuring velocity becomes impossible and upper management no longer will have the ability to evaluate the teams performance. People external to the team will no longer have the ability to determine when key features are likely to be developed. Variable iterations cause the entire company to fail and likely cause Scrum to fail at an organization. Develops Story Details and Helps the Team Understand Those Details A key concept in Scrum is that the stories are nothing more than a placeholder for a conversation. Stories should be nothing more than short, one line statements about the functionality. The team will then converse with the Product Owner about the details about that story. The product owner needs to have a very good idea about what the details of the story are and needs to be able to help the team understand those details. Too often, we see this requirement as being translated into the need for comprehensive documentation about the story, including old fashioned requirements documentation. The team should only develop the documentation that is required and should not develop documentation that is only created because their is a process to do so. In general, what we see that works best is the iteration before a team starts development work on a story, the Product Owner, with other appropriate business analysts, will develop the details of that story. They’ll figure out what business rules are required, potentially make paper prototypes or other light weight mock-ups, and they seek to understand the story and what is implied. Note that the time allowed for this task is deliberately short. The Product Owner only has a single iteration to develop all of the stories for the next iteration. If more than one iteration is used, I’ve found that teams will end up with Big Design Up Front and traditional requirements documents. This is a waste of time, since the team will need to then have discussions with the Product Owner to figure out what the requirements document says. Instead of this, skip making the pretty pictures and detailing the nuances of the requirements and build only what is minimally needed by the team to do development. If something comes up during development, you can address it at that time and figure out what you want to do. The goal is to keep things as light weight as possible so that everyone can move as quickly as possible. Helps QA to Develop Acceptance Tests In Scrum, no story can be counted until it is accepted by QA. Because of this, acceptance tests are very important to the team. In general, acceptance tests need to be developed prior to the iteration or at the very beginning of the iteration so that the team can make sure that the tasks that they develop will fulfill the acceptance criteria. The Product Owner will help the team, including QA, understand what will make the story acceptable. Note that the Product Owner needs to be careful about specifying that the feature will work “Perfectly” at the end of the iteration. In general, features are developed a little bit at a time, so only the bit that is being developed should be considered as necessary for acceptance. A weak Product Owner will make statements like “Do it right the first time.” Not only are these statements damaging to the team (like they would try to do it WRONG the first time . . .), they’re also ignoring the iterative nature of Scrum. Additionally, a weak product owner will seek to add scope in the acceptance testing. For example, they will refuse to determine acceptance at the beginning of the iteration, and then, after the team has planned and committed to the iteration, they will expand scope by defining acceptance. This often causes the team to miss the iteration because scope that wasn’t planned on is included. There are ways that the team can mitigate this problem. For example, include extra “Product Owner” time to deal with the uncertainty that you know will be introduced by the Product Owner. This will slow the perceived velocity of the team and is not ideal, since they’ll be doing more work than they get credit for. Interact with the Customer to Make Sure that the Product is Meeting the Customer’s Needs Once development is complete, what the team has worked on should be put in front of real live people to see if it meets the needs of the customer. One of the great things about Agile is that if something doesn’t work, we can revisit it in a future iteration! This frees up the team to make the best decision now and know that if that decision proves to be incorrect, the team can revisit it and change that decision. Features are about adding value to the customer, so if the customer doesn’t find them useful, then having the team make tweaks is valuable. In general, most software will be 80 to 90 percent “right” after the initial round and only minor tweaks are required. If proper coding standards are followed, these tweaks are usually minor and easy to accomplish. Product Owners that are doing a good job will encourage real users to see and use the software, since they know that they are trying to add value to the customer. Poor product owners will think that they know the answers already, that their customers are silly and do stupid things and that they don’t need customer input. If you have a product owner that is afraid to show the team’s work to real customers, you probably need a different product owner. Up Next, “Who Makes a Good Product Owner.” Followed by, “Messing with the Team.” Technorati Tags: Scrum,Product Owner

Read the article
what is the best and valid way for cross browser min-height?

- by metal-gear-solid

for #main-content I don't want to give any fix height because content can be long and short but if content is short then it should take minimum height 500px. i need compatibility in all browser. Is thery any w3c valid and cross browser way without using !important because i read !important should not be used In conclusion, don’t use the !important declaration unless you’ve tried everything else first, and keep in mind any drawbacks. If you do use it, it would probably make sense, if possible, to put a comment in your CSS next to any styles that are being overridden, to ensure better code maintainability. I tried to cover everything significant in relation to use of the !important declaration, so please offer comments if you think there’s anything I’ve missed, or if I’ve misstated anything, and I’ll be happy to make any needed corrections. http://www.impressivewebs.com/everything-you-need-to-know-about-the-important-css-declaration/

Read the article
Visual Studio Extensions

- by Scott Dorman

Originally posted on: http://geekswithblogs.net/sdorman/archive/2013/10/18/visual-studio-extensions.aspxAs a product, Visual Studio has been around for a long time. In fact, it’s been 18 years since the first Visual Studio product was launched. In that time, there have been some major changes but perhaps the most important (or at least influential) changes for the course of the product have been in the last few years. While we can argue over what was and wasn’t an important change or what has and hasn’t changed, I want to talk about what I think is the single most important change Microsoft has made to Visual Studio. Specifically, I’m referring to the Visual Studio Gallery (first introduced in Visual Studio 2010) and the ability for third-parties to easily write extensions which can add new functionality to Visual Studio or even change existing functionality. I know Visual Studio had this ability before the Gallery existed, but it was expensive (both from a financial and development resource) perspective for a company or individual to write such an extension. The Visual Studio Gallery changed all of that. As of today, there are over 4000 items in the Gallery. Microsoft itself has over 100 items in the Gallery and more are added all of the time. Why is this such an important feature? Simply put, it allows third-parties (companies such as JetBrains, Telerik, Red Gate, Devart, and DevExpress, just to name a few) to provide enhanced developer productivity experiences directly within the product by providing new functionality or changing existing functionality. However, there is an even more important function that it serves. It also allows Microsoft to do the same. By providing extensions which add new functionality or change existing functionality, Microsoft is not only able to rapidly innovate on new features and changes but to also get those changes into the hands of developers world-wide for feedback. The end result is that these extensions become very robust and often end up becoming part of a later product release. An excellent example of this is the new CodeLens feature of Visual Studio 2013. This is, perhaps, the single most important developer productivity enhancement released in the last decade and already has huge potential. As you can see, out of the box CodeLens supports showing you information about references, unit tests and TFS history. Fortunately, CodeLens is also accessible to Visual Studio extensions, and Microsoft DevLabs has already written such an extension to show code “health.” This extension shows different code metrics to help make sure your code is maintainable. At this point, you may have already asked yourself, “With over 4000 extensions, how do I find ones that are good?” That’s a really good question. Fortunately, the Visual Studio Gallery has a ratings system in place, which definitely helps but that’s still a lot of extensions to look through. To that end, here is my personal list of favorite extensions. This is something I started back when Visual Studio 2010 was first released, but so much has changed since then that I thought it would be good to provide an updated list for Visual Studio 2013. These are extensions that I have installed and use on a regular basis as a developer that I find indispensible. This list is in no particular order. NuGet Package Manager for Visual Studio 2013 Microsoft CodeLens Code Health Indicator Visual Studio Spell Checker Indent Guides Web Essentials 2013 VSCommands for Visual Studio 2013 Productivity Power Tools (right now this is only for Visual Studio 2012, but it should be updated to support Visual Studio 2013.) Everyone has their own set of favorites, so mine is probably not going to match yours. If there is an extension that you really like, feel free to leave me a comment!

Read the article
Can it be important to call remove() on an EJB 3 stateless session bean? Perhaps on weblogic?

- by Michael Borgwardt

I'm in the process of migrating an EJB 2 application to EJB 3 (and what a satisfying task it is to delete all those deployment descriptors!). It used to run on Weblogic and now runs on Glassfish. The task is going very smoothly, but one thing makes me wary: The current code takes great care to ensure that EJBObject.remove() is called on the bean when it's done its job, and other developers have (unfortunately very vague) memories of "bad things" happening when that was not done. However, with EJB3, the implementation class does not implement EJBObject, so there is no remove() method to be called. And my understanding is that there isn't really any point at all in calling it on stateless session beans, since they are, well, stateless. Could these "bad things" have been weblogic-specific? If not, what else? Should I avoid the full EJB3 lightweightness and keep a remote interface that extends EJBObject? Or just write it off as cargo-cult programming and delete all those try/finally clauses? I'm leaning towards the latter, but now feeling very comfortable with it.

Read the article
The Incremental Architect´s Napkin – #3 – Make Evolvability inevitable

- by Ralf Westphal

Originally posted on: http://geekswithblogs.net/theArchitectsNapkin/archive/2014/06/04/the-incremental-architectacutes-napkin-ndash-3-ndash-make-evolvability-inevitable.aspxThe easier something to measure the more likely it will be produced. Deviations between what is and what should be can be readily detected. That´s what automated acceptance tests are for. That´s what sprint reviews in Scrum are for. It´s no small wonder our software looks like it looks. It has all the traits whose conformance with requirements can easily be measured. And it´s lacking traits which cannot easily be measured. Evolvability (or Changeability) is such a trait. If an operation is correct, if an operation if fast enough, that can be checked very easily. But whether Evolvability is high or low, that cannot be checked by taking a measure or two. Evolvability might correlate with certain traits, e.g. number of lines of code (LOC) per function or Cyclomatic Complexity or test coverage. But there is no threshold value signalling “evolvability too low”; also Evolvability is hardly tangible for the customer. Nevertheless Evolvability is of great importance - at least in the long run. You can get away without much of it for a short time. Eventually, though, it´s needed like any other requirement. Or even more. Because without Evolvability no other requirement can be implemented. Evolvability is the foundation on which all else is build. Such fundamental importance is in stark contrast with its immeasurability. To compensate this, Evolvability must be put at the very center of software development. It must become the hub around everything else revolves. Since we cannot measure Evolvability, though, we cannot start watching it more. Instead we need to establish practices to keep it high (enough) at all times. Chefs have known that for long. That´s why everybody in a restaurant kitchen is constantly seeing after cleanliness. Hygiene is important as is to have clean tools at standardized locations. Only then the health of the patrons can be guaranteed and production efficiency is constantly high. Still a kitchen´s level of cleanliness is easier to measure than software Evolvability. That´s why important practices like reviews, pair programming, or TDD are not enough, I guess. What we need to keep Evolvability in focus and high is… to continually evolve. Change must not be something to avoid but too embrace. To me that means the whole change cycle from requirement analysis to delivery needs to be gone through more often. Scrum´s sprints of 4, 2 even 1 week are too long. Kanban´s flow of user stories across is too unreliable; it takes as long as it takes. Instead we should fix the cycle time at 2 days max. I call that Spinning. No increment must take longer than from this morning until tomorrow evening to finish. Then it should be acceptance checked by the customer (or his/her representative, e.g. a Product Owner). For me there are several resasons for such a fixed and short cycle time for each increment: Clear expectations Absolute estimates (“This will take X days to complete.”) are near impossible in software development as explained previously. Too much unplanned research and engineering work lurk in every feature. And then pervasive interruptions of work by peers and management. However, the smaller the scope the better our absolute estimates become. That´s because we understand better what really are the requirements and what the solution should look like. But maybe more importantly the shorter the timespan the more we can control how we use our time. So much can happen over the course of a week and longer timespans. But if push comes to shove I can block out all distractions and interruptions for a day or possibly two. That´s why I believe we can give rough absolute estimates on 3 levels: Noon Tonight Tomorrow Think of a meeting with a Product Owner at 8:30 in the morning. If she asks you, how long it will take you to implement a user story or bug fix, you can say, “It´ll be fixed by noon.”, or you can say, “I can manage to implement it until tonight before I leave.”, or you can say, “You´ll get it by tomorrow night at latest.” Yes, I believe all else would be naive. If you´re not confident to get something done by tomorrow night (some 34h from now) you just cannot reliably commit to any timeframe. That means you should not promise anything, you should not even start working on the issue. So when estimating use these four categories: Noon, Tonight, Tomorrow, NoClue - with NoClue meaning the requirement needs to be broken down further so each aspect can be assigned to one of the first three categories. If you like absolute estimates, here you go. But don´t do deep estimates. Don´t estimate dozens of issues; don´t think ahead (“Issue A is a Tonight, then B will be a Tomorrow, after that it´s C as a Noon, finally D is a Tonight - that´s what I´ll do this week.”). Just estimate so Work-in-Progress (WIP) is 1 for everybody - plus a small number of buffer issues. To be blunt: Yes, this makes promises impossible as to what a team will deliver in terms of scope at a certain date in the future. But it will give a Product Owner a clear picture of what to pull for acceptance feedback tonight and tomorrow. Trust through reliability Our trade is lacking trust. Customers don´t trust software companies/departments much. Managers don´t trust developers much. I find that perfectly understandable in the light of what we´re trying to accomplish: delivering software in the face of uncertainty by means of material good production. Customers as well as managers still expect software development to be close to production of houses or cars. But that´s a fundamental misunderstanding. Software development ist development. It´s basically research. As software developers we´re constantly executing experiments to find out what really provides value to users. We don´t know what they need, we just have mediated hypothesises. That´s why we cannot reliably deliver on preposterous demands. So trust is out of the window in no time. If we switch to delivering in short cycles, though, we can regain trust. Because estimates - explicit or implicit - up to 32 hours at most can be satisfied. I´d say: reliability over scope. It´s more important to reliably deliver what was promised then to cover a lot of requirement area. So when in doubt promise less - but deliver without delay. Deliver on scope (Functionality and Quality); but also deliver on Evolvability, i.e. on inner quality according to accepted principles. Always. Trust will be the reward. Less complexity of communication will follow. More goodwill buffer will follow. So don´t wait for some Kanban board to show you, that flow can be improved by scheduling smaller stories. You don´t need to learn that the hard way. Just start with small batch sizes of three different sizes. Fast feedback What has been finished can be checked for acceptance. Why wait for a sprint of several weeks to end? Why let the mental model of the issue and its solution dissipate? If you get final feedback after one or two weeks, you hardly remember what you did and why you did it. Resoning becomes hard. But more importantly youo probably are not in the mood anymore to go back to something you deemed done a long time ago. It´s boring, it´s frustrating to open up that mental box again. Learning is harder the longer it takes from event to feedback. Effort can be wasted between event (finishing an issue) and feedback, because other work might go in the wrong direction based on false premises. Checking finished issues for acceptance is the most important task of a Product Owner. It´s even more important than planning new issues. Because as long as work started is not released (accepted) it´s potential waste. So before starting new work better make sure work already done has value. By putting the emphasis on acceptance rather than planning true pull is established. As long as planning and starting work is more important, it´s a push process. Accept a Noon issue on the same day before leaving. Accept a Tonight issue before leaving today or first thing tomorrow morning. Accept a Tomorrow issue tomorrow night before leaving or early the day after tomorrow. After acceptance the developer(s) can start working on the next issue. Flexibility As if reliability/trust and fast feedback for less waste weren´t enough economic incentive, there is flexibility. After each issue the Product Owner can change course. If on Monday morning feature slices A, B, C, D, E were important and A, B, C were scheduled for acceptance by Monday evening and Tuesday evening, the Product Owner can change her mind at any time. Maybe after A got accepted she asks for continuation with D. But maybe, just maybe, she has gotten a completely different idea by then. Maybe she wants work to continue on F. And after B it´s neither D nor E, but G. And after G it´s D. With Spinning every 32 hours at latest priorities can be changed. And nothing is lost. Because what got accepted is of value. It provides an incremental value to the customer/user. Or it provides internal value to the Product Owner as increased knowledge/decreased uncertainty. I find such reactivity over commitment economically very benefical. Why commit a team to some workload for several weeks? It´s unnecessary at beast, and inflexible and wasteful at worst. If we cannot promise delivery of a certain scope on a certain date - which is what customers/management usually want -, we can at least provide them with unpredecented flexibility in the face of high uncertainty. Where the path is not clear, cannot be clear, make small steps so you´re able to change your course at any time. Premature completion Customers/management are used to premeditating budgets. They want to know exactly how much to pay for a certain amount of requirements. That´s understandable. But it does not match with the nature of software development. We should know that by now. Maybe there´s somewhere in the world some team who can consistently deliver on scope, quality, and time, and budget. Great! Congratulations! I, however, haven´t seen such a team yet. Which does not mean it´s impossible, but I think it´s nothing I can recommend to strive for. Rather I´d say: Don´t try this at home. It might hurt you one way or the other. However, what we can do, is allow customers/management stop work on features at any moment. With spinning every 32 hours a feature can be declared as finished - even though it might not be completed according to initial definition. I think, progress over completion is an important offer software development can make. Why think in terms of completion beyond a promise for the next 32 hours? Isn´t it more important to constantly move forward? Step by step. We´re not running sprints, we´re not running marathons, not even ultra-marathons. We´re in the sport of running forever. That makes it futile to stare at the finishing line. The very concept of a burn-down chart is misleading (in most cases). Whoever can only think in terms of completed requirements shuts out the chance for saving money. The requirements for a features mostly are uncertain. So how does a Product Owner know in the first place, how much is needed. Maybe more than specified is needed - which gets uncovered step by step with each finished increment. Maybe less than specified is needed. After each 4–32 hour increment the Product Owner can do an experient (or invite users to an experiment) if a particular trait of the software system is already good enough. And if so, she can switch the attention to a different aspect. In the end, requirements A, B, C then could be finished just 70%, 80%, and 50%. What the heck? It´s good enough - for now. 33% money saved. Wouldn´t that be splendid? Isn´t that a stunning argument for any budget-sensitive customer? You can save money and still get what you need? Pull on practices So far, in addition to more trust, more flexibility, less money spent, Spinning led to “doing less” which also means less code which of course means higher Evolvability per se. Last but not least, though, I think Spinning´s short acceptance cycles have one more effect. They excert pull-power on all sorts of practices known for increasing Evolvability. If, for example, you believe high automated test coverage helps Evolvability by lowering the fear of inadverted damage to a code base, why isn´t 90% of the developer community practicing automated tests consistently? I think, the answer is simple: Because they can do without. Somehow they manage to do enough manual checks before their rare releases/acceptance checks to ensure good enough correctness - at least in the short term. The same goes for other practices like component orientation, continuous build/integration, code reviews etc. None of that is compelling, urgent, imperative. Something else always seems more important. So Evolvability principles and practices fall through the cracks most of the time - until a project hits a wall. Then everybody becomes desperate; but by then (re)gaining Evolvability has become as very, very difficult and tedious undertaking. Sometimes up to the point where the existence of a project/company is in danger. With Spinning that´s different. If you´re practicing Spinning you cannot avoid all those practices. With Spinning you very quickly realize you cannot deliver reliably even on your 32 hour promises. Spinning thus is pulling on developers to adopt principles and practices for Evolvability. They will start actively looking for ways to keep their delivery rate high. And if not, management will soon tell them to do that. Because first the Product Owner then management will notice an increasing difficulty to deliver value within 32 hours. There, finally there emerges a way to measure Evolvability: The more frequent developers tell the Product Owner there is no way to deliver anything worth of feedback until tomorrow night, the poorer Evolvability is. Don´t count the “WTF!”, count the “No way!” utterances. In closing For sustainable software development we need to put Evolvability first. Functionality and Quality must not rule software development but be implemented within a framework ensuring (enough) Evolvability. Since Evolvability cannot be measured easily, I think we need to put software development “under pressure”. Software needs to be changed more often, in smaller increments. Each increment being relevant to the customer/user in some way. That does not mean each increment is worthy of shipment. It´s sufficient to gain further insight from it. Increments primarily serve the reduction of uncertainty, not sales. Sales even needs to be decoupled from this incremental progress. No more promises to sales. No more delivery au point. Rather sales should look at a stream of accepted increments (or incremental releases) and scoup from that whatever they find valuable. Sales and marketing need to realize they should work on what´s there, not what might be possible in the future. But I digress… In my view a Spinning cycle - which is not easy to reach, which requires practice - is the core practice to compensate the immeasurability of Evolvability. From start to finish of each issue in 32 hours max - that´s the challenge we need to accept if we´re serious increasing Evolvability. Fortunately higher Evolvability is not the only outcome of Spinning. Customer/management will like the increased flexibility and “getting more bang for the buck”.

Read the article
Are these non-standard applications of rendering practical in games?

- by maul

I've recently got into 3D and I came up with a few different "tricky" rendering techniques. Unfortunately I don't have the time to work on this myself, but I'd like to know if these are known methods and if they can be used in practice. Hybrid rendering Now I know that ray-tracing is still not fast enough for real-time rendering, at least on home computers. I also know that hybrid rendering (a combination of rasterization and ray-tracing) is a well known theory. However I had the following idea: one could separate a scene into "important" and "not important" objects. First you render the "not important" objects using traditional rasterization. In this pass you also render the "important" objects using a special shader that simply marks these parts on the image using a special color, or some stencil/depth buffer trickery. Then in the second pass you read back the results of the first pass and start ray tracing, but only from the pixels that were marked by the "important" object's shader. This would allow you to only ray-trace exactly what you need to. Could this be fast enough for real-time effects? Rendered physics I'm specifically talking about bullet physics - intersection of a very small object (point/bullet) that travels across a straight line with other, relatively slow-moving, fairly constant objects. More specifically: hit detection. My idea is that you could render the scene from the point of view of the gun (or the bullet). Every object in the scene would draw a different color. You only need to render a 1x1 pixel window - the center of the screen (again, from the gun's point of view). Then you simply check that central pixel and the color tells you what you hit. This is pixel-perfect hit detection based on the graphical representation of objects, which is not common in games. Afaik traditional OpenGL "picking" is a similar method. This could be extended in a few ways: For larger (non-bullet) objects you render a larger portion of the screen. If you put a special-colored plane in the middle of the scene (exactly where the bullet will be after the current frame) you get a method that works as the traditional slow-moving iterative physics test as well. You could simulate objects that the bullet can pass through (with decreased velocity) using alpha blending or some similar trick. So are these techniques in use anywhere, and/or are they practical at all?

Read the article
How can I highlight empty fields in ASP.NET MVC 2 before model binding has occurred?

- by Richard Poole

I'm trying to highlight certain form fields (let's call them important fields) when they're empty. In essence, they should behave a bit like required fields, but they should be highlighted if they are empty when the user first GETs the form, before POST & model validation has occurred. The user can also ignore the warnings and submit the form when these fields are empty (i.e. empty important fields won't cause ModelState.IsValid to be false). Ideally it needs to work server-side (empty important fields are highlighted with warning message on GET) and client-side (highlighted if empty when losing focus). I've thought of a few ways of doing this, but I'm hoping some bright spark can come up with a nice elegant solution... Just use a CSS class to flag important fields Update every view/template to render important fields with an important CSS class. Write some jQuery to highlight empty important fields when the DOM is ready and hook their blur events so highlights & warning messages can be shown/hidden as appropriate. Pros: Quick and easy. Cons: Unnecessary duplication of importance flags and warning messages across views & templates. Clients with JavaScript disabled will never see highlights/warnings. Custom data annotation and client-side validator Create classes similar to RequiredAttribute, RequiredAttributeAdapter and ModelClientValidationRequiredRule, and register the adapter with DataAnnotationsModelValidatorProvider.RegisterAdapter. Create a client-side validator like this that responds to the blur event. Pros: Data annotation follows DRY principle (Html.ValidationMessageFor<T> picks up field importance and warning message from attribute, no duplication). Cons: Must call TryValidateModel from GET actions to ensure empty fields are decorated. Not technically validation (client- & server-side rules don't match) so it's at the mercy of framework changes. Clients with JavaScript disabled will never see highlights/warnings. Clone the entire validation framework It strikes me that I'm trying to achieve exactly the same thing as validation but with warnings rather than errors. It needs to run before model binding (and therefore validation) has occurred. Perhaps it's worth designing a similar framework with annotations like Required, RegularExpression, StringLength, etc. that somehow cause Html.TextBoxFor<T> etc. to render the warning CSS class and Html.ValidationMessageFor<T> to emit the warning message and JSON needed to enable client-side blur checks. Pros: Sounds like something MVC 2 could do with out of the box. Cons: Way too much effort for my current requirement! I'm swaying towards option 1. Can anyone think of a better solution?

Read the article
Prioritize One Network Share Over Another And/Or Cap Network Share Traffic? (Windows Server 2008 R2 Enterprise)

- by FullTimeCoderPartTimeSysAdmin

One of my fileservers is a hyper v VM running Windows Server 2008 R2 Enterprise. The NIC in the VM maps to a 1 GB NIC on the host that is dedicated just to this VM. I have two shares on the file server. One is very important and used by a few users. The other is less important but used by many users. The issue I'm having: When a ton of users are accessing the unimportant share, it can choke out requests to the important share. What I'd like to do: I'd like to some prioritize requests for for file on the more important share, or even dedicate a portion of the NIC's bandwidth just to requests for files on that share. Is there any way to do that? Alternately, can I add another NIC and specify that all traffic to one share goes over one NIC and traffic to the other share goes over the other?

Read the article
SQLAuthority News – 2 Whitepapers Announced – AlwaysOn Architecture Guide: Building a High Availability and Disaster Recovery Solution

- by pinaldave

Understanding AlwaysOn Architecture is extremely important when building a solution with failover clusters and availability groups. Microsoft has just released two very important white papers related to this subject. Both the white papers are written by top experts in industry and have been reviewed by excellent panel of experts. Every time I talk with various organizations who are adopting the SQL Server 2012 they are always excited with the concept of the new feature AlwaysOn. One of the requests I often here is the related to detailed documentations which can help enterprises to build a robust high availability and disaster recovery solution. I believe following two white paper now satisfies the request. AlwaysOn Architecture Guide: Building a High Availability and Disaster Recovery Solution by Using AlwaysOn Availability Groups SQL Server 2012 AlwaysOn Availability Groups provides a unified high availability and disaster recovery (HADR) solution. This paper details the key topology requirements of this specific design pattern on important concepts like quorum configuration considerations, steps required to build the environment, and a workflow that shows how to handle a disaster recovery. AlwaysOn Architecture Guide: Building a High Availability and Disaster Recovery Solution by Using Failover Cluster Instances and Availability Groups SQL Server 2012 AlwaysOn Failover Cluster Instances (FCI) and AlwaysOn Availability Groups provide a comprehensive high availability and disaster recovery solution. This paper details the key topology requirements of this specific design pattern on important concepts like asymmetric storage considerations, quorum model selection, quorum votes, steps required to build the environment, and a workflow. If you are not going to implement AlwaysOn feature, this two Whitepapers are still a great reference material to review as it will give you complete idea regarding what it takes to implement AlwaysOn architecture and what kind of efforts needed. One should at least bookmark above two white papers for future reference. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Documentation, SQL Download, SQL Query, SQL Server, SQL Tips and Tricks, SQL White Papers, T SQL, Technology Tagged: AlwaysOn

Read the article
Key ATG architecture principles

- by Glen Borkowski

Overview The purpose of this article is to describe some of the important foundational concepts of ATG. This is not intended to cover all areas of the ATG platform, just the most important subset - the ones that allow ATG to be extremely flexible, configurable, high performance, etc. For more information on these topics, please see the online product manuals. Modules The first concept is called the 'ATG Module'. Simply put, you can think of modules as the building blocks for ATG applications. The ATG development team builds the out of the box product using modules (these are the 'out of the box' modules). Then, when a customer is implementing their site, they build their own modules that sit 'on top' of the out of the box ATG modules. Modules can be very simple - containing minimal definition, and perhaps a small amount of configuration. Alternatively, a module can be rather complex - containing custom logic, database schema definitions, configuration, one or more web applications, etc. Modules generally will have dependencies on other modules (the modules beneath it). For example, the Commerce Reference Store module (CRS) requires the DCS (out of the box commerce) module. Modules have a ton of value because they provide a way to decouple a customers implementation from the out of the box ATG modules. This allows for a much easier job when it comes time to upgrade the ATG platform. Modules are also a very useful way to group functionality into a single package which can be leveraged across multiple ATG applications. One very important thing to understand about modules, or more accurately, ATG as a whole, is that when you start ATG, you tell it what module(s) you want to start. One of the first things ATG does is to look through all the modules you specified, and for each one, determine a list of modules that are also required to start (based on each modules dependencies). Once this final, ordered list is determined, ATG continues to boot up. One of the outputs from the ordered list of modules is that each module can contain it's own classes and configuration. During boot, the ordered list of modules drives the unified classpath and configpath. This is what determines which classes override others, and which configuration overrides other configuration. Think of it as a layered approach. The structure of a module is well defined. It simply looks like a folder in a filesystem that has certain other folders and files within it. Here is a list of items that can appear in a module: MyModule: META-INF - this is required, along with a file called MANIFEST.MF which describes certain properties of the module. One important property is what other modules this module depends on. config - this is typically present in most modules. It defines a tree structure (folders containing properties files, XML, etc) that maps to ATG components (these are described below). lib - this contains the classes (typically in jarred format) for any code defined in this module j2ee - this is where any web-apps would be stored. src - in case you want to include the source code for this module, it's standard practice to put it here sql - if your module requires any additions to the database schema, you should place that schema here Here's a screenshots of a module: Modules can also contain sub-modules. A dot-notation is used when referring to these sub-modules (i.e. MyModule.Versioned, where Versioned is a sub-module of MyModule). Finally, it is important to completely understand how modules work if you are going to be able to leverage them effectively. There are many different ways to design modules you want to create, some approaches are better than others, especially if you plan to share functionality between multiple different ATG applications. Components A component in ATG can be thought of as a single item that performs a certain set of related tasks. An example could be a ProductViews component - used to store information about what products the current customer has viewed. Components have properties (also called attributes). The ProductViews component could have properties like lastProductViewed (stores the ID of the last product viewed) or productViewList (stores the ID's of products viewed in order of their being viewed). The previous examples of component properties would typically also offer get and set methods used to retrieve and store the property values. Components typically will also offer other types of useful methods aside from get and set. In the ProductViewed component, we might want to offer a hasViewed method which will tell you if the customer has viewed a certain product or not. Components are organized in a tree like hierarchy called 'nucleus'. Nucleus is used to locate and instantiate ATG Components. So, when you create a new ATG component, it will be able to be found 'within' nucleus. Nucleus allows ATG components to reference one another - this is how components are strung together to perform meaningful work. It's also a mechanism to prevent redundant configuration - define it once and refer to it from everywhere. Here is a screenshot of a component in nucleus: Components can be extremely simple (i.e. a single property with a get method), or can be rather complex offering many properties and methods. To be an ATG component, a few things are required: a class - you can reference an existing out of the box class or you could write your own a properties file - this is used to define your component the above items must be located 'within' nucleus by placing them in the correct spot in your module's config folder Within the properties file, you will need to point to the class you want to use: $class=com.mycompany.myclass You may also want to define the scope of the class (request, session, or global): $scope=session In summary, ATG Components live in nucleus, generally have links to other components, and provide some meaningful type of work. You can configure components as well as extend their functionality by writing code. Repositories Repositories (a.k.a. Data Anywhere Architecture) is the mechanism that ATG uses to access data primarily stored in relational databases, but also LDAP or other backend systems. ATG applications are required to be very high performance, and data access is critical in that if not handled properly, it could create a bottleneck. ATG's repository functionality has been around for a long time - it's proven to be extremely scalable. Developers new to ATG need to understand how repositories work as this is a critical aspect of the ATG architecture. Repositories essentially map relational tables to objects in ATG, as well as handle caching. ATG defines many repositories out of the box (i.e. user profile, catalog, orders, etc), and this is comprised of both the underlying database schema along with the associated repository definition files (XML). It is fully expected that implementations will extend / change the out of the box repository definitions, so there is a prescribed approach to doing this. The first thing to be sure of is to encapsulate your repository definition additions / changes within your own module (as described above). The other important best practice is to never modify the out of the box schema - in other words, don't add columns to existing ATG tables, just create your own new tables. These will help ensure you can easily upgrade your application at a later date. xml-combination As mentioned earlier, when you start ATG, the order of the modules will determine the final configpath. Files within this configpath are 'layered' such that modules on top can override configuration of modules below it. This is the same concept for repository definition files. If you want to add a few properties to the out of the box user profile, you simply need to create an XML file containing only your additions, and place it in the correct location in your module. At boot time, your definition will be combined (hence the term xml-combination) with the lower, out of the box modules, with the result being a user profile that contains everything (out of the box, plus your additions). Aside from just adding properties, there are also ways to remove and change properties. types of properties Aside from the normal 'database backed' properties, there are a few other interesting types: transient properties - these are properties that are in memory, but not backed by any database column. These are useful for temporary storage. java-backed properties - by nature, these are transient, but in addition, when you access this property (by called the get method) instead of looking up a piece of data, it performs some logic and returns the results. 'Age' is a good example - if you're storing a birth date on the profile, but your business rules are defined in terms of someones age, you could create a simple java-backed property to look at the birth date and compare it to the current date, and return the persons age. derived properties - this is what allows for inheritance within the repository structure. You could define a property at the category level, and have the product inherit it's value as well as override it. This is useful for setting defaults, with the ability to override. caching There are a number of different caching modes which are useful at different times depending on the nature of the data being cached. For example, the simple cache mode is useful for things like user profiles. This is because the user profile will typically only be used on a single instance of ATG at one time. Simple cache mode is also useful for read-only types of data such as the product catalog. Locked cache mode is useful when you need to ensure that only one ATG instance writes to a particular item at a time - an example would be a customers order. There are many options in terms of configuring caching which are outside the scope of this article - please refer to the product manuals for more details. Other important concepts - out of scope for this article There are a whole host of concepts that are very important pieces to the ATG platform, but are out of scope for this article. Here's a brief description of some of them: formhandlers - these are ATG components that handle form submissions by users. pipelines - these are configurable chains of logic that are used for things like handling a request (request pipeline) or checking out an order. special kinds of repositories (versioned, files, secure, ...) - there are a couple different types of repositories that are used in various situations. See the manuals for more information. web development - JSP/ DSP tag library - ATG provides a traditional approach to developing web applications by providing a tag library called the DSP library. This library is used throughout your JSP pages to interact with all the ATG components. messaging - a message sub-system used as another way for components to interact. personalization - ability for business users to define a personalized user experience for customers. See the other blog posts related to personalization.

Read the article
Choosing Technology To Include In Software Design

How many of us have been forced to select one technology over another when designing a new system? What factors do we and should we consider? How can we ensure the correct business decision is made? When faced with this type of decision it is important to gather as much information possible regarding each technology being considered as well as the project itself. Additionally, I tend to delay my decision about the technology until it is ultimately necessary to be made. The reason why I tend to delay such an important design decision is due to the fact that as the project progresses requirements and other factors can alter a decision for selecting the best technology for a project. Important factors to consider when making technology decisions: Time to Implement and Maintain Total Cost of Technology (including Implementation and maintenance) Adaptability of Technology Implementation Team’s Skill Sets Complexity of Technology (including Implementation and maintenance) orecasted Return On Investment (ROI) Forecasted Profit on Investment (POI) Of the factors to consider the ROI and POI weigh the heaviest because the take in to consideration the other factors when calculating the profitability and return on investments.For a real world example let us consider developing a web based lead management system for a new company. This system can either be hosted on Microsoft Windows based web server or on a Linux based web server. Important Factors for this Example Implementation Team’s Skill Sets Member 1 Skill Set: Classic ASP, ASP.Net, and MS SQL Server Experience: 10 years Member 2 Skill Set: PHP, MySQL, Photoshop and MS SQL Server Experience: 3 years Member 3 Skill Set: C++, VB6, ASP.Net, and MS SQL Server Experience: 12 years Total Cost of Technology (including Implementation and maintenance) Linux Initial Year: $5,000 (Random Value) Additional Years: $3,000 (Random Value) Windows Initial Year: $10,000 (Random Value) Additional Years: $3,000 (Random Value) Complexity of Technology Linux Large Learning Curve with user driven documentation Estimated learning cost: $30,000 Windows Minimal based on Teams skills with Microsoft based documentation Estimated learning cost: $5,000 ROI Linux Total Cost Initial Total Cost: $35,000 Additional Cost $3,000 per year Windows Total Cost Initial Total Cost: $15,000 Additional Cost $3,000 per year Based on the hypothetical numbers it would make more sense to select windows based web server because the initial investment of the technology is much lower initially compared to the Linux based web server.

Read the article
The Expert Secret to Search Engine Optimization - Effective Website Optimization

Throwing keywords into a program that shows you how popular they are and then using those keywords without doing a little bit of preliminary research and answering some very important questions can just spell disaster. There are three questions that are extremely important to ask yourself before just doing random search engine optimization. And believe it or not those three questions are not, "What are the most popular keywords for my particular website?" Those questions are much more fundamental and strategic and they can be much more important to your overall efforts in getting your site ranked on the search engines.

Read the article
ASP.NET 3.5 Debugging Using Visual Web Developer Express 2008

One of the most important features in Visual Web Developer Express 2 8 in developing ASP.NET 3.5 websites is the debugging feature. Having a debugger is important in troubleshooting source code and application-related problems. It will save you a lot of time if you encounter and fix problems during the design and testing stage. This article is all about basic debugging in ASP.NET using Visual Web Developer Express its information will provide you with an important tool for designing and creating ASP.NET websites.... Cloud Servers in Demand - GoGrid Start Small and Grow with Your Business. $0.10/hour

Read the article
JavaOne Blog RSS is here!

- by Cassandra Clark

tweetmeme_url = 'http://blogs.oracle.com/javaone/2010/06/javaone_blog_rss_is_here.html'; Share .FBConnectButton_Small{background-position:-5px -232px !important;border-left:1px solid #1A356E;} .FBConnectButton_Text{margin-left:12px !important ;padding:2px 3px 3px !important;} Don't be the last one to know all the juicy details about JavaOne. Subscribe to the newly implemented RSS feed and see the news as soon as it is posted. We have a long list of updates to come in the next few weeks; Java University, Schedule Builder, contests, quizzes and much much more.

Read the article
Firefox ignore GTK theme for localhost stuff

- by Mario De Schaepmeester

I know this is pretty much a duplicate of How can one make firefox ignore my GTK theme entirely?, but the answers on that one are no permanent solution. It works by launching firefox from the terminal. I would like to know a solution that works for every instance of firefox no matter how it was created. There is the possibility to edit the userContent.css file, but the settings you make randomly do not apply to some sites or in some situations, strangely, even with the !important added... I have a dark GTK theme and this results in some textboxes having a black background with black text with a userContent.css that has input, textarea { color: black !important; background-color: white !important; } Update I changed a setting in about:config from true to false, namely browser.display.use_system_colors. Everything appears normal and well now, for one exception: everything that runs on localhost. This includes PHPMyAdmin and a website I am making. I would like to know if there is a solution to this.

Read the article
I prefer C/C++ over Unity and other tools: is it such a big downer for a game developper ?

- by jokoon

We have a big game project on Unity at school on which we are 12 to work on. My teacher seems to be convinced it's an important tool to teach students, since it makes students look from the high level to the lower level. I can understand his view, and I'm wondering: Is unity such an important engine in game developping companies ? Are there a lot of companies using it because they can't afford to use something else ? He is talking like Unity is a big player in game making, but I only see it fit small indie game companies who want to do a game as fast as possible. Do you think Unity is that much important in the industry ? Does it endangers the value of C++ skills ? It's not that I don't like Unity, it's just that I don't learn nothing with it, I prefer to achieve little steps with Ogre or SFML instead. Also, we also have C++ practice exercises, but those are just practice with theory, nothing much.

Read the article
Unimportant words on page being seen as keywords, due to repetiveness

- by user21100

I have a list of products on my site, and each product has a number of descriptors such as features, price, etc. Next to each product, I list the 10 features, with a graphical icon which lets the user know whether the product has that particular feature or not. In all, I have about 230 products, and I have to add the same list of features to describe each product, so you can see the enormous redundancy here of these "feature names". These "feature names", ex., "water proof", are not important keywords at all, yet due to the sheer volume of these words, Google is seeing them as my most important keywords. Is there any way to get around this, or to tell the bots to place (less) emphasis on these repetitive words, and not view them as important keywords?

Read the article
What are the best SEO techniques for a professional blog? [closed]

- by Kyle

Possible Duplicate: What are the best ways to increase your site's position in Google? Beginner to SEO here, starting with a personal site, looking for some insight and feedback. Question: what's more important, domain name or site title? Question: how important are the meta tags (description and keywords) on your site? Description should be under 60 chars right? How many keywords is ideal? Question: #1 most important SEO principle = ?? (my guess is getting others to link to your site) -thanks.

Read the article
WCF + AppFabric training (4+1 days)

- by Sahil Malik

SharePoint 2010 Training: more information If there is one part of .NET that I think is the most important for you to master, it has to be WCF. It is something I have used, learnt, and talked about extensively. If there is one part of future looking technologies that I think will be extremely important going forward, it is AppFabric, both for Windows Server and Windows Azure. Both these topics are so incredibly valuable that I exude with excitement every time I touch them or talk about them. I have finally put together an exhaustive training on these two extremely relevant and important technologies, that you as a .NET developer must know. Here are the details, Read full article ....

Read the article
Deduping your redundancies

- by nospam(at)example.com (Joerg Moellenkamp)

Robin Harris of Storagemojo pointed to an interesting article about about deduplication and it's impact to the resiliency of your data against data corruption on ACM Queue. The problem in short: A considerable number of filesystems store important metadata at multiple locations. For example the ZFS rootblock is copied to three locations. Other filesystems have similar provisions to protect their metadata. However you can easily proof, that the rootblock pointer in the uberblock of ZFS for example is pointing to blocks with absolutely equal content in all three locatition (with zdb -uu and zdb -r). It has to be that way, because they are protected by the same checksum. A number of devices offer block level dedup, either as an option or as part of their inner workings. However when you store three identical blocks on them and the devices does block level dedup internally, the device may just deduplicated your redundant metadata to a block stored just once that is stored on the non-voilatile storage. When this block is corrupted, you have essentially three corrupted copies. Three hit with one bullet. This is indeed an interesting problem: A device doing deduplication doesn't know if a block is important or just a datablock. This is the reason why I like deduplication like it's done in ZFS. It's an integrated part and so important parts don't get deduplicated away. A disk accessed by a block level interface doesn't know anything about the importance of a block. A metadata block is nothing different to it's inner mechanism than a normal data block because there is no way to tell that this is important and that those redundancies aren't allowed to fall prey to some clever deduplication mechanism. Robin talks about this in regard of the Sandforce disk controllers who use a kind of dedup to reduce some of the nasty effects of writing data to flash, but the problem is much broader. However this is relevant whenever you are using a device with block level deduplication. It's just the point that you have to activate it for most implementation by command, whereas certain devices do this by default or by design and you don't know about it. However I'm not perfectly sure about that ? given that storage administration and server administration are often different groups with different business objectives I would ask your storage guys if they have activated dedup without telling somebody elase on their boxes in order to speak less often with the storage sales rep. The problem is even more interesting with ZFS. You may use ditto blocks to protect important data to store multiple copies of data in the pool to increase redundancy, even when your pool just consists out of one disk or just a striped set of disk. However when your device is doing dedup internally it may remove your redundancy before it hits the nonvolatile storage. You've won nothing. Just spend your disk quota on the the LUNs in the SAN and you make your disk admin happy because of the good dedup ratio However you can just fall in this specific "deduped ditto block"trap when your pool just consists out of a single device, because ZFS writes ditto blocks on different disks, when there is more than just one disk. Yet another reason why you should spend some extra-thought when putting your zpool on a single LUN, especially when the LUN is sliced and dices out of a large heap of storage devices by a storage controller. However I have one problem with the articles and their specific mention of ZFS: You can just hit by this problem when you are using the deduplicating device for the pool. However in the specifically mentioned case of SSD this isn't the usecase. Most implementations of SSD in conjunction with ZFS are hybrid storage pools and so rotating rust disk is used as pool and SSD are used as L2ARC/sZIL. And there it simply doesn't matter: When you really have to resort to the sZIL (your system went down, it doesn't matter of one block or several blocks are corrupt, you have to fail back to the last known good transaction group the device. On the other side, when a block in L2ARC is corrupt, you simply read it from the pool and in HSP implementations this is the already mentioned rust. In conjunction with ZFS this is more interesting when using a storage array, that is capable to do dedup and where you use LUNs for your pool. However as mentioned before, on those devices it's a user made decision to do so, and so it's less probable that you deduplicating your redundancies. Other filesystems lacking acapability similar to hybrid storage pools are more "haunted" by this problem of SSD using dedup-like mechanisms internally, because those filesystem really store the data on the the SSD instead of using it just as accelerating devices. However at the end Robin is correct: It's jet another point why protecting your data by creating redundancies by dispersing it several disks (by mirror or parity RAIDs) is really important. No dedup mechanism inside a device can dedup away your redundancy when you write it to a totally different and indepenent device.

Read the article
What is Agile Modeling and why do I need it?

What is Agile Modeling and why do I need it? Agile Modeling is an add-on to existing agile methodologies like Extreme programming (XP) and Rational Unified Process (RUP). Agile Modeling enables developers to develop a customized software development process that actually meets their current development needs and is flexible enough to adjust in the future. According to Scott Ambler, Agile Modeling consists of five core values that enable this methodology to be effective and light weight Agile Modeling Core Values: Communication Simplicity Feedback Courage Humility Communication is a key component to any successful project. Open communication between stakeholder and the development team is essential when developing new applications or maintaining legacy systems. Agile models promote communication amongst software development teams and stakeholders. Furthermore, Agile Models provide a common understanding of an application for members of a software development team allowing them to have a universal common point of reference. The use of simplicity in Agile Models enables the exploration of new ideas and concepts through the use of basic diagrams instead of investing the time in writing tens or hundreds of lines of code. Feedback in regards to application development is essential. Feedback allows a development team to confirm that the development path is on track. Agile Models allow for quick feedback from shareholders because minimal to no technical expertise is required to understand basic models. Courage is important because you need to make important decisions and be able to change direction by either discarding or refactoring your work when some of your decisions prove inadequate, according to Scott Ambler. As a member of a development team, we must admit that we do not know everything even though some of us think we do. This is where humility comes in to play. Everyone is a knowledge expert in their own specific domain. If you need help with your finances then you would consult an accountant. If you have a problem or are in need of help with a topic why would someone not consult with a subject expert? An effective approach is to assume that everyone involved with your project has equal value and therefore should be treated with respect. Agile Model Characteristics: Purposeful Understandable Sufficiently Accurate Sufficiently Consistent Sufficiently Detailed Provide Positive Value Simple as Possible Just Fulfill Basic Requirements According to Scott Ambler, Agile models are the most effective possible because the time that is invested in the model is just enough effort to complete the job. Furthermore, if a model isn’t good enough yet then additional effort can be invested to get more value out of the model. However if a model is good enough, for the current needs, or surpass the current needs, then any additional work done on the model would be a waste. It is important to remember that good enough is in the eye of the beholder, so this can be tough. In order for Agile Models to work effectively Active Stakeholder need to participation in the modeling process. Finally it is also very important to model with others, this allows for additionally input ensuring that all the shareholders needs are reflected in the models. How can Agile Models be incorporated in to our projects? Agile Models can be incorporated in to our project during the requirement gathering and design phases. As requirements are gathered the models should be updated to incorporate the new project details as they are defined and updated. Additionally, the Agile Models created during the requirement phase can be the bases for the models created during the design phase. It is important to only add to the model when the changes fit within the agile model characteristics and they do not over complicate the design.

Read the article
MySQL – Video Course – MySQL Backup and Recovery Fundamentals

- by Pinal Dave

Data is the one of the most crucial things for any organization and keeping data safe is the biggest challenge for any DBA. This is true for any organizations. Think about the scenario that you have a database which is extremely important and suddenly you accidently delete the most important table from that database. I am sure this is a very difficult time. In times like this people often get stressed or just make even second mistake. In my career of 10 years I have done often this mistake and often got stressed out due to un-availability of the database backup. In the SQL Server field, we have plenty of the help on this subject, but in MySQL domain there is not enough help. For the same reason I have build this MySQL course on Backup and Recovery. Course Outline Data is very important to any application and business. It is very important that every business plan for data safety. Database backup strategies are often discussed after the disaster has already happened. In this introductory course we will explore a few of the basic backup strategies every business should implement for data safely. We will explore how we can recover our server quickly after any unfriendly incident to our MySQL database. Click to View Course Here are various important aspects which we have discussed in this course. How to take backup of single database? How to take backup of multiple database? How to backup various database objects? How to restore a single database? How to restore multiple databases? How to use MySQL Workbench for Backup and Restore? How to restore Point in Time for any database? What is the best time to backup? How to copy database from one server to another server? All of the above concepts and many more subjects are covered in the MySQL Backup and Recovery Fundamentals course. It is available on Pluralsight. Scenarios As learning about Backup and Recovery can be very much boring, I decided to create two fictitious characters and demonstrate the entire course based on their conversation. The story is about Mike and Rahul. Mike is Sr. Database administrator in USA and Rahul is an intern in India. Rahul aspires to become a senior database administrator and this is a story about his challenges and how he overcomes those challenges. I had a great time to build this course and I have got very good feedback on this course. I encourage all of you to attempt to learn MySQL Backup and Recovery Fundamental course with this innovative effort. It will be very valuable to know your feedback. You will need a valid Pluralsight subscription to watch this course. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: MySQL, PostADay, SQL, SQL Authority, SQL Backup and Restore, SQL Query, SQL Tips and Tricks, T SQL

Read the article
Big Data – Basics of Big Data Architecture – Day 4 of 21

- by Pinal Dave

In yesterday’s blog post we understood how Big Data evolution happened. Today we will understand basics of the Big Data Architecture. Big Data Cycle Just like every other database related applications, bit data project have its development cycle. Though three Vs (link) for sure plays an important role in deciding the architecture of the Big Data projects. Just like every other project Big Data project also goes to similar phases of the data capturing, transforming, integrating, analyzing and building actionable reporting on the top of the data. While the process looks almost same but due to the nature of the data the architecture is often totally different. Here are few of the question which everyone should ask before going ahead with Big Data architecture. Questions to Ask How big is your total database? What is your requirement of the reporting in terms of time – real time, semi real time or at frequent interval? How important is the data availability and what is the plan for disaster recovery? What are the plans for network and physical security of the data? What platform will be the driving force behind data and what are different service level agreements for the infrastructure? This are just basic questions but based on your application and business need you should come up with the custom list of the question to ask. As I mentioned earlier this question may look quite simple but the answer will not be simple. When we are talking about Big Data implementation there are many other important aspects which we have to consider when we decide to go for the architecture. Building Blocks of Big Data Architecture It is absolutely impossible to discuss and nail down the most optimal architecture for any Big Data Solution in a single blog post, however, we can discuss the basic building blocks of big data architecture. Here is the image which I have built to explain how the building blocks of the Big Data architecture works. Above image gives good overview of how in Big Data Architecture various components are associated with each other. In Big Data various different data sources are part of the architecture hence extract, transform and integration are one of the most essential layers of the architecture. Most of the data is stored in relational as well as non relational data marts and data warehousing solutions. As per the business need various data are processed as well converted to proper reports and visualizations for end users. Just like software the hardware is almost the most important part of the Big Data Architecture. In the big data architecture hardware infrastructure is extremely important and failure over instances as well as redundant physical infrastructure is usually implemented. NoSQL in Data Management NoSQL is a very famous buzz word and it really means Not Relational SQL or Not Only SQL. This is because in Big Data Architecture the data is in any format. It can be unstructured, relational or in any other format or from any other data source. To bring all the data together relational technology is not enough, hence new tools, architecture and other algorithms are invented which takes care of all the kind of data. This is collectively called NoSQL. Tomorrow Next four days we will answer the Buzz Words – Hadoop. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Thoughts on Thoughts on TDD

Brian Harry wrote a post entitled Thoughts on TDD that I thought I was going to let lie, but I find that I need to write a response. I find myself in agreement with Brian on many points in the post, but I disagree with his conclusion. Not surprisingly, I agree with the things that he likes about TDD. Focusing on the usage rather than the implementation is really important, and this is important whether you use TDD or not. And YAGNI was a big theme in my Seven Deadly Sins of Programming series. Now, on to what he doesnt like. He says that he finds it inefficient to have tests that he has to change every time he refactors. Here is where we part company. If you are having to do a lot of test rewriting (say, more than a couple of minutes work to get back to green) *often* when you are refactoring your code, I submit that either you are testing things that you dont need to test (internal details rather than external implementation), your code perhaps isnt as decoupled as it could be, or maybe you need a visit to refactorers anonymous. I also like to refactor like crazy, but as we all know, the huge downside of refactoring is that we often break things. Important things. Subtle things. Which makes refactoring risky. *Unless* we have a set of tests that have great coverage. And TDD (or Example-based Design, which I prefer as a term) gives those to us. Now, I dont know what sort of coverage Brian gets with the unit tests that he writes, but I do know that for the majority of the developers Ive worked with and I count myself in that bucket the coverage of unit tests written afterwards is considerably inferior to the coverage of unit tests that come from TDD. For me, it all comes down to the answer to the following question: How do you ensure that your code works now and will continue to work in the future? Im willing to put up with a little efficiency on the front side to get that benefit later. Its not the writing of the code thats the expensive part, its everything else that comes after. I dont think that stepping through test cases in the debugger gets you what you want. You can verify what the current behavior is, sure, and do it fairly cheaply, but you dont help the guy in the future who doesnt know what conditions were important if he has to change your code. His second part that he doesnt like backing into an architecture (go read to see what he means). Ive certainly had to work with code that was like this before, and its a nightmare the code that nobody wants to touch. But thats not at all the kind of code that you get with TDD, because if youre doing it right youre doing the write a failing tests, make it pass, refactor approach. Now, you may miss some useful refactorings and generalizations for this, but if you do, you can refactor later because you have the tests that make it safe to do so, and your code tends to be easy to refactor because the same things that make code easy to write unit tests for make it easy to refactor. I also think Brian is missing an important point. We arent all as smart as he is. Im reminded a bit of the lesson of Intentional Programming, Charles Simonyis paradigm for making programming easier. I played around with Intentional Programming when it was young, and came to the conclusion that it was a pretty good thing if you were as smart as Simonyi is, but it was pretty much a disaster if you were an average developer. In this case, TDD gives you a way to work your way into a good, flexible, and functional architecture when you dont have somebody of Brians talents to help you out. And thats a good thing.Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

Read the article

Search Results

Search found 8274 results on 331 pages for 'offtopic but important'.

Page 19/331 | < Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26 | Next Page >

- by FransBouma

- by Robert May

- by metal-gear-solid

- by Scott Dorman

- by Michael Borgwardt

- by Ralf Westphal

- by maul

- by Richard Poole

- by FullTimeCoderPartTimeSysAdmin

- by pinaldave

- by Glen Borkowski

- by Cassandra Clark

- by Mario De Schaepmeester

- by jokoon

- by user21100

- by Kyle

- by Sahil Malik

- by nospam(at)example.com (Joerg Moellenkamp)

- by Pinal Dave

- by Pinal Dave

< Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26 | Next Page >