Search Results

Search found 5267 results on 211 pages for 'use cases'.

Page 10/211 | < Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >

Bug Triage

In this blog post brain dump, I'll attempt to describe the process my team tries to follow when dealing with new bug reports (specifically, code defect reports). This is not official Microsoft policy, just the way we do things… if you do things differently and want to share, you can do so at the bottom in the comments (or on your blog).Feature Triage TeamA subset of the feature crew, the triage team (which has representations from the PM, Dev and QA disciplines), looks at all unassigned bugs at regular intervals. This can be weekly or daily (or other frequency) dependent on which part of the product cycle we are in and what the untriaged bug load looks like. They discuss each bug considering the evidence and make a decision of whether the bug goes from Not Yet Assigned to Assigned (plus the name of the DEV to fix this) or whether it goes from Active to Resolved (which means it gets assigned back to the requestor for closure or further debate if they were not present at the triage meeting). Close to critical milestones, the feature triage team needs to further justify bugs they take to additional higher-level triage teams.Bug Opened = Not Yet AssignedSomeone (typically an SDET from the QA team) creates the bug item (e.g. in TFS), ensuring they populate all the relevant fields including: Title, Description, Repro Steps (including the Actual Result at the end of the steps), attachments of code and/or screenshots, Build number that they observed the issue in, regression details if applicable, how it was found, if a test case exists or needs to be created etc. They also indicate their opinion on the Priority and Severity. The bug status is left as Not Yet Assigned."Issue" versus "Fix for issue"The solution to some bugs is easy to determine, e.g. "bug: the column name is misspelled". Obviously the fix is to correct the spelling – still, the triage team should be explicit and enter the correct spelling in the bug's Description. Note that a bad bug name here would be "bug: fix the spelling of the column" (it describes the solution, rather than the problem).Other solutions are trickier to establish, e.g. "bug: the column header is not accessible (can only be clicked on with the mouse, not reached via keyboard)". What is the correct solution here? The last thing to do is leave this undetermined and just assign it to a developer. The solution has to be entered in the description. Behind this type of a bug usually hides a spec defect or a new feature request.The person opening the bug should focus on describing the issue, rather than the solution. The person indicates what the fix is in their opinion by stating the Expected Result (immediately after stating the Actual Result). If they have a complex suggested solution, that should be split out in a separate part, but the triage team has the final say before assigning it. If the solution is lengthy/complicated to describe, the bug can be assigned to the PM. Note: the strict interpretation suggests that any bug with no clear, obvious solution is always a hole in the spec and should always go to the PM. This also ensures the spec gets updated.Not Yet Assigned - Not Yet Assigned (on someone else's plate)If the bug is observed in our feature, but the cause is actually another team, we change the Area Path (which is the way we identify teams in TFS) and leave it as Not Yet Assigned. The triage team may add more comments as appropriate including potentially changing the repro steps. In some cases, we may even resolve the bug in our area path and open a new bug in the area path of the other team.Even though there is no action on a dev on the team, the bug still needs to be tracked. One way of doing this is to implement some notification system that informs the team when the tracked bug changed status; another way is to occasionally run a global query (against all area paths) for bugs that have been opened by a member of the team and follow up with the current owners for stale bugs.Not Yet Assigned - ResolvedThis state transition can only be made by the Feature Triage Team.0. Sometimes the bug description is not clear and in that case it gets Resolved as More Information Needed, so the original requestor can provide it.After understanding what the bug item is about, the first decision is to determine whether it needs to go to a dev.1. If it is a known bug, it gets resolved as "Duplicate" and linked to the existing bug.2. If it is "By Design" it gets resolved as such, indicating that the triage team does not think this is a bug.3. If the bug does not repro on latest bits, it is resolved as "No Repro"4. The most painful: If it is decided that we cannot fix it for this release it gets resolved as "Postponed" or "Won't Fix". The former is typically due to resources and time constraints, while the latter is due to deciding that it is not important enough to consume our resources in any release (yes, not all bugs must be fixed!). For both cases, there are other factors that contribute to the decision such as: existence of a reasonable workaround, frequency we expect users to encounter the issue, dependencies on other team to offer a solution, whether it breaks a core scenario, whether it prohibits customer feedback on a major feature, is it a regression from a previous release, impact of the fix on other partner teams (e.g. User Education, User Experience, Localization/Globalization), whether this is the right fix, does the fix impact performance goals, and last but not least, severity of bug (e.g. loss of customer data, security threat, crash, hang). The bar for fixing a bug goes up as the release date approaches. The triage team becomes hardnosed about which bugs to take, while the developers are busy resolving assigned bugs thus everyone drives for Zero Bug Bounce (ZBB). ZBB is when you have 0 active bugs older than 48 hours.Not Yet Assigned - AssignedIf the bug is something we decide to fix in this release and the solution is known, then it is assigned to a DEV. This is either the developer that will do the work, or a Lead that can further assign it to one of his developer team based on a load balancing algorithm of their choosing.Sometimes, the triage team needs the dev to do some investigation work before deciding whether to take the fix; similarly, the checkin for the fix may be gated on code review by the triage team. In these cases, these instructions are provided in the comments section of the bug and when the developer is done they notify the triage team for final decision.Additionally, a Priority and Severity (from 0 to 4) has to be entered, e.g. a P0 means "drop anything you are doing and fix this now" whereas a P4 is something you get to after all P0,1,2,3 bugs are fixed.From a testing perspective, if the bug was found through ad-hoc testing or an external team, the decision is made whether test cases should be added to avoid future regressions. This is communicated to the QA team.Assigned - ResolvedWhen the developer receives the bug (they should be checking daily for new bugs on their plate looking at bugs in order of priority and from older to newer) they can send it back to triage if the information is not clear. Otherwise, they investigate the bug, setting the Sub Status to "Investigating"; if they cannot make progress, they set the Sub Status to "Blocked" and discuss this with triage or whoever else can help them get unblocked. Once they are unblocked, they set the Sub Status to "Working on Solution"; once they are code complete they send a code review request, setting the Sub Status to "Fix Available". After the iterative code review process is over and everyone is happy with the fix, the developer checks it in and changes the state of the bug from Active (and Assigned to them) to Resolved (and Assigned to someone else).The developer needs to ensure that when the status is changed to Resolved that it is assigned to a QA person. For example, maybe the PM opened the bug, but it should be a QA person that will verify the fix - the developer needs to manually change the assignee in that case. Typically the QA person will send an email to the original requestor notifying them that the fix is verified.Resolved - ??In all cases above, note that the final state was Resolved. What happens after that? The final step should be Closed. The bug is closed once the QA person verifying the fix is happy with it. If the person is not happy, then they change the state from Resolved to Active, thus sending it back to the developer. If the developer and QA person cannot reach agreement, then triage can be brought into it. An easy way to do that is change the status back to Not Yet Assigned with appropriate comments so the triage team can re-review.It is important to note that only QA can close a bug. That means that if the opener of the bug was a PM, when the bug gets resolved by the dev it may land on the PM's plate and after a quick review, the PM would re-assign to an SDET, which is the only role that can close bugs. One exception to this is if the person that filed the bug is external: in that case, we leave it Resolved and assigned to them and also send them a notification that they need to verify the fix. Another exception is if specialized developer knowledge is needed for verifying the bug fix (e.g. it was a refactoring suggestion bug typically not observable by the user) in which case it is fine to have a developer verify the fix, and ideally a different developer to the one that opened the bug.Other links on bug triageA quick search reveals that others have talked about this subject, e.g. here, here, here, here and here.Your take?If you have other best practices your team uses to deal with incoming bug reports, feel free to share in the comments below or on your blog. Comments about this post welcome at the original blog.

Read the article
The Business case for Big Data

- by jasonw

The Business Case for Big Data Part 1 What's the Big Deal Okay, so a new buzz word is emerging. It's gone beyond just a buzzword now, and I think it is going to change the landscape of retail, financial services, healthcare....everything. Let me spend a moment to talk about what i'm going to talk about. Massive amounts of data are being collected every second, more than ever imaginable, and the size of this data is more than can be practically managed by today’s current strategies and technologies. There is a revolution at hand centering on this groundswell of data and it will change how we execute our businesses through greater efficiencies, new revenue discovery and even enable innovation. It is the revolution of Big Data. This is more than just a new buzzword is being tossed around technology circles.This blog series for Big Data will explain this new wave of technology and provide a roadmap for businesses to take advantage of this growing trend. Cases for Big Data There is a growing list of use cases for big data. We naturally think of Marketing as the low hanging fruit. Many projects look to analyze twitter feeds to find new ways to do marketing. I think of a great example from a TED speech that I recently saw on data visualization from Facebook from my masters studies at University of Virginia. We can see when the most likely time for breaks-ups occurs by looking at status changes and updates on users Walls. This is the intersection of Big Data, Analytics and traditional structured data. Ted Video Marketers can use this to sell more stuff. I really like the following piece on looking at twitter feeds to measure mood. The following company was bought by a hedge fund. They could predict how the S&P was going to do within three days at an 85% accuracy. Link to the article Here we see a convergence of predictive analytics and Big Data. So, we'll look at a lot of these business cases and start talking about what this means for the business. It's more than just finding ways to use Hadoop + NoSql and we'll talk about that too. How do I start in Big Data? That's what is coming next post.

Read the article
Quick guide to Oracle IRM 11g: Classification design

- by Simon Thorpe

Quick guide to Oracle IRM 11g indexThis is the final article in the quick guide to Oracle IRM. If you've followed everything prior you will now have a fully functional and tested Information Rights Management service. It doesn't matter if you've been following the 10g or 11g guide as this next article is common to both. ContentsWhy this is the most important part... Understanding the classification and standard rights model Identifying business use cases Creating an effective IRM classification modelOne single classification across the entire businessA context for each and every possible granular use caseWhat makes a good context? Deciding on the use of roles in the context Reviewing the features and security for context roles Summary Why this is the most important part...Now the real work begins, installing and getting an IRM system running is as simple as following instructions. However to actually have an IRM technology easily protecting your most sensitive information without interfering with your users existing daily work flows and be able to scale IRM across the entire business, requires thought into how confidential documents are created, used and distributed. This article is going to give you the information you need to ask the business the right questions so that you can deploy your IRM service successfully. The IRM team here at Oracle have over 10 years of experience in helping customers and it is important you understand the following to be successful in securing access to your most confidential information. Whatever you are trying to secure, be it mergers and acquisitions information, engineering intellectual property, health care documentation or financial reports. No matter what type of user is going to access the information, be they employees, contractors or customers, there are common goals you are always trying to achieve.Securing the content at the earliest point possible and do it automatically. Removing the dependency on the user to decide to secure the content reduces the risk of mistakes significantly and therefore results a more secure deployment. K.I.S.S. (Keep It Simple Stupid) Reduce complexity in the rights/classification model. Oracle IRM lets you make changes to access to documents even after they are secured which allows you to start with a simple model and then introduce complexity once you've understood how the technology is going to be used in the business. After an initial learning period you can review your implementation and start to make informed decisions based on user feedback and administration experience. Clearly communicate to the user, when appropriate, any changes to their existing work practice. You must make every effort to make the transition to sealed content as simple as possible. For external users you must help them understand why you are securing the documents and inform them the value of the technology to both your business and them. Before getting into the detail, I must pay homage to Martin White, Vice President of client services in SealedMedia, the company Oracle acquired and who created Oracle IRM. In the SealedMedia years Martin was involved with every single customer and was key to the design of certain aspects of the IRM technology, specifically the context model we will be discussing here. Listening carefully to customers and understanding the flexibility of the IRM technology, Martin taught me all the skills of helping customers build scalable, effective and simple to use IRM deployments. No matter how well the engineering department designed the software, badly designed and poorly executed projects can result in difficult to use and manage, and ultimately insecure solutions. The advice and information that follows was born with Martin and he's still delivering IRM consulting with customers and can be found at www.thinkers.co.uk. It is from Martin and others that Oracle not only has the most advanced, scalable and usable document security solution on the market, but Oracle and their partners have the most experience in delivering successful document security solutions. Understanding the classification and standard rights model The goal of any successful IRM deployment is to balance the increase in security the technology brings without over complicating the way people use secured content and avoid a significant increase in administration and maintenance. With Oracle it is possible to automate the protection of content, deploy the desktop software transparently and use authentication methods such that users can open newly secured content initially unaware the document is any different to an insecure one. That is until of course they attempt to do something for which they don't have any rights, such as copy and paste to an insecure application or try and print. Central to achieving this objective is creating a classification model that is simple to understand and use but also provides the right level of complexity to meet the business needs. In Oracle IRM the term used for each classification is a "context". A context defines the relationship between.A group of related documents The people that use the documents The roles that these people perform The rights that these people need to perform their role The context is the key to the success of Oracle IRM. It provides the separation of the role and rights of a user from the content itself. Documents are sealed to contexts but none of the rights, user or group information is stored within the content itself. Sealing only places information about the location of the IRM server that sealed it, the context applied to the document and a few other pieces of metadata that pertain only to the document. This important separation of rights from content means that millions of documents can be secured against a single classification and a user needs only one right assigned to be able to access all documents. If you have followed all the previous articles in this guide, you will be ready to start defining contexts to which your sensitive information will be protected. But before you even start with IRM, you need to understand how your own business uses and creates sensitive documents and emails. Identifying business use cases Oracle is able to support multiple classification systems, but usually there is one single initial need for the technology which drives a deployment. This need might be to protect sensitive mergers and acquisitions information, engineering intellectual property, financial documents. For this and every subsequent use case you must understand how users create and work with documents, to who they are distributed and how the recipients should interact with them. A successful IRM deployment should start with one well identified use case (we go through some examples towards the end of this article) and then after letting this use case play out in the business, you learn how your users work with content, how well your communication to the business worked and if the classification system you deployed delivered the right balance. It is at this point you can start rolling the technology out further. Creating an effective IRM classification model Once you have selected the initial use case you will address with IRM, you need to design a classification model that defines the access to secured documents within the use case. In Oracle IRM there is an inbuilt classification system called the "context" model. In Oracle IRM 11g it is possible to extend the server to support any rights classification model, but the majority of users who are not using an application integration (such as Oracle IRM within Oracle Beehive) are likely to be starting out with the built in context model. Before looking at creating a classification system with IRM, it is worth reviewing some recognized standards and methods for creating and implementing security policy. A very useful set of documents are the ISO 17799 guidelines and the SANS security policy templates. First task is to create a context against which documents are to be secured. A context consists of a group of related documents (all top secret engineering research), a list of roles (contributors and readers) which define how users can access documents and a list of users (research engineers) who have been given a role allowing them to interact with sealed content. Before even creating the first context it is wise to decide on a philosophy which will dictate the level of granularity, the question is, where do you start? At a department level? By project? By technology? First consider the two ends of the spectrum... One single classification across the entire business Imagine that instead of having separate contexts, one for engineering intellectual property, one for your financial data, one for human resources personally identifiable information, you create one context for all documents across the entire business. Whilst you may have immediate objections, there are some significant benefits in thinking about considering this. Document security classification decisions are simple. You only have one context to chose from! User provisioning is simple, just make sure everyone has a role in the only context in the business. Administration is very low, if you assign rights to groups from the business user repository you probably never have to touch IRM administration again. There are however some obvious downsides to this model.All users in have access to all IRM secured content. So potentially a sales person could access sensitive mergers and acquisition documents, if they can get their hands on a copy that is. You cannot delegate control of different documents to different parts of the business, this may not satisfy your regulatory requirements for the separation and delegation of duties. Changing a users role affects every single document ever secured. Even though it is very unlikely a business would ever use one single context to secure all their sensitive information, thinking about this scenario raises one very important point. Just having one single context and securing all confidential documents to it, whilst incurring some of the problems detailed above, has one huge value. Once secured, IRM protected content can ONLY be accessed by authorized users. Just think of all the sensitive documents in your business today, imagine if you could ensure that only everyone you trust could open them. Even if an employee lost a laptop or someone accidentally sent an email to the wrong recipient, only the right people could open that file. A context for each and every possible granular use case Now let's think about the total opposite of a single context design. What if you created a context for each and every single defined business need and created multiple contexts within this for each level of granularity? Let's take a use case where we need to protect engineering intellectual property. Imagine we have 6 different engineering groups, and in each we have a research department, a design department and manufacturing. The company information security policy defines 3 levels of information sensitivity... restricted, confidential and top secret. Then let's say that each group and department needs to define access to information from both internal and external users. Finally add into the mix that they want to review the rights model for each context every financial quarter. This would result in a huge amount of contexts. For example, lets just look at the resulting contexts for one engineering group. Q1FY2010 Restricted Internal - Engineering Group 1 - Research Q1FY2010 Restricted Internal - Engineering Group 1 - Design Q1FY2010 Restricted Internal - Engineering Group 1 - Manufacturing Q1FY2010 Restricted External- Engineering Group 1 - Research Q1FY2010 Restricted External - Engineering Group 1 - Design Q1FY2010 Restricted External - Engineering Group 1 - Manufacturing Q1FY2010 Confidential Internal - Engineering Group 1 - Research Q1FY2010 Confidential Internal - Engineering Group 1 - Design Q1FY2010 Confidential Internal - Engineering Group 1 - Manufacturing Q1FY2010 Confidential External - Engineering Group 1 - Research Q1FY2010 Confidential External - Engineering Group 1 - Design Q1FY2010 Confidential External - Engineering Group 1 - Manufacturing Q1FY2010 Top Secret Internal - Engineering Group 1 - Research Q1FY2010 Top Secret Internal - Engineering Group 1 - Design Q1FY2010 Top Secret Internal - Engineering Group 1 - Manufacturing Q1FY2010 Top Secret External - Engineering Group 1 - Research Q1FY2010 Top Secret External - Engineering Group 1 - Design Q1FY2010 Top Secret External - Engineering Group 1 - Manufacturing Now multiply the above by 6 for each engineering group, 18 contexts. You are then creating/reviewing another 18 every 3 months. After a year you've got 72 contexts. What would be the advantages of such a complex classification model? You can satisfy very granular rights requirements, for example only an authorized engineering group 1 researcher can create a top secret report for access internally, and his role will be reviewed on a very frequent basis. Your business may have very complex rights requirements and mapping this directly to IRM may be an obvious exercise. The disadvantages of such a classification model are significant...Huge administrative overhead. Someone in the business must manage, review and administrate each of these contexts. If the engineering group had a single administrator, they would have 72 classifications to reside over each year. From an end users perspective life will be very confusing. Imagine if a user has rights in just 6 of these contexts. They may be able to print content from one but not another, be able to edit content in 2 contexts but not the other 4. Such confusion at the end user level causes frustration and resistance to the use of the technology. Increased synchronization complexity. Imagine a user who after 3 years in the company ends up with over 300 rights in many different contexts across the business. This would result in long synchronization times as the client software updates all your offline rights. Hard to understand who can do what with what. Imagine being the VP of engineering and as part of an internal security audit you are asked the question, "What rights to researchers have to our top secret information?". In this complex model the answer is not simple, it would depend on many roles in many contexts. Of course this example is extreme, but it highlights that trying to build many barriers in your business can result in a nightmare of administration and confusion amongst users. In the real world what we need is a balance of the two. We need to seek an optimum number of contexts. Too many contexts are unmanageable and too few contexts does not give fine enough granularity. What makes a good context? Good context design derives mainly from how well you understand your business requirements to secure access to confidential information. Some customers I have worked with can tell me exactly the documents they wish to secure and know exactly who should be opening them. However there are some customers who know only of the government regulation that requires them to control access to certain types of information, they don't actually know where the documents are, how they are created or understand exactly who should have access. Therefore you need to know how to ask the business the right questions that lead to information which help you define a context. First ask these questions about a set of documentsWhat is the topic? Who are legitimate contributors on this topic? Who are the authorized readership? If the answer to any one of these is significantly different, then it probably merits a separate context. Remember that sealed documents are inherently secure and as such they cannot leak to your competitors, therefore it is better sealed to a broad context than not sealed at all. Simplicity is key here. Always revert to the first extreme example of a single classification, then work towards essential complexity. If there is any doubt, always prefer fewer contexts. Remember, Oracle IRM allows you to change your mind later on. You can implement a design now and continue to change and refine as you learn how the technology is used. It is easy to go from a simple model to a more complex one, it is much harder to take a complex model that is already embedded in the work practice of users and try to simplify it. It is also wise to take a single use case and address this first with the business. Don't try and tackle many different problems from the outset. Do one, learn from the process, refine it and then take what you have learned into the next use case, refine and continue. Once you have a good grasp of the technology and understand how your business will use it, you can then start rolling out the technology wider across the business. Deciding on the use of roles in the context Once you have decided on that first initial use case and a context to create let's look at the details you need to decide upon. For each context, identify; Administrative rolesBusiness owner, the person who makes decisions about who may or may not see content in this context. This is often the person who wanted to use IRM and drove the business purchase. They are the usually the person with the most at risk when sensitive information is lost. Point of contact, the person who will handle requests for access to content. Sometimes the same as the business owner, sometimes a trusted secretary or administrator. Context administrator, the person who will enact the decisions of the Business Owner. Sometimes the point of contact, sometimes a trusted IT person. Document related rolesContributors, the people who create and edit documents in this context. Reviewers, the people who are involved in reviewing documents but are not trusted to secure information to this classification. This role is not always necessary. (See later discussion on Published-work and Work-in-Progress) Readers, the people who read documents from this context. Some people may have several of the roles above, which is fine. What you are trying to do is understand and define how the business interacts with your sensitive information. These roles obviously map directly to roles available in Oracle IRM. Reviewing the features and security for context roles At this point we have decided on a classification of information, understand what roles people in the business will play when administrating this classification and how they will interact with content. The final piece of the puzzle in getting the information for our first context is to look at the permissions people will have to sealed documents. First think why are you protecting the documents in the first place? It is to prevent the loss of leaking of information to the wrong people. To control the information, making sure that people only access the latest versions of documents. You are not using Oracle IRM to prevent unauthorized people from doing legitimate work. This is an important point, with IRM you can erect many barriers to prevent access to content yet too many restrictions and authorized users will often find ways to circumvent using the technology and end up distributing unprotected originals. Because IRM is a security technology, it is easy to get carried away restricting different groups. However I would highly recommend starting with a simple solution with few restrictions. Ensure that everyone who reasonably needs to read documents can do so from the outset. Remember that with Oracle IRM you can change rights to content whenever you wish and tighten security. Always return to the fact that the greatest value IRM brings is that ONLY authorized users can access secured content, remember that simple "one context for the entire business" model. At the start of the deployment you really need to aim for user acceptance and therefore a simple model is more likely to succeed. As time passes and users understand how IRM works you can start to introduce more restrictions and complexity. Another key aspect to focus on is handling exceptions. If you decide on a context model where engineering can only access engineering information, and sales can only access sales data. Act quickly when a sales manager needs legitimate access to a set of engineering documents. Having a quick and effective process for permitting other people with legitimate needs to obtain appropriate access will be rewarded with acceptance from the user community. These use cases can often be satisfied by integrating IRM with a good Identity & Access Management technology which simplifies the process of assigning users the correct business roles. The big print issue... Printing is often an issue of contention, users love to print but the business wants to ensure sensitive information remains in the controlled digital world. There are many cases of physical document loss causing a business pain, it is often overlooked that IRM can help with this issue by limiting the ability to generate physical copies of digital content. However it can be hard to maintain a balance between security and usability when it comes to printing. Consider the following points when deciding about whether to give print rights. Oracle IRM sealed documents can contain watermarks that expose information about the user, time and location of access and the classification of the document. This information would reside in the printed copy making it easier to trace who printed it. Printed documents are slower to distribute in comparison to their digital counterparts, so time sensitive information in printed format may present a lower risk. Print activity is audited, therefore you can monitor and react to users abusing print rights. Summary In summary it is important to think carefully about the way you create your context model. As you ask the business these questions you may get a variety of different requirements. There may be special projects that require a context just for sensitive information created during the lifetime of the project. There may be a department that requires all information in the group is secured and you might have a few senior executives who wish to use IRM to exchange a small number of highly sensitive documents with a very small number of people. Oracle IRM, with its very flexible context classification system, can support all of these use cases. The trick is to introducing the complexity to deliver them at the right level. In another article i'm working on I will go through some examples of how Oracle IRM might map to existing business use cases. But for now, this article covers all the important questions you need to get your IRM service deployed and successfully protecting your most sensitive information.

Read the article
Option Trading: Getting the most out of the event session options

- by extended_events

You can control different aspects of how an event session behaves by setting the event session options as part of the CREATE EVENT SESSION DDL. The default settings for the event session options are designed to handle most of the common event collection situations so I generally recommend that you just use the defaults. Like everything in the real world though, there are going to be a handful of “special cases” that require something different. This post focuses on identifying the special cases and the correct use of the options to accommodate those cases. There is a reason it’s called Default The default session options specify a total event buffer size of 4 MB with a 30 second latency. Translating this into human terms; this means that our default behavior is that the system will start processing events from the event buffer when we reach about 1.3 MB of events or after 30 seconds, which ever comes first. Aside: What’s up with the 1.3 MB, I thought you said the buffer was 4 MB?The Extended Events engine takes the total buffer size specified by MAX_MEMORY (4MB by default) and divides it into 3 equally sized buffers. This is done so that a session can be publishing events to one buffer while other buffers are being processed. There are always at least three buffers; how to get more than three is covered later. Using this configuration, the Extended Events engine can “keep up” with most event sessions on standard workloads. Why is this? The fact is that most events are small, really small; on the order of a couple hundred bytes. Even when you start considering events that carry dynamically sized data (eg. binary, text, etc.) or adding actions that collect additional data, the total size of the event is still likely to be pretty small. This means that each buffer can likely hold thousands of events before it has to be processed. When the event buffers are finally processed there is an economy of scale achieved since most targets support bulk processing of the events so they are processed at the buffer level rather than the individual event level. When all this is working together it’s more likely that a full buffer will be processed and put back into the ready queue before the remaining buffers (remember, there are at least three) are full. I know what you’re going to say: “My server is exceptional! My workload is so massive it defies categorization!” OK, maybe you weren’t going to say that exactly, but you were probably thinking it. The point is that there are situations that won’t be covered by the Default, but that’s a good place to start and this post assumes you’ve started there so that you have something to look at in order to determine if you do have a special case that needs different settings. So let’s get to the special cases… What event just fired?! How about now?! Now?! If you believe the commercial adage from Heinz Ketchup (Heinz Slow Good Ketchup ad on You Tube), some things are worth the wait. This is not a belief held by most DBAs, particularly DBAs who are looking for an answer to a troubleshooting question fast. If you’re one of these anxious DBAs, or maybe just a Program Manager doing a demo, then 30 seconds might be longer than you’re comfortable waiting. If you find yourself in this situation then consider changing the MAX_DISPATCH_LATENCY option for your event session. This option will force the event buffers to be processed based on your time schedule. This option only makes sense for the asynchronous targets since those are the ones where we allow events to build up in the event buffer – if you’re using one of the synchronous targets this option isn’t relevant. Avoid forgotten events by increasing your memory Have you ever had one of those days where you keep forgetting things? That can happen in Extended Events too; we call it dropped events. In order to optimizes for server performance and help ensure that the Extended Events doesn’t block the server if to drop events that can’t be published to a buffer because the buffer is full. You can determine if events are being dropped from a session by querying the dm_xe_sessions DMV and looking at the dropped_event_count field. Aside: Should you care if you’re dropping events?Maybe not – think about why you’re collecting data in the first place and whether you’re really going to miss a few dropped events. For example, if you’re collecting query duration stats over thousands of executions of a query it won’t make a huge difference to miss a couple executions. Use your best judgment. If you find that your session is dropping events it means that the event buffer is not large enough to handle the volume of events that are being published. There are two ways to address this problem. First, you could collect fewer events – examine you session to see if you are over collecting. Do you need all the actions you’ve specified? Could you apply a predicate to be more specific about when you fire the event? Assuming the session is defined correctly, the next option is to change the MAX_MEMORY option to a larger number. Picking the right event buffer size might take some trial and error, but a good place to start is with the number of dropped events compared to the number you’ve collected. Aside: There are three different behaviors for dropping events that you specify using the EVENT_RETENTION_MODE option. The default is to allow single event loss and you should stick with this setting since it is the best choice for keeping the impact on server performance low.You’ll be tempted to use the setting to not lose any events (NO_EVENT_LOSS) – resist this urge since it can result in blocking on the server. If you’re worried that you’re losing events you should be increasing your event buffer memory as described in this section. Some events are too big to fail A less common reason for dropping an event is when an event is so large that it can’t fit into the event buffer. Even though most events are going to be small, you might find a condition that occasionally generates a very large event. You can determine if your session is dropping large events by looking at the dm_xe_sessions DMV once again, this time check the largest_event_dropped_size. If this value is larger than the size of your event buffer [remember, the size of your event buffer, by default, is max_memory / 3] then you need a large event buffer. To specify a large event buffer you set the MAX_EVENT_SIZE option to a value large enough to fit the largest event dropped based on data from the DMV. When you set this option the Extended Events engine will create two buffers of this size to accommodate these large events. As an added bonus (no extra charge) the large event buffer will also be used to store normal events in the cases where the normal event buffers are all full and waiting to be processed. (Note: This is just a side-effect, not the intended use. If you’re dropping many normal events then you should increase your normal event buffer size.) Partitioning: moving your events to a sub-division Earlier I alluded to the fact that you can configure your event session to use more than the standard three event buffers – this is called partitioning and is controlled by the MEMORY_PARTITION_MODE option. The result of setting this option is fairly easy to explain, but knowing when to use it is a bit more art than science. First the science… You can configure partitioning in three ways: None, Per NUMA Node & Per CPU. This specifies the location where sets of event buffers are created with fairly obvious implication. There are rules we follow for sub-dividing the total memory (specified by MAX_MEMORY) between all the event buffers that are specific to the mode used: None: 3 buffers (fixed)Node: 3 * number_of_nodesCPU: 2.5 * number_of_cpus Here are some examples of what this means for different Node/CPU counts: Configuration None Node CPU 2 CPUs, 1 Node 3 buffers 3 buffers 5 buffers 6 CPUs, 2 Node 3 buffers 6 buffers 15 buffers 40 CPUs, 5 Nodes 3 buffers 15 buffers 100 buffers Aside: Buffer size on multi-processor computersAs the number of Nodes or CPUs increases, the size of the event buffer gets smaller because the total memory is sub-divided into more pieces. The defaults will hold up to this for a while since each buffer set is holding events only from the Node or CPU that it is associated with, but at some point the buffers will get too small and you’ll either see events being dropped or you’ll get an error when you create your session because you’re below the minimum buffer size. Increase the MAX_MEMORY setting to an appropriate number for the configuration. The most likely reason to start partitioning is going to be related to performance. If you notice that running an event session is impacting the performance of your server beyond a reasonably expected level [Yes, there is a reasonably expected level of work required to collect events.] then partitioning might be an answer. Before you partition you might want to check a few other things: Is your event retention set to NO_EVENT_LOSS and causing blocking? (I told you not to do this.) Consider changing your event loss mode or increasing memory. Are you over collecting and causing more work than necessary? Consider adding predicates to events or removing unnecessary events and actions from your session. Are you writing the file target to the same slow disk that you use for TempDB and your other high activity databases? <kidding> <not really> It’s always worth considering the end to end picture – if you’re writing events to a file you can be impacted by I/O, network; all the usual stuff. Assuming you’ve ruled out the obvious (and not so obvious) issues, there are performance conditions that will be addressed by partitioning. For example, it’s possible to have a successful event session (eg. no dropped events) but still see a performance impact because you have many CPUs all attempting to write to the same free buffer and having to wait in line to finish their work. This is a case where partitioning would relieve the contention between the different CPUs and likely reduce the performance impact cause by the event session. There is no DMV you can check to find these conditions – sorry – that’s where the art comes in. This is largely a matter of experimentation. On the bright side you probably won’t need to to worry about this level of detail all that often. The performance impact of Extended Events is significantly lower than what you may be used to with SQL Trace. You will likely only care about the impact if you are trying to set up a long running event session that will be part of your everyday workload – sessions used for short term troubleshooting will likely fall into the “reasonably expected impact” category. Hey buddy – I think you forgot something OK, there are two options I didn’t cover: STARTUP_STATE & TRACK_CAUSALITY. If you want your event sessions to start automatically when the server starts, set the STARTUP_STATE option to ON. (Now there is only one option I didn’t cover.) I’m going to leave causality for another post since it’s not really related to session behavior, it’s more about event analysis. - Mike Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
How to find and fix performance problems in ORM powered applications

- by FransBouma

Once in a while we get requests about how to fix performance problems with our framework. As it comes down to following the same steps and looking into the same things every single time, I decided to write a blogpost about it instead, so more people can learn from this and solve performance problems in their O/R mapper powered applications. In some parts it's focused on LLBLGen Pro but it's also usable for other O/R mapping frameworks, as the vast majority of performance problems in O/R mapper powered applications are not specific for a certain O/R mapper framework. Too often, the developer looks at the wrong part of the application, trying to fix what isn't a problem in that part, and getting frustrated that 'things are so slow with <insert your favorite framework X here>'. I'm in the O/R mapper business for a long time now (almost 10 years, full time) and as it's a small world, we O/R mapper developers know almost all tricks to pull off by now: we all know what to do to make task ABC faster and what compromises (because there are almost always compromises) to deal with if we decide to make ABC faster that way. Some O/R mapper frameworks are faster in X, others in Y, but you can be sure the difference is mainly a result of a compromise some developers are willing to deal with and others aren't. That's why the O/R mapper frameworks on the market today are different in many ways, even though they all fetch and save entities from and to a database. I'm not suggesting there's no room for improvement in today's O/R mapper frameworks, there always is, but it's not a matter of 'the slowness of the application is caused by the O/R mapper' anymore. Perhaps query generation can be optimized a bit here, row materialization can be optimized a bit there, but it's mainly coming down to milliseconds. Still worth it if you're a framework developer, but it's not much compared to the time spend inside databases and in user code: if a complete fetch takes 40ms or 50ms (from call to entity object collection), it won't make a difference for your application as that 10ms difference won't be noticed. That's why it's very important to find the real locations of the problems so developers can fix them properly and don't get frustrated because their quest to get a fast, performing application failed. Performance tuning basics and rules Finding and fixing performance problems in any application is a strict procedure with four prescribed steps: isolate, analyze, interpret and fix, in that order. It's key that you don't skip a step nor make assumptions: these steps help you find the reason of a problem which seems to be there, and how to fix it or leave it as-is. Skipping a step, or when you assume things will be bad/slow without doing analysis will lead to the path of premature optimization and won't actually solve your problems, only create new ones. The most important rule of finding and fixing performance problems in software is that you have to understand what 'performance problem' actually means. Most developers will say "when a piece of software / code is slow, you have a performance problem". But is that actually the case? If I write a Linq query which will aggregate, group and sort 5 million rows from several tables to produce a resultset of 10 rows, it might take more than a couple of milliseconds before that resultset is ready to be consumed by other logic. If I solely look at the Linq query, the code consuming the resultset of the 10 rows and then look at the time it takes to complete the whole procedure, it will appear to me to be slow: all that time taken to produce and consume 10 rows? But if you look closer, if you analyze and interpret the situation, you'll see it does a tremendous amount of work, and in that light it might even be extremely fast. With every performance problem you encounter, always do realize that what you're trying to solve is perhaps not a technical problem at all, but a perception problem. The second most important rule you have to understand is based on the old saying "Penny wise, Pound Foolish": the part which takes e.g. 5% of the total time T for a given task isn't worth optimizing if you have another part which takes a much larger part of the total time T for that same given task. Optimizing parts which are relatively insignificant for the total time taken is not going to bring you better results overall, even if you totally optimize that part away. This is the core reason why analysis of the complete set of application parts which participate in a given task is key to being successful in solving performance problems: No analysis -> no problem -> no solution. One warning up front: hunting for performance will always include making compromises. Fast software can be made maintainable, but if you want to squeeze as much performance out of your software, you will inevitably be faced with the dilemma of compromising one or more from the group {readability, maintainability, features} for the extra performance you think you'll gain. It's then up to you to decide whether it's worth it. In almost all cases it's not. The reason for this is simple: the vast majority of performance problems can be solved by implementing the proper algorithms, the ones with proven Big O-characteristics so you know the performance you'll get plus you know the algorithm will work. The time taken by the algorithm implementing code is inevitable: you already implemented the best algorithm. You might find some optimizations on the technical level but in general these are minor. Let's look at the four steps to see how they guide us through the quest to find and fix performance problems. Isolate The first thing you need to do is to isolate the areas in your application which are assumed to be slow. For example, if your application is a web application and a given page is taking several seconds or even minutes to load, it's a good candidate to check out. It's important to start with the isolate step because it allows you to focus on a single code path per area with a clear begin and end and ignore the rest. The rest of the steps are taken per identified problematic area. Keep in mind that isolation focuses on tasks in an application, not code snippets. A task is something that's started in your application by either another task or the user, or another program, and has a beginning and an end. You can see a task as a piece of functionality offered by your application. Analyze Once you've determined the problem areas, you have to perform analysis on the code paths of each area, to see where the performance problems occur and which areas are not the problem. This is a multi-layered effort: an application which uses an O/R mapper typically consists of multiple parts: there's likely some kind of interface (web, webservice, windows etc.), a part which controls the interface and business logic, the O/R mapper part and the RDBMS, all connected with either a network or inter-process connections provided by the OS or other means. Each of these parts, including the connectivity plumbing, eat up a part of the total time it takes to complete a task, e.g. load a webpage with all orders of a given customer X. To understand which parts participate in the task / area we're investigating and how much they contribute to the total time taken to complete the task, analysis of each participating task is essential. Start with the code you wrote which starts the task, analyze the code and track the path it follows through your application. What does the code do along the way, verify whether it's correct or not. Analyze whether you have implemented the right algorithms in your code for this particular area. Remember we're looking at one area at a time, which means we're ignoring all other code paths, just the code path of the current problematic area, from begin to end and back. Don't dig in and start optimizing at the code level just yet. We're just analyzing. If your analysis reveals big architectural stupidity, it's perhaps a good idea to rethink the architecture at this point. For the rest, we're analyzing which means we collect data about what could be wrong, for each participating part of the complete application. Reviewing the code you wrote is a good tool to get deeper understanding of what is going on for a given task but ultimately it lacks precision and overview what really happens: humans aren't good code interpreters, computers are. We therefore need to utilize tools to get deeper understanding about which parts contribute how much time to the total task, triggered by which other parts and for example how many times are they called. There are two different kind of tools which are necessary: .NET profilers and O/R mapper / RDBMS profilers. .NET profiling .NET profilers (e.g. dotTrace by JetBrains or Ants by Red Gate software) show exactly which pieces of code are called, how many times they're called, and the time it took to run that piece of code, at the method level and sometimes even at the line level. The .NET profilers are essential tools for understanding whether the time taken to complete a given task / area in your application is consumed by .NET code, where exactly in your code, the path to that code, how many times that code was called by other code and thus reveals where hotspots are located: the areas where a solution can be found. Importantly, they also reveal which areas can be left alone: remember our penny wise pound foolish saying: if a profiler reveals that a group of methods are fast, or don't contribute much to the total time taken for a given task, ignore them. Even if the code in them is perhaps complex and looks like a candidate for optimization: you can work all day on that, it won't matter. As we're focusing on a single area of the application, it's best to start profiling right before you actually activate the task/area. Most .NET profilers support this by starting the application without starting the profiling procedure just yet. You navigate to the particular part which is slow, start profiling in the profiler, in your application you perform the actions which are considered slow, and afterwards you get a snapshot in the profiler. The snapshot contains the data collected by the profiler during the slow action, so most data is produced by code in the area to investigate. This is important, because it allows you to stay focused on a single area. O/R mapper and RDBMS profiling .NET profilers give you a good insight in the .NET side of things, but not in the RDBMS side of the application. As this article is about O/R mapper powered applications, we're also looking at databases, and the software making it possible to consume the database in your application: the O/R mapper. To understand which parts of the O/R mapper and database participate how much to the total time taken for task T, we need different tools. There are two kind of tools focusing on O/R mappers and database performance profiling: O/R mapper profilers and RDBMS profilers. For O/R mapper profilers, you can look at LLBLGen Prof by hibernating rhinos or the Linq to Sql/LLBLGen Pro profiler by Huagati. Hibernating rhinos also have profilers for other O/R mappers like NHibernate (NHProf) and Entity Framework (EFProf) and work the same as LLBLGen Prof. For RDBMS profilers, you have to look whether the RDBMS vendor has a profiler. For example for SQL Server, the profiler is shipped with SQL Server, for Oracle it's build into the RDBMS, however there are also 3rd party tools. Which tool you're using isn't really important, what's important is that you get insight in which queries are executed during the task / area we're currently focused on and how long they took. Here, the O/R mapper profilers have an advantage as they collect the time it took to execute the query from the application's perspective so they also collect the time it took to transport data across the network. This is important because a query which returns a massive resultset or a resultset with large blob/clob/ntext/image fields takes more time to get transported across the network than a small resultset and a database profiler doesn't take this into account most of the time. Another tool to use in this case, which is more low level and not all O/R mappers support it (though LLBLGen Pro and NHibernate as well do) is tracing: most O/R mappers offer some form of tracing or logging system which you can use to collect the SQL generated and executed and often also other activity behind the scenes. While tracing can produce a tremendous amount of data in some cases, it also gives insight in what's going on. Interpret After we've completed the analysis step it's time to look at the data we've collected. We've done code reviews to see whether we've done anything stupid and which parts actually take place and if the proper algorithms have been implemented. We've done .NET profiling to see which parts are choke points and how much time they contribute to the total time taken to complete the task we're investigating. We've performed O/R mapper profiling and RDBMS profiling to see which queries were executed during the task, how many queries were generated and executed and how long they took to complete, including network transportation. All this data reveals two things: which parts are big contributors to the total time taken and which parts are irrelevant. Both aspects are very important. The parts which are irrelevant (i.e. don't contribute significantly to the total time taken) can be ignored from now on, we won't look at them. The parts which contribute a lot to the total time taken are important to look at. We now have to first look at the .NET profiler results, to see whether the time taken is consumed in our own code, in .NET framework code, in the O/R mapper itself or somewhere else. For example if most of the time is consumed by DbCommand.ExecuteReader, the time it took to complete the task is depending on the time the data is fetched from the database. If there was just 1 query executed, according to tracing or O/R mapper profilers / RDBMS profilers, check whether that query is optimal, uses indexes or has to deal with a lot of data. Interpret means that you follow the path from begin to end through the data collected and determine where, along the path, the most time is contributed. It also means that you have to check whether this was expected or is totally unexpected. My previous example of the 10 row resultset of a query which groups millions of rows will likely reveal that a long time is spend inside the database and almost no time is spend in the .NET code, meaning the RDBMS part contributes the most to the total time taken, the rest is compared to that time, irrelevant. Considering the vastness of the source data set, it's expected this will take some time. However, does it need tweaking? Perhaps all possible tweaks are already in place. In the interpret step you then have to decide that further action in this area is necessary or not, based on what the analysis results show: if the analysis results were unexpected and in the area where the most time is contributed to the total time taken is room for improvement, action should be taken. If not, you can only accept the situation and move on. In all cases, document your decision together with the analysis you've done. If you decide that the perceived performance problem is actually expected due to the nature of the task performed, it's essential that in the future when someone else looks at the application and starts asking questions you can answer them properly and new analysis is only necessary if situations changed. Fix After interpreting the analysis results you've concluded that some areas need adjustment. This is the fix step: you're actively correcting the performance problem with proper action targeted at the real cause. In many cases related to O/R mapper powered applications it means you'll use different features of the O/R mapper to achieve the same goal, or apply optimizations at the RDBMS level. It could also mean you apply caching inside your application (compromise memory consumption over performance) to avoid unnecessary re-querying data and re-consuming the results. After applying a change, it's key you re-do the analysis and interpretation steps: compare the results and expectations with what you had before, to see whether your actions had any effect or whether it moved the problem to a different part of the application. Don't fall into the trap to do partly analysis: do the full analysis again: .NET profiling and O/R mapper / RDBMS profiling. It might very well be that the changes you've made make one part faster but another part significantly slower, in such a way that the overall problem hasn't changed at all. Performance tuning is dealing with compromises and making choices: to use one feature over the other, to accept a higher memory footprint, to go away from the strict-OO path and execute queries directly onto the RDBMS, these are choices and compromises which will cross your path if you want to fix performance problems with respect to O/R mappers or data-access and databases in general. In most cases it's not a big issue: alternatives are often good choices too and the compromises aren't that hard to deal with. What is important is that you document why you made a choice, a compromise: which analysis data, which interpretation led you to the choice made. This is key for good maintainability in the years to come. Most common performance problems with O/R mappers Below is an incomplete list of common performance problems related to data-access / O/R mappers / RDBMS code. It will help you with fixing the hotspots you found in the interpretation step. SELECT N+1: (Lazy-loading specific). Lazy loading triggered performance bottlenecks. Consider a list of Orders bound to a grid. You have a Field mapped onto a related field in Order, Customer.CompanyName. Showing this column in the grid will make the grid fetch (indirectly) for each row the Customer row. This means you'll get for the single list not 1 query (for the orders) but 1+(the number of orders shown) queries. To solve this: use eager loading using a prefetch path to fetch the customers with the orders. SELECT N+1 is easy to spot with an O/R mapper profiler or RDBMS profiler: if you see a lot of identical queries executed at once, you have this problem. Prefetch paths using many path nodes or sorting, or limiting. Eager loading problem. Prefetch paths can help with performance, but as 1 query is fetched per node, it can be the number of data fetched in a child node is bigger than you think. Also consider that data in every node is merged on the client within the parent. This is fast, but it also can take some time if you fetch massive amounts of entities. If you keep fetches small, you can use tuning parameters like the ParameterizedPrefetchPathThreshold setting to get more optimal queries. Deep inheritance hierarchies of type Target Per Entity/Type. If you use inheritance of type Target per Entity / Type (each type in the inheritance hierarchy is mapped onto its own table/view), fetches will join subtype- and supertype tables in many cases, which can lead to a lot of performance problems if the hierarchy has many types. With this problem, keep inheritance to a minimum if possible, or switch to a hierarchy of type Target Per Hierarchy, which means all entities in the inheritance hierarchy are mapped onto the same table/view. Of course this has its own set of drawbacks, but it's a compromise you might want to take. Fetching massive amounts of data by fetching large lists of entities. LLBLGen Pro supports paging (and limiting the # of rows returned), which is often key to process through large sets of data. Use paging on the RDBMS if possible (so a query is executed which returns only the rows in the page requested). When using paging in a web application, be sure that you switch server-side paging on on the datasourcecontrol used. In this case, paging on the grid alone is not enough: this can lead to fetching a lot of data which is then loaded into the grid and paged there. Keep note that analyzing queries for paging could lead to the false assumption that paging doesn't occur, e.g. when the query contains a field of type ntext/image/clob/blob and DISTINCT can't be applied while it should have (e.g. due to a join): the datareader will do DISTINCT filtering on the client. this is a little slower but it does perform paging functionality on the data-reader so it won't fetch all rows even if the query suggests it does. Fetch massive amounts of data because blob/clob/ntext/image fields aren't excluded. LLBLGen Pro supports field exclusion for queries. You can exclude fields (also in prefetch paths) per query to avoid fetching all fields of an entity, e.g. when you don't need them for the logic consuming the resultset. Excluding fields can greatly reduce the amount of time spend on data-transport across the network. Use this optimization if you see that there's a big difference between query execution time on the RDBMS and the time reported by the .NET profiler for the ExecuteReader method call. Doing client-side aggregates/scalar calculations by consuming a lot of data. If possible, try to formulate a scalar query or group by query using the projection system or GetScalar functionality of LLBLGen Pro to do data consumption on the RDBMS server. It's far more efficient to process data on the RDBMS server than to first load it all in memory, then traverse the data in-memory to calculate a value. Using .ToList() constructs inside linq queries. It might be you use .ToList() somewhere in a Linq query which makes the query be run partially in-memory. Example: var q = from c in metaData.Customers.ToList() where c.Country=="Norway" select c; This will actually fetch all customers in-memory and do an in-memory filtering, as the linq query is defined on an IEnumerable<T>, and not on the IQueryable<T>. Linq is nice, but it can often be a bit unclear where some parts of a Linq query might run. Fetching all entities to delete into memory first. To delete a set of entities it's rather inefficient to first fetch them all into memory and then delete them one by one. It's more efficient to execute a DELETE FROM ... WHERE query on the database directly to delete the entities in one go. LLBLGen Pro supports this feature, and so do some other O/R mappers. It's not always possible to do this operation in the context of an O/R mapper however: if an O/R mapper relies on a cache, these kind of operations are likely not supported because they make it impossible to track whether an entity is actually removed from the DB and thus can be removed from the cache. Fetching all entities to update with an expression into memory first. Similar to the previous point: it is more efficient to update a set of entities directly with a single UPDATE query using an expression instead of fetching the entities into memory first and then updating the entities in a loop, and afterwards saving them. It might however be a compromise you don't want to take as it is working around the idea of having an object graph in memory which is manipulated and instead makes the code fully aware there's a RDBMS somewhere. Conclusion Performance tuning is almost always about compromises and making choices. It's also about knowing where to look and how the systems in play behave and should behave. The four steps I provided should help you stay focused on the real problem and lead you towards the solution. Knowing how to optimally use the systems participating in your own code (.NET framework, O/R mapper, RDBMS, network/services) is key for success as well as knowing what's going on inside the application you built. I hope you'll find this guide useful in tracking down performance problems and dealing with them in a useful way.

Read the article
Option Trading: Getting the most out of the event session options

- by extended_events

You can control different aspects of how an event session behaves by setting the event session options as part of the CREATE EVENT SESSION DDL. The default settings for the event session options are designed to handle most of the common event collection situations so I generally recommend that you just use the defaults. Like everything in the real world though, there are going to be a handful of “special cases” that require something different. This post focuses on identifying the special cases and the correct use of the options to accommodate those cases. There is a reason it’s called Default The default session options specify a total event buffer size of 4 MB with a 30 second latency. Translating this into human terms; this means that our default behavior is that the system will start processing events from the event buffer when we reach about 1.3 MB of events or after 30 seconds, which ever comes first. Aside: What’s up with the 1.3 MB, I thought you said the buffer was 4 MB?The Extended Events engine takes the total buffer size specified by MAX_MEMORY (4MB by default) and divides it into 3 equally sized buffers. This is done so that a session can be publishing events to one buffer while other buffers are being processed. There are always at least three buffers; how to get more than three is covered later. Using this configuration, the Extended Events engine can “keep up” with most event sessions on standard workloads. Why is this? The fact is that most events are small, really small; on the order of a couple hundred bytes. Even when you start considering events that carry dynamically sized data (eg. binary, text, etc.) or adding actions that collect additional data, the total size of the event is still likely to be pretty small. This means that each buffer can likely hold thousands of events before it has to be processed. When the event buffers are finally processed there is an economy of scale achieved since most targets support bulk processing of the events so they are processed at the buffer level rather than the individual event level. When all this is working together it’s more likely that a full buffer will be processed and put back into the ready queue before the remaining buffers (remember, there are at least three) are full. I know what you’re going to say: “My server is exceptional! My workload is so massive it defies categorization!” OK, maybe you weren’t going to say that exactly, but you were probably thinking it. The point is that there are situations that won’t be covered by the Default, but that’s a good place to start and this post assumes you’ve started there so that you have something to look at in order to determine if you do have a special case that needs different settings. So let’s get to the special cases… What event just fired?! How about now?! Now?! If you believe the commercial adage from Heinz Ketchup (Heinz Slow Good Ketchup ad on You Tube), some things are worth the wait. This is not a belief held by most DBAs, particularly DBAs who are looking for an answer to a troubleshooting question fast. If you’re one of these anxious DBAs, or maybe just a Program Manager doing a demo, then 30 seconds might be longer than you’re comfortable waiting. If you find yourself in this situation then consider changing the MAX_DISPATCH_LATENCY option for your event session. This option will force the event buffers to be processed based on your time schedule. This option only makes sense for the asynchronous targets since those are the ones where we allow events to build up in the event buffer – if you’re using one of the synchronous targets this option isn’t relevant. Avoid forgotten events by increasing your memory Have you ever had one of those days where you keep forgetting things? That can happen in Extended Events too; we call it dropped events. In order to optimizes for server performance and help ensure that the Extended Events doesn’t block the server if to drop events that can’t be published to a buffer because the buffer is full. You can determine if events are being dropped from a session by querying the dm_xe_sessions DMV and looking at the dropped_event_count field. Aside: Should you care if you’re dropping events?Maybe not – think about why you’re collecting data in the first place and whether you’re really going to miss a few dropped events. For example, if you’re collecting query duration stats over thousands of executions of a query it won’t make a huge difference to miss a couple executions. Use your best judgment. If you find that your session is dropping events it means that the event buffer is not large enough to handle the volume of events that are being published. There are two ways to address this problem. First, you could collect fewer events – examine you session to see if you are over collecting. Do you need all the actions you’ve specified? Could you apply a predicate to be more specific about when you fire the event? Assuming the session is defined correctly, the next option is to change the MAX_MEMORY option to a larger number. Picking the right event buffer size might take some trial and error, but a good place to start is with the number of dropped events compared to the number you’ve collected. Aside: There are three different behaviors for dropping events that you specify using the EVENT_RETENTION_MODE option. The default is to allow single event loss and you should stick with this setting since it is the best choice for keeping the impact on server performance low.You’ll be tempted to use the setting to not lose any events (NO_EVENT_LOSS) – resist this urge since it can result in blocking on the server. If you’re worried that you’re losing events you should be increasing your event buffer memory as described in this section. Some events are too big to fail A less common reason for dropping an event is when an event is so large that it can’t fit into the event buffer. Even though most events are going to be small, you might find a condition that occasionally generates a very large event. You can determine if your session is dropping large events by looking at the dm_xe_sessions DMV once again, this time check the largest_event_dropped_size. If this value is larger than the size of your event buffer [remember, the size of your event buffer, by default, is max_memory / 3] then you need a large event buffer. To specify a large event buffer you set the MAX_EVENT_SIZE option to a value large enough to fit the largest event dropped based on data from the DMV. When you set this option the Extended Events engine will create two buffers of this size to accommodate these large events. As an added bonus (no extra charge) the large event buffer will also be used to store normal events in the cases where the normal event buffers are all full and waiting to be processed. (Note: This is just a side-effect, not the intended use. If you’re dropping many normal events then you should increase your normal event buffer size.) Partitioning: moving your events to a sub-division Earlier I alluded to the fact that you can configure your event session to use more than the standard three event buffers – this is called partitioning and is controlled by the MEMORY_PARTITION_MODE option. The result of setting this option is fairly easy to explain, but knowing when to use it is a bit more art than science. First the science… You can configure partitioning in three ways: None, Per NUMA Node & Per CPU. This specifies the location where sets of event buffers are created with fairly obvious implication. There are rules we follow for sub-dividing the total memory (specified by MAX_MEMORY) between all the event buffers that are specific to the mode used: None: 3 buffers (fixed)Node: 3 * number_of_nodesCPU: 2.5 * number_of_cpus Here are some examples of what this means for different Node/CPU counts: Configuration None Node CPU 2 CPUs, 1 Node 3 buffers 3 buffers 5 buffers 6 CPUs, 2 Node 3 buffers 6 buffers 15 buffers 40 CPUs, 5 Nodes 3 buffers 15 buffers 100 buffers Aside: Buffer size on multi-processor computersAs the number of Nodes or CPUs increases, the size of the event buffer gets smaller because the total memory is sub-divided into more pieces. The defaults will hold up to this for a while since each buffer set is holding events only from the Node or CPU that it is associated with, but at some point the buffers will get too small and you’ll either see events being dropped or you’ll get an error when you create your session because you’re below the minimum buffer size. Increase the MAX_MEMORY setting to an appropriate number for the configuration. The most likely reason to start partitioning is going to be related to performance. If you notice that running an event session is impacting the performance of your server beyond a reasonably expected level [Yes, there is a reasonably expected level of work required to collect events.] then partitioning might be an answer. Before you partition you might want to check a few other things: Is your event retention set to NO_EVENT_LOSS and causing blocking? (I told you not to do this.) Consider changing your event loss mode or increasing memory. Are you over collecting and causing more work than necessary? Consider adding predicates to events or removing unnecessary events and actions from your session. Are you writing the file target to the same slow disk that you use for TempDB and your other high activity databases? <kidding> <not really> It’s always worth considering the end to end picture – if you’re writing events to a file you can be impacted by I/O, network; all the usual stuff. Assuming you’ve ruled out the obvious (and not so obvious) issues, there are performance conditions that will be addressed by partitioning. For example, it’s possible to have a successful event session (eg. no dropped events) but still see a performance impact because you have many CPUs all attempting to write to the same free buffer and having to wait in line to finish their work. This is a case where partitioning would relieve the contention between the different CPUs and likely reduce the performance impact cause by the event session. There is no DMV you can check to find these conditions – sorry – that’s where the art comes in. This is largely a matter of experimentation. On the bright side you probably won’t need to to worry about this level of detail all that often. The performance impact of Extended Events is significantly lower than what you may be used to with SQL Trace. You will likely only care about the impact if you are trying to set up a long running event session that will be part of your everyday workload – sessions used for short term troubleshooting will likely fall into the “reasonably expected impact” category. Hey buddy – I think you forgot something OK, there are two options I didn’t cover: STARTUP_STATE & TRACK_CAUSALITY. If you want your event sessions to start automatically when the server starts, set the STARTUP_STATE option to ON. (Now there is only one option I didn’t cover.) I’m going to leave causality for another post since it’s not really related to session behavior, it’s more about event analysis. - Mike Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
How do you structure computer science University notes?

- by Sai Perchard

I am completing a year of postgraduate study in CS next semester. I am finishing a law degree this year, and I will use this to briefly explain what I mean when I refer to the 'structure' of University notes. My preferred structure for authoring law notes: Word Two columns 0.5cm margins (top, right, bottom, middle, left) Body text (10pt, regular), 3 levels of headings (14/12/10pt, bold), 3 levels of bulleted lists Color A background for cases Color B background for legislation I find that it's crucial to have a good structure from the outset. My key advice to a law student would be to ensure styles allows cases and legislation to be easily identified from supporting text, and not to include too much detail regarding the facts of cases. More than 3 levels of headings is too deep. More than 3 levels of a bulleted list is too deep. In terms of CS, I am interested in similar advice; for example, any strategies that have been successfully employed regarding structure, and general advice regarding note taking. Has latex proved better than Word? Code would presumably need to be stylistically differentiated, and use a monospaced font - perhaps code could be written in TextMate so that it could be copied to retain syntax highlighting? (Are notes even that useful in a CS degree? I am tempted to simply use a textbook. They are crucial in law.) I understand that different people may employ varying techniques and that people will have personal preferences, however I am interested in what these different techniques are. Update Thank you for the responses so far. To clarify, I am not suggesting that the approach should be comparable to that I employ for law. I could have been clearer. The consensus so far seems to be - just learn it. Structure of notes/notes themselves are not generally relevant. This is what I was alluding to when I said I was just tempted to use a textbook. Re the comment that said textbooks are generally useless - I strongly disagree. Sure, perhaps the recommended textbook is useless. But if I'm going to learn a programming language, I will (1) identify what I believe to be the best textbook, and (2) read it. I was unsure if the combination of theory with code meant that lecture notes may be a more efficient way to study for an exam. I imagine that would depend on the subject. A subject specifically on a programming language, reading a textbook and coding would be my preferred approach. But I was unsure if, given a subject containing substantive theory that may not be covered in a single textbook, people may have preferences regarding note taking and structure.

Read the article
Kuppinger Cole Paper on Entitlements Server

- by Naresh Persaud

Kuppinger Cole recently released a paper discussing external authorization describing how organizations can "future proof" their enterprise security by deploying Oracle Entitlements Server. By taking a declarative security approach, security policy can be flexible and distributed across multiple applications consistently. You can get a copy of the report here. In fact Oracle Entitlements Server is being used in many places to secure data and sensitive business transactions. The paper covers the major use cases for Entitlements Server as well as Kuppinger Cole's assessment of the market. Here are some additional resources that reinforce the cases discussed in the paper. Today applications for cloud and mobile applications can utilize RESTful interfaces. Click on this link to learn how. OES can also be used to secure data in Oracle Databases. To learn more check out the new Oracle U OES 11g course.

Read the article
Oracle OpenWorld Healthcare Integration Session Highlights Challenges & Solutions

- by Bruce Tierney

In today’s session co-presented by Steve Schenks, Integration Architect from Ascension Health and Oracle’s Sundar Shenbagam and Suresh Sharma (apparently your initials must be SS to present during this session), interesting insights in many different areas including Steve’s descriptions of the challenges with their previous environment: Disparate hardware and software is an issue common across healthcare and most other industries…Larry Ellison spoke on this topic during Sundays’ keynote address. In the last part of session, Suresh is planning to go over some of the best practices and lesson learned to implement successful healthcare applications and will discuss the different options to model Sequencing (FIF0) use cases (one of most common use cases in the provider market). The session was “Implementing Successful Healthcare Applications with Oracle SOA Suite” – Session # CON8546. For more information about this session, please contact Senior Principal Product Manager Suresh Sharma

Read the article
Oracle OpenWorld Healthcare Integration Session Highlights Challenges & Solutions

- by Nitesh Jain

In today’s session co-presented by Steve Schenks, Integration Architect from Ascension Health and Oracle’s Sundar Shenbagam and Suresh Sharma (apparently your initials must be SS to present during this session), interesting insights in many different areas including Steve’s descriptions of the challenges with their previous environment: Disparate hardware and software is an issue common across healthcare and most other industries…Larry Ellison spoke on this topic during Sundays’ keynote address. In the last part of session, Suresh is planning to go over some of the best practices and lesson learned to implement successful healthcare applications and will discuss the different options to model Sequencing (FIF0) use cases (one of most common use cases in the provider market). The session was “Implementing Successful Healthcare Applications with Oracle SOA Suite” – Session # CON8546. For more information about this session, please contact Senior Principal Product Manager Suresh Sharma Ref : https://blogs.oracle.com/SOA/entry/oracle_openworld_healthcare_integration_session

Read the article
The Threats are Outside the Risks are Inside

- by Naresh Persaud

In the past few years we have seen the threats against the enterprise increase dramatically. The number of attacks originating externally have outpaced the number of attacks driven by insiders. During the CSO Summit at Open World, Sonny Singh examined the phenomenon and shared Oracle's security story. While the threats are largely external, the risks are largely inside. Criminals are going after our sensitive customer data. In some cases the attacks are advanced. In most cases the attacks are very simple. Taking a security inside out approach can provide a cost effective way to secure an organization's most valuable assets. &amp;amp;lt;/span&amp;amp;gt;border-width:1px 1px 0;margin-bottom:5px&amp;amp;quot; allowfullscreen=&amp;amp;quot;&amp;amp;quot;&amp;amp;gt; Cso oow12-summit-sonny-sing hv4 from OracleIDM

Read the article
In retrospect, has it been a good idea to use three-valued logic for SQL NULL comparisons?

- by Heinzi

In SQL, NULL means "unknown value". Thus, every comparison with NULL yields NULL (unknown) rather than TRUE or FALSE. From a conceptional point of view, this three-valued logic makes sense. From a practical point of view, every learner of SQL has, one time or another, made the classic WHERE myField = NULL mistake or learned the hard way that NOT IN does not do what one would expect when NULL values are present. It is my impression (please correct me if I am wrong) that the cases where this three-valued logic helps (e.g. WHERE myField IS NOT NULL AND myField <> 2 can be shortened to WHERE myField <> 2) are rare and, in those cases, people tend to use the longer version anyway for clarity, just like you would add a comment when using a clever, non-obvious hack. Is there some obvious advantage that I am missing? Or is there a general consensus among the development community that this has been a mistake?

Read the article
Should devs, testers and business users have one unified test script?

- by Carlos Jaime C. De Leon

In development, I would normally have my own test scripts that would document the data, scenarios and execution steps that I plan to test; this is my dev test plan. When the functionality has been deployed to Test, testers test it using their own test script that they wrote. In UAT, the business user then tests using their own test plan. In retrospect, it looks like this provides a better coverage, with dev tests having a mix of black and white box testing, while testers and business users focus on black box testing. But on the other hand, this brings up distinct test cases that only are executed per stage (ie. some cases which testers thought of are only executed on Test stage) and it would like the dev missed it, which makes it a finding/bug. Is it worth consolidating the test scripts from the start? Thus using one unified test script, or is it abit difficult to do this upfront?

Read the article
Linq To Dataset: Display the contents of several tables in a data control

In several cases we will be required to work with data in different datasets, or on the same dataset but at different DataTables

Read the article
Good and easy way to share files on local machine

- by jb

I would like to have a directory that has following properties: Many users can copy files into it These files can be deleted/changed by these users (user A can delete/modify file that was copied into this directory) it cant be done using normal file permissions (because permissions are retained on copy). Here is what I found on the net: brainstorm idea blueprint Some use cases: Sharing music on local machine Simple git repository sharing (just make a bare repository writeable to many people) --- i know that there are solutions like gitosis Allow many developers to modify test instance of php app without giving them root (i guess they would copy files) --- I'm leading a team of nonprofit junior developers and I need to keep that one simple! EDIT AFAIK setting SGID bit is not enugh, it only affects newly created files --- and basic workflow for these use cases ivnolves copying and other operations (which cleave file's gid unchanged)

Read the article
Test case as a function or test case as a class

- by GodMan

I am having a design problem in test automation:- Requirements - Need to test different servers (using unix console and not GUI) through automation framework. Tests which I'm going to run - Unit, System, Integration Question: While designing a test case, I am thinking that a Test Case should be a part of a test suite (test suite is a class), just as we have in Python's pyunit framework. But, should we keep test cases as functions for a scalable automation framework or should be keep test cases as separate classes(each having their own setup, run and teardown methods) ? From automation perspective, Is the idea of having a test case as a class more scalable, maintainable or as a function?

Read the article
C#/.NET Little Wonders – Cross Calling Constructors

- by James Michael Hare

Just a small post today, it’s the final iteration before our release and things are crazy here! This is another little tidbit that I love using, and it should be fairly common knowledge, yet I’ve noticed many times that less experienced developers tend to have redundant constructor code when they overload their constructors. The Problem – repetitive code is less maintainable Let’s say you were designing a messaging system, and so you want to create a class to represent the properties for a Receiver, so perhaps you design a ReceiverProperties class to represent this collection of properties. Perhaps, you decide to make ReceiverProperties immutable, and so you have several constructors that you can use for alternative construction: 1: // Constructs a set of receiver properties. 2: public ReceiverProperties(ReceiverType receiverType, string source, bool isDurable, bool isBuffered) 3: { 4: ReceiverType = receiverType; 5: Source = source; 6: IsDurable = isDurable; 7: IsBuffered = isBuffered; 8: } 9: 10: // Constructs a set of receiver properties with buffering on by default. 11: public ReceiverProperties(ReceiverType receiverType, string source, bool isDurable) 12: { 13: ReceiverType = receiverType; 14: Source = source; 15: IsDurable = isDurable; 16: IsBuffered = true; 17: } 18: 19: // Constructs a set of receiver properties with buffering on and durability off. 20: public ReceiverProperties(ReceiverType receiverType, string source) 21: { 22: ReceiverType = receiverType; 23: Source = source; 24: IsDurable = false; 25: IsBuffered = true; 26: } Note: keep in mind this is just a simple example for illustration, and in same cases default parameters can also help clean this up, but they have issues of their own. While strictly speaking, there is nothing wrong with this code, logically, it suffers from maintainability flaws. Consider what happens if you add a new property to the class? You have to remember to guarantee that it is set appropriately in every constructor call. This can cause subtle bugs and becomes even uglier when the constructors do more complex logic, error handling, or there are numerous potential overloads (especially if you can’t easily see them all on one screen’s height). The Solution – cross-calling constructors I’d wager nearly everyone knows how to call your base class’s constructor, but you can also cross-call to one of the constructors in the same class by using the this keyword in the same way you use base to call a base constructor. 1: // Constructs a set of receiver properties. 2: public ReceiverProperties(ReceiverType receiverType, string source, bool isDurable, bool isBuffered) 3: { 4: ReceiverType = receiverType; 5: Source = source; 6: IsDurable = isDurable; 7: IsBuffered = isBuffered; 8: } 9: 10: // Constructs a set of receiver properties with buffering on by default. 11: public ReceiverProperties(ReceiverType receiverType, string source, bool isDurable) 12: : this(receiverType, source, isDurable, true) 13: { 14: } 15: 16: // Constructs a set of receiver properties with buffering on and durability off. 17: public ReceiverProperties(ReceiverType receiverType, string source) 18: : this(receiverType, source, false, true) 19: { 20: } Notice, there is much less code. In addition, the code you have has no repetitive logic. You can define the main constructor that takes all arguments, and the remaining constructors with defaults simply cross-call the main constructor, passing in the defaults. Yes, in some cases default parameters can ease some of this for you, but default parameters only work for compile-time constants (null, string and number literals). For example, if you were creating a TradingDataAdapter that relied on an implementation of ITradingDao which is the data access object to retreive records from the database, you might want two constructors: one that takes an ITradingDao reference, and a default constructor which constructs a specific ITradingDao for ease of use: 1: public TradingDataAdapter(ITradingDao dao) 2: { 3: _tradingDao = dao; 4: 5: // other constructor logic 6: } 7: 8: public TradingDataAdapter() 9: { 10: _tradingDao = new SqlTradingDao(); 11: 12: // same constructor logic as above 13: } As you can see, this isn’t something we can solve with a default parameter, but we could with cross-calling constructors: 1: public TradingDataAdapter(ITradingDao dao) 2: { 3: _tradingDao = dao; 4: 5: // other constructor logic 6: } 7: 8: public TradingDataAdapter() 9: : this(new SqlTradingDao()) 10: { 11: } So in cases like this where you have constructors with non compiler-time constant defaults, default parameters can’t help you and cross-calling constructors is one of your best options. Summary When you have just one constructor doing the job of initializing the class, you can consolidate all your logic and error-handling in one place, thus ensuring that your behavior will be consistent across the constructor calls. This makes the code more maintainable and even easier to read. There will be some cases where cross-calling constructors may be sub-optimal or not possible (if, for example, the overloaded constructors take completely different types and are not just “defaulting” behaviors). You can also use default parameters, of course, but default parameter behavior in a class hierarchy can be problematic (default values are not inherited and in fact can differ) so sometimes multiple constructors are actually preferable. Regardless of why you may need to have multiple constructors, consider cross-calling where you can to reduce redundant logic and clean up the code. Technorati Tags: C#,.NET,Little Wonders

Read the article
A Warning to Those Using sys.dm_exec_query_stats

- by Adam Machanic

The sys.dm_exec_query_stats view is one of my favorite DMVs. It has replaced a large chunk of what I used to use SQL Trace for--pulling metrics about what queries are running and how often--and it makes this kind of data collection painless and automatic. What's not to love? But use cases for the view are a topic for another post. Today I want to quickly point out an inconsistency. If you're using this view heavily, as I am, you should know that in some cases your queries will not get a row. One...(read more)

Read the article
Storage Vendors, Users Cope With Tough Economy

Storage remains a priority for end users, but in some cases budgets have been cut and vendors are more willing to deal.

Read the article
Read only array, deep copy or retrieve copies one by one? (Performance and Memory)

- by Arthur Wulf White

In a garbage collection based system, what is the most effective way to handle a read only array if such a structure does not exist natively in the language. Is it better to return a copy of an array or allow other classes to retrieve copies of the objects stored in the array one by one? @JustinSkiles: It is not very broad. It is performance related and can actually be answered specifically for two general cases. You only need very few items: in this situation it's more effective to retrieve copies of the objects one by one. You wish to iterate over 30% or more objects. In this cases it is superior to retrieve all the array at once. This saves on functions calls. Function calls are very expansive when compared to reading directly from an array. A good specific answer could include performance, reading from an array and reading indirectly through a function. It is a simple performance related question.

Read the article
When is a Use Case layer needed?

- by Meta-Knight

In his blog post The Clean Architecture Uncle Bob suggests a 4-layer architecture. I understand the separation between business rules, interfaces and infrastructure, but I wonder if/when it's necessary to have separate layers for domain objects and use cases. What added value will it bring, compared to just having the uses cases as "domain services" in the domain layer? The only useful info I've found on the web about a use case layer is an article by Martin Fowler, who seems to contradict Uncle Bob about its necessity: At some point I may run into the problems, and then I'll make a Use Case Controller - but only then. And even when I do that I rarely consider the Use Case Controllers to occupy a separate layer in the system architecture. Edit: I stumbled upon a video of Uncle Bob's Architecture: The Lost Years keynote, in which he explains this architecture in depth. Very informative.

Read the article
What is the most effective way to add functionality to unfamiliar, structurally unsound code?

- by Coder

This is probably something everyone has to face during the development sooner or later. You have an existing code written by someone else, and you have to extend it to work under new requirements. Sometimes it's simple, but sometimes the modules have medium to high coupling and medium to low cohesion, so the moment you start touching anything, everything breaks. And you don't feel that it's fixed correctly when you get the new and old scenarios working again. One approach would be to write tests, but in reality, in all cases I've seen, that was pretty much impossible (reliance on GUI, missing specifications, threading, complex dependencies and hierarchies, deadlines, etc). So everything sort of falls back to good ol' cowboy coding approach. But I refuse to believe there is no other systematic way that would make everything easier. Does anyone know a better approach, or the name of the methodology that should be used in such cases?

Read the article
Robust line of sight test on the inside of a polygon with tolerance

- by David Gouveia

Foreword This is a followup to this question and the main problem I'm trying to solve. My current solution is an hack which involves inflating the polygon, and doing most calculations on the inflated polygon instead. My goal is to remove this step completely, and correctly solve the problem with calculations only. Problem Given a concave polygon and treating all of its edges as if they were walls in a level, determine whether two points A and B are in line of sight of each other, while accounting for some degree of floating point errors. I'm currently basing my solution on a series of line-segment interection tests. In other words: If any of the end points are outside the polygon, they are not in line of sight. If both end points are inside the polygon, and the line segment from A to B crosses any of the edges from the polygon, then they are not in line of sight. If both end points are inside the polygon, and the line segment from A to B does not cross any of the edges from the polygon, then they are in line of sight. But the problem is dealing correctly with all the edge cases. In particular, it must be able to deal with all the situations depicted below, where red lines are examples that should be rejected, and green lines are examples that should be accepted. I probably missed a few other situations, such as when the line segment from A to B is colinear with an edge, but one of the end points is outside the polygon. One point of particular interest is the difference between 1 and 9. In both cases, both end points are vertices of the polygon, and there are no edges being intersected, but 1 should be rejected while 9 should be accepted. How to distinguish these two? I could check some middle point within the segment to see if it falls inside or not, but it's easy to come up with situations in which it would fail. Point 7 was also pretty tricky and I had to to treat it as a special case, which checks if two points are adjacent vertices of the polygon directly. But there are also other chances of line segments being col linear with the edges of the polygon, and I'm still not entirely sure how I should handle those cases. Is there any well known solution to this problem?

Read the article
"continue" and "break" for static analysis

- by B. VB.

I know there have been a number of discussions of whether break and continue should be considered harmful generally (with the bottom line being - more or less - that it depends; in some cases they enhance clarity and readability, but in other cases they do not). Suppose a new project is starting development, with plans for nightly builds including a run through a static analyzer. Should it be part of the coding guidelines for the project to avoid (or strongly discourage) the use of continue and break, even if it can sacrifice a little readability and require excessive indentation? I'm most interested in how this applies to C code. Essentially, can the use of these control operators significantly complicate the static analysis of the code possibly resulting in additional false negatives, that would otherwise register a potential fault if break or continue were not used? (Of course a complete static analysis proving the correctness of an aribtrary program is an undecidable proposition, so please keep responses about any hands-on experience with this you have, and not on theoretical impossibilities) Thanks in advance!

Read the article
Why are people using C instead of C++? [closed]

- by Darth

Possible Duplicate: When to use C over C++, and C++ over C? Many times I've stumbled upon people saying that C++ is not always better than C. Great example here would be the Linux kernel, where they simply decided to use C instead of C++ because it had better compilers at the time. But that's many years ago and a lot has changed. So the question is, why are people still using C over C++? I gues there are probably some cases (like embedded devices), where there simply isn't a good C++ compiler, or am I wrong here? What are the other cases when it is better to go with C instead of C++?

Read the article

Search Results

Search found 5267 results on 211 pages for 'use cases'.

Page 10/211 | < Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >

- by jasonw

- by Simon Thorpe

- by extended_events

- by FransBouma

- by extended_events

- by Sai Perchard

- by Naresh Persaud

- by Bruce Tierney

- by Nitesh Jain

- by Naresh Persaud

- by Heinzi

- by Carlos Jaime C. De Leon

- by jb

- by GodMan

- by James Michael Hare

- by Adam Machanic

- by Arthur Wulf White

- by Meta-Knight

- by Coder

- by David Gouveia

- by B. VB.

- by Darth

< Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >