Search Results

Search found 11759 results on 471 pages for 'isolation level'.

Page 72/471 | < Previous Page | 68 69 70 71 72 73 74 75 76 77 78 79  | Next Page >

  • Results from two queries at once in sqlite?

    - by SF.
    I'm currently trying to optimize the sluggish process of retrieving a page of log entries from the SQLite database. I noticed I almost always retrieve next entries along with count of available entries: SELECT time, level, type, text FROM Logs WHERE level IN (%s) ORDER BY time DESC, id DESC LIMIT LOG_REQ_LINES OFFSET %d* LOG_REQ_LINES ; together with total count of records that can match current query: SELECT count(*) FROM Logs WHERE level IN (%s); (for a display "page n of m") I wonder, if I could concatenate the two queries, and ask them both in one sqlite3_exec() simply concatenating the query string. How should my callback function look then? Can I distinguish between the different types of data by argc? What other optimizations would you suggest?

    Read the article

  • log file not generated while accessing a jar having log4j support.

    - by naveen garimella
    I have a x.jar which is being used by some client y.jar. Both x.jar and y.jar are at the same package level. i have put the log4j.xml at the same package level. But the log file is never generated. Can i know why? I have tried few other options as well but no luck till now. i have added log4j-1.2.16.jar to ClassPath: variable in manifest files of both x.jar and y.jar. Put log4j.xml at the class level of y.jar which actually calls the x.jar classes. package structure is as follows: x.jar --manifest.mf has a entry ClassPath:log4j-1.2.16.jar y.jar --manifest.mf has a entry ClassPath:log4j-1.2.16.jar log4j-1.2.16.jar log4j.xml --has a RollingFileAppender. Can any one suggest whether i am missing anything?

    Read the article

  • Android App crashing on one device only

    - by Daniel1402
    I am working on a new game that works perfectly on my test devices, 7-inch tablets and smartphones. But it crashes on my Galaxy Tab2 10-inch tablet with an Out of memory error. It always crashes when I start to play a second game! I have spent a full week checking the codes and I cannot figure out what is wrong. When I play from the menu screen, everything works fine. When I want to replay a game level from the level screen, the game will crash on the second launch. The level screen is made of 3 fragments, each with 32 buttons (4kB in size). I tried to keep only one fragment in memory with viewPager.setOffscreenPageLimit(1); but it does not solve the problem. Could someone stir me in some direction as to where to look for the potential problem? Why is the 10-inch tablet the only one to crash? Thanks.

    Read the article

  • How to handle request/response propagation up and down a widget hierarchy in a GUI app?

    - by fig-gnuton
    Given a GUI application where widgets can be composed of other widgets: If the user triggers an event resulting in a lower level widget needing data from a model, what's the cleanest way to be able to send that request to a controller (or the datastore itself)? And subsequently get the response back to that widget? Presumably one wouldn't want the controller or datastore to be a singleton directly available to all levels of widgets, or is this an acceptable use of singleton? Or should a top level controller be injected as a dependency through a widget hierarchy, as far down as the lowest level widget that might need that controller? Or a different approach entirely?

    Read the article

  • Help me vaildate these points regarding Ruby

    - by Bragaadeesh
    Hi, I have started learning Ruby for the past 2,3 weeks and I have come up with some findings on the language. Can someone please validate these points. Implemented in many other high level languages such as C, Java, .Net etc., Is slow for the obvious reason that it cannot beat any of the already known high level languages. Should never be compared with any other high level language. Not suitable for large applications. Completely open source and is in a budding state. Has a framework called Rails which claims that it would be good for Agile development Community out there is getting better day by day and finding help immediately should not be a problem as time goes by. Has significant changes between releases which many developers wont welcome right away. Running time cannot be comprehensively estimated since the language has several underlying implementation in several languages. Books are always outdated by the time when you finish them. Thanks.

    Read the article

  • Timezone settings in MySQL - Using NOW()?

    - by matt74tm
    SOmewhat related to Doing calculations in MySQL vs PHP Right now, our database assumes that the system time is in UTC and uses that to calculate NOW(). PHP explicitly sets the timezone as UTC (so its impervious to server time zone shifts). An accidental shift of timezones on the server messed this relationship up at the database level and i'm now trying to figure out the ideal congiguration: configure Mysql to be in UTC, but also from the perspective that: our application may be on someone else's server where they might have a different TZ (so i cant set the timezone at the mysql/server level). How do i configure it at the specific database level?

    Read the article

  • how to make several redirection

    - by Olesya Manevich
    Right now I am updating my web site and I want to change all links to non-extension links like example.com/test.php to example.com/test. THis solution I found I put in .htaccess file: RewriteCond %{THE_REQUEST} ^GET\ /(([^/?]*/)*[^/?]+)\.php RewriteRule ^.+\.php$ /%1 [L,R=301] But at the same time I need to make redirection for links what located in lower level of my web-site like example.com/level/test2.php to example.com/test2.php. What should I write in .htaccess to redirect my files from lower level to higher and remove .php extension at the same time? p.s. Thank you very much! and sorry for my poor English.

    Read the article

  • Trouble running setup package after Publishing in Visual Studio 2008

    - by Andrew Cooper
    I've got a small winform application that I've written that is running fine in the IDE. It builds with no errors or warnings. It's not using any third party controls. I'm coding in C# in Visual Studio 2008. When I Build -- Publish the application, everything seems to work fine. However, when I go and attempt to install the application via the setup.exe file I get an error message that says, "Application cannot be started." The error details are below: ERROR DETAILS Following errors were detected during this operation. * [3/18/2010 10:50:56 AM] System.Runtime.InteropServices.COMException - The referenced assembly is not installed on your system. (Exception from HRESULT: 0x800736B3) - Source: System.Deployment - Stack trace: at System.Deployment.Internal.Isolation.IStore.GetAssemblyInformation(UInt32 Flags, IDefinitionIdentity DefinitionIdentity, Guid& riid) at System.Deployment.Internal.Isolation.Store.GetAssemblyManifest(UInt32 Flags, IDefinitionIdentity DefinitionIdentity) at System.Deployment.Application.ComponentStore.GetAssemblyManifest(DefinitionIdentity asmId) at System.Deployment.Application.ComponentStore.GetSubscriptionStateInternal(DefinitionIdentity subId) at System.Deployment.Application.SubscriptionStore.GetSubscriptionStateInternal(SubscriptionState subState) at System.Deployment.Application.ComponentStore.CollectCrossGroupApplications(Uri codebaseUri, DefinitionIdentity deploymentIdentity, Boolean& identityGroupFound, Boolean& locationGroupFound, String& identityGroupProductName) at System.Deployment.Application.SubscriptionStore.CommitApplication(SubscriptionState& subState, CommitApplicationParams commitParams) at System.Deployment.Application.ApplicationActivator.InstallApplication(SubscriptionState& subState, ActivationDescription actDesc) at System.Deployment.Application.ApplicationActivator.PerformDeploymentActivation(Uri activationUri, Boolean isShortcut, String textualSubId, String deploymentProviderUrlFromExtension, BrowserSettings browserSettings, String& errorPageUrl) at System.Deployment.Application.ApplicationActivator.ActivateDeploymentWorker(Object state) I'm not sure what else to do. The only slightly odd thing I used in this application is the SQL Compact Server. Any help would be appreciated. Thanks, Andrew

    Read the article

  • singleton pattern in Windows Activation Service

    - by Joshua
    Hello I have a few WCF services that are currently being self hosted, in a very basic NT Service. I want to expand my application to add provisioning of WCF Services, and updates, as well as isolation (I want each WCF Service to be in its own AppDomain). These WCF Services contain logic that needs to be run on a regular basis, pinging the database, and getting information from external devices so that when a request comes in the data is readily available. I'm thinking about trying out Windows Activation Service, because i really like the provisioning, and isolation that comes with a managed services infrastructure. If I didn't use WAS I would essentially have to write the same code myself. From what I understand though WAS does not really support the model of having a service that is running before someone actually calls a method on the service. the article I read here MSDN Article Link states "That means in essence that out-of-the-box WAS hosting is not something that is really suited for sessionful or singleton services. It is more suitable for stateless per-call services." it does say that "Out of the box" so I'm wondering if anyone has used WAS to host a WCF service that really behaves more like an NT Service (starting and stopping independantly of having a method called upon it). Or any other ideas would be great. I was planning on writting this infrastructure myself, to host WCF services in a custom ServiceHost, and put their execution in a seporate AppDomain, as well as allow for provision of these services after initial installation, along with updates. However, I would MUCH MUCH MUCH rather not own that code if I don't have to. thanks Joshua

    Read the article

  • MySQL return Deadlock with insert row and FK is locked 'for update'

    - by constantin-slednev
    Hello developers! I get deadlock error in my mysql transaction. The simple example of my situation: Thread1 > set autocommit=0; Query OK, 0 rows affected (0.00 sec) Thread1 > SET TRANSACTION ISOLATION LEVEL READ COMMITTED; Query OK, 0 rows affected (0.00 sec) Thread1 > SELECT * FROM A WHERE ID=1000 FOR UPDATE; 1 row in set (0.00 sec) Thread2 > set autocommit=0; Query OK, 0 rows affected (0.00 sec) Thread2 > SET TRANSACTION ISOLATION LEVEL READ COMMITTED; Query OK, 0 rows affected (0.00 sec) Thread2 > INSERT INTO B (AID, NAME) VALUES (1000, 'Hello world'); SLEEP Query OK, 1 row affected (4.99 sec) Thread1 > INSERT INTO B (AID, NAME) VALUES (1000, 'Hello world2'); ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction B.AID -> FK -> A.ID I see three solutions: catch deadlock error in code and retry query. use innodb_locks_unsafe_for_binlog in my.cnf lock (for update) table A in Thread2 before insert Can you give me more solutions ? Current solutions do not fit me.

    Read the article

  • Correct way to generate order numbers in SQL Server

    - by Anton Gogolev
    This question certainly applies to a much broader scope, but here it is. I have a basic ecommerce app, where users can, naturally enough, place orders. Said orders need to have a unique number, which I'm trying to generate right now. Each order is Vendor-specific. Basically, I have an OrderNumberInfo (VendorID, OrderNumber) table. Now whenever a customer places an order I need to increment OrderNumber for a particuar Vendor and return that value. Naturally, I don't want other processes to interfere with me, so I need to exclusively lock this row somehow: begin tranaction declare @n int select @n = OrderNumber from OrderNumberInfo where VendorID = @vendorID update OrderNumberInfo set OrderNumber = @n + 1 where OrderNumber = @n and VendorID = @vendorID commit transaction Now, I've read about select ... with (updlock rowlock), pessimistic locking, etc., but just cannot fit all this in a coherent picture: How do these hints play with SQL Server 2008s' snapshot isolation? Do they perform row-level, page-level or even table-level locks? How does this tolerate multiple users trying to generate numbers for a single Vendor? What isolation levels are appropriate here? And generally - what is the way to do such things?

    Read the article

  • JDBC transaction dead-lock solution required?

    - by user49767
    It's a scenario described my friend and challenged to find solution. He is using Oracle database and JDBC connection with read committed as transaction isolation level. In one of the transaction, he updates a record and executes selects statement and commits the transaction. when everything happening within single thread, things are fine. But when multiple requests are handled, dead-lock happens. Thread-A updates a record. Thread B updates another record. Thread-A issues select statement and waits for Thread-B's transaction to complete the commit operation. Thread-B issues select statement and waits for Thread-A's transaction to complete the commit operation. Now above causes dead-lock. Since they use command pattern, the base framework allows to issue commit only once (at the end of all the db operation), so they are unable to issue commit immediately after select statement. My argument was Thread-A supposed to select all the records which are committed and hence should not be issue. But he said that Thread-A will surely wait till Thread-B commits the record. is that true? What are all the ways, to avoid the above issue? is it possible to change isolation-level? (without changing underlying java framework) Little information about base framework, it is something similar to Struts action, their each and every request handled by one action, transaction begins before execution and commits after execution.

    Read the article

  • Spring Security ACL: NotFoundException from JDBCMutableAclService.createAcl

    - by user340202
    Hello, I've been working on this task for too long to abandon the idea of using Spring Security to achieve it, but I wish that the community will provide with some support that will help reduce the regret that I have for choosing Spring Security. Enough ranting and now let's get to the point. I'm trying to create an ACL by using JDBCMutableAclService.createAcl as follows: [code] public void addPermission(IWFArtifact securedObject, Sid recipient, Permission permission, Class clazz) { ObjectIdentity oid = new ObjectIdentityImpl(clazz.getCanonicalName(), securedObject.getId()); this.addPermission(oid, recipient, permission); } @Override @Transactional(propagation = Propagation.REQUIRED, isolation = Isolation.READ_UNCOMMITTED, readOnly = false) public void addPermission(ObjectIdentity oid, Sid recipient, Permission permission) { SpringSecurityUtils.assureThreadLocalAuthSet(); MutableAcl acl; try { acl = this.mutableAclService.createAcl(oid); } catch (AlreadyExistsException e) { acl = (MutableAcl) this.mutableAclService.readAclById(oid); } // try { // acl = (MutableAcl) this.mutableAclService.readAclById(oid); // } catch (NotFoundException nfe) { // acl = this.mutableAclService.createAcl(oid); // } acl.insertAce(acl.getEntries().length, permission, recipient, true); this.mutableAclService.updateAcl(acl); } [/code] The call throws a NotFoundException from the line: [code] // Retrieve the ACL via superclass (ensures cache registration, proper retrieval etc) Acl acl = readAclById(objectIdentity); [/code] I believe this is caused by something related to Transactional, and that's why I have tested with many TransactionDefinition attributes. I have also doubted the annotation and tried with declarative transaction definition, but still with no luck. One important point is that I have used the statement used to insert the oid in the database earlier in the method directly on the database and it worked, and also threw a unique constraint exception at me when it tried to insert it in the method. I'm using Spring Security 2.0.8 and IceFaces 1.8 (which doesn't support spring 3.0 but definetely supprorts 2.0.x, specially when I keep caling SpringSecurityUtils.assureThreadLocalAuthSet()). My AppServer is Tomcat 6.0, and my DB Server is MySQL 6.0 I wish to get back a reply soon because I need to get this task off my way

    Read the article

  • Testing Hibernate DAO, without building the universe around it.

    - by Varun Mehta
    We have an application built using spring/Hibernate/MySQL, now we want to test the DAO layer, but here are a few shortcomings we face. Consider the use case of multiple objects connected to one another, eg: Book has Pages. The Page object cannot exist without the Book as book_id is mandatory FK in Page. For testing a Page I have to create a Book. This simple usecase is easy to manage, but if you start building a Library, till you don't create the whole universe surrounding the Book and Page, you cannot test it! So to test Page; Create Library Create Section Create Genre Create Author Create Book Create Page Now test Page. Is there an easy way to by pass this "universe creation" and just test he page object in isolation. I also want to be able to test HQLs related to Page. eg: SELECT new com.test.BookPage (book.id, page.name) FROM Book book, Page page. JUnit is supposed to run in isolation, so I have to write the whole test case to create the Page. Any tips will be useful.

    Read the article

  • What is New in ASP.NET 4.0 Code Access Security

    - by HosamKamel
    ASP.NET Code Access Security (CAS) is a feature that helps protect server applications on hosting multiple Web sites, ASP.NET lets you assign a configurable trust level that corresponds to a predefined set of permissions. ASP.NET has predefined ASP.NET Trust Levels and Policy Files that you can assign to applications, you also can assign custom trust level and policy files. Most web hosting companies run ASP.NET applications in Medium Trust to prevent that one website affect or harm another site etc. As .NET Framework's Code Access Security model has evolved, ASP.NET 4.0 Code Access Security also has introduced several changes and improvements.   A Full post addresses the new changes in ASP.NET 4.0 is published at Asp.Net QA Team Here http://weblogs.asp.net/asptest/archive/2010/04/23/what-is-new-in-asp-net-4-0-code-access-security.aspx

    Read the article

  • When should I use Areas in TFS instead of Team Projects

    - by Martin Hinshelwood
    Well, it depends…. If you are a small company that creates a finite number of internal projects then you will find it easier to create a single project for each of your products and have TFS do the heavy lifting with reporting, SharePoint sites and Version Control. But what if you are not… Update 9th March 2010 Michael Fourie gave me some feedback which I have integrated. Ed Blankenship via @edblankenship offered encouragement and a nice quote. Ewald Hofman gave me a couple of Cons, and maybe a few more soon. Ewald’s company, Avanade, currently uses Areas, but it looks like the manual management is getting too much and the project is getting cluttered. What if you are likely to have hundreds of projects, possibly with a multitude of internal and external projects? You might have 1 project for a customer or 10. This is the situation that most consultancies find themselves in and thus they need a more sustainable and maintainable option. What I am advocating is that we should have 1 “Team Project” per customer, and use areas to create “sub projects” within that single “Team Project”. "What you describe is what we generally do internally and what we recommend. We make very heavy use of area path to categorize the work within a larger project." - Brian Harry, Microsoft Technical Fellow & Product Unit Manager for Team Foundation Server   "We tend to use areas to segregate multiple projects in the same team project and it works well." - Tiago Pascoal, Visual Studio ALM MVP   "In general, I believe this approach provides consistency [to multi-product engagements] and lowers the administration and maintenance costs. All good." - Michael Fourie, Visual Studio ALM MVP   “@MrHinsh BTW, I'm very much a fan of very large, if not huge, team projects in TFS. Just FYI :) Use Areas & Iterations.” Ed Blankenship, Visual Studio ALM MVP   This would mean that SSW would have a single Team Project called “SSW” that contains all of our internal projects and consequently all of the Areas and Iteration move down one hierarchy to accommodate this. Where we would have had “\SSW\Sprint 1” we now have “\SSW\SqlDeploy\Sprint1” with “SqlDeploy” being our internal project. At the moment SSW has over 70 internal projects and more than 170 total projects in TFS. This method has long term benefits that help to simplify the support model for companies that often have limited internal support time and many projects. But, there are implications as TFS does not provide this model “out-of-the-box”. These implications stretch across Areas, Iterations, Queries, Project Portal and Version Control. Michael made a good comment, he said: I agree with your approach, assuming that in a multi-product engagement with a client, they are happy to adopt the same process template across all products. If they are not, then it’ll either be easy to convince them or there is a valid reason for having a different template - Michael Fourie, Visual Studio ALM MVP   At SSW we have a standard template that we use and this is applied across the board, to all of our projects. We even apply any changes to the core process template to all of our existing projects as well. If you have multiple projects for the same clients on multiple templates and you want to keep it that way, then this approach will not work for you. However, if you want to standardise as we have at SSW then this approach may benefit you as well. Implications around Areas Areas should be used for topological classification/isolation of work items. You can think of this as architecture areas, organisational areas or even the main features of your application. In our scenario there is an additional top level item that represents the Project / Product that we want to chop our Team Project into. Figure: Creating a sub area to represent a product/project is easy. <teamproject> <teamproject>\<Functional Area/module whatever> Becomes: <teamproject> <teamproject>\<ProjectName>\ <teamproject>\<ProjectName>\<Functional Area/module whatever> Implications around Iterations Iterations should be used for chronological classification/isolation of work items. This could include isolated time boxes, milestones or release timelines and really depends on the logical flow of your project or projects. Due to the new level in Area we need to add the same level to Iteration. This is primarily because it is unlikely that the sprints in each of your projects/products will start and end at the same time. This is just a reality of managing multiple projects. Figure: Adding the same Area value to Iteration as the top level item adds flexibility to Iteration. <teamproject>\Sprint 1 Or <teamproject>\Release 1\Sprint 1 Becomes: <teamproject>\<ProjectName>\Sprint 1 Or <teamproject>\<ProjectName>\Release 1\Sprint 1 Implications around Queries Queries are used to filter your work items based on a specified level of granularity. There are a number of queries that are built into a project created using the MSF Agile 5.0 template, but we now have multiple projects and it would be a pain to have to edit all of the work items every time we changed project, and that would only allow one team to work on one project at a time.   Figure: The Queries that are created in a normal MSF Agile 5.0 project do not quite suit our new needs. In order for project contributors to be able to query based on their project we need a couple of things. The first thing I did was to create an “_Area Template” folder that has a copy of the project layout with all the queries setup to filter based on the “_Area Template” Area and the “_Sprint template” you can see in the Area and Iteration views. Figure: The template is currently easily drag and drop, but you then need to edit the queries to point at the right Area and Iteration. This needs a tool. I then created an “Areas” folder to hold all of the area specific queries. So, when you go to create a new TFS Sub-Project you just drag “_Area Template” while holding “Ctrl” and drop it onto “Areas”. There is a little setup here. That said I managed it in around 10 minutes which is not so bad, and I can imagine it being quite easy to build a tool to create these queries Figure: These new queries can be configured in around 10 minutes, which includes setting up the Area and Iteration as well. Version Control What about your source code? Well, that is the easiest of the lot. Just create a sub folder for each of your projects/products.   Figure: Creating sub folders in source control is easy as “Right click | Create new folder”. <teamproject>\DEV\Main\ Becomes: <teamproject>\<ProjectName>\DEV\Main\ Conclusion I think it is up to each company to make a call on how you want to configure your Team Projects and it depends completely on how many projects/products you are going to have for each customer including yourself. If we decide to utilise this route it will require some configuration to get our 170+ projects into this format, and I will probably be writing some tools to help. Pros You only have one project to upgrade when a process template changes – After going through an upgrade of over 170 project prior to the changes in the RC I can tell you that that many projects is no fun. Standardises your Process Template – You will always have the same Process implementation across projects/products without exception You get tighter control over the permissions – Yes, you can do this on a standard Team Project, but it gets a lot easier with practice. You can “move” work items from one “product” to another – Have we not always wanted to do that. You can rename your projects – Wahoo: everyone wants to do this, now you can. One set of Reporting Services reports to manage – You set an area and iteration to run reports anyway, so you may as well set both. Simplified Check-In Policies– There is only one set of check-in policies per client. This simplifies administration of policies. Simplified Alerts – As alerts are applied across multiple projects this simplifies your alert rules as per client. Cons All of these cons could be mitigated by a custom tool that helps automate creation of “Sub-projects” within Team Projects. This custom tool could create areas, Iteration, permissions, SharePoint and queries. It just does not exist yet :) You need to configure the Areas and Iterations You need to configure the permissions You may need to configure sub sites for SharePoint (depends on your requirement) – If you have two projects/products in the same Team Project then you will not see the burn down for each one out-of-the-box, but rather a cumulative for the Team Project. This is not really that much of a problem as you would have to configure your burndown graphs for your current iteration anyway. note: When you create a sub site to a TFS linked portal it will inherit the settings of its parent site :) This is fantastic as it means that you can easily create sub sites and then set the Area and Iteration path in each of the reports to be the correct one. Every team wants their own customization (via Ewald Hofman) - small teams of 2 persons against teams of 30 – or even outsourcing – need their own process, you cannot allow that because everybody gets the same work item types. note: Luckily at SSW this is not a problem as our template is standardised across all projects and customers. Large list of builds (via Ewald Hofman) – As the build list in Team Explorer is just a flat list it can get very cluttered. note: I would mitigate this by removing any build that has not been run in over 30 days. The build template and workflow will still be available in version control, but it will clean the list. Feedback Now that I have explained this method, what do you think? What other pros and cons can you see? What do you think of this approach? Will you be using it? What tools would you like to support you?   Technorati Tags: Visual Studio ALM,TFS Administration,TFS,Team Foundation Server,Project Planning,TFS Customisation

    Read the article

  • The Incremental Architect&rsquo;s Napkin - #5 - Design functions for extensibility and readability

    - by Ralf Westphal
    Originally posted on: http://geekswithblogs.net/theArchitectsNapkin/archive/2014/08/24/the-incremental-architectrsquos-napkin---5---design-functions-for.aspx The functionality of programs is entered via Entry Points. So what we´re talking about when designing software is a bunch of functions handling the requests represented by and flowing in through those Entry Points. Designing software thus consists of at least three phases: Analyzing the requirements to find the Entry Points and their signatures Designing the functionality to be executed when those Entry Points get triggered Implementing the functionality according to the design aka coding I presume, you´re familiar with phase 1 in some way. And I guess you´re proficient in implementing functionality in some programming language. But in my experience developers in general are not experienced in going through an explicit phase 2. “Designing functionality? What´s that supposed to mean?” you might already have thought. Here´s my definition: To design functionality (or functional design for short) means thinking about… well, functions. You find a solution for what´s supposed to happen when an Entry Point gets triggered in terms of functions. A conceptual solution that is, because those functions only exist in your head (or on paper) during this phase. But you may have guess that, because it´s “design” not “coding”. And here is, what functional design is not: It´s not about logic. Logic is expressions (e.g. +, -, && etc.) and control statements (e.g. if, switch, for, while etc.). Also I consider calling external APIs as logic. It´s equally basic. It´s what code needs to do in order to deliver some functionality or quality. Logic is what´s doing that needs to be done by software. Transformations are either done through expressions or API-calls. And then there is alternative control flow depending on the result of some expression. Basically it´s just jumps in Assembler, sometimes to go forward (if, switch), sometimes to go backward (for, while, do). But calling your own function is not logic. It´s not necessary to produce any outcome. Functionality is not enhanced by adding functions (subroutine calls) to your code. Nor is quality increased by adding functions. No performance gain, no higher scalability etc. through functions. Functions are not relevant to functionality. Strange, isn´t it. What they are important for is security of investment. By introducing functions into our code we can become more productive (re-use) and can increase evolvability (higher unterstandability, easier to keep code consistent). That´s no small feat, however. Evolvable code can hardly be overestimated. That´s why to me functional design is so important. It´s at the core of software development. To sum this up: Functional design is on a level of abstraction above (!) logical design or algorithmic design. Functional design is only done until you get to a point where each function is so simple you are very confident you can easily code it. Functional design an logical design (which mostly is coding, but can also be done using pseudo code or flow charts) are complementary. Software needs both. If you start coding right away you end up in a tangled mess very quickly. Then you need back out through refactoring. Functional design on the other hand is bloodless without actual code. It´s just a theory with no experiments to prove it. But how to do functional design? An example of functional design Let´s assume a program to de-duplicate strings. The user enters a number of strings separated by commas, e.g. a, b, a, c, d, b, e, c, a. And the program is supposed to clear this list of all doubles, e.g. a, b, c, d, e. There is only one Entry Point to this program: the user triggers the de-duplication by starting the program with the string list on the command line C:\>deduplicate "a, b, a, c, d, b, e, c, a" a, b, c, d, e …or by clicking on a GUI button. This leads to the Entry Point function to get called. It´s the program´s main function in case of the batch version or a button click event handler in the GUI version. That´s the physical Entry Point so to speak. It´s inevitable. What then happens is a three step process: Transform the input data from the user into a request. Call the request handler. Transform the output of the request handler into a tangible result for the user. Or to phrase it a bit more generally: Accept input. Transform input into output. Present output. This does not mean any of these steps requires a lot of effort. Maybe it´s just one line of code to accomplish it. Nevertheless it´s a distinct step in doing the processing behind an Entry Point. Call it an aspect or a responsibility - and you will realize it most likely deserves a function of its own to satisfy the Single Responsibility Principle (SRP). Interestingly the above list of steps is already functional design. There is no logic, but nevertheless the solution is described - albeit on a higher level of abstraction than you might have done yourself. But it´s still on a meta-level. The application to the domain at hand is easy, though: Accept string list from command line De-duplicate Present de-duplicated strings on standard output And this concrete list of processing steps can easily be transformed into code:static void Main(string[] args) { var input = Accept_string_list(args); var output = Deduplicate(input); Present_deduplicated_string_list(output); } Instead of a big problem there are three much smaller problems now. If you think each of those is trivial to implement, then go for it. You can stop the functional design at this point. But maybe, just maybe, you´re not so sure how to go about with the de-duplication for example. Then just implement what´s easy right now, e.g.private static string Accept_string_list(string[] args) { return args[0]; } private static void Present_deduplicated_string_list( string[] output) { var line = string.Join(", ", output); Console.WriteLine(line); } Accept_string_list() contains logic in the form of an API-call. Present_deduplicated_string_list() contains logic in the form of an expression and an API-call. And then repeat the functional design for the remaining processing step. What´s left is the domain logic: de-duplicating a list of strings. How should that be done? Without any logic at our disposal during functional design you´re left with just functions. So which functions could make up the de-duplication? Here´s a suggestion: De-duplicate Parse the input string into a true list of strings. Register each string in a dictionary/map/set. That way duplicates get cast away. Transform the data structure into a list of unique strings. Processing step 2 obviously was the core of the solution. That´s where real creativity was needed. That´s the core of the domain. But now after this refinement the implementation of each step is easy again:private static string[] Parse_string_list(string input) { return input.Split(',') .Select(s => s.Trim()) .ToArray(); } private static Dictionary<string,object> Compile_unique_strings(string[] strings) { return strings.Aggregate( new Dictionary<string, object>(), (agg, s) => { agg[s] = null; return agg; }); } private static string[] Serialize_unique_strings( Dictionary<string,object> dict) { return dict.Keys.ToArray(); } With these three additional functions Main() now looks like this:static void Main(string[] args) { var input = Accept_string_list(args); var strings = Parse_string_list(input); var dict = Compile_unique_strings(strings); var output = Serialize_unique_strings(dict); Present_deduplicated_string_list(output); } I think that´s very understandable code: just read it from top to bottom and you know how the solution to the problem works. It´s a mirror image of the initial design: Accept string list from command line Parse the input string into a true list of strings. Register each string in a dictionary/map/set. That way duplicates get cast away. Transform the data structure into a list of unique strings. Present de-duplicated strings on standard output You can even re-generate the design by just looking at the code. Code and functional design thus are always in sync - if you follow some simple rules. But about that later. And as a bonus: all the functions making up the process are small - which means easy to understand, too. So much for an initial concrete example. Now it´s time for some theory. Because there is method to this madness ;-) The above has only scratched the surface. Introducing Flow Design Functional design starts with a given function, the Entry Point. Its goal is to describe the behavior of the program when the Entry Point is triggered using a process, not an algorithm. An algorithm consists of logic, a process on the other hand consists just of steps or stages. Each processing step transforms input into output or a side effect. Also it might access resources, e.g. a printer, a database, or just memory. Processing steps thus can rely on state of some sort. This is different from Functional Programming, where functions are supposed to not be stateful and not cause side effects.[1] In its simplest form a process can be written as a bullet point list of steps, e.g. Get data from user Output result to user Transform data Parse data Map result for output Such a compilation of steps - possibly on different levels of abstraction - often is the first artifact of functional design. It can be generated by a team in an initial design brainstorming. Next comes ordering the steps. What should happen first, what next etc.? Get data from user Parse data Transform data Map result for output Output result to user That´s great for a start into functional design. It´s better than starting to code right away on a given function using TDD. Please get me right: TDD is a valuable practice. But it can be unnecessarily hard if the scope of a functionn is too large. But how do you know beforehand without investing some thinking? And how to do this thinking in a systematic fashion? My recommendation: For any given function you´re supposed to implement first do a functional design. Then, once you´re confident you know the processing steps - which are pretty small - refine and code them using TDD. You´ll see that´s much, much easier - and leads to cleaner code right away. For more information on this approach I call “Informed TDD” read my book of the same title. Thinking before coding is smart. And writing down the solution as a bunch of functions possibly is the simplest thing you can do, I´d say. It´s more according to the KISS (Keep It Simple, Stupid) principle than returning constants or other trivial stuff TDD development often is started with. So far so good. A simple ordered list of processing steps will do to start with functional design. As shown in the above example such steps can easily be translated into functions. Moving from design to coding thus is simple. However, such a list does not scale. Processing is not always that simple to be captured in a list. And then the list is just text. Again. Like code. That means the design is lacking visuality. Textual representations need more parsing by your brain than visual representations. Plus they are limited in their “dimensionality”: text just has one dimension, it´s sequential. Alternatives and parallelism are hard to encode in text. In addition the functional design using numbered lists lacks data. It´s not visible what´s the input, output, and state of the processing steps. That´s why functional design should be done using a lightweight visual notation. No tool is necessary to draw such designs. Use pen and paper; a flipchart, a whiteboard, or even a napkin is sufficient. Visualizing processes The building block of the functional design notation is a functional unit. I mostly draw it like this: Something is done, it´s clear what goes in, it´s clear what comes out, and it´s clear what the processing step requires in terms of state or hardware. Whenever input flows into a functional unit it gets processed and output is produced and/or a side effect occurs. Flowing data is the driver of something happening. That´s why I call this approach to functional design Flow Design. It´s about data flow instead of control flow. Control flow like in algorithms is of no concern to functional design. Thinking about control flow simply is too low level. Once you start with control flow you easily get bogged down by tons of details. That´s what you want to avoid during design. Design is supposed to be quick, broad brush, abstract. It should give overview. But what about all the details? As Robert C. Martin rightly said: “Programming is abot detail”. Detail is a matter of code. Once you start coding the processing steps you designed you can worry about all the detail you want. Functional design does not eliminate all the nitty gritty. It just postpones tackling them. To me that´s also an example of the SRP. Function design has the responsibility to come up with a solution to a problem posed by a single function (Entry Point). And later coding has the responsibility to implement the solution down to the last detail (i.e. statement, API-call). TDD unfortunately mixes both responsibilities. It´s just coding - and thereby trying to find detailed implementations (green phase) plus getting the design right (refactoring). To me that´s one reason why TDD has failed to deliver on its promise for many developers. Using functional units as building blocks of functional design processes can be depicted very easily. Here´s the initial process for the example problem: For each processing step draw a functional unit and label it. Choose a verb or an “action phrase” as a label, not a noun. Functional design is about activities, not state or structure. Then make the output of an upstream step the input of a downstream step. Finally think about the data that should flow between the functional units. Write the data above the arrows connecting the functional units in the direction of the data flow. Enclose the data description in brackets. That way you can clearly see if all flows have already been specified. Empty brackets mean “no data is flowing”, but nevertheless a signal is sent. A name like “list” or “strings” in brackets describes the data content. Use lower case labels for that purpose. A name starting with an upper case letter like “String” or “Customer” on the other hand signifies a data type. If you like, you also can combine descriptions with data types by separating them with a colon, e.g. (list:string) or (strings:string[]). But these are just suggestions from my practice with Flow Design. You can do it differently, if you like. Just be sure to be consistent. Flows wired-up in this manner I call one-dimensional (1D). Each functional unit just has one input and/or one output. A functional unit without an output is possible. It´s like a black hole sucking up input without producing any output. Instead it produces side effects. A functional unit without an input, though, does make much sense. When should it start to work? What´s the trigger? That´s why in the above process even the first processing step has an input. If you like, view such 1D-flows as pipelines. Data is flowing through them from left to right. But as you can see, it´s not always the same data. It get´s transformed along its passage: (args) becomes a (list) which is turned into (strings). The Principle of Mutual Oblivion A very characteristic trait of flows put together from function units is: no functional units knows another one. They are all completely independent of each other. Functional units don´t know where their input is coming from (or even when it´s gonna arrive). They just specify a range of values they can process. And they promise a certain behavior upon input arriving. Also they don´t know where their output is going. They just produce it in their own time independent of other functional units. That means at least conceptually all functional units work in parallel. Functional units don´t know their “deployment context”. They now nothing about the overall flow they are place in. They are just consuming input from some upstream, and producing output for some downstream. That makes functional units very easy to test. At least as long as they don´t depend on state or resources. I call this the Principle of Mutual Oblivion (PoMO). Functional units are oblivious of others as well as an overall context/purpose. They are just parts of a whole focused on a single responsibility. How the whole is built, how a larger goal is achieved, is of no concern to the single functional units. By building software in such a manner, functional design interestingly follows nature. Nature´s building blocks for organisms also follow the PoMO. The cells forming your body do not know each other. Take a nerve cell “controlling” a muscle cell for example:[2] The nerve cell does not know anything about muscle cells, let alone the specific muscel cell it is “attached to”. Likewise the muscle cell does not know anything about nerve cells, let a lone a specific nerve cell “attached to” it. Saying “the nerve cell is controlling the muscle cell” thus only makes sense when viewing both from the outside. “Control” is a concept of the whole, not of its parts. Control is created by wiring-up parts in a certain way. Both cells are mutually oblivious. Both just follow a contract. One produces Acetylcholine (ACh) as output, the other consumes ACh as input. Where the ACh is going, where it´s coming from neither cell cares about. Million years of evolution have led to this kind of division of labor. And million years of evolution have produced organism designs (DNA) which lead to the production of these different cell types (and many others) and also to their co-location. The result: the overall behavior of an organism. How and why this happened in nature is a mystery. For our software, though, it´s clear: functional and quality requirements needs to be fulfilled. So we as developers have to become “intelligent designers” of “software cells” which we put together to form a “software organism” which responds in satisfying ways to triggers from it´s environment. My bet is: If nature gets complex organisms working by following the PoMO, who are we to not apply this recipe for success to our much simpler “machines”? So my rule is: Wherever there is functionality to be delivered, because there is a clear Entry Point into software, design the functionality like nature would do it. Build it from mutually oblivious functional units. That´s what Flow Design is about. In that way it´s even universal, I´d say. Its notation can also be applied to biology: Never mind labeling the functional units with nouns. That´s ok in Flow Design. You´ll do that occassionally for functional units on a higher level of abstraction or when their purpose is close to hardware. Getting a cockroach to roam your bedroom takes 1,000,000 nerve cells (neurons). Getting the de-duplication program to do its job just takes 5 “software cells” (functional units). Both, though, follow the same basic principle. Translating functional units into code Moving from functional design to code is no rocket science. In fact it´s straightforward. There are two simple rules: Translate an input port to a function. Translate an output port either to a return statement in that function or to a function pointer visible to that function. The simplest translation of a functional unit is a function. That´s what you saw in the above example. Functions are mutually oblivious. That why Functional Programming likes them so much. It makes them composable. Which is the reason, nature works according to the PoMO. Let´s be clear about one thing: There is no dependency injection in nature. For all of an organism´s complexity no DI container is used. Behavior is the result of smooth cooperation between mutually oblivious building blocks. Functions will often be the adequate translation for the functional units in your designs. But not always. Take for example the case, where a processing step should not always produce an output. Maybe the purpose is to filter input. Here the functional unit consumes words and produces words. But it does not pass along every word flowing in. Some words are swallowed. Think of a spell checker. It probably should not check acronyms for correctness. There are too many of them. Or words with no more than two letters. Such words are called “stop words”. In the above picture the optionality of the output is signified by the astrisk outside the brackets. It means: Any number of (word) data items can flow from the functional unit for each input data item. It might be none or one or even more. This I call a stream of data. Such behavior cannot be translated into a function where output is generated with return. Because a function always needs to return a value. So the output port is translated into a function pointer or continuation which gets passed to the subroutine when called:[3]void filter_stop_words( string word, Action<string> onNoStopWord) { if (...check if not a stop word...) onNoStopWord(word); } If you want to be nitpicky you might call such a function pointer parameter an injection. And technically you´re right. Conceptually, though, it´s not an injection. Because the subroutine is not functionally dependent on the continuation. Firstly continuations are procedures, i.e. subroutines without a return type. Remember: Flow Design is about unidirectional data flow. Secondly the name of the formal parameter is chosen in a way as to not assume anything about downstream processing steps. onNoStopWord describes a situation (or event) within the functional unit only. Translating output ports into function pointers helps keeping functional units mutually oblivious in cases where output is optional or produced asynchronically. Either pass the function pointer to the function upon call. Or make it global by putting it on the encompassing class. Then it´s called an event. In C# that´s even an explicit feature.class Filter { public void filter_stop_words( string word) { if (...check if not a stop word...) onNoStopWord(word); } public event Action<string> onNoStopWord; } When to use a continuation and when to use an event dependens on how a functional unit is used in flows and how it´s packed together with others into classes. You´ll see examples further down the Flow Design road. Another example of 1D functional design Let´s see Flow Design once more in action using the visual notation. How about the famous word wrap kata? Robert C. Martin has posted a much cited solution including an extensive reasoning behind his TDD approach. So maybe you want to compare it to Flow Design. The function signature given is:string WordWrap(string text, int maxLineLength) {...} That´s not an Entry Point since we don´t see an application with an environment and users. Nevertheless it´s a function which is supposed to provide a certain functionality. The text passed in has to be reformatted. The input is a single line of arbitrary length consisting of words separated by spaces. The output should consist of one or more lines of a maximum length specified. If a word is longer than a the maximum line length it can be split in multiple parts each fitting in a line. Flow Design Let´s start by brainstorming the process to accomplish the feat of reformatting the text. What´s needed? Words need to be assembled into lines Words need to be extracted from the input text The resulting lines need to be assembled into the output text Words too long to fit in a line need to be split Does sound about right? I guess so. And it shows a kind of priority. Long words are a special case. So maybe there is a hint for an incremental design here. First let´s tackle “average words” (words not longer than a line). Here´s the Flow Design for this increment: The the first three bullet points turned into functional units with explicit data added. As the signature requires a text is transformed into another text. See the input of the first functional unit and the output of the last functional unit. In between no text flows, but words and lines. That´s good to see because thereby the domain is clearly represented in the design. The requirements are talking about words and lines and here they are. But note the asterisk! It´s not outside the brackets but inside. That means it´s not a stream of words or lines, but lists or sequences. For each text a sequence of words is output. For each sequence of words a sequence of lines is produced. The asterisk is used to abstract from the concrete implementation. Like with streams. Whether the list of words gets implemented as an array or an IEnumerable is not important during design. It´s an implementation detail. Does any processing step require further refinement? I don´t think so. They all look pretty “atomic” to me. And if not… I can always backtrack and refine a process step using functional design later once I´ve gained more insight into a sub-problem. Implementation The implementation is straightforward as you can imagine. The processing steps can all be translated into functions. Each can be tested easily and separately. Each has a focused responsibility. And the process flow becomes just a sequence of function calls: Easy to understand. It clearly states how word wrapping works - on a high level of abstraction. And it´s easy to evolve as you´ll see. Flow Design - Increment 2 So far only texts consisting of “average words” are wrapped correctly. Words not fitting in a line will result in lines too long. Wrapping long words is a feature of the requested functionality. Whether it´s there or not makes a difference to the user. To quickly get feedback I decided to first implement a solution without this feature. But now it´s time to add it to deliver the full scope. Fortunately Flow Design automatically leads to code following the Open Closed Principle (OCP). It´s easy to extend it - instead of changing well tested code. How´s that possible? Flow Design allows for extension of functionality by inserting functional units into the flow. That way existing functional units need not be changed. The data flow arrow between functional units is a natural extension point. No need to resort to the Strategy Pattern. No need to think ahead where extions might need to be made in the future. I just “phase in” the remaining processing step: Since neither Extract words nor Reformat know of their environment neither needs to be touched due to the “detour”. The new processing step accepts the output of the existing upstream step and produces data compatible with the existing downstream step. Implementation - Increment 2 A trivial implementation checking the assumption if this works does not do anything to split long words. The input is just passed on: Note how clean WordWrap() stays. The solution is easy to understand. A developer looking at this code sometime in the future, when a new feature needs to be build in, quickly sees how long words are dealt with. Compare this to Robert C. Martin´s solution:[4] How does this solution handle long words? Long words are not even part of the domain language present in the code. At least I need considerable time to understand the approach. Admittedly the Flow Design solution with the full implementation of long word splitting is longer than Robert C. Martin´s. At least it seems. Because his solution does not cover all the “word wrap situations” the Flow Design solution handles. Some lines would need to be added to be on par, I guess. But even then… Is a difference in LOC that important as long as it´s in the same ball park? I value understandability and openness for extension higher than saving on the last line of code. Simplicity is not just less code, it´s also clarity in design. But don´t take my word for it. Try Flow Design on larger problems and compare for yourself. What´s the easier, more straightforward way to clean code? And keep in mind: You ain´t seen all yet ;-) There´s more to Flow Design than described in this chapter. In closing I hope I was able to give you a impression of functional design that makes you hungry for more. To me it´s an inevitable step in software development. Jumping from requirements to code does not scale. And it leads to dirty code all to quickly. Some thought should be invested first. Where there is a clear Entry Point visible, it´s functionality should be designed using data flows. Because with data flows abstraction is possible. For more background on why that´s necessary read my blog article here. For now let me point out to you - if you haven´t already noticed - that Flow Design is a general purpose declarative language. It´s “programming by intention” (Shalloway et al.). Just write down how you think the solution should work on a high level of abstraction. This breaks down a large problem in smaller problems. And by following the PoMO the solutions to those smaller problems are independent of each other. So they are easy to test. Or you could even think about getting them implemented in parallel by different team members. Flow Design not only increases evolvability, but also helps becoming more productive. All team members can participate in functional design. This goes beyon collective code ownership. We´re talking collective design/architecture ownership. Because with Flow Design there is a common visual language to talk about functional design - which is the foundation for all other design activities.   PS: If you like what you read, consider getting my ebook “The Incremental Architekt´s Napkin”. It´s where I compile all the articles in this series for easier reading. I like the strictness of Function Programming - but I also find it quite hard to live by. And it certainly is not what millions of programmers are used to. Also to me it seems, the real world is full of state and side effects. So why give them such a bad image? That´s why functional design takes a more pragmatic approach. State and side effects are ok for processing steps - but be sure to follow the SRP. Don´t put too much of it into a single processing step. ? Image taken from www.physioweb.org ? My code samples are written in C#. C# sports typed function pointers called delegates. Action is such a function pointer type matching functions with signature void someName(T t). Other languages provide similar ways to work with functions as first class citizens - even Java now in version 8. I trust you find a way to map this detail of my translation to your favorite programming language. I know it works for Java, C++, Ruby, JavaScript, Python, Go. And if you´re using a Functional Programming language it´s of course a no brainer. ? Taken from his blog post “The Craftsman 62, The Dark Path”. ?

    Read the article

  • Control diamond square algorithm to generate islands/pangea.

    - by Gabriel A. Zorrilla
    I generated a height map with the diamond square algorithm. The thing is i do not manage to create islands, this is, restrict the height other than water level range to a certain value in the center of the map. I manualy seeded a circle in the middle of the map but the rest of the map still receives heights over the water level. I dont fully understand the Perlin noise algorithm so i'd like to work with my current implementation of the diamond square algorithm which took me 3 days to interpret and code in PHP. :P

    Read the article

  • Quick guide to Oracle IRM 11g: Classification design

    - by Simon Thorpe
    Quick guide to Oracle IRM 11g indexThis is the final article in the quick guide to Oracle IRM. If you've followed everything prior you will now have a fully functional and tested Information Rights Management service. It doesn't matter if you've been following the 10g or 11g guide as this next article is common to both. ContentsWhy this is the most important part... Understanding the classification and standard rights model Identifying business use cases Creating an effective IRM classification modelOne single classification across the entire businessA context for each and every possible granular use caseWhat makes a good context? Deciding on the use of roles in the context Reviewing the features and security for context roles Summary Why this is the most important part...Now the real work begins, installing and getting an IRM system running is as simple as following instructions. However to actually have an IRM technology easily protecting your most sensitive information without interfering with your users existing daily work flows and be able to scale IRM across the entire business, requires thought into how confidential documents are created, used and distributed. This article is going to give you the information you need to ask the business the right questions so that you can deploy your IRM service successfully. The IRM team here at Oracle have over 10 years of experience in helping customers and it is important you understand the following to be successful in securing access to your most confidential information. Whatever you are trying to secure, be it mergers and acquisitions information, engineering intellectual property, health care documentation or financial reports. No matter what type of user is going to access the information, be they employees, contractors or customers, there are common goals you are always trying to achieve.Securing the content at the earliest point possible and do it automatically. Removing the dependency on the user to decide to secure the content reduces the risk of mistakes significantly and therefore results a more secure deployment. K.I.S.S. (Keep It Simple Stupid) Reduce complexity in the rights/classification model. Oracle IRM lets you make changes to access to documents even after they are secured which allows you to start with a simple model and then introduce complexity once you've understood how the technology is going to be used in the business. After an initial learning period you can review your implementation and start to make informed decisions based on user feedback and administration experience. Clearly communicate to the user, when appropriate, any changes to their existing work practice. You must make every effort to make the transition to sealed content as simple as possible. For external users you must help them understand why you are securing the documents and inform them the value of the technology to both your business and them. Before getting into the detail, I must pay homage to Martin White, Vice President of client services in SealedMedia, the company Oracle acquired and who created Oracle IRM. In the SealedMedia years Martin was involved with every single customer and was key to the design of certain aspects of the IRM technology, specifically the context model we will be discussing here. Listening carefully to customers and understanding the flexibility of the IRM technology, Martin taught me all the skills of helping customers build scalable, effective and simple to use IRM deployments. No matter how well the engineering department designed the software, badly designed and poorly executed projects can result in difficult to use and manage, and ultimately insecure solutions. The advice and information that follows was born with Martin and he's still delivering IRM consulting with customers and can be found at www.thinkers.co.uk. It is from Martin and others that Oracle not only has the most advanced, scalable and usable document security solution on the market, but Oracle and their partners have the most experience in delivering successful document security solutions. Understanding the classification and standard rights model The goal of any successful IRM deployment is to balance the increase in security the technology brings without over complicating the way people use secured content and avoid a significant increase in administration and maintenance. With Oracle it is possible to automate the protection of content, deploy the desktop software transparently and use authentication methods such that users can open newly secured content initially unaware the document is any different to an insecure one. That is until of course they attempt to do something for which they don't have any rights, such as copy and paste to an insecure application or try and print. Central to achieving this objective is creating a classification model that is simple to understand and use but also provides the right level of complexity to meet the business needs. In Oracle IRM the term used for each classification is a "context". A context defines the relationship between.A group of related documents The people that use the documents The roles that these people perform The rights that these people need to perform their role The context is the key to the success of Oracle IRM. It provides the separation of the role and rights of a user from the content itself. Documents are sealed to contexts but none of the rights, user or group information is stored within the content itself. Sealing only places information about the location of the IRM server that sealed it, the context applied to the document and a few other pieces of metadata that pertain only to the document. This important separation of rights from content means that millions of documents can be secured against a single classification and a user needs only one right assigned to be able to access all documents. If you have followed all the previous articles in this guide, you will be ready to start defining contexts to which your sensitive information will be protected. But before you even start with IRM, you need to understand how your own business uses and creates sensitive documents and emails. Identifying business use cases Oracle is able to support multiple classification systems, but usually there is one single initial need for the technology which drives a deployment. This need might be to protect sensitive mergers and acquisitions information, engineering intellectual property, financial documents. For this and every subsequent use case you must understand how users create and work with documents, to who they are distributed and how the recipients should interact with them. A successful IRM deployment should start with one well identified use case (we go through some examples towards the end of this article) and then after letting this use case play out in the business, you learn how your users work with content, how well your communication to the business worked and if the classification system you deployed delivered the right balance. It is at this point you can start rolling the technology out further. Creating an effective IRM classification model Once you have selected the initial use case you will address with IRM, you need to design a classification model that defines the access to secured documents within the use case. In Oracle IRM there is an inbuilt classification system called the "context" model. In Oracle IRM 11g it is possible to extend the server to support any rights classification model, but the majority of users who are not using an application integration (such as Oracle IRM within Oracle Beehive) are likely to be starting out with the built in context model. Before looking at creating a classification system with IRM, it is worth reviewing some recognized standards and methods for creating and implementing security policy. A very useful set of documents are the ISO 17799 guidelines and the SANS security policy templates. First task is to create a context against which documents are to be secured. A context consists of a group of related documents (all top secret engineering research), a list of roles (contributors and readers) which define how users can access documents and a list of users (research engineers) who have been given a role allowing them to interact with sealed content. Before even creating the first context it is wise to decide on a philosophy which will dictate the level of granularity, the question is, where do you start? At a department level? By project? By technology? First consider the two ends of the spectrum... One single classification across the entire business Imagine that instead of having separate contexts, one for engineering intellectual property, one for your financial data, one for human resources personally identifiable information, you create one context for all documents across the entire business. Whilst you may have immediate objections, there are some significant benefits in thinking about considering this. Document security classification decisions are simple. You only have one context to chose from! User provisioning is simple, just make sure everyone has a role in the only context in the business. Administration is very low, if you assign rights to groups from the business user repository you probably never have to touch IRM administration again. There are however some obvious downsides to this model.All users in have access to all IRM secured content. So potentially a sales person could access sensitive mergers and acquisition documents, if they can get their hands on a copy that is. You cannot delegate control of different documents to different parts of the business, this may not satisfy your regulatory requirements for the separation and delegation of duties. Changing a users role affects every single document ever secured. Even though it is very unlikely a business would ever use one single context to secure all their sensitive information, thinking about this scenario raises one very important point. Just having one single context and securing all confidential documents to it, whilst incurring some of the problems detailed above, has one huge value. Once secured, IRM protected content can ONLY be accessed by authorized users. Just think of all the sensitive documents in your business today, imagine if you could ensure that only everyone you trust could open them. Even if an employee lost a laptop or someone accidentally sent an email to the wrong recipient, only the right people could open that file. A context for each and every possible granular use case Now let's think about the total opposite of a single context design. What if you created a context for each and every single defined business need and created multiple contexts within this for each level of granularity? Let's take a use case where we need to protect engineering intellectual property. Imagine we have 6 different engineering groups, and in each we have a research department, a design department and manufacturing. The company information security policy defines 3 levels of information sensitivity... restricted, confidential and top secret. Then let's say that each group and department needs to define access to information from both internal and external users. Finally add into the mix that they want to review the rights model for each context every financial quarter. This would result in a huge amount of contexts. For example, lets just look at the resulting contexts for one engineering group. Q1FY2010 Restricted Internal - Engineering Group 1 - Research Q1FY2010 Restricted Internal - Engineering Group 1 - Design Q1FY2010 Restricted Internal - Engineering Group 1 - Manufacturing Q1FY2010 Restricted External- Engineering Group 1 - Research Q1FY2010 Restricted External - Engineering Group 1 - Design Q1FY2010 Restricted External - Engineering Group 1 - Manufacturing Q1FY2010 Confidential Internal - Engineering Group 1 - Research Q1FY2010 Confidential Internal - Engineering Group 1 - Design Q1FY2010 Confidential Internal - Engineering Group 1 - Manufacturing Q1FY2010 Confidential External - Engineering Group 1 - Research Q1FY2010 Confidential External - Engineering Group 1 - Design Q1FY2010 Confidential External - Engineering Group 1 - Manufacturing Q1FY2010 Top Secret Internal - Engineering Group 1 - Research Q1FY2010 Top Secret Internal - Engineering Group 1 - Design Q1FY2010 Top Secret Internal - Engineering Group 1 - Manufacturing Q1FY2010 Top Secret External - Engineering Group 1 - Research Q1FY2010 Top Secret External - Engineering Group 1 - Design Q1FY2010 Top Secret External - Engineering Group 1 - Manufacturing Now multiply the above by 6 for each engineering group, 18 contexts. You are then creating/reviewing another 18 every 3 months. After a year you've got 72 contexts. What would be the advantages of such a complex classification model? You can satisfy very granular rights requirements, for example only an authorized engineering group 1 researcher can create a top secret report for access internally, and his role will be reviewed on a very frequent basis. Your business may have very complex rights requirements and mapping this directly to IRM may be an obvious exercise. The disadvantages of such a classification model are significant...Huge administrative overhead. Someone in the business must manage, review and administrate each of these contexts. If the engineering group had a single administrator, they would have 72 classifications to reside over each year. From an end users perspective life will be very confusing. Imagine if a user has rights in just 6 of these contexts. They may be able to print content from one but not another, be able to edit content in 2 contexts but not the other 4. Such confusion at the end user level causes frustration and resistance to the use of the technology. Increased synchronization complexity. Imagine a user who after 3 years in the company ends up with over 300 rights in many different contexts across the business. This would result in long synchronization times as the client software updates all your offline rights. Hard to understand who can do what with what. Imagine being the VP of engineering and as part of an internal security audit you are asked the question, "What rights to researchers have to our top secret information?". In this complex model the answer is not simple, it would depend on many roles in many contexts. Of course this example is extreme, but it highlights that trying to build many barriers in your business can result in a nightmare of administration and confusion amongst users. In the real world what we need is a balance of the two. We need to seek an optimum number of contexts. Too many contexts are unmanageable and too few contexts does not give fine enough granularity. What makes a good context? Good context design derives mainly from how well you understand your business requirements to secure access to confidential information. Some customers I have worked with can tell me exactly the documents they wish to secure and know exactly who should be opening them. However there are some customers who know only of the government regulation that requires them to control access to certain types of information, they don't actually know where the documents are, how they are created or understand exactly who should have access. Therefore you need to know how to ask the business the right questions that lead to information which help you define a context. First ask these questions about a set of documentsWhat is the topic? Who are legitimate contributors on this topic? Who are the authorized readership? If the answer to any one of these is significantly different, then it probably merits a separate context. Remember that sealed documents are inherently secure and as such they cannot leak to your competitors, therefore it is better sealed to a broad context than not sealed at all. Simplicity is key here. Always revert to the first extreme example of a single classification, then work towards essential complexity. If there is any doubt, always prefer fewer contexts. Remember, Oracle IRM allows you to change your mind later on. You can implement a design now and continue to change and refine as you learn how the technology is used. It is easy to go from a simple model to a more complex one, it is much harder to take a complex model that is already embedded in the work practice of users and try to simplify it. It is also wise to take a single use case and address this first with the business. Don't try and tackle many different problems from the outset. Do one, learn from the process, refine it and then take what you have learned into the next use case, refine and continue. Once you have a good grasp of the technology and understand how your business will use it, you can then start rolling out the technology wider across the business. Deciding on the use of roles in the context Once you have decided on that first initial use case and a context to create let's look at the details you need to decide upon. For each context, identify; Administrative rolesBusiness owner, the person who makes decisions about who may or may not see content in this context. This is often the person who wanted to use IRM and drove the business purchase. They are the usually the person with the most at risk when sensitive information is lost. Point of contact, the person who will handle requests for access to content. Sometimes the same as the business owner, sometimes a trusted secretary or administrator. Context administrator, the person who will enact the decisions of the Business Owner. Sometimes the point of contact, sometimes a trusted IT person. Document related rolesContributors, the people who create and edit documents in this context. Reviewers, the people who are involved in reviewing documents but are not trusted to secure information to this classification. This role is not always necessary. (See later discussion on Published-work and Work-in-Progress) Readers, the people who read documents from this context. Some people may have several of the roles above, which is fine. What you are trying to do is understand and define how the business interacts with your sensitive information. These roles obviously map directly to roles available in Oracle IRM. Reviewing the features and security for context roles At this point we have decided on a classification of information, understand what roles people in the business will play when administrating this classification and how they will interact with content. The final piece of the puzzle in getting the information for our first context is to look at the permissions people will have to sealed documents. First think why are you protecting the documents in the first place? It is to prevent the loss of leaking of information to the wrong people. To control the information, making sure that people only access the latest versions of documents. You are not using Oracle IRM to prevent unauthorized people from doing legitimate work. This is an important point, with IRM you can erect many barriers to prevent access to content yet too many restrictions and authorized users will often find ways to circumvent using the technology and end up distributing unprotected originals. Because IRM is a security technology, it is easy to get carried away restricting different groups. However I would highly recommend starting with a simple solution with few restrictions. Ensure that everyone who reasonably needs to read documents can do so from the outset. Remember that with Oracle IRM you can change rights to content whenever you wish and tighten security. Always return to the fact that the greatest value IRM brings is that ONLY authorized users can access secured content, remember that simple "one context for the entire business" model. At the start of the deployment you really need to aim for user acceptance and therefore a simple model is more likely to succeed. As time passes and users understand how IRM works you can start to introduce more restrictions and complexity. Another key aspect to focus on is handling exceptions. If you decide on a context model where engineering can only access engineering information, and sales can only access sales data. Act quickly when a sales manager needs legitimate access to a set of engineering documents. Having a quick and effective process for permitting other people with legitimate needs to obtain appropriate access will be rewarded with acceptance from the user community. These use cases can often be satisfied by integrating IRM with a good Identity & Access Management technology which simplifies the process of assigning users the correct business roles. The big print issue... Printing is often an issue of contention, users love to print but the business wants to ensure sensitive information remains in the controlled digital world. There are many cases of physical document loss causing a business pain, it is often overlooked that IRM can help with this issue by limiting the ability to generate physical copies of digital content. However it can be hard to maintain a balance between security and usability when it comes to printing. Consider the following points when deciding about whether to give print rights. Oracle IRM sealed documents can contain watermarks that expose information about the user, time and location of access and the classification of the document. This information would reside in the printed copy making it easier to trace who printed it. Printed documents are slower to distribute in comparison to their digital counterparts, so time sensitive information in printed format may present a lower risk. Print activity is audited, therefore you can monitor and react to users abusing print rights. Summary In summary it is important to think carefully about the way you create your context model. As you ask the business these questions you may get a variety of different requirements. There may be special projects that require a context just for sensitive information created during the lifetime of the project. There may be a department that requires all information in the group is secured and you might have a few senior executives who wish to use IRM to exchange a small number of highly sensitive documents with a very small number of people. Oracle IRM, with its very flexible context classification system, can support all of these use cases. The trick is to introducing the complexity to deliver them at the right level. In another article i'm working on I will go through some examples of how Oracle IRM might map to existing business use cases. But for now, this article covers all the important questions you need to get your IRM service deployed and successfully protecting your most sensitive information.

    Read the article

  • Option Trading: Getting the most out of the event session options

    - by extended_events
    You can control different aspects of how an event session behaves by setting the event session options as part of the CREATE EVENT SESSION DDL. The default settings for the event session options are designed to handle most of the common event collection situations so I generally recommend that you just use the defaults. Like everything in the real world though, there are going to be a handful of “special cases” that require something different. This post focuses on identifying the special cases and the correct use of the options to accommodate those cases. There is a reason it’s called Default The default session options specify a total event buffer size of 4 MB with a 30 second latency. Translating this into human terms; this means that our default behavior is that the system will start processing events from the event buffer when we reach about 1.3 MB of events or after 30 seconds, which ever comes first. Aside: What’s up with the 1.3 MB, I thought you said the buffer was 4 MB?The Extended Events engine takes the total buffer size specified by MAX_MEMORY (4MB by default) and divides it into 3 equally sized buffers. This is done so that a session can be publishing events to one buffer while other buffers are being processed. There are always at least three buffers; how to get more than three is covered later. Using this configuration, the Extended Events engine can “keep up” with most event sessions on standard workloads. Why is this? The fact is that most events are small, really small; on the order of a couple hundred bytes. Even when you start considering events that carry dynamically sized data (eg. binary, text, etc.) or adding actions that collect additional data, the total size of the event is still likely to be pretty small. This means that each buffer can likely hold thousands of events before it has to be processed. When the event buffers are finally processed there is an economy of scale achieved since most targets support bulk processing of the events so they are processed at the buffer level rather than the individual event level. When all this is working together it’s more likely that a full buffer will be processed and put back into the ready queue before the remaining buffers (remember, there are at least three) are full. I know what you’re going to say: “My server is exceptional! My workload is so massive it defies categorization!” OK, maybe you weren’t going to say that exactly, but you were probably thinking it. The point is that there are situations that won’t be covered by the Default, but that’s a good place to start and this post assumes you’ve started there so that you have something to look at in order to determine if you do have a special case that needs different settings. So let’s get to the special cases… What event just fired?! How about now?! Now?! If you believe the commercial adage from Heinz Ketchup (Heinz Slow Good Ketchup ad on You Tube), some things are worth the wait. This is not a belief held by most DBAs, particularly DBAs who are looking for an answer to a troubleshooting question fast. If you’re one of these anxious DBAs, or maybe just a Program Manager doing a demo, then 30 seconds might be longer than you’re comfortable waiting. If you find yourself in this situation then consider changing the MAX_DISPATCH_LATENCY option for your event session. This option will force the event buffers to be processed based on your time schedule. This option only makes sense for the asynchronous targets since those are the ones where we allow events to build up in the event buffer – if you’re using one of the synchronous targets this option isn’t relevant. Avoid forgotten events by increasing your memory Have you ever had one of those days where you keep forgetting things? That can happen in Extended Events too; we call it dropped events. In order to optimizes for server performance and help ensure that the Extended Events doesn’t block the server if to drop events that can’t be published to a buffer because the buffer is full. You can determine if events are being dropped from a session by querying the dm_xe_sessions DMV and looking at the dropped_event_count field. Aside: Should you care if you’re dropping events?Maybe not – think about why you’re collecting data in the first place and whether you’re really going to miss a few dropped events. For example, if you’re collecting query duration stats over thousands of executions of a query it won’t make a huge difference to miss a couple executions. Use your best judgment. If you find that your session is dropping events it means that the event buffer is not large enough to handle the volume of events that are being published. There are two ways to address this problem. First, you could collect fewer events – examine you session to see if you are over collecting. Do you need all the actions you’ve specified? Could you apply a predicate to be more specific about when you fire the event? Assuming the session is defined correctly, the next option is to change the MAX_MEMORY option to a larger number. Picking the right event buffer size might take some trial and error, but a good place to start is with the number of dropped events compared to the number you’ve collected. Aside: There are three different behaviors for dropping events that you specify using the EVENT_RETENTION_MODE option. The default is to allow single event loss and you should stick with this setting since it is the best choice for keeping the impact on server performance low.You’ll be tempted to use the setting to not lose any events (NO_EVENT_LOSS) – resist this urge since it can result in blocking on the server. If you’re worried that you’re losing events you should be increasing your event buffer memory as described in this section. Some events are too big to fail A less common reason for dropping an event is when an event is so large that it can’t fit into the event buffer. Even though most events are going to be small, you might find a condition that occasionally generates a very large event. You can determine if your session is dropping large events by looking at the dm_xe_sessions DMV once again, this time check the largest_event_dropped_size. If this value is larger than the size of your event buffer [remember, the size of your event buffer, by default, is max_memory / 3] then you need a large event buffer. To specify a large event buffer you set the MAX_EVENT_SIZE option to a value large enough to fit the largest event dropped based on data from the DMV. When you set this option the Extended Events engine will create two buffers of this size to accommodate these large events. As an added bonus (no extra charge) the large event buffer will also be used to store normal events in the cases where the normal event buffers are all full and waiting to be processed. (Note: This is just a side-effect, not the intended use. If you’re dropping many normal events then you should increase your normal event buffer size.) Partitioning: moving your events to a sub-division Earlier I alluded to the fact that you can configure your event session to use more than the standard three event buffers – this is called partitioning and is controlled by the MEMORY_PARTITION_MODE option. The result of setting this option is fairly easy to explain, but knowing when to use it is a bit more art than science. First the science… You can configure partitioning in three ways: None, Per NUMA Node & Per CPU. This specifies the location where sets of event buffers are created with fairly obvious implication. There are rules we follow for sub-dividing the total memory (specified by MAX_MEMORY) between all the event buffers that are specific to the mode used: None: 3 buffers (fixed)Node: 3 * number_of_nodesCPU: 2.5 * number_of_cpus Here are some examples of what this means for different Node/CPU counts: Configuration None Node CPU 2 CPUs, 1 Node 3 buffers 3 buffers 5 buffers 6 CPUs, 2 Node 3 buffers 6 buffers 15 buffers 40 CPUs, 5 Nodes 3 buffers 15 buffers 100 buffers   Aside: Buffer size on multi-processor computersAs the number of Nodes or CPUs increases, the size of the event buffer gets smaller because the total memory is sub-divided into more pieces. The defaults will hold up to this for a while since each buffer set is holding events only from the Node or CPU that it is associated with, but at some point the buffers will get too small and you’ll either see events being dropped or you’ll get an error when you create your session because you’re below the minimum buffer size. Increase the MAX_MEMORY setting to an appropriate number for the configuration. The most likely reason to start partitioning is going to be related to performance. If you notice that running an event session is impacting the performance of your server beyond a reasonably expected level [Yes, there is a reasonably expected level of work required to collect events.] then partitioning might be an answer. Before you partition you might want to check a few other things: Is your event retention set to NO_EVENT_LOSS and causing blocking? (I told you not to do this.) Consider changing your event loss mode or increasing memory. Are you over collecting and causing more work than necessary? Consider adding predicates to events or removing unnecessary events and actions from your session. Are you writing the file target to the same slow disk that you use for TempDB and your other high activity databases? <kidding> <not really> It’s always worth considering the end to end picture – if you’re writing events to a file you can be impacted by I/O, network; all the usual stuff. Assuming you’ve ruled out the obvious (and not so obvious) issues, there are performance conditions that will be addressed by partitioning. For example, it’s possible to have a successful event session (eg. no dropped events) but still see a performance impact because you have many CPUs all attempting to write to the same free buffer and having to wait in line to finish their work. This is a case where partitioning would relieve the contention between the different CPUs and likely reduce the performance impact cause by the event session. There is no DMV you can check to find these conditions – sorry – that’s where the art comes in. This is  largely a matter of experimentation. On the bright side you probably won’t need to to worry about this level of detail all that often. The performance impact of Extended Events is significantly lower than what you may be used to with SQL Trace. You will likely only care about the impact if you are trying to set up a long running event session that will be part of your everyday workload – sessions used for short term troubleshooting will likely fall into the “reasonably expected impact” category. Hey buddy – I think you forgot something OK, there are two options I didn’t cover: STARTUP_STATE & TRACK_CAUSALITY. If you want your event sessions to start automatically when the server starts, set the STARTUP_STATE option to ON. (Now there is only one option I didn’t cover.) I’m going to leave causality for another post since it’s not really related to session behavior, it’s more about event analysis. - Mike Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • SOA’s People Problem by Bob Rhubart

    - by JuergenKress
    Are reluctant passengers slowing down your SOA train? Based on my conversations with various experts in service-oriented architecture (SOA), the consensus is that SOA tools and technology have achieved a high level of maturity. Some even use the term industrialization to describe the current state of SOA. Given that scenario, one might assume that SOA has been wildly successful for every organization that has adopted its principles. Obviously SOA could not have achieved its current level of maturity and industrialization without having reached a tipping point in the volume of success stories to drive continued adoption. But some organizations continue to struggle with SOA. The problem, according to some experts, has little to do with tools or technologies. “One of the greatest challenges to implementing SOA has nothing to do with the intrinsic complexity behind a SOA technology platform,” says Oracle ACE Luis Augusto Weir, senior Oracle solution director at HCL AXON. “The real difficulty lies in dealing with people and processes from different parts of the business and aligning them to deliver enterprisewide solutions.” What can an organization do to meet that challenge? “Staff the right people,” says Weir. “For example, the role of a SOA architect should be as much about integrating people as it is about integrating systems. Dealing with people from different departments, backgrounds, and agendas is a huge challenge. The SOA architect role requires someone that not only has a sound architectural and technological background but also has charisma and human skills, and can communicate equally well to the business and technical teams.” The SOA architect’s communication skills are instrumental in establishing service orientation as the guiding principle across the organization. “A consistent architecture comprising both business services and IT services can comprehensively redefine the role of IT at the process level,” says Danilo Schmiedel, solution architect at Opitz Consulting. That helps to shift the focus from siloes to services and get SOA on track. To that end, Oracle ACE Director Lonneke Dikmans, a managing partner at Vennster, stresses the importance of replacing individual, uncoordinated projects with a focused program that promotes communication, cooperation, and service reuse. “Having support among lead developers and architects helps, as does having sponsors that see the business case and understand the strategic value,” she says. Read the complete article here. SOA & BPM Partner Community For regular information on Oracle SOA Suite become a member in the SOA & BPM Partner Community for registration please visit www.oracle.com/goto/emea/soa (OPN account required) If you need support with your account please contact the Oracle Partner Business Center. Blog Twitter LinkedIn Facebook Wiki Technorati Tags: Bob Rhubard,OTN,Lonneke Dikmans,SOA Community,Oracle SOA,Oracle BPM,Community,OPN,Jürgen Kress

    Read the article

  • How to find and fix performance problems in ORM powered applications

    - by FransBouma
    Once in a while we get requests about how to fix performance problems with our framework. As it comes down to following the same steps and looking into the same things every single time, I decided to write a blogpost about it instead, so more people can learn from this and solve performance problems in their O/R mapper powered applications. In some parts it's focused on LLBLGen Pro but it's also usable for other O/R mapping frameworks, as the vast majority of performance problems in O/R mapper powered applications are not specific for a certain O/R mapper framework. Too often, the developer looks at the wrong part of the application, trying to fix what isn't a problem in that part, and getting frustrated that 'things are so slow with <insert your favorite framework X here>'. I'm in the O/R mapper business for a long time now (almost 10 years, full time) and as it's a small world, we O/R mapper developers know almost all tricks to pull off by now: we all know what to do to make task ABC faster and what compromises (because there are almost always compromises) to deal with if we decide to make ABC faster that way. Some O/R mapper frameworks are faster in X, others in Y, but you can be sure the difference is mainly a result of a compromise some developers are willing to deal with and others aren't. That's why the O/R mapper frameworks on the market today are different in many ways, even though they all fetch and save entities from and to a database. I'm not suggesting there's no room for improvement in today's O/R mapper frameworks, there always is, but it's not a matter of 'the slowness of the application is caused by the O/R mapper' anymore. Perhaps query generation can be optimized a bit here, row materialization can be optimized a bit there, but it's mainly coming down to milliseconds. Still worth it if you're a framework developer, but it's not much compared to the time spend inside databases and in user code: if a complete fetch takes 40ms or 50ms (from call to entity object collection), it won't make a difference for your application as that 10ms difference won't be noticed. That's why it's very important to find the real locations of the problems so developers can fix them properly and don't get frustrated because their quest to get a fast, performing application failed. Performance tuning basics and rules Finding and fixing performance problems in any application is a strict procedure with four prescribed steps: isolate, analyze, interpret and fix, in that order. It's key that you don't skip a step nor make assumptions: these steps help you find the reason of a problem which seems to be there, and how to fix it or leave it as-is. Skipping a step, or when you assume things will be bad/slow without doing analysis will lead to the path of premature optimization and won't actually solve your problems, only create new ones. The most important rule of finding and fixing performance problems in software is that you have to understand what 'performance problem' actually means. Most developers will say "when a piece of software / code is slow, you have a performance problem". But is that actually the case? If I write a Linq query which will aggregate, group and sort 5 million rows from several tables to produce a resultset of 10 rows, it might take more than a couple of milliseconds before that resultset is ready to be consumed by other logic. If I solely look at the Linq query, the code consuming the resultset of the 10 rows and then look at the time it takes to complete the whole procedure, it will appear to me to be slow: all that time taken to produce and consume 10 rows? But if you look closer, if you analyze and interpret the situation, you'll see it does a tremendous amount of work, and in that light it might even be extremely fast. With every performance problem you encounter, always do realize that what you're trying to solve is perhaps not a technical problem at all, but a perception problem. The second most important rule you have to understand is based on the old saying "Penny wise, Pound Foolish": the part which takes e.g. 5% of the total time T for a given task isn't worth optimizing if you have another part which takes a much larger part of the total time T for that same given task. Optimizing parts which are relatively insignificant for the total time taken is not going to bring you better results overall, even if you totally optimize that part away. This is the core reason why analysis of the complete set of application parts which participate in a given task is key to being successful in solving performance problems: No analysis -> no problem -> no solution. One warning up front: hunting for performance will always include making compromises. Fast software can be made maintainable, but if you want to squeeze as much performance out of your software, you will inevitably be faced with the dilemma of compromising one or more from the group {readability, maintainability, features} for the extra performance you think you'll gain. It's then up to you to decide whether it's worth it. In almost all cases it's not. The reason for this is simple: the vast majority of performance problems can be solved by implementing the proper algorithms, the ones with proven Big O-characteristics so you know the performance you'll get plus you know the algorithm will work. The time taken by the algorithm implementing code is inevitable: you already implemented the best algorithm. You might find some optimizations on the technical level but in general these are minor. Let's look at the four steps to see how they guide us through the quest to find and fix performance problems. Isolate The first thing you need to do is to isolate the areas in your application which are assumed to be slow. For example, if your application is a web application and a given page is taking several seconds or even minutes to load, it's a good candidate to check out. It's important to start with the isolate step because it allows you to focus on a single code path per area with a clear begin and end and ignore the rest. The rest of the steps are taken per identified problematic area. Keep in mind that isolation focuses on tasks in an application, not code snippets. A task is something that's started in your application by either another task or the user, or another program, and has a beginning and an end. You can see a task as a piece of functionality offered by your application.  Analyze Once you've determined the problem areas, you have to perform analysis on the code paths of each area, to see where the performance problems occur and which areas are not the problem. This is a multi-layered effort: an application which uses an O/R mapper typically consists of multiple parts: there's likely some kind of interface (web, webservice, windows etc.), a part which controls the interface and business logic, the O/R mapper part and the RDBMS, all connected with either a network or inter-process connections provided by the OS or other means. Each of these parts, including the connectivity plumbing, eat up a part of the total time it takes to complete a task, e.g. load a webpage with all orders of a given customer X. To understand which parts participate in the task / area we're investigating and how much they contribute to the total time taken to complete the task, analysis of each participating task is essential. Start with the code you wrote which starts the task, analyze the code and track the path it follows through your application. What does the code do along the way, verify whether it's correct or not. Analyze whether you have implemented the right algorithms in your code for this particular area. Remember we're looking at one area at a time, which means we're ignoring all other code paths, just the code path of the current problematic area, from begin to end and back. Don't dig in and start optimizing at the code level just yet. We're just analyzing. If your analysis reveals big architectural stupidity, it's perhaps a good idea to rethink the architecture at this point. For the rest, we're analyzing which means we collect data about what could be wrong, for each participating part of the complete application. Reviewing the code you wrote is a good tool to get deeper understanding of what is going on for a given task but ultimately it lacks precision and overview what really happens: humans aren't good code interpreters, computers are. We therefore need to utilize tools to get deeper understanding about which parts contribute how much time to the total task, triggered by which other parts and for example how many times are they called. There are two different kind of tools which are necessary: .NET profilers and O/R mapper / RDBMS profilers. .NET profiling .NET profilers (e.g. dotTrace by JetBrains or Ants by Red Gate software) show exactly which pieces of code are called, how many times they're called, and the time it took to run that piece of code, at the method level and sometimes even at the line level. The .NET profilers are essential tools for understanding whether the time taken to complete a given task / area in your application is consumed by .NET code, where exactly in your code, the path to that code, how many times that code was called by other code and thus reveals where hotspots are located: the areas where a solution can be found. Importantly, they also reveal which areas can be left alone: remember our penny wise pound foolish saying: if a profiler reveals that a group of methods are fast, or don't contribute much to the total time taken for a given task, ignore them. Even if the code in them is perhaps complex and looks like a candidate for optimization: you can work all day on that, it won't matter.  As we're focusing on a single area of the application, it's best to start profiling right before you actually activate the task/area. Most .NET profilers support this by starting the application without starting the profiling procedure just yet. You navigate to the particular part which is slow, start profiling in the profiler, in your application you perform the actions which are considered slow, and afterwards you get a snapshot in the profiler. The snapshot contains the data collected by the profiler during the slow action, so most data is produced by code in the area to investigate. This is important, because it allows you to stay focused on a single area. O/R mapper and RDBMS profiling .NET profilers give you a good insight in the .NET side of things, but not in the RDBMS side of the application. As this article is about O/R mapper powered applications, we're also looking at databases, and the software making it possible to consume the database in your application: the O/R mapper. To understand which parts of the O/R mapper and database participate how much to the total time taken for task T, we need different tools. There are two kind of tools focusing on O/R mappers and database performance profiling: O/R mapper profilers and RDBMS profilers. For O/R mapper profilers, you can look at LLBLGen Prof by hibernating rhinos or the Linq to Sql/LLBLGen Pro profiler by Huagati. Hibernating rhinos also have profilers for other O/R mappers like NHibernate (NHProf) and Entity Framework (EFProf) and work the same as LLBLGen Prof. For RDBMS profilers, you have to look whether the RDBMS vendor has a profiler. For example for SQL Server, the profiler is shipped with SQL Server, for Oracle it's build into the RDBMS, however there are also 3rd party tools. Which tool you're using isn't really important, what's important is that you get insight in which queries are executed during the task / area we're currently focused on and how long they took. Here, the O/R mapper profilers have an advantage as they collect the time it took to execute the query from the application's perspective so they also collect the time it took to transport data across the network. This is important because a query which returns a massive resultset or a resultset with large blob/clob/ntext/image fields takes more time to get transported across the network than a small resultset and a database profiler doesn't take this into account most of the time. Another tool to use in this case, which is more low level and not all O/R mappers support it (though LLBLGen Pro and NHibernate as well do) is tracing: most O/R mappers offer some form of tracing or logging system which you can use to collect the SQL generated and executed and often also other activity behind the scenes. While tracing can produce a tremendous amount of data in some cases, it also gives insight in what's going on. Interpret After we've completed the analysis step it's time to look at the data we've collected. We've done code reviews to see whether we've done anything stupid and which parts actually take place and if the proper algorithms have been implemented. We've done .NET profiling to see which parts are choke points and how much time they contribute to the total time taken to complete the task we're investigating. We've performed O/R mapper profiling and RDBMS profiling to see which queries were executed during the task, how many queries were generated and executed and how long they took to complete, including network transportation. All this data reveals two things: which parts are big contributors to the total time taken and which parts are irrelevant. Both aspects are very important. The parts which are irrelevant (i.e. don't contribute significantly to the total time taken) can be ignored from now on, we won't look at them. The parts which contribute a lot to the total time taken are important to look at. We now have to first look at the .NET profiler results, to see whether the time taken is consumed in our own code, in .NET framework code, in the O/R mapper itself or somewhere else. For example if most of the time is consumed by DbCommand.ExecuteReader, the time it took to complete the task is depending on the time the data is fetched from the database. If there was just 1 query executed, according to tracing or O/R mapper profilers / RDBMS profilers, check whether that query is optimal, uses indexes or has to deal with a lot of data. Interpret means that you follow the path from begin to end through the data collected and determine where, along the path, the most time is contributed. It also means that you have to check whether this was expected or is totally unexpected. My previous example of the 10 row resultset of a query which groups millions of rows will likely reveal that a long time is spend inside the database and almost no time is spend in the .NET code, meaning the RDBMS part contributes the most to the total time taken, the rest is compared to that time, irrelevant. Considering the vastness of the source data set, it's expected this will take some time. However, does it need tweaking? Perhaps all possible tweaks are already in place. In the interpret step you then have to decide that further action in this area is necessary or not, based on what the analysis results show: if the analysis results were unexpected and in the area where the most time is contributed to the total time taken is room for improvement, action should be taken. If not, you can only accept the situation and move on. In all cases, document your decision together with the analysis you've done. If you decide that the perceived performance problem is actually expected due to the nature of the task performed, it's essential that in the future when someone else looks at the application and starts asking questions you can answer them properly and new analysis is only necessary if situations changed. Fix After interpreting the analysis results you've concluded that some areas need adjustment. This is the fix step: you're actively correcting the performance problem with proper action targeted at the real cause. In many cases related to O/R mapper powered applications it means you'll use different features of the O/R mapper to achieve the same goal, or apply optimizations at the RDBMS level. It could also mean you apply caching inside your application (compromise memory consumption over performance) to avoid unnecessary re-querying data and re-consuming the results. After applying a change, it's key you re-do the analysis and interpretation steps: compare the results and expectations with what you had before, to see whether your actions had any effect or whether it moved the problem to a different part of the application. Don't fall into the trap to do partly analysis: do the full analysis again: .NET profiling and O/R mapper / RDBMS profiling. It might very well be that the changes you've made make one part faster but another part significantly slower, in such a way that the overall problem hasn't changed at all. Performance tuning is dealing with compromises and making choices: to use one feature over the other, to accept a higher memory footprint, to go away from the strict-OO path and execute queries directly onto the RDBMS, these are choices and compromises which will cross your path if you want to fix performance problems with respect to O/R mappers or data-access and databases in general. In most cases it's not a big issue: alternatives are often good choices too and the compromises aren't that hard to deal with. What is important is that you document why you made a choice, a compromise: which analysis data, which interpretation led you to the choice made. This is key for good maintainability in the years to come. Most common performance problems with O/R mappers Below is an incomplete list of common performance problems related to data-access / O/R mappers / RDBMS code. It will help you with fixing the hotspots you found in the interpretation step. SELECT N+1: (Lazy-loading specific). Lazy loading triggered performance bottlenecks. Consider a list of Orders bound to a grid. You have a Field mapped onto a related field in Order, Customer.CompanyName. Showing this column in the grid will make the grid fetch (indirectly) for each row the Customer row. This means you'll get for the single list not 1 query (for the orders) but 1+(the number of orders shown) queries. To solve this: use eager loading using a prefetch path to fetch the customers with the orders. SELECT N+1 is easy to spot with an O/R mapper profiler or RDBMS profiler: if you see a lot of identical queries executed at once, you have this problem. Prefetch paths using many path nodes or sorting, or limiting. Eager loading problem. Prefetch paths can help with performance, but as 1 query is fetched per node, it can be the number of data fetched in a child node is bigger than you think. Also consider that data in every node is merged on the client within the parent. This is fast, but it also can take some time if you fetch massive amounts of entities. If you keep fetches small, you can use tuning parameters like the ParameterizedPrefetchPathThreshold setting to get more optimal queries. Deep inheritance hierarchies of type Target Per Entity/Type. If you use inheritance of type Target per Entity / Type (each type in the inheritance hierarchy is mapped onto its own table/view), fetches will join subtype- and supertype tables in many cases, which can lead to a lot of performance problems if the hierarchy has many types. With this problem, keep inheritance to a minimum if possible, or switch to a hierarchy of type Target Per Hierarchy, which means all entities in the inheritance hierarchy are mapped onto the same table/view. Of course this has its own set of drawbacks, but it's a compromise you might want to take. Fetching massive amounts of data by fetching large lists of entities. LLBLGen Pro supports paging (and limiting the # of rows returned), which is often key to process through large sets of data. Use paging on the RDBMS if possible (so a query is executed which returns only the rows in the page requested). When using paging in a web application, be sure that you switch server-side paging on on the datasourcecontrol used. In this case, paging on the grid alone is not enough: this can lead to fetching a lot of data which is then loaded into the grid and paged there. Keep note that analyzing queries for paging could lead to the false assumption that paging doesn't occur, e.g. when the query contains a field of type ntext/image/clob/blob and DISTINCT can't be applied while it should have (e.g. due to a join): the datareader will do DISTINCT filtering on the client. this is a little slower but it does perform paging functionality on the data-reader so it won't fetch all rows even if the query suggests it does. Fetch massive amounts of data because blob/clob/ntext/image fields aren't excluded. LLBLGen Pro supports field exclusion for queries. You can exclude fields (also in prefetch paths) per query to avoid fetching all fields of an entity, e.g. when you don't need them for the logic consuming the resultset. Excluding fields can greatly reduce the amount of time spend on data-transport across the network. Use this optimization if you see that there's a big difference between query execution time on the RDBMS and the time reported by the .NET profiler for the ExecuteReader method call. Doing client-side aggregates/scalar calculations by consuming a lot of data. If possible, try to formulate a scalar query or group by query using the projection system or GetScalar functionality of LLBLGen Pro to do data consumption on the RDBMS server. It's far more efficient to process data on the RDBMS server than to first load it all in memory, then traverse the data in-memory to calculate a value. Using .ToList() constructs inside linq queries. It might be you use .ToList() somewhere in a Linq query which makes the query be run partially in-memory. Example: var q = from c in metaData.Customers.ToList() where c.Country=="Norway" select c; This will actually fetch all customers in-memory and do an in-memory filtering, as the linq query is defined on an IEnumerable<T>, and not on the IQueryable<T>. Linq is nice, but it can often be a bit unclear where some parts of a Linq query might run. Fetching all entities to delete into memory first. To delete a set of entities it's rather inefficient to first fetch them all into memory and then delete them one by one. It's more efficient to execute a DELETE FROM ... WHERE query on the database directly to delete the entities in one go. LLBLGen Pro supports this feature, and so do some other O/R mappers. It's not always possible to do this operation in the context of an O/R mapper however: if an O/R mapper relies on a cache, these kind of operations are likely not supported because they make it impossible to track whether an entity is actually removed from the DB and thus can be removed from the cache. Fetching all entities to update with an expression into memory first. Similar to the previous point: it is more efficient to update a set of entities directly with a single UPDATE query using an expression instead of fetching the entities into memory first and then updating the entities in a loop, and afterwards saving them. It might however be a compromise you don't want to take as it is working around the idea of having an object graph in memory which is manipulated and instead makes the code fully aware there's a RDBMS somewhere. Conclusion Performance tuning is almost always about compromises and making choices. It's also about knowing where to look and how the systems in play behave and should behave. The four steps I provided should help you stay focused on the real problem and lead you towards the solution. Knowing how to optimally use the systems participating in your own code (.NET framework, O/R mapper, RDBMS, network/services) is key for success as well as knowing what's going on inside the application you built. I hope you'll find this guide useful in tracking down performance problems and dealing with them in a useful way.  

    Read the article

  • LLBLGen Pro v3.5 has been released!

    - by FransBouma
    Last weekend we released LLBLGen Pro v3.5! Below the list of what's new in this release. Of course, not everything is on this list, like the large amount of work we put in refactoring the runtime framework. The refactoring was necessary because our framework has two paradigms which are added to the framework at a different time, and from a design perspective in the wrong order (the paradigm we added first, SelfServicing, should have been built on top of Adapter, the other paradigm, which was added more than a year after the first released version). The refactoring made sure the framework re-uses more code across the two paradigms (they already shared a lot of code) and is better prepared for the future. We're not done yet, but refactoring a massive framework like ours without breaking interfaces and existing applications is ... a bit of a challenge ;) To celebrate the release of v3.5, we give every customer a 30% discount! Use the coupon code NR1ORM with your order :) The full list of what's new: Designer Rule based .NET Attribute definitions. It's now possible to specify a rule using fine-grained expressions with an attribute definition to define which elements of a given type will receive the attribute definition. Rules can be assigned to attribute definitions on the project level, to make it even easier to define attribute definitions in bulk for many elements in the project. More information... Revamped Project Settings dialog. Multiple project related properties and settings dialogs have been merged into a single dialog called Project Settings, which makes it easier to configure the various settings related to project elements. It also makes it easier to find features previously not used  by many (e.g. type conversions) More information... Home tab with Quick Start Guides. To make new users feel right at home, we added a home tab with quick start guides which guide you through four main use cases of the designer. System Type Converters. Many common conversions have been implemented by default in system type converters so users don't have to develop their own type converters anymore for these type conversions. Bulk Element Setting Manipulator. To change setting values for multiple project elements, it was a little cumbersome to do that without a lot of clicking and opening various editors. This dialog makes changing settings for multiple elements very easy. EDMX Importer. It's now possible to import entity model data information from an existing Entity Framework EDMX file. Other changes and fixes See for the full list of changes and fixes the online documentation. LLBLGen Pro Runtime Framework WCF Data Services (OData) support has been added. It's now possible to use your LLBLGen Pro runtime framework powered domain layer in a WCF Data Services application using the VS.NET tools for WCF Data Services. WCF Data Services is a Microsoft technology for .NET 4 to expose your domain model using OData. More information... New query specification and execution API: QuerySpec. QuerySpec is our new query specification and execution API as an alternative to Linq and our more low-level API. It's build, like our Linq provider, on top of our lower-level API. More information... SQL Server 2012 support. The SQL Server DQE allows paging using the new SQL Server 2012 style. More information... System Type converters. For a common set of types the LLBLGen Pro runtime framework contains built-in type conversions so you don't need to write your own type converters anymore. Public/NonPublic property support. It's now possible to mark a field / navigator as non-public which is reflected in the runtime framework as an internal/friend property instead of a public property. This way you can hide properties from the public interface of a generated class and still access it through code added to the generated code base. FULL JOIN support. It's now possible to perform FULL JOIN joins using the native query api and QuerySpec. It's left to the developer to check whether the used target database supports FULL (OUTER) JOINs. Using a FULL JOIN with entity fetches is not recommended, and should only be used when both participants in the join aren't the target of the fetch. Dependency Injection Tracing. It's now possible to enable tracing on dependency injection. Enable tracing at level '4' on the traceswitch 'ORMGeneral'. This will emit trace information about which instance of which type got an instance of type T injected into property P. Entity Instances in projections in Linq. It's now possible to return an entity instance in a custom Linq projection. It's now also possible to pass this instance to a method inside the query projection. Inheritance fully supported in this construct. Entity Framework support The Entity Framework has been updated in the recent year with code-first support and a new simpler context api: DbContext (with DbSet). The amount of code to generate is smaller and the context simpler. LLBLGen Pro v3.5 comes with support for DbContext and DbSet and generates code which utilizes these new classes. NHibernate support NHibernate v3.2+ built-in proxy factory factory support. By default the built-in ProxyFactoryFactory is selected. FluentNHibernate Session Manager uses 1.2 syntax. Fluent NHibernate mappings generate a SessionManager which uses the v1.2 syntax for the ProxyFactoryFactory location Optionally emit schema / catalog name in mappings Two settings have been added which allow the user to control whether the catalog name and/or schema name as known in the project in the designer is emitted into the mappings.

    Read the article

  • SyncToBlog #12 Windows Azure and Cloud Links

    - by Eric Nelson
    Some more “syncing to paper” :) Steve Marx wrote a very interesting article about using Hosted Web Core in an Azure Worker Role. Hosted Web Core is a new feature in IIS 7 that enables developers to create applications that load the core IIS functionality. Wade Wegner is a new Technical Evangelist for Windows Azure platform AppFabric Example from Wade (and how I found him) Host WCF Services in IIS with Service Bus Endpoints Google and vmware “get engaged” over cloud http://googlecode.blogspot.com/2010/05/enabling-cloud-portability-with-google.html A new cloud comparison site – slick but limited coverage (it is not at Azure level, rather BPOS level) www.cloudhypermarket.com  The Rise of NoSQL Database (devx free registration required) Moe Khosravy talks about Codename "Dallas"  to my colleague David G (14min video) New videos Calculating the cost of Azure and Calculating the cost of SQL Azure Related Links: Previous SyncToBlog posts My delicious bookmarks

    Read the article

  • Scrum for a single programmer?

    - by Rob Perkins
    I'm billed as the "Windows Expert" in my very small company, which consists of myself, a mechanical engineer working in a sales and training role, and the company's president, working in a design, development, and support role. My role is equally as general, but primarily I design and implement whatever programming on our product needs to get done in order for our stuff to run on whichever versions of Windows are current. I just finished watching a high-level overview of the Scrum paradigm, given in a webcast. My question is: Is it worth my time to learn more about this approach to product development, given that my development work items are usually given at a very high level, such as "internationalize and localize the product". If it is, how would you suggest adapting Scrum for the use of just one programmer? What tools, cloud-based or otherwise, would be useful to that end? If it is not, what approach would you suggest for a single programmer to organize his efforts from day to day? (Perhaps the question reduces to that simple question.)

    Read the article

< Previous Page | 68 69 70 71 72 73 74 75 76 77 78 79  | Next Page >