Search Results

Search found 9199 results on 368 pages for 'topological sort'.

Page 61/368 | < Previous Page | 57 58 59 60 61 62 63 64 65 66 67 68  | Next Page >

  • Need ideas for an innovation week

    - by slandau
    So 4 times a year we have an innovation week (to even out the odd sprint releases). This whole week is dedicated to experimenting with new technology/ideas that could potentially help progress the software department or the company as a whole, and serve as sort of a starting point for new ideas and brainstorming. For example, the last one contained a lot of projects. One was the re-design of our web app into more of a Web 2.0 look and feel using JQuery and a lot of cool CSS tricks. Another was a proposal for a new bug tracking software as opposed to the clearly outdated one we use, and another was a very cool JQuery/Js design that could show the same page to multiple users on different computers and allow each of them to take "charge" of the page, disabling the other one from doing anything, and vice versa, seeing all updates in real time -- sort of like Netmeeting through Js. Well, this is my first one as a new employee so I wanted to think of something cool. We get one week (anywhere from 40-60 hours or so), and we usually pair up or do this in groups of 3-4, depending on how many projects there are. Projects have to get approved but usually that doesn't prove to be too difficult. We are in the financial analysis software industry if the domain was leading you guys to think of anything helpful. I am primarily working on a web app in MVC 2 at the moment using a lot of JQuery and a C# backend. Do you guys have any idea of something that would be cool/beneficial/worth it?

    Read the article

  • Get 100 highest numbers from an infinite list

    - by Sachin Shanbhag
    One of my friend was asked this interview question - "There is a constant flow of numbers coming in from some infinite list of numbers out of which you need to maintain a datastructure as to return the top 100 highest numbers at any given point of time. Assume all the numbers are whole numbers only." This is simple, you need to keep a sorted list in descending order and keep a track on lowest number in that list. If new number obtained is greater than that lowest number then you have to remove that lowest number and insert the new number in sorted list as required. Then question was extended - "Can you make sure that the Order for insertion should be O(1)? Is it possible?" As far as I knew, even if you add a new number to list and sort it again using any sort algorithm, it would by best be O(logn) for quicksort (I think). So my friend told it was not possible. But he was not convinced, he asked to maintain any other data structure rather than a list. I thought of balanced Binary tree, but even there you will not get the insertion with order of 1. So the same question I have too now. Wanted to know if there is any such data structure which can do insertion in Order of 1 for above problem or it is not possible at all.

    Read the article

  • Where can I find design exercises to work on?

    - by Oak
    I feel it's important to continue practicing my problem-solving skills. Writing my own mini-projects is one way, but another is to try and solve problems posted online. It's easy to find interesting programming quizzes online that require applying clever algorithms to solve - Project Euler is one well-known example. However, in a lot of real-life projects the design of the software - especially in the initial phases - has a large impact and at later stages it cannot be tweaked as easily as plain algorithms. In order to improve these skills, I'm looking for any collection of design problems. When I say "design", I mean the abstract design of a software solution - for example what modules will there be and what are the dependencies between them, how data will flow in the program, what sort of data needs to be saved in the database, etc. Design problems are those problems that are critical to solve in the early stages of any project, but their solution is a whiteboard diagram without a single line of code. Of course these sort of problems do not have a single correct solution, but I'll be especially happy with any place that also displays pros and cons of the typical solutions that might be used to approach the problem.

    Read the article

  • suggestions for lonewolf dev setup

    - by d33j
    I'm looking for some suggestions for a better development setup. Background: I'm a crusty old software engineer (mostly java of late) and I have around 50 - 100 incomplete java projects scattered everywhere, usb keys, HDDs, and spanning across 5 or 6 computers etc, which have been put on hold for a few years (ie: family). I have no version control at home. I've been using IntelliJ for around 10 years, so that's the only constant. I'm thinking of nominating one machine as a headless server to put all my projects on, maybe a ubuntu box, that way It won't matter which device I'm on, all my projects can be accessed (and I don't have to waste time actually looking for them). I don't need to access code over the net. These are my own 'happy place' projects so I only work on them when I'm at home, however I can see the benefit of the tasking app being online, that way if I think of something while on public transport lets say, I can add it then & there, but it's not a requirement. I can wait until I get home to create tasks. Summary: So I need some sort of version control so I can rollback mistakes, and some sort of simple tasking software where I can assign tasks for myself later on when I get time. I use Subversion, Sonar, Jira and Crucible at work but I think it's a little bit of an overkill for me though. What do you suggest?

    Read the article

  • Which things instantly ring alarm bells when looking at code? [closed]

    - by FinnNk
    I attended a software craftsmanship event a couple of weeks ago and one of the comments made was "I'm sure we all recognize bad code when we see it" and everyone nodded sagely without further discussion. This sort of thing always worries me as there's that truism that everyone thinks they're an above average driver. Although I think I can recognize bad code I'd love to learn more about what other people consider to be code smells as it's rarely discussed in detail on people's blogs and only in a handful of books. In particular I think it'd be interesting to hear about anything that's a code smell in one language but not another. I'll start off with an easy one: Code in source control that has a high proportion of commented out code - why is it there? was it meant to be deleted? is it a half finished piece of work? maybe it shouldn't have been commented out and was only done when someone was testing something out? Personally I find this sort of thing really annoying even if it's just the odd line here and there, but when you see large blocks interspersed with the rest of the code it's totally unacceptable. It's also usually an indication that the rest of the code is likely to be of dubious quality as well.

    Read the article

  • Sorting for 2D Drawing

    - by Nexian
    okie, looked through quite a few similar questions but still feel the need to ask mine specifically (I know, crazy). Anyhoo: I am drawing a game in 2D (isometric) My objects have their own arrays. (i.e. Tiles[], Objects[], Particles[], etc) I want to have a draw[] array to hold anything that will be drawn. Because it is 2D, I assume I must prioritise depth over any other sorting or things will look weird. My game is turn based so Tiles and Objects won't be changing position every frame. However, Particles probably will. So I am thinking I can populate the draw[] array (probably a vector?) with what is on-screen and have it add/remove object, tile & particle references when I pan the screen or when a tile or object is specifically moved. No idea how often I'm going to have to update for particles right now. I want to do this because my game may have many thousands of objects and I want to iterate through as few as possible when drawing. I plan to give each element a depth value to sort by. So, my questions: Does the above method sound like a good way to deal with the actual drawing? What is the most efficient way to sort a vector? Most of the time it wont require efficiency. But for panning the screen it will. And I imagine if I have many particles on screen moving across multiple tiles, it may happen quite often. For reference, my screen will be drawing about 2,800 objects at any one time. When panning, it will be adding/removing about ~200 elements every second, and each new element will need adding in the correct location based on depth.

    Read the article

  • Sorting Algorithm : output

    - by Aaditya
    I faced this problem on a website and I quite can't understand the output, please help me understand it :- Bogosort, is a dumb algorithm which shuffles the sequence randomly until it is sorted. But here we have tweaked it a little, so that if after the last shuffle several first elements end up in the right places we will fix them and don't shuffle those elements furthermore. We will do the same for the last elements if they are in the right places. For example, if the initial sequence is (3, 5, 1, 6, 4, 2) and after one shuffle we get (1, 2, 5, 4, 3, 6) we will keep 1, 2 and 6 and proceed with sorting (5, 4, 3) using the same algorithm. Calculate the expected amount of shuffles for the improved algorithm to sort the sequence of the first n natural numbers given that no elements are in the right places initially. Input: 2 6 10 Output: 2 1826/189 877318/35343 For each test case output the expected amount of shuffles needed for the improved algorithm to sort the sequence of first n natural numbers in the form of irreducible fractions. I just can't understand the output.

    Read the article

  • Is there an idiot's guide to software licensing somewhere?

    - by Karpie
    Basically, my knowledge on the issue is zilch other than the fact that open-source and closed-source exists. I'm a web developer (not a designer in the slightest), so I look online for things like icons. I've always been a big fan of these icons, which have a Creative Commons Attribution 2.5 License. As far as I can see, this license says 'do whatever you want with them as long as you have a link back to me somewhere'. Is that assumption correct? Just today I found a new icon set, with a much more confusing license (found here), and to be quite honest I have no idea if I'm allowed to use them or not. At the moment I want to just use them for toy stuff that might never see the light of day, but then my source code is stored on Github, is it legal to store the icons there where they're publicly accessible? If I put them on my personal website that might have ads on it to make me five cents every now and then, is that legal? If I use them on a site that offers a free service to users, is that legal? If that site then starts making money (via things like paid subscriptions) or gets bought out by someone (highly unlikely but one day possible) is that legal? Is there some noob guide out there that explains all this stuff, because I would hate to start using this sort of stuff now only to have to change it all later. Even if I buy the icons, there's still licensing issues that I don't understand! :( And this sort of stuff keeps popping up more and more often...

    Read the article

  • What are some good tips for a developer trying to design a scalable MySQL database?

    - by CFL_Jeff
    As the question states, I am a developer, not a DBA. I have experience with designing good ER schemas and am fairly knowledgeable about normalization and good schema design. I have also worked with data warehouses that use dimensional modeling with fact tables and dim tables. However, all of the database-driven applications I've developed at previous jobs have been internal applications on the company's intranet, never receiving "real-world traffic". Furthermore, at previous jobs, I have always had a DBA or someone who knew much more than me about these things. At this new job I just started, I've been asked to develop a public-facing application with a MySQL backend and the data stored by this application is expected to grow very rapidly. Oh, and we don't have a DBA. Well, I guess I am the DBA. ;) As far as designing a database to be scalable, I don't even know where to start. Does anyone have any good tips or know of any good educational materials for a developer who has been sort of shoved into a DBA/database designer role and has been tasked with designing a scalable database to support an application like this? Have any other developers been through this sort of thing? What did you do to quickly become good at this role? I've found some good slides on the subject here but it's hard to glean details from slides. Wish I could've attended that guy's talk. I also found a good blog entry called 5 Ways to Boost MySQL Scalability which had some good information, though some of it was over my head. tl;dr I just want to make sure the database doesn't have to be completely redesigned when it scales up, and I'm looking for tips to get it right the first time. The answer I'm looking for is a "list of things every developer should know about making a scalable MySQL database so your application doesn't perform like crap when the data gets huge".

    Read the article

  • How to identify web development benchmarking questions? [closed]

    - by GenericJam
    I am in my final year of college and I have to put forward some sort of thesis for my final year project. The project is a web based attendance system that I am building for the college. I have it about 70% complete in Java. After completing it in Java, the plan is for me to rewrite the server bit in Erlang and then release the bitter rivals in a head to head cage match. The idea being that there is some sort of grounds for comparison. There are a few hurdles along the way, such as me learning Erlang. I understand that a performance comparison like this isn't strictly scientific as there are many factors such as the programmer (myself); the hardware it runs on; etc... but it is meant to be a reasonable comparison of the merits of using Java vs. Erlang for web development. I need help in identifying what the relevant questions are that my project could address. Even though the project scope is fixed, I am trying to shoehorn in some worthwhile scientific inquiries.

    Read the article

  • Getting into the details of game engine programming

    - by Darkslash
    I am interested in learning game programming, but I really have an interest in the lower level engineering in games. I have OpenGL experience, and I am really interested in learning more about implementing AI, Physics, etc. I have a computer science degree, so I really like getting into technical stuff. Many times when I ask about this sort of thing, I get a lot of "Use an engine", "Use Unity3d", "Why waste your time writing code that already exists", etc, etc. My idea was to use simpler libraries such as SFML or XNA so that I could learn how to implement the more complex systems. The thing is, although I do want to write games, I want to learn things that using something like Unity simply doesn't teach you. My goal is not to make a current generation quality 3D game to sell, I just want to make some cool smaller games and learn all I can about the programming side of game development. Is this something that people just do not do anymore? It seems like everywhere I turn people are using Unity or UDK or GameMaker. I fully understand why you would use a tool like these, but I cant see how they would suit my purposes. So where does someone like myself turn? Am I trying to learn something that people just do not bother doing anymore? Is the innovation in this area gone and just all about gameplay now? I'm sorry if this question seems silly, but I am genuinely interested in knowing more about this and meeting more people who are interested in this sort of thing.

    Read the article

  • Draw "vision cone" / targetting element onto game world

    - by gkimsey
    I'm wanting to indicate various things using a "pie slice" sort of shape as below. Similar to vision cones in stealth game minimaps, or targetting indicators in RTS type games for frontal area attacks. Something generic enough to be used for both would be ideal. I need to be able to procedurally (and efficiently) change things like the slice width and length, color, transparency, position in the world, etc. For my particular situation, there's no concern with elevation, funky terrain, or really any third axis at all as far as this element is concerned. I have two first inclinations on how to accomplish this: 1) Manually generate the vertices for a main triangle, (possibly two, superimposed to get the border effect), a handful more to approximate the arc at the end, and roll it into a mesh. 2) Use some sort of 2D drawing library to create a circle and mask it off at the right angles, render to texture, and use that. For reference, I have some experience with Ogre3D, but I'm not attached to it as this is a mostly academic pursuit at the moment. Other technologies that might be better at accomplishing this are more than welcome. Finally, I'm kind of curious about how to do a "flashlight" or similar 3D effect that could produce the same result, but on all surfaces in the lit area.

    Read the article

  • Reverse-Engineer Driver for Backlit Keyboard

    - by user87847
    Here's my situation: I recently purchased a Sager NP9170 (same as the Clevo P170EM) and it has a multi-colored, backlit keyboard. Under Windows 7, you can launch an app that allows you to change the color of the backlighting to any of a handful of colors (blue, green, red, etc). I want that same functionality under Linux. I haven't been able to find any software that does this, so I guess I'm going to have to write it myself. I'm a programmer by trade, but I've haven't done much low level programming, and I've certainly never written a device driver, so I was wondering if anyone could answer these two questions: 1) Is there any software already out there that does this sort of thing? I've looked fairly thoroughly but haven't found anything applicable. 2) Where would I start in trying to reverse engineer this sort of thing? Any useful articles, tutorials, books that might help? And just to clarify: The backlighting already works, that's not the problem. I just want to be able to change the color of the backlighting. This functionality is supported by the hardware. The laptop came with windows software that does this and I want the same functionality in Linux. I am willing to write this software myself, I just want to know the best way to go about it. Thanks!

    Read the article

  • Best CMS for review-type sites

    - by Pru
    Is there an ideal CMS for making a review site? By review site, I mean like a restaurant review site where you have each entry belonging to different major categories like Cuisine and City. Then users can browse and filter by each or by combination (Chinese Food in Los Angeles, with suggestions of other Chinese restaurants in LA, etc). Furthermore, I'd want it to support other fields like price, parking, kid-friendliness, etc. And to have users be able to filter by those criteria. I've been told that with a combination of custom taxonomies, plug-ins and many clever little queries, that Wordpress 3.x can handle this. But I'm having a heck of a time with it getting into the nitty gritty, and that's where I find the community support is lacking. The sort of stuff you'd think would work in WP, like making one parent category for Cuisine and one for City, don't really work once you get further in and start trying to pull it all together. Then you find these blog posts where people say, "This example shows that one could create a huge movie review site using custom taxonomies..." but when you go and try it you hit all sorts of challenges and oddities that point a big long finger at Wordpress being in fact a blogging platform. The best I came up with was one category for the cuisine and one tag for the city, then I created a couple of custom tag-like taxonomies for the other features. It's quite a mess to try to figure out how to assemble all of that into a natural, intuitive site. I expect a few versions down the road WP will be able to do these sorts of sites out of the box. So I thought I'd take a step back before I run back into the Wordpress fray and find out if maybe there is another platform better suited to this sort of relational content site. Directory scripts in some ways offer many of the features I'm looking for, but I need something more flexible and, hopefully, interactive (comments, reviews). I'm especially looking for feedback from people who've crafted sites like this. Thanks!

    Read the article

  • Why is the remove function not working for hashmaps? [migrated]

    - by John Marston
    I am working on a simple project that obtains data from an input file, gets what it needs and prints it to a file. I am basically getting word frequency so each key is a string and the value is its frequency in the document. The problem however, is that I need to print out these values to a file in descending order of frequency. After making my hashmap, this is the part of my program that sorts it and writes it to a file. //Hashmap I create Map<String, Integer> map = new ConcurrentHashMap<String, Integer>(); //function to sort hashmap while (map.isEmpty() == false){ for (Entry<String, Integer> entry: map.entrySet()){ if (entry.getValue() > valueMax){ max = entry.getKey(); System.out.println("max: " + max); valueMax = entry.getValue(); System.out.println("value: " + valueMax); } } map.remove(max); out.write(max + "\t" + valueMax + "\n"); System.out.println(max + "\t" + valueMax); } When I run this i get: t 9 t 9 t 9 t 9 t 9 .... so it appears the remove function is not working as it keeps getting the same value. I'm thinking i have an issue with a scope rule or I just don't understand hashmaps very well. If anyone knows of a better way to sort a hashmap and print it, I would welcome a suggestion. thanks

    Read the article

  • Matching users based on a series of questions

    - by SeanWM
    I'm trying to figure out a way to match users based on specific personality traits. Each trait will have its own category. I figure in my user table I'll add a column for each category: id name cat1 cat2 cat3 1 Sean ? ? ? 2 Other ? ? ? Let's say I ask each user 3 questions in each category. For each question, you can answer one of the following: No, Maybe, Yes How would I calculate one number based off the answers in those 3 questions that would hold a value I can compare other users to? I was thinking having some sort of weight. Like: No -> 0 Maybe -> 1 Yes -> 2 Then doing some sort of meaningful calculation. I want to end up with something like this so I can query the users and find who matches close: id name cat1 cat2 cat3 1 Sean 4 5 1 2 Other 1 2 5 In the situation above, the users don't really match. I'd want to match with someone with a +1 or -1 of my score in each category. I'm not a math guy so I'm just looking for some ideas to get me started.

    Read the article

  • High-level strategy for distinguishing a regular string from invalid JSON (ie. JSON-like string detection)

    - by Jonline
    Disclaimer On Absence of Code: I have no code to post because I haven't started writing; was looking for more theoretical guidance as I doubt I'll have trouble coding it but am pretty befuddled on what approach(es) would yield best results. I'm not seeking any code, either, though; just direction. Dilemma I'm toying with adding a "magic method"-style feature to a UI I'm building for a client, and it would require intelligently detecting whether or not a string was meant to be JSON as against a simple string. I had considered these general ideas: Look for a sort of arbitrarily-determined acceptable ratio of the frequency of JSON-like syntax (ie. regex to find strings separated by colons; look for colons between curly-braces, etc.) to the number of quote-encapsulated strings + nulls, bools and ints/floats. But the smaller the data set, the more fickle this would get look for key identifiers like opening and closing curly braces... not sure if there even are more easy identifiers, and this doesn't appeal anyway because it's so prescriptive about the kinds of mistakes it could find try incrementally parsing chunks, as those between curly braces, and seeing what proportion of these fractional statements turn out to be valid JSON; this seems like it would suffer less than (1) from smaller datasets, but would probably be much more processing-intensive, and very susceptible to a missing or inverted brace Just curious if the computational folks or algorithm pros out there had any approaches in mind that my semantics-oriented brain might have missed. PS: It occurs to me that natural language processing, about which I am totally ignorant, might be a cool approach; but, if NLP is a good strategy here, it sort of doesn't matter because I have zero experience with it and don't have time to learn & then implement/ this feature isn't worth it to the client.

    Read the article

  • how to use a batch file to delete a line of text in a bunch of text files? [on hold]

    - by wbt
    I have a bunch of txt files in my D drive which are placed randomly in different locations.Some files also contain symbols.I want a batch file so that i can delete their specific lines completely at the same time without doing it one by one for each file and please refer to a code which does not create a new text file at some other location with the changes being incorporated i.e I do not want the input.txt and output.txt thing.I just need the original files to be replaced with the changes as soon as i click the batch file. e.g D:\abc\1.txt D:\xyz\2.txt etc I want both of their 3rd lines erased completely with a single click and the new file must be saved with the same name in the same location i.e the new changed text files must replace the old text files with their respective lines removed.Maybe some sort of *.txt thing i.e i should be able to change all the files with the .txt extensions in a drive via a single batch file perhaps in another drive,not placing my batch file into each and every folder separately and then running them.Alternatively a vbs file is also welcomed. SORRY FOR THE LONG AND THOROUGH MESSAGE BUT I'M WONDERING ALL OVER THE INTERNET FOR THE LAST TWO DAYS JUST FOR THIS ONE BATCH FILE.ALL THE INFORMATION I GET IS A SORT OF JARGON FOR ME AS I AM NOT A GEEK WITH THE SCRIPTING.PLEASE DESCRIBE THE CODE TOO.YOUR HELP IS MUCH APPRECIATED

    Read the article

  • Android onClickListener options and help on creating a non-static array adapter

    - by CoderInTraining
    I am trying to make an application that gets data dynamically and displays it in a listView. Here is my code that I have with static data. There are a couple of things I want to try and do and can't figure out how. MainActivity.java package text.example.project; import java.util.ArrayList; import java.util.Arrays; import java.util.Collections; import android.app.ListActivity; import android.os.Bundle; import android.view.View; import android.widget.AdapterView; import android.widget.AdapterView.OnItemClickListener; import android.widget.ArrayAdapter; import android.widget.ListView; import android.widget.TextView; import android.widget.Toast; public class MainActivity extends ListActivity { //declarations private boolean isItem; private ArrayAdapter<String> item1Adapter; private ArrayAdapter<String> item2Adapter; private ArrayAdapter<String> item3Adapter; /** Called when the activity is first created. */ @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); Collections.sort(ITEM1); Collections.sort(ITEM2); Collections.sort(ITEM3); item1Adapter = new ArrayAdapter<String>(this, R.layout.list_item, ITEM1); item2Adapter = new ArrayAdapter<String>(this, R.layout.list_item, ITEM2); item3Adapter = new ArrayAdapter<String>(this, R.layout.list_item, ITEM3); setListAdapter(item1Adapter); isItem = true; ListView lv = getListView(); lv.setTextFilterEnabled(true); lv.setOnItemClickListener(new OnItemClickListener() { @Override public void onItemClick(AdapterView<?> parent, View view, int position, long id) { // When clicked, show a toast with the TextView text if (isItem) { //ITEM1.add("another\n\t" + Calendar.getInstance().getTime()); Collections.sort(ITEM1); item2Adapter.notifyDataSetChanged(); setListAdapter(item2Adapter); isItem = false; } else { item1Adapter.notifyDataSetChanged(); setListAdapter(item1Adapter); isItem = true; } Toast.makeText(getApplicationContext(), ((TextView) view).getText(), Toast.LENGTH_SHORT).show(); } }); } // need to turn dynamic static ArrayList<String> ITEM1 = new ArrayList<String> (Arrays.asList( "ITEM1-1", "ITEM1-2" )); static ArrayList<String> ITEM2 = new ArrayList<String> (Arrays.asList( "ITEM2-1", "ITEM2-2" )); static ArrayList<String> ITEM3 = new ArrayList<String> (Arrays.asList("ITEM3-1", "ITEM3-2")); } list_item.xml <?xml version="1.0" encoding="utf-8"?> <TextView xmlns:android="http://schemas.android.com/apk/res/android" android:layout_width="fill_parent" android:layout_height="fill_parent" android:padding="10dp" android:textSize="16sp" > </TextView> What I want to do is first pull from a dynamic source. I need to do what is almost exactly like this tutorial... http://androiddevelopement.blogspot.in/2011/06/android-xml-parsing-tutorial-using.html ... however, I can't get the tutorial to work... I also would like to know how I can make the list item clicks so that if I click on "ITEM1-1" it goes to the menu "ITEM2-1" and "ITEM2-2". and if "ITEM1-2" is clicked, then it goes to the menu with "ITEM3-1" and "ITEM3-2"... I am totally a noob at this whole android development thing. So any suggestions would be great!

    Read the article

  • A way of doing real-world test-driven development (and some thoughts about it)

    - by Thomas Weller
    Lately, I exchanged some arguments with Derick Bailey about some details of the red-green-refactor cycle of the Test-driven development process. In short, the issue revolved around the fact that it’s not enough to have a test red or green, but it’s also important to have it red or green for the right reasons. While for me, it’s sufficient to initially have a NotImplementedException in place, Derick argues that this is not totally correct (see these two posts: Red/Green/Refactor, For The Right Reasons and Red For The Right Reason: Fail By Assertion, Not By Anything Else). And he’s right. But on the other hand, I had no idea how his insights could have any practical consequence for my own individual interpretation of the red-green-refactor cycle (which is not really red-green-refactor, at least not in its pure sense, see the rest of this article). This made me think deeply for some days now. In the end I found out that the ‘right reason’ changes in my understanding depending on what development phase I’m in. To make this clear (at least I hope it becomes clear…) I started to describe my way of working in some detail, and then something strange happened: The scope of the article slightly shifted from focusing ‘only’ on the ‘right reason’ issue to something more general, which you might describe as something like  'Doing real-world TDD in .NET , with massive use of third-party add-ins’. This is because I feel that there is a more general statement about Test-driven development to make:  It’s high time to speak about the ‘How’ of TDD, not always only the ‘Why’. Much has been said about this, and me myself also contributed to that (see here: TDD is not about testing, it's about how we develop software). But always justifying what you do is very unsatisfying in the long run, it is inherently defensive, and it costs time and effort that could be used for better and more important things. And frankly: I’m somewhat sick and tired of repeating time and again that the test-driven way of software development is highly preferable for many reasons - I don’t want to spent my time exclusively on stating the obvious… So, again, let’s say it clearly: TDD is programming, and programming is TDD. Other ways of programming (code-first, sometimes called cowboy-coding) are exceptional and need justification. – I know that there are many people out there who will disagree with this radical statement, and I also know that it’s not a description of the real world but more of a mission statement or something. But nevertheless I’m absolutely sure that in some years this statement will be nothing but a platitude. Side note: Some parts of this post read as if I were paid by Jetbrains (the manufacturer of the ReSharper add-in – R#), but I swear I’m not. Rather I think that Visual Studio is just not production-complete without it, and I wouldn’t even consider to do professional work without having this add-in installed... The three parts of a software component Before I go into some details, I first should describe my understanding of what belongs to a software component (assembly, type, or method) during the production process (i.e. the coding phase). Roughly, I come up with the three parts shown below:   First, we need to have some initial sort of requirement. This can be a multi-page formal document, a vague idea in some programmer’s brain of what might be needed, or anything in between. In either way, there has to be some sort of requirement, be it explicit or not. – At the C# micro-level, the best way that I found to formulate that is to define interfaces for just about everything, even for internal classes, and to provide them with exhaustive xml comments. The next step then is to re-formulate these requirements in an executable form. This is specific to the respective programming language. - For C#/.NET, the Gallio framework (which includes MbUnit) in conjunction with the ReSharper add-in for Visual Studio is my toolset of choice. The third part then finally is the production code itself. It’s development is entirely driven by the requirements and their executable formulation. This is the delivery, the two other parts are ‘only’ there to make its production possible, to give it a decent quality and reliability, and to significantly reduce related costs down the maintenance timeline. So while the first two parts are not really relevant for the customer, they are very important for the developer. The customer (or in Scrum terms: the Product Owner) is not interested at all in how  the product is developed, he is only interested in the fact that it is developed as cost-effective as possible, and that it meets his functional and non-functional requirements. The rest is solely a matter of the developer’s craftsmanship, and this is what I want to talk about during the remainder of this article… An example To demonstrate my way of doing real-world TDD, I decided to show the development of a (very) simple Calculator component. The example is deliberately trivial and silly, as examples always are. I am totally aware of the fact that real life is never that simple, but I only want to show some development principles here… The requirement As already said above, I start with writing down some words on the initial requirement, and I normally use interfaces for that, even for internal classes - the typical question “intf or not” doesn’t even come to mind. I need them for my usual workflow and using them automatically produces high componentized and testable code anyway. To think about their usage in every single situation would slow down the production process unnecessarily. So this is what I begin with: namespace Calculator {     /// <summary>     /// Defines a very simple calculator component for demo purposes.     /// </summary>     public interface ICalculator     {         /// <summary>         /// Gets the result of the last successful operation.         /// </summary>         /// <value>The last result.</value>         /// <remarks>         /// Will be <see langword="null" /> before the first successful operation.         /// </remarks>         double? LastResult { get; }       } // interface ICalculator   } // namespace Calculator So, I’m not beginning with a test, but with a sort of code declaration - and still I insist on being 100% test-driven. There are three important things here: Starting this way gives me a method signature, which allows to use IntelliSense and AutoCompletion and thus eliminates the danger of typos - one of the most regular, annoying, time-consuming, and therefore expensive sources of error in the development process. In my understanding, the interface definition as a whole is more of a readable requirement document and technical documentation than anything else. So this is at least as much about documentation than about coding. The documentation must completely describe the behavior of the documented element. I normally use an IoC container or some sort of self-written provider-like model in my architecture. In either case, I need my components defined via service interfaces anyway. - I will use the LinFu IoC framework here, for no other reason as that is is very simple to use. The ‘Red’ (pt. 1)   First I create a folder for the project’s third-party libraries and put the LinFu.Core dll there. Then I set up a test project (via a Gallio project template), and add references to the Calculator project and the LinFu dll. Finally I’m ready to write the first test, which will look like the following: namespace Calculator.Test {     [TestFixture]     public class CalculatorTest     {         private readonly ServiceContainer container = new ServiceContainer();           [Test]         public void CalculatorLastResultIsInitiallyNull()         {             ICalculator calculator = container.GetService<ICalculator>();               Assert.IsNull(calculator.LastResult);         }       } // class CalculatorTest   } // namespace Calculator.Test       This is basically the executable formulation of what the interface definition states (part of). Side note: There’s one principle of TDD that is just plain wrong in my eyes: I’m talking about the Red is 'does not compile' thing. How could a compiler error ever be interpreted as a valid test outcome? I never understood that, it just makes no sense to me. (Or, in Derick’s terms: this reason is as wrong as a reason ever could be…) A compiler error tells me: Your code is incorrect, but nothing more.  Instead, the ‘Red’ part of the red-green-refactor cycle has a clearly defined meaning to me: It means that the test works as intended and fails only if its assumptions are not met for some reason. Back to our Calculator. When I execute the above test with R#, the Gallio plugin will give me this output: So this tells me that the test is red for the wrong reason: There’s no implementation that the IoC-container could load, of course. So let’s fix that. With R#, this is very easy: First, create an ICalculator - derived type:        Next, implement the interface members: And finally, move the new class to its own file: So far my ‘work’ was six mouse clicks long, the only thing that’s left to do manually here, is to add the Ioc-specific wiring-declaration and also to make the respective class non-public, which I regularly do to force my components to communicate exclusively via interfaces: This is what my Calculator class looks like as of now: using System; using LinFu.IoC.Configuration;   namespace Calculator {     [Implements(typeof(ICalculator))]     internal class Calculator : ICalculator     {         public double? LastResult         {             get             {                 throw new NotImplementedException();             }         }     } } Back to the test fixture, we have to put our IoC container to work: [TestFixture] public class CalculatorTest {     #region Fields       private readonly ServiceContainer container = new ServiceContainer();       #endregion // Fields       #region Setup/TearDown       [FixtureSetUp]     public void FixtureSetUp()     {        container.LoadFrom(AppDomain.CurrentDomain.BaseDirectory, "Calculator.dll");     }       ... Because I have a R# live template defined for the setup/teardown method skeleton as well, the only manual coding here again is the IoC-specific stuff: two lines, not more… The ‘Red’ (pt. 2) Now, the execution of the above test gives the following result: This time, the test outcome tells me that the method under test is called. And this is the point, where Derick and I seem to have somewhat different views on the subject: Of course, the test still is worthless regarding the red/green outcome (or: it’s still red for the wrong reasons, in that it gives a false negative). But as far as I am concerned, I’m not really interested in the test outcome at this point of the red-green-refactor cycle. Rather, I only want to assert that my test actually calls the right method. If that’s the case, I will happily go on to the ‘Green’ part… The ‘Green’ Making the test green is quite trivial. Just make LastResult an automatic property:     [Implements(typeof(ICalculator))]     internal class Calculator : ICalculator     {         public double? LastResult { get; private set; }     }         One more round… Now on to something slightly more demanding (cough…). Let’s state that our Calculator exposes an Add() method:         ...   /// <summary>         /// Adds the specified operands.         /// </summary>         /// <param name="operand1">The operand1.</param>         /// <param name="operand2">The operand2.</param>         /// <returns>The result of the additon.</returns>         /// <exception cref="ArgumentException">         /// Argument <paramref name="operand1"/> is &lt; 0.<br/>         /// -- or --<br/>         /// Argument <paramref name="operand2"/> is &lt; 0.         /// </exception>         double Add(double operand1, double operand2);       } // interface ICalculator A remark: I sometimes hear the complaint that xml comment stuff like the above is hard to read. That’s certainly true, but irrelevant to me, because I read xml code comments with the CR_Documentor tool window. And using that, it looks like this:   Apart from that, I’m heavily using xml code comments (see e.g. here for a detailed guide) because there is the possibility of automating help generation with nightly CI builds (using MS Sandcastle and the Sandcastle Help File Builder), and then publishing the results to some intranet location.  This way, a team always has first class, up-to-date technical documentation at hand about the current codebase. (And, also very important for speeding up things and avoiding typos: You have IntelliSense/AutoCompletion and R# support, and the comments are subject to compiler checking…).     Back to our Calculator again: Two more R# – clicks implement the Add() skeleton:         ...           public double Add(double operand1, double operand2)         {             throw new NotImplementedException();         }       } // class Calculator As we have stated in the interface definition (which actually serves as our requirement document!), the operands are not allowed to be negative. So let’s start implementing that. Here’s the test: [Test] [Row(-0.5, 2)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) {     ICalculator calculator = container.GetService<ICalculator>();       Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); } As you can see, I’m using a data-driven unit test method here, mainly for these two reasons: Because I know that I will have to do the same test for the second operand in a few seconds, I save myself from implementing another test method for this purpose. Rather, I only will have to add another Row attribute to the existing one. From the test report below, you can see that the argument values are explicitly printed out. This can be a valuable documentation feature even when everything is green: One can quickly review what values were tested exactly - the complete Gallio HTML-report (as it will be produced by the Continuous Integration runs) shows these values in a quite clear format (see below for an example). Back to our Calculator development again, this is what the test result tells us at the moment: So we’re red again, because there is not yet an implementation… Next we go on and implement the necessary parameter verification to become green again, and then we do the same thing for the second operand. To make a long story short, here’s the test and the method implementation at the end of the second cycle: // in CalculatorTest:   [Test] [Row(-0.5, 2)] [Row(295, -123)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) {     ICalculator calculator = container.GetService<ICalculator>();       Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); }   // in Calculator: public double Add(double operand1, double operand2) {     if (operand1 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand1");     }     if (operand2 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand2");     }     throw new NotImplementedException(); } So far, we have sheltered our method from unwanted input, and now we can safely operate on the parameters without further caring about their validity (this is my interpretation of the Fail Fast principle, which is regarded here in more detail). Now we can think about the method’s successful outcomes. First let’s write another test for that: [Test] [Row(1, 1, 2)] public void TestAdd(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       double result = calculator.Add(operand1, operand2);       Assert.AreEqual(expectedResult, result); } Again, I’m regularly using row based test methods for these kinds of unit tests. The above shown pattern proved to be extremely helpful for my development work, I call it the Defined-Input/Expected-Output test idiom: You define your input arguments together with the expected method result. There are two major benefits from that way of testing: In the course of refining a method, it’s very likely to come up with additional test cases. In our case, we might add tests for some edge cases like ‘one of the operands is zero’ or ‘the sum of the two operands causes an overflow’, or maybe there’s an external test protocol that has to be fulfilled (e.g. an ISO norm for medical software), and this results in the need of testing against additional values. In all these scenarios we only have to add another Row attribute to the test. Remember that the argument values are written to the test report, so as a side-effect this produces valuable documentation. (This can become especially important if the fulfillment of some sort of external requirements has to be proven). So your test method might look something like that in the end: [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 2)] [Row(0, 999999999, 999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, double.MaxValue)] [Row(4, double.MaxValue - 2.5, double.MaxValue)] public void TestAdd(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       double result = calculator.Add(operand1, operand2);       Assert.AreEqual(expectedResult, result); } And this will produce the following HTML report (with Gallio):   Not bad for the amount of work we invested in it, huh? - There might be scenarios where reports like that can be useful for demonstration purposes during a Scrum sprint review… The last requirement to fulfill is that the LastResult property is expected to store the result of the last operation. I don’t show this here, it’s trivial enough and brings nothing new… And finally: Refactor (for the right reasons) To demonstrate my way of going through the refactoring portion of the red-green-refactor cycle, I added another method to our Calculator component, namely Subtract(). Here’s the code (tests and production): // CalculatorTest.cs:   [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtract(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       double result = calculator.Subtract(operand1, operand2);       Assert.AreEqual(expectedResult, result); }   [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtractGivesExpectedLastResult(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       calculator.Subtract(operand1, operand2);       Assert.AreEqual(expectedResult, calculator.LastResult); }   ...   // ICalculator.cs: /// <summary> /// Subtracts the specified operands. /// </summary> /// <param name="operand1">The operand1.</param> /// <param name="operand2">The operand2.</param> /// <returns>The result of the subtraction.</returns> /// <exception cref="ArgumentException"> /// Argument <paramref name="operand1"/> is &lt; 0.<br/> /// -- or --<br/> /// Argument <paramref name="operand2"/> is &lt; 0. /// </exception> double Subtract(double operand1, double operand2);   ...   // Calculator.cs:   public double Subtract(double operand1, double operand2) {     if (operand1 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand1");     }       if (operand2 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand2");     }       return (this.LastResult = operand1 - operand2).Value; }   Obviously, the argument validation stuff that was produced during the red-green part of our cycle duplicates the code from the previous Add() method. So, to avoid code duplication and minimize the number of code lines of the production code, we do an Extract Method refactoring. One more time, this is only a matter of a few mouse clicks (and giving the new method a name) with R#: Having done that, our production code finally looks like that: using System; using LinFu.IoC.Configuration;   namespace Calculator {     [Implements(typeof(ICalculator))]     internal class Calculator : ICalculator     {         #region ICalculator           public double? LastResult { get; private set; }           public double Add(double operand1, double operand2)         {             ThrowIfOneOperandIsInvalid(operand1, operand2);               return (this.LastResult = operand1 + operand2).Value;         }           public double Subtract(double operand1, double operand2)         {             ThrowIfOneOperandIsInvalid(operand1, operand2);               return (this.LastResult = operand1 - operand2).Value;         }           #endregion // ICalculator           #region Implementation (Helper)           private static void ThrowIfOneOperandIsInvalid(double operand1, double operand2)         {             if (operand1 < 0.0)             {                 throw new ArgumentException("Value must not be negative.", "operand1");             }               if (operand2 < 0.0)             {                 throw new ArgumentException("Value must not be negative.", "operand2");             }         }           #endregion // Implementation (Helper)       } // class Calculator   } // namespace Calculator But is the above worth the effort at all? It’s obviously trivial and not very impressive. All our tests were green (for the right reasons), and refactoring the code did not change anything. It’s not immediately clear how this refactoring work adds value to the project. Derick puts it like this: STOP! Hold on a second… before you go any further and before you even think about refactoring what you just wrote to make your test pass, you need to understand something: if your done with your requirements after making the test green, you are not required to refactor the code. I know… I’m speaking heresy, here. Toss me to the wolves, I’ve gone over to the dark side! Seriously, though… if your test is passing for the right reasons, and you do not need to write any test or any more code for you class at this point, what value does refactoring add? Derick immediately answers his own question: So why should you follow the refactor portion of red/green/refactor? When you have added code that makes the system less readable, less understandable, less expressive of the domain or concern’s intentions, less architecturally sound, less DRY, etc, then you should refactor it. I couldn’t state it more precise. From my personal perspective, I’d add the following: You have to keep in mind that real-world software systems are usually quite large and there are dozens or even hundreds of occasions where micro-refactorings like the above can be applied. It’s the sum of them all that counts. And to have a good overall quality of the system (e.g. in terms of the Code Duplication Percentage metric) you have to be pedantic on the individual, seemingly trivial cases. My job regularly requires the reading and understanding of ‘foreign’ code. So code quality/readability really makes a HUGE difference for me – sometimes it can be even the difference between project success and failure… Conclusions The above described development process emerged over the years, and there were mainly two things that guided its evolution (you might call it eternal principles, personal beliefs, or anything in between): Test-driven development is the normal, natural way of writing software, code-first is exceptional. So ‘doing TDD or not’ is not a question. And good, stable code can only reliably be produced by doing TDD (yes, I know: many will strongly disagree here again, but I’ve never seen high-quality code – and high-quality code is code that stood the test of time and causes low maintenance costs – that was produced code-first…) It’s the production code that pays our bills in the end. (Though I have seen customers these days who demand an acceptance test battery as part of the final delivery. Things seem to go into the right direction…). The test code serves ‘only’ to make the production code work. But it’s the number of delivered features which solely counts at the end of the day - no matter how much test code you wrote or how good it is. With these two things in mind, I tried to optimize my coding process for coding speed – or, in business terms: productivity - without sacrificing the principles of TDD (more than I’d do either way…).  As a result, I consider a ratio of about 3-5/1 for test code vs. production code as normal and desirable. In other words: roughly 60-80% of my code is test code (This might sound heavy, but that is mainly due to the fact that software development standards only begin to evolve. The entire software development profession is very young, historically seen; only at the very beginning, and there are no viable standards yet. If you think about software development as a kind of casting process, where the test code is the mold and the resulting production code is the final product, then the above ratio sounds no longer extraordinary…) Although the above might look like very much unnecessary work at first sight, it’s not. With the aid of the mentioned add-ins, doing all the above is a matter of minutes, sometimes seconds (while writing this post took hours and days…). The most important thing is to have the right tools at hand. Slow developer machines or the lack of a tool or something like that - for ‘saving’ a few 100 bucks -  is just not acceptable and a very bad decision in business terms (though I quite some times have seen and heard that…). Production of high-quality products needs the usage of high-quality tools. This is a platitude that every craftsman knows… The here described round-trip will take me about five to ten minutes in my real-world development practice. I guess it’s about 30% more time compared to developing the ‘traditional’ (code-first) way. But the so manufactured ‘product’ is of much higher quality and massively reduces maintenance costs, which is by far the single biggest cost factor, as I showed in this previous post: It's the maintenance, stupid! (or: Something is rotten in developerland.). In the end, this is a highly cost-effective way of software development… But on the other hand, there clearly is a trade-off here: coding speed vs. code quality/later maintenance costs. The here described development method might be a perfect fit for the overwhelming majority of software projects, but there certainly are some scenarios where it’s not - e.g. if time-to-market is crucial for a software project. So this is a business decision in the end. It’s just that you have to know what you’re doing and what consequences this might have… Some last words First, I’d like to thank Derick Bailey again. His two aforementioned posts (which I strongly recommend for reading) inspired me to think deeply about my own personal way of doing TDD and to clarify my thoughts about it. I wouldn’t have done that without this inspiration. I really enjoy that kind of discussions… I agree with him in all respects. But I don’t know (yet?) how to bring his insights into the described production process without slowing things down. The above described method proved to be very “good enough” in my practical experience. But of course, I’m open to suggestions here… My rationale for now is: If the test is initially red during the red-green-refactor cycle, the ‘right reason’ is: it actually calls the right method, but this method is not yet operational. Later on, when the cycle is finished and the tests become part of the regular, automated Continuous Integration process, ‘red’ certainly must occur for the ‘right reason’: in this phase, ‘red’ MUST mean nothing but an unfulfilled assertion - Fail By Assertion, Not By Anything Else!

    Read the article

  • Logical Domain Modeling Made Simple

    - by Knut Vatsendvik
    How can logical domain modeling be made simple and collaborative? Many non-technical end-users, managers and business domain experts find it difficult to understand the visual models offered by many UML tools. This creates trouble in capturing and verifying the information that goes into a logical domain model. The tools are also too advanced and complex for a non-technical user to learn and use. We have therefore, in our current project, ended up with using Confluence as tool for designing the logical domain model with the help of a few very useful plugins. Big thanks to Ole Nymoen and Per Spilling for their expertise in this field that made this posting possible. Confluence Plugins Here is a list of Confluence plugins used in this solution. Install these before trying out the macros used below. Plugin Description Copy Space Allows a space administrator to copy a space, including the pages within the space Metadata Supports adding metadata to Wiki pages Label Manages labeling of pages Linking Contains macros for linking to templates, the dashboard and other Table Enhances the table capability in Confluence Creating a Confluence Space First we need to create a new confluence space for the domain model. Click the link Create a Space located below the list of spaces on the Dashboard. Please contact your Confluence administrator is you do not have permissions to do this.   For illustrative purpose all attributes and entities in this posting are based on my imaginary project manager domain model. When a logical domain model is good enough for being implemented, do a copy of the Confluence Space (see Copy Space plugin). In this way you create a stable version of the logical domain model while further design can continue with the new copied space. Typical will the implementation phase result in a database design and/or a XSD schema design. Add Space Templates Go to the Home page of your Confluence Space. Navigate to the Browse drop-down menu and click on Advanced. Then click the Templates option in the left navigation panel. Click Add New Space Template to add the following three templates. Name: attribute {metadata-list} || Name | | || Type | | || Format | | || Description | | {metadata-list} {add-label:attribute} Name: primary-type {metadata-list} || Name | || || Type | || || Format | || || Description | || {metadata-list} {add-label:primary-type} Name: complex-type {metadata-list} || Name | || || Description |  || {metadata-list} h3. Attributes || Name || Type || Format || Description || | [name] | {metadata-from:name|Type} | {metadata-from:name|Format} | {metadata-from:name|Description} | {add-label:complex-type,entity} The metadata-list macro (see Metadata plugin) will save a list of metadata values to the page. The add-label macro (see Label plugin) will automatically label the page. Primary Types Page Our first page to add will act as container for our primary types. Switch to Wiki markup when adding the following content to the page. | (+) {add-page:template=primary-type|parent=@self}Add new primary type{add-page} | {metadata-report:Name,Type,Format,Description|sort=Name|root=@self|pages=@descendents} Once the page is created, click the Add new primary type (create-page macro) to start creating a new pages. Here is an example of input to the LocalDate page. Embrace the LocalDate with square brackets [] to make the page linkable. Again switch to Wiki markup before editing. {metadata-list} || Name | [LocalDate] || || Type | Date || || Format | YYYY-MM-DD || || Description | Date in local time zone. YYYY = year, MM = month and DD = day || {metadata-list} {add-label:primary-type} The metadata-report macro will show a tabular report of all child pages.   Attributes Page The next page will act as container for all of our attributes. | (+) {add-page:template=attribute|parent=@self|title=attribute}Add new attribute{add-page} | {metadata-report:Name,Type,Format,Description|sort=Name|pages=@descendants} Here is an example of input to the startDate page. {metadata-list} || Name | [startDate] || || Type | [LocalDate] || || Format | {metadata-from:LocalDate|Format} || || Description | The projects start date || {metadata-list} {add-label:attribute} Using the metadata-from macro we fetch the text from the previously created LocalDate page. Complex Types Page The last page in this example shows how attributes can be combined together to form more complex types.   h3. Intro Overview of complex types in the domain model. | (+) {add-page:template=complex-type|parent=@self}Add a new complex type{add-page}\\ | {metadata-report:Name,Description|sort=Name|root=@self|pages=@descendents} Here is an example of input to the ProjectType page. {metadata-list} || Name | [ProjectType] || || Description | Represents a project || {metadata-list} h3. Attributes || Name || Type || Format || Description || | [projectId] | {metadata-from:projectId|Type} | {metadata-from:projectId|Format} | {metadata-from:projectId|Description} | | [name] | {metadata-from:name|Type} | {metadata-from:name|Format} | {metadata-from:name|Description} | | [description] | {metadata-from:description|Type} | {metadata-from:description|Format} | {metadata-from:description|Description} | | [startDate] | {metadata-from:startDate|Type} | {metadata-from:startDate|Format} | {metadata-from:startDate|Description} | {add-label:complex-type,entity} Gives us this Conclusion Using a web-based corporate Wiki like Confluence to create a logical domain model increases the collaboration between people with different roles in the enterprise. It’s my believe that this helps the domain model to be more accurate, and better documented. In our real project we have more pages than illustrated here to complete the documentation. We do also still use UML tools to create different types of diagrams that Confluence do not support. As a last tip, an ImageMap plugin can make those diagrams clickable when used in pages. Enjoy!

    Read the article

  • Talend Enterprise Data Integration overperforms on Oracle SPARC T4

    - by Amir Javanshir
    The SPARC T microprocessor, released in 2005 by Sun Microsystems, and now continued at Oracle, has a good track record in parallel execution and multi-threaded performance. However it was less suited for pure single-threaded workloads. The new SPARC T4 processor is now filling that gap by offering a 5x better single-thread performance over previous generations. Following our long-term relationship with Talend, a fast growing ISV positioned by Gartner in the “Visionaries” quadrant of the “Magic Quadrant for Data Integration Tools”, we decided to test some of their integration components with the T4 chip, more precisely on a T4-1 system, in order to verify first hand if this new processor stands up to its promises. Several tests were performed, mainly focused on: Single-thread performance of the new SPARC T4 processor compared to an older SPARC T2+ processor Overall throughput of the SPARC T4-1 server using multiple threads The tests consisted in reading large amounts of data --ten's of gigabytes--, processing and writing them back to a file or an Oracle 11gR2 database table. They are CPU, memory and IO bound tests. Given the main focus of this project --CPU performance--, bottlenecks were removed as much as possible on the memory and IO sub-systems. When possible, the data to process was put into the ZFS filesystem cache, for instance. Also, two external storage devices were directly attached to the servers under test, each one divided in two ZFS pools for read and write operations. Multi-thread: Testing throughput on the Oracle T4-1 The tests were performed with different number of simultaneous threads (1, 2, 4, 8, 12, 16, 32, 48 and 64) and using different storage devices: Flash, Fibre Channel storage, two stripped internal disks and one single internal disk. All storage devices used ZFS as filesystem and volume management. Each thread read a dedicated 1GB-large file containing 12.5M lines with the following structure: customerID;FirstName;LastName;StreetAddress;City;State;Zip;Cust_Status;Since_DT;Status_DT 1;Ronald;Reagan;South Highway;Santa Fe;Montana;98756;A;04-06-2006;09-08-2008 2;Theodore;Roosevelt;Timberlane Drive;Columbus;Louisiana;75677;A;10-05-2009;27-05-2008 3;Andrew;Madison;S Rustle St;Santa Fe;Arkansas;75677;A;29-04-2005;09-02-2008 4;Dwight;Adams;South Roosevelt Drive;Baton Rouge;Vermont;75677;A;15-02-2004;26-01-2007 […] The following graphs present the results of our tests: Unsurprisingly up to 16 threads, all files fit in the ZFS cache a.k.a L2ARC : once the cache is hot there is no performance difference depending on the underlying storage. From 16 threads upwards however, it is clear that IO becomes a bottleneck, having a good IO subsystem is thus key. Single-disk performance collapses whereas the Sun F5100 and ST6180 arrays allow the T4-1 to scale quite seamlessly. From 32 to 64 threads, the performance is almost constant with just a slow decline. For the database load tests, only the best IO configuration --using external storage devices-- were used, hosting the Oracle table spaces and redo log files. Using the Sun Storage F5100 array allows the T4-1 server to scale up to 48 parallel JVM processes before saturating the CPU. The final result is a staggering 646K lines per second insertion in an Oracle table using 48 parallel threads. Single-thread: Testing the single thread performance Seven different tests were performed on both servers. Given the fact that only one thread, thus one file was read, no IO bottleneck was involved, all data being served from the ZFS cache. Read File ? Filter ? Write File: Read file, filter data, write the filtered data in a new file. The filter is set on the “Status” column: only lines with status set to “A” are selected. This limits each output file to about 500 MB. Read File ? Load Database Table: Read file, insert into a single Oracle table. Average: Read file, compute the average of a numeric column, write the result in a new file. Division & Square Root: Read file, perform a division and square root on a numeric column, write the result data in a new file. Oracle DB Dump: Dump the content of an Oracle table (12.5M rows) into a CSV file. Transform: Read file, transform, write the result data in a new file. The transformations applied are: set the address column to upper case and add an extra column at the end, which is the concatenation of two columns. Sort: Read file, sort a numeric and alpha numeric column, write the result data in a new file. The following table and graph present the final results of the tests: Throughput unit is thousand lines per second processed (K lines/second). Improvement is the % of improvement between the T5140 and T4-1. Test T4-1 (Time s.) T5140 (Time s.) Improvement T4-1 (Throughput) T5140 (Throughput) Read/Filter/Write 125 806 645% 100 16 Read/Load Database 195 1111 570% 64 11 Average 96 557 580% 130 22 Division & Square Root 161 1054 655% 78 12 Oracle DB Dump 164 945 576% 76 13 Transform 159 1124 707% 79 11 Sort 251 1336 532% 50 9 The improvement of single-thread performance is quite dramatic: depending on the tests, the T4 is between 5.4 to 7 times faster than the T2+. It seems clear that the SPARC T4 processor has gone a long way filling the gap in single-thread performance, without sacrifying the multi-threaded capability as it still shows a very impressive scaling on heavy-duty multi-threaded jobs. Finally, as always at Oracle ISV Engineering, we are happy to help our ISV partners test their own applications on our platforms, so don't hesitate to contact us and let's see what the SPARC T4-based systems can do for your application! "As describe in this benchmark, Talend Enterprise Data Integration has overperformed on T4. I was generally happy to see that the T4 gave scaling opportunities for many scenarios like complex aggregations. Row by row insertion in Oracle DB is faster with more than 650,000 rows per seconds without using any bulk Oracle capabilities !" Cedric Carbone, Talend CTO.

    Read the article

  • SQL Table stored as a Heap - the dangers within

    - by MikeD
    Nearly all of the time I create a table, I include a primary key, and often that PK is implemented as a clustered index. Those two don't always have to go together, but in my world they almost always do. On a recent project, I was working on a data warehouse and a set of SSIS packages to import data from an OLTP database into my data warehouse. The data I was importing from the business database into the warehouse was mostly new rows, sometimes updates to existing rows, and sometimes deletes. I decided to use the MERGE statement to implement the insert, update or delete in the data warehouse, I found it quite performant to have a stored procedure that extracted all the new, updated, and deleted rows from the source database and dump it into a working table in my data warehouse, then run a stored proc in the warehouse that was the MERGE statement that took the rows from the working table and updated the real fact table. Use Warehouse CREATE TABLE Integration.MergePolicy (PolicyId int, PolicyTypeKey int, Premium money, Deductible money, EffectiveDate date, Operation varchar(5)) CREATE TABLE fact.Policy (PolicyKey int identity primary key, PolicyId int, PolicyTypeKey int, Premium money, Deductible money, EffectiveDate date) CREATE PROC Integration.MergePolicy as begin begin tran Merge fact.Policy as tgtUsing Integration.MergePolicy as SrcOn (tgt.PolicyId = Src.PolicyId) When not matched by Target then Insert (PolicyId, PolicyTypeKey, Premium, Deductible, EffectiveDate)values (src.PolicyId, src.PolicyTypeKey, src.Premium, src.Deductible, src.EffectiveDate) When matched and src.Operation = 'U' then Update set PolicyTypeKey = src.PolicyTypeKey,Premium = src.Premium,Deductible = src.Deductible,EffectiveDate = src.EffectiveDate When matched and src.Operation = 'D' then Delete ;delete from Integration.WorkPolicy commit end Notice that my worktable (Integration.MergePolicy) doesn't have any primary key or clustered index. I didn't think this would be a problem, since it was relatively small table and was empty after each time I ran the stored proc. For one of the work tables, during the initial loads of the warehouse, it was getting about 1.5 million rows inserted, processed, then deleted. Also, because of a bug in the extraction process, the same 1.5 million rows (plus a few hundred more each time) was getting inserted, processed, and deleted. This was being sone on a fairly hefty server that was otherwise unused, and no one was paying any attention to the time it was taking. This week I received a backup of this database and loaded it on my laptop to troubleshoot the problem, and of course it took a good ten minutes or more to run the process. However, what seemed strange to me was that after I fixed the problem and happened to run the merge sproc when the work table was completely empty, it still took almost ten minutes to complete. I immediately looked back at the MERGE statement to see if I had some sort of outer join that meant it would be scanning the target table (which had about 2 million rows in it), then turned on the execution plan output to see what was happening under the hood. Running the stored procedure again took a long time, and the plan output didn't show me much - 55% on the MERGE statement, and 45% on the DELETE statement, and table scans on the work table in both places. I was surprised at the relative cost of the DELETE statement, because there were really 0 rows to delete, but I was expecting to see the table scans. (I was beginning now to suspect that my problem was because the work table was being stored as a heap.) Then I turned on STATS_IO and ran the sproc again. The output was quite interesting.Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Policy'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'MergePolicy'. Scan count 1, logical reads 433276, physical reads 60, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. I've reproduced the above from memory, the details aren't exact, but the essential bit was the very high number of logical reads on the table stored as a heap. Even just doing a SELECT Count(*) from Integration.MergePolicy incurred that sort of output, even though the result was always 0. I suppose I should research more on the allocation and deallocation of pages to tables stored as a heap, but I haven't, and my original assumption that a table stored as a heap with no rows would only need to read one page to answer any query was definitely proven wrong. It's likely that some sort of physical defragmentation of the table may have cleaned that up, but it seemed that the easiest answer was to put a clustered index on the table. After doing so, the execution plan showed a cluster index scan, and the IO stats showed only a single page read. (I aborted my first attempt at adding a clustered index on the table because it was taking too long - instead I ran TRUNCATE TABLE Integration.MergePolicy first and added the clustered index, both of which took very little time). I suspect I may not have noticed this if I had used TRUNCATE TABLE Integration.MergePolicy instead of DELETE FROM Integration.MergePolicy, since I'm guessing that the truncate operation does some rather quick releasing of pages allocated to the heap table. In the future, I will likely be much more careful to have a clustered index on every table I use, even the working tables. Mike  

    Read the article

  • Breaking 1NF to model subset constraints. Does this sound sane?

    - by Chris Travers
    My first question here. Appologize if it is in the wrong forum but this seems pretty conceptual. I am looking at doing something that goes against conventional wisdom and want to get some feedback as to whether this is totally insane or will result in problems, so critique away! I am on PostgreSQL 9.1 but may be moving to 9.2 for this part of this project. To re-iterate: Does it seem sane to break 1NF in this way? I am not looking for debugging code so much as where people see problems that this might lead. The Problem In double entry accounting, financial transactions are journal entries with an arbitrary number of lines. Each line has either a left value (debit) or a right value (credit) which can be modelled as a single value with negatives as debits and positives as credits or vice versa. The sum of all debits and credits must equal zero (so if we go with a single amount field, sum(amount) must equal zero for each financial journal entry). SQL-based databases, pretty much required for this sort of work, have no way to express this sort of constraint natively and so any approach to enforcing it in the database seems rather complex. The Write Model The journal entries are append only. There is a possibility we will add a delete model but it will be subject to a different set of restrictions and so is not applicable here. If and when we allow deletes, we will probably do them using a simple ON DELETE CASCADE designation on the foreign key, and require that deletes go through a dedicated stored procedure which can enforce the other constraints. So inserts and selects have to be accommodated but updates and deletes do not for this task. My Proposed Solution My proposed solution is to break first normal form and model constraints on arrays of tuples, with a trigger that breaks the rows out into another table. CREATE TABLE journal_line ( entry_id bigserial primary key, account_id int not null references account(id), journal_entry_id bigint not null, -- adding references later amount numeric not null ); I would then add "table methods" to extract debits and credits for reporting purposes: CREATE OR REPLACE FUNCTION debits(journal_line) RETURNS numeric LANGUAGE sql IMMUTABLE AS $$ SELECT CASE WHEN $1.amount < 0 THEN $1.amount * -1 ELSE NULL END; $$; CREATE OR REPLACE FUNCTION credits(journal_line) RETURNS numeric LANGUAGE sql IMMUTABLE AS $$ SELECT CASE WHEN $1.amount > 0 THEN $1.amount ELSE NULL END; $$; Then the journal entry table (simplified for this example): CREATE TABLE journal_entry ( entry_id bigserial primary key, -- no natural keys :-( journal_id int not null references journal(id), date_posted date not null, reference text not null, description text not null, journal_lines journal_line[] not null ); Then a table method and and check constraints: CREATE OR REPLACE FUNCTION running_total(journal_entry) returns numeric language sql immutable as $$ SELECT sum(amount) FROM unnest($1.journal_lines); $$; ALTER TABLE journal_entry ADD CONSTRAINT CHECK (((journal_entry.running_total) = 0)); ALTER TABLE journal_line ADD FOREIGN KEY journal_entry_id REFERENCES journal_entry(entry_id); And finally we'd have a breakout trigger: CREATE OR REPLACE FUNCTION je_breakout() RETURNS TRIGGER LANGUAGE PLPGSQL AS $$ BEGIN IF TG_OP = 'INSERT' THEN INSERT INTO journal_line (journal_entry_id, account_id, amount) SELECT NEW.id, account_id, amount FROM unnest(NEW.journal_lines); RETURN NEW; ELSE RAISE EXCEPTION 'Operation Not Allowed'; END IF; END; $$; And finally CREATE TRIGGER AFTER INSERT OR UPDATE OR DELETE ON journal_entry FOR EACH ROW EXECUTE_PROCEDURE je_breaout(); Of course the example above is simplified. There will be a status table that will track approval status allowing for separation of duties, etc. However the goal here is to prevent unbalanced transactions. Any feedback? Does this sound entirely insane? Standard Solutions? In getting to this point I have to say I have looked at four different current ERP solutions to this problems: Represent every line item as a debit and a credit against different accounts. Use of foreign keys against the line item table to enforce an eventual running total of 0 Use of constraint triggers in PostgreSQL Forcing all validation here solely through the app logic. My concerns are that #1 is pretty limiting and very hard to audit internally. It's not programmer transparent and so it strikes me as being difficult to work with in the future. The second strikes me as being very complex and required a series of contraints and foreign keys against self to make work, and therefore it strikes me as complex, hard to sort out at least in my mind, and thus hard to work with. The fourth could be done as we force all access through stored procedures anyway and this is the most common solution (have the app total things up and throw an error otherwise). However, I think proof that a constraint is followed is superior to test cases, and so the question becomes whether this in fact generates insert anomilies rather than solving them. If this is a solved problem it isn't the case that everyone agrees on the solution....

    Read the article

  • k-d tree implementation [closed]

    - by user466441
    when i run my code and debugged,i got this error - this 0x00093584 {_Myproxy=0x00000000 _Mynextiter=0x00000000 } std::_Iterator_base12 * const - _Myproxy 0x00000000 {_Mycont=??? _Myfirstiter=??? } std::_Container_proxy * _Mycont CXX0017: Error: symbol "" not found _Myfirstiter CXX0030: Error: expression cannot be evaluated + _Mynextiter 0x00000000 {_Myproxy=??? _Mynextiter=??? } std::_Iterator_base12 * but i dont know what does it means,code is this #include<iostream> #include<vector> #include<algorithm> using namespace std; struct point { float x,y; }; vector<point>pointleft(4); vector<point>pointright(4); //we are going to implement two comparison function for x and y coordinates,we need it in calculation of median (we should sort vector //by x or y according to depth informaton,is depth even or odd. bool sortby_X(point &a,point &b) { return a.x<b.x; } bool sortby_Y(point &a,point &b) { return a.y<b.y; } //so i am going to implement to median finding algorithm,one for finding median by x and another find median by y point medianx(vector<point>points) { point temp; sort(points.begin(),points.end(),sortby_X); temp=points[(points.size()/2)]; return temp; } point mediany(vector<point>points) { point temp; sort(points.begin(),points.end(),sortby_Y); temp=points[(points.size()/2)]; return temp; } //now construct basic tree structure struct Tree { float x,y; Tree(point a) { x=a.x; y=a.y; } Tree *left; Tree *right; }; Tree * build_kd( Tree *root,vector<point>points,int depth) { point temp; if(points.size()==1)// that point is as a leaf { if(root==NULL) root=new Tree(points[0]); return root; } if(depth%2==0) { temp=medianx(points); root=new Tree(temp); for(int i=0;i<points.size();i++) { if (points[i].x<temp.x) pointleft[i]=points[i]; else pointright[i]=points[i]; } } else { temp=mediany(points); root=new Tree(temp); for(int i=0;i<points.size();i++) { if(points[i].y<temp.y) pointleft[i]=points[i]; else pointright[i]=points[i]; } } return build_kd(root->left,pointleft,depth+1); return build_kd(root->right,pointright,depth+1); } void print(Tree *root) { while(root!=NULL) { cout<<root->x<<" " <<root->y; print(root->left); print(root->right); } } int main() { int depth=0; Tree *root=NULL; vector<point>points(4); float x,y; int n=4; for(int i=0;i<n;i++) { cin>>x>>y; points[i].x=x; points[i].y=y; } root=build_kd(root,points,depth); print(root); return 0; } i am trying ti implement in c++ this pseudo code tuple function build_kd_tree(int depth, set points): if points contains only one point: return that point as a leaf. if depth is even: Calculate the median x-value. Create a set of points (pointsLeft) that have x-values less than the median. Create a set of points (pointsRight) that have x-values greater than or equal to the median. else: Calculate the median y-value. Create a set of points (pointsLeft) that have y-values less than the median. Create a set of points (pointsRight) that have y-values greater than or equal to the median. treeLeft = build_kd_tree(depth + 1, pointsLeft) treeRight = build_kd_tree(depth + 1, pointsRight) return(median, treeLeft, treeRight) please help me what this error means?

    Read the article

< Previous Page | 57 58 59 60 61 62 63 64 65 66 67 68  | Next Page >