Search Results

Search found 1848 results on 74 pages for 'algorithms'.

Page 70/74 | < Previous Page | 66 67 68 69 70 71 72 73 74 | Next Page >

Using boost::iterator

- by Neil G

I wrote a sparse vector class (see #1, #2.) I would like to provide two kinds of iterators: The first set, the regular iterators, can point any element, whether set or unset. If they are read from, they return either the set value or value_type(), if they are written to, they create the element and return the lvalue reference. Thus, they are: Random Access Traversal Iterator and Readable and Writable Iterator The second set, the sparse iterators, iterate over only the set elements. Since they don't need to lazily create elements that are written to, they are: Random Access Traversal Iterator and Readable and Writable and Lvalue Iterator I also need const versions of both, which are not writable. I can fill in the blanks, but not sure how to use boost::iterator_adaptor to start out. Here's what I have so far: template<typename T> class sparse_vector { public: typedef size_t size_type; typedef T value_type; private: typedef T& true_reference; typedef const T* const_pointer; typedef sparse_vector<T> self_type; struct ElementType { ElementType(size_type i, T const& t): index(i), value(t) {} ElementType(size_type i, T&& t): index(i), value(t) {} ElementType(size_type i): index(i) {} ElementType(ElementType const&) = default; size_type index; value_type value; }; typedef vector<ElementType> array_type; public: typedef T* pointer; typedef T& reference; typedef const T& const_reference; private: size_type size_; mutable typename array_type::size_type sorted_filled_; mutable array_type data_; // lots of code for various algorithms... public: class sparse_iterator : public boost::iterator_adaptor< sparse_iterator // Derived , array_type::iterator // Base (the internal array) (this paramater does not compile! -- says expected a type, got 'std::vector::iterator'???) , boost::use_default // Value , boost::random_access_traversal_tag? // CategoryOrTraversal > class iterator_proxy { ??? }; class iterator : public boost::iterator_facade< iterator // Derived , ????? // Base , ????? // Value , boost::?????? // CategoryOrTraversal > { }; };

Read the article
How to avoid repetition when working with primitive types?

- by I82Much

I have the need to perform algorithms on various primitive types; the algorithm is essentially the same with the exception of which type the variables are. So for instance, /** * Determine if <code>value</code> is the bitwise OR of elements of <code>validValues</code> array. * For instance, our valid choices are 0001, 0010, and 1000. * We are given a value of 1001. This is valid because it can be made from * ORing together 0001 and 1000. * On the other hand, if we are given a value of 1111, this is invalid because * you cannot turn on the second bit from left by ORing together those 3 * valid values. */ public static boolean isValid(long value, long[] validValues) { for (long validOption : validValues) { value &= ~validOption; } return value != 0; } public static boolean isValid(int value, int[] validValues) { for (int validOption : validValues) { value &= ~validOption; } return value != 0; } How can I avoid this repetition? I know there's no way to genericize primitive arrays, so my hands seem tied. I have instances of primitive arrays and not boxed arrays of say Number objects, so I do not want to go that route either. I know there are a lot of questions about primitives with respect to arrays, autoboxing, etc., but I haven't seen it formulated in quite this way, and I haven't seen a decisive answer on how to interact with these arrays. I suppose I could do something like: public static<E extends Number> boolean isValid(E value, List<E> numbers) { long theValue = value.longValue(); for (Number validOption : numbers) { theValue &= ~validOption.longValue(); } return theValue != 0; } and then public static boolean isValid(long value, long[] validValues) { return isValid(value, Arrays.asList(ArrayUtils.toObject(validValues))); } public static boolean isValid(int value, int[] validValues) { return isValid(value, Arrays.asList(ArrayUtils.toObject(validValues))); } Is that really much better though? Any thoughts in this matter would be appreciated.

Read the article
Which programming language to choose? (for a specific problem/domain, details inside)

- by Bijan

I am building a trading portfolio management system that is responsible for production, optimization, and simulation of non-high frequency trading portfolios (dealing with 1min or 3min bars of data, not tick data). I plan on employing Amazon web services to take on the entire load of the application. I have four choices that I am considering as language. a) Java b) C++ c) C# d) Python Here is the scope of the extremes of the project scope. This isn't how it will be, maybe ever, but it's within the scope of the requirements: Weekly simulation of 10,000,000 trading systems. (Each trading system is expected to have its own data mining methods, including feature selection algorithms which are extremely computationally-expensive. Imagine 500-5000 features using wrappers. These are not run often by any means, but it's still a consideration) Real-time production of portfolio w/ 100,000 trading strategies Taking in 1 min or 3 min data from every stock/futures market around the globe (approx 100,000) Portfolio optimization of portfolios with up to 100,000 strategies. (rather intensive algorithm) Speed is a concern, but I believe that Java can handle the load. I just want to make sure that Java CAN handle the above requirements comfortably. I don't want to do the project in C++, but I will if it's required. The reason C# is on there is because I thought it was a good alternative to Java, even though I don't like Windows at all and would prefer Java if all things are the same. Python - I've read somethings on PyPy and pyscho that claim python can be optimized with JIT compiling to run at near C-like speeds.... That's pretty much the only reason it is on this list, besides that fact that Python is a great language and would probably be the most enjoyable language to code in, which is not a factor at all for this project, but a perk. To sum up: - real time production - weekly simulations of a large number of systems - weekly/monthly optimizations of portfolios - large numbers of connections to collect data from There is no dealing with millisecond or even second based trades. The only consideration is if Java can possibly deal with this kind of load when spread out of a necessary amount of EC2 servers. Thank you guys so much for your wisdom.

Read the article
Finding N contiguous zero bits in an integer to the left of the MSB position of another integer

- by James Morris

The problem is: given an integer val1 find the position of the highest bit set (Most Significant Bit) then, given a second integer val2 find a contiguous region of unset bits, with the minimum number of zero bits given by width to the left of the position (ie, in the higher bits). Here is the C code for my solution: typedef unsigned int t; unsigned const t_bits = sizeof(t) * CHAR_BIT; _Bool test_fit_within_left_of_msb( unsigned width, t val1, t val2, unsigned* offset_result) { unsigned offbit = 0; unsigned msb = 0; t mask; t b; while(val1 >>= 1) ++msb; while(offbit + width < t_bits - msb) { mask = (((t)1 << width) - 1) << (t_bits - width - offbit); b = val2 & mask; if (!b) { *offset_result = offbit; return true; } if (offbit++) /* this conditional bothers me! */ b <<= offbit - 1; while(b <<= 1) offbit++; } return false; } Aside from faster ways of finding the MSB of the first integer, the commented test for a zero offbit seems a bit extraneous, but necessary to skip the highest bit of type t if it is set. I have also implemented similar algorithms but working to the right of the MSB of the first number, so they don't require this seemingly extra condition. How can I get rid of this extra condition, or even, are there far more optimal solutions? Edit: Some background not strictly required. The offset result is a count of bits from the high bit, not from the low bit as maybe expected. This will be part of a wider algorithm which scans a 2D array for a 2D area of zero bits. Here, for testing, the algorithm has been simplified. val1 represents the first integer which does not have all bits set found in a row of the 2D array. From this the 2D version would scan down which is what val2 represents. Here's some output showing success and failure: t_bits:32 t_high: 10000000000000000000000000000000 ( 2147483648 ) --------- ----------------------------------- *** fit within left of msb test *** ----------------------------------- val1: 00000000000000000000000010000000 ( 128 ) val2: 01000001000100000000100100001001 ( 1091569929 ) msb: 7 offbit:0 + width: 8 = 8 mask: 11111111000000000000000000000000 ( 4278190080 ) b: 01000001000000000000000000000000 ( 1090519040 ) offbit:8 + width: 8 = 16 mask: 00000000111111110000000000000000 ( 16711680 ) b: 00000000000100000000000000000000 ( 1048576 ) offbit:12 + width: 8 = 20 mask: 00000000000011111111000000000000 ( 1044480 ) b: 00000000000000000000000000000000 ( 0 ) offbit:12 iters:10 ***** found room for width:8 at offset: 12 ***** ----------------------------------- *** fit within left of msb test *** ----------------------------------- val1: 00000000000000000000000001000000 ( 64 ) val2: 00010000000000001000010001000001 ( 268469313 ) msb: 6 offbit:0 + width: 13 = 13 mask: 11111111111110000000000000000000 ( 4294443008 ) b: 00010000000000000000000000000000 ( 268435456 ) offbit:4 + width: 13 = 17 mask: 00001111111111111000000000000000 ( 268402688 ) b: 00000000000000001000000000000000 ( 32768 ) ***** mask: 00001111111111111000000000000000 ( 268402688 ) offbit:17 iters:15 ***** no room found for width:13 ***** (iters is the count of iterations of the inner while loop)

Read the article
Looking for an Open Source Project in need of help

- by hvidgaard

Hi StackOverflow! I'm a CS student on well on my way to graduate. I have had a difficult time of finding relevant student jobs (they seems to be taken merely hours after the notice gets on the board) , so instead I'm looking for an open source project in need of help. I'm aware that I should choose one that I use, but I'm not aware of any OS-project that I use that needs help. That's why I'm asking you. I don't have any deep experience, but I here are some of my biggest projects so far: BitTorrent-ish client in Python (a subset of BitTorrent) HTTP 1.1 webserver in Java Compiler from a subset of Java to run on JRE Flash-framework project to model an iPad look and feel (not to run actual iPad programs) complete with an API for programs. Complete MySQL database for a booking system, with departure and arrival times, so you could only book valid tickets (with a Java frontend). I know, Java and languages like AS3 and C# feels natural per se, Python, and have done a fair bit of hacking around in C, but I don't feel very comfortable with it. Mostly I'm afraid to make a fuckup because I have such a high degree of control. I would like to think I'm well aware of good software design practices, but in reality what I do is ask myself "would I like to use/maintain this?", and I love to refactor my code because I see optimizations. I love algorithms and to make them run in the best possible time. I don't have any preferred domain to work in, but I wouldn't mind it to be graphics or math heavy. Ideally I'm looking for a project in C++ to learn the in's and out's of it, but I'm well aware that I don't know that language very well. I would like to have a mentor-like figure until I'm confident enough to stand on my own, not one to review all my code (I'm sure someone will to start with anyway), but to ask questions about the project and language in question. I do have a wife and two children, so don't expect me to put in 10+ hours every week. In return I can work on my own, I strive to program modular and maintainable code. Know how to read an API, use Google, StackOverflow and online resources in general. If you have any questions, shoot. I'm looking forward to your suggestions.

Read the article
How to check for palindrome using Python logic

- by DrOnline

My background is only a 6 month college class in basic C/C++, and I'm trying to convert to Python. I may be talking nonsense, but it seems to me C, at least at my level, is very for-loop intensive. I solve most problems with these loops. And it seems to me the biggest mistake people do when going from C to Python is trying to implement C logic using Python, which makes things run slowly, and it's just not making the most of the language. I see on this website: http://hyperpolyglot.org/scripting (serach for "c-style for", that Python doesn't have C-style for loops. Might be outdated, but I interpret it to mean Python has its own methods for this. I've tried looking around, I can't find much up to date (Python 3) advice for this. How can I solve a palindrome challenge in Python, without using the for loop? I've done this in C in class, but I want to do it in Python, on a personal basis. The problem is from the Euler Project, great site btw. def isPalindrome(n): lst = [int(n) for n in str(n)] l=len(lst) if l==0 || l==1: return True elif len(lst)%2==0: for k in range (l) ##### else: while (k<=((l-1)/2)): if (list[]): ##### for i in range (999, 100, -1): for j in range (999,100, -1): if isPalindrome(i*j): print(i*j) break I'm missing a lot of code here. The five hashes are just reminders for myself. Concrete questions: 1) In C, I would make a for loop comparing index 0 to index max, and then index 0+1 with max-1, until something something. How to best do this in Python? 2) My for loop (in in range (999, 100, -1), is this a bad way to do it in Python? 3) Does anybody have any good advice, or good websites or resources for people in my position? I'm not a programmer, I don't aspire to be one, I just want to learn enough so that when I write my bachelor's degree thesis (electrical engineering), I don't have to simultaneously LEARN an applicable programming language while trying to obtain good results in the project. "How to go from basic C to great application of Python", that sort of thing. 4) Any specific bits of code to make a great solution to this problem would also be appreciated, I need to learn good algorithms.. I am envisioning 3 situations. If the value is zero or single digit, if it is of odd length, and if it is of even length. I was planning to write for loops... PS: The problem is: Find the highest value product of two 3 digit integers that is also a palindrome.

Read the article
Android library to get pitch from WAV file

- by Sakura

I have a list of sampled data from the WAV file. I would like to pass in these values into a library and get the frequency of the music played in the WAV file. For now, I will have 1 frequency in the WAV file and I would like to find a library that is compatible with Android. I understand that I need to use FFT to get the frequency domain. Is there any good libraries for that? I found that [KissFFT][1] is quite popular but I am not very sure how compatible it is on Android. Is there an easier and good library that can perform the task I want? EDIT: I tried to use JTransforms to get the FFT of the WAV file but always failed at getting the correct frequency of the file. Currently, the WAV file contains sine curve of 440Hz, music note A4. However, I got the result as 441. Then I tried to get the frequency of G4, I got the result as 882Hz which is incorrect. The frequency of G4 is supposed to be 783Hz. Could it be due to not enough samples? If yes, how much samples should I take? //DFT DoubleFFT_1D fft = new DoubleFFT_1D(numOfFrames); double max_fftval = -1; int max_i = -1; double[] fftData = new double[numOfFrames * 2]; for (int i = 0; i < numOfFrames; i++) { // copying audio data to the fft data buffer, imaginary part is 0 fftData[2 * i] = buffer[i]; fftData[2 * i + 1] = 0; } fft.complexForward(fftData); for (int i = 0; i < fftData.length; i += 2) { // complex numbers -> vectors, so we compute the length of the vector, which is sqrt(realpart^2+imaginarypart^2) double vlen = Math.sqrt((fftData[i] * fftData[i]) + (fftData[i + 1] * fftData[i + 1])); //fd.append(Double.toString(vlen)); // fd.append(","); if (max_fftval < vlen) { // if this length is bigger than our stored biggest length max_fftval = vlen; max_i = i; } } //double dominantFreq = ((double)max_i / fftData.length) * sampleRate; double dominantFreq = (max_i/2.0) * sampleRate / numOfFrames; fd.append(Double.toString(dominantFreq)); Can someone help me out? EDIT2: I manage to fix the problem mentioned above by increasing the number of samples to 100000, however, sometimes I am getting the overtones as the frequency. Any idea how to fix it? Should I use Harmonic Product Frequency or Autocorrelation algorithms?

Read the article
Finding the most frequent subtrees in a collection of (parse) trees

- by peter.murray.rust

I have a collection of trees whose nodes are labelled (but not uniquely). Specifically the trees are from a collection of parsed sentences (see http://en.wikipedia.org/wiki/Treebank). I wish to extract the most common subtrees from the collection - performance is not (yet) an issue. I'd be grateful for algorithms (ideally Java) or pointers to tools which do this for treebanks. Note that order of child nodes is important. EDIT @mjv. We are working in a limited domain (chemistry) which has a stylised language so the varirty of the trees is not huge - probably similar to children's readers. Simple tree for "the cat sat on the mat". <sentence> <nounPhrase> <article/> <noun/> </nounPhrase> <verbPhrase> <verb/> <prepositionPhrase> <preposition/> <nounPhrase> <article/> <noun/> </nounPhrase> </prepositionPhrase> </verbPhrase> </sentence> Here the sentence contains two identical part-of-speech subtrees (the actual tokens "cat". "mat" are not important in matching). So the algorithm would need to detect this. Note that not all nounPhrases are identical - "the big black cat" could be: <nounPhrase> <article/> <adjective/> <adjective/> <noun/> </nounPhrase> The length of sentences will be longer - between 15 to 30 nodes. I would expect to get useful results from 1000 trees. If this does not take more than a day or so that's acceptable. Obviously the shorter the tree the more frequent, so nounPhrase will be very common. EDIT If this is to be solved by flattening the tree then I think it would be related to Longest Common Substring, not Longest Common Sequence. But note that I don't necessarily just want the longest - I want a list of all those long enough to be "interesting" (criterion yet to be decided).

Read the article
Initialize a Variable Again.

- by SoulBeaver

That may sound a little confusing. Basically, I have a function CCard newCard() { /* Used to store the string variables intermittantly */ std::stringstream ssPIN, ssBN; int picker1, picker2; int pin, bankNum; /* Choose 5 random variables, store them in stream */ for( int loop = 0; loop < 5; ++loop ) { picker1 = rand() % 8 + 1; picker2 = rand() % 8 + 1; ssPIN << picker1; ssBN << picker2; } /* Convert them */ ssPIN >> pin; ssBN >> bankNum; CCard card( pin, bankNum ); return card; } that creates a new CCard variable and returns it to the caller CCard card = newCard(); My teacher advised me that doing this is a violation of OOP principles and has to be put in the class. He told me to use this method as a constructor. Which I did: CCard::CCard() { m_Sperre = false; m_Guthaben = rand() % 1000; /* Work */ /* Convert them */ ssPIN >> m_Geheimzahl; ssBN >> m_Nummer; } All variables with m_ are member variables. However, the constructor works when I initialize the card normally CCard card(); at the start of the program. However, I also have a function, that is supposed to create a new card and return it to the user, this function is now broken. The original command: card = newCard(); isn't available anymore, and card = new CCard(); doesn't work. What other options do I have? I have a feeling using the constructor won't work, and that I probably should just create a class method newCard, but I want to see if it is somehow at all possible to do it the way the teacher wanted. This is creating a lot of headaches for me. I told the teacher that this is a stupid idea and not everything has to be classed in OOP. He has since told me that Java or C# don't allow code outside of classes, which sounds a little incredible. Not sure that you can do this in C++, especially when templated functions exist, or generic algorithms. Is it true that this would be bad code for OOP in C++ if I didn't force it into a class?

Read the article
Blending pixels from Two Bitmaps

- by MarkPowell

I'm beating my head against a wall here, and I'm fairly certain I'm doing something stupid, so time to make my stupidity public. I'm trying to take two images, blend them together into a third image using standard blending algorithms (Hardlight, softlight, overlay, multiply, etc). Because Android does not have such blend properties build in, I've gone down the path of taking each pixel and combine them using an algorithm. However, the results are garbage. Any help would be appreciated. Below is the code, which I've tried to strip out all the "junk", but some may have made it through. I'll clean it up if something isn't clear. Bitmap src = BitmapFactory.decodeResource(getResources(), R.drawable.base, options); Bitmap mutableBitmap = src.copy(Bitmap.Config.RGB_565, true); int imageId = getResources().getIdentifier("drawable/" + filter, null, getPackageName()); Bitmap filterBitmap = BitmapFactory.decodeResource(getResources(), imageId, options); float scaleWidth = ((float) mutableBitmap.getWidth()) / filterBitmap.getWidth(); float scaleHeight = ((float) mutableBitmap.getHeight()) / filterBitmap.getHeight(); IntBuffer buffSrc = IntBuffer.allocate(src.getWidth() * src.getHeight()); mutableBitmap.copyPixelsToBuffer(buffSrc); buffSrc.rewind(); IntBuffer buffFilter = IntBuffer.allocate(resizedFilterBitmap.getWidth() * resizedFilterBitmap.getHeight()); resizedFilterBitmap.copyPixelsToBuffer(buffFilter); buffFilter.rewind(); IntBuffer buffOut = IntBuffer.allocate(src.getWidth() * src.getHeight()); buffOut.rewind(); while (buffOut.position() < buffOut.limit()) { int filterInt = buffFilter.get(); int srcInt = buffSrc.get(); int alphaValueFilter = Color.alpha(filterInt); int redValueFilter = Color.red(filterInt); int greenValueFilter = Color.green(filterInt); int blueValueFilter = Color.blue(filterInt); int alphaValueSrc = Color.alpha(srcInt); int redValueSrc = Color.red(srcInt); int greenValueSrc = Color.green(srcInt); int blueValueSrc = Color.blue(srcInt); int alphaValueFinal = convert(alphaValueFilter, alphaValueSrc); int redValueFinal = convert(redValueFilter, redValueSrc); int greenValueFinal = convert(greenValueFilter, greenValueSrc); int blueValueFinal = convert(blueValueFilter, blueValueSrc); int pixel = Color.argb(alphaValueFinal, redValueFinal, greenValueFinal, blueValueFinal); buffOut.put(pixel); } buffOut.rewind(); mutableBitmap.copyPixelsFromBuffer(buffOut); BitmapDrawable drawable = new BitmapDrawable(getResources(), mutableBitmap); imageView.setImageDrawable(drawable); } int convert (int in1, int in2) { //simple multiply for example return in1 * in2 / 255; }

Read the article
Primary language - C++/Qt, C#, Java?

- by Airjoe

I'm looking for some input, but let me start with a bit of background (for tl;dr skip to end). I'm an IT major with a concentration in networking. While I'm not a CS major nor do I want to program as a vocation, I do consider myself a programmer and do pretty well with the concepts involved. I've been programming since about 6th grade, started out with a proprietary game creation language that made my transition into C++ at college pretty easy. I like to make programs for myself and friends, and have been paid to program for local businesses. A bit about that- I wrote some programs for a couple local businesses in my senior year in high school. I wrote management systems for local shops (inventory, phone/pos orders, timeclock, customer info, and more stuff I can't remember). It definitely turned out to be over my head, as I had never had any formal programming education. It was a great learning experience, but damn was it crappy code. Oh yeah, by the way, it was all vb6. So, I've used vb6 pretty extensively, I've used c++ in my classes (intro to programming up to algorithms), used Java a little bit in another class (had to write a ping client program, pretty easy) and used Java for some simple Project Euler problems to help learn syntax and such when writing the program for the class. I've also used C# a bit for my own simple personal projects (simple programs, one which would just generate an HTTP request on a list of websites and notify if one responded unexpectedly or not at all, and another which just held a list of things to do and periodically reminded me to do them), things I would've written in vb6 a year or two ago. I've just started using Qt C++ for some undergrad research I'm working on. Now I've had some formal education, I [think I] understand organization in programming a lot better (I didn't even use classes in my vb6 programs where I really should have), how it's important to structure code, split into functions where appropriate, document properly, efficiency both in memory and speed, dynamic and modular programming etc. I was looking for some input on which language to pick up as my "primary". As I'm not a "real programmer", it will be mostly hobby projects, but will include some 'real' projects I'm sure. From my perspective: QtC++ and Java are cross platform, which is cool. Java and C# run in a virtual machine, but I'm not sure if that's a big deal (something extra to distribute, possibly a bit slower? I think Qt would require additional distributables too, right?). I don't really know too much more than this, so I appreciate any help, thanks! TL;DR Am an avocational programmer looking for a language, want quick and straight forward development, liked vb6, will be working with database driven GUI apps- should I go with QtC++, Java, C#, or perhaps something else?

Read the article
Java: Combination of recursive loops which has different FOR loop inside; Output: FOR loops indexes

- by vvinjj

currently recursion is fresh & difficult topic for me, however I need to use it in one of my algorithms. Here is the challenge: I need a method where I specify number of recursions (number of nested FOR loops) and number of iterations for each FOR loop. The result should show me, something simmilar to counter, however each column of counter is limited to specific number. ArrayList<Integer> specs= new ArrayList<Integer>(); specs.add(5); //for(int i=0 to 5; i++) specs.add(7); specs.add(9); specs.add(2); specs.add(8); specs.add(9); public void recursion(ArrayList<Integer> specs){ //number of nested loops will be equal to: specs.size(); //each item in specs, specifies the For loop max count e.g: //First outside loop will be: for(int i=0; i< specs.get(0); i++) //Second loop inside will be: for(int i=0; i< specs.get(1); i++) //... } The the results will be similar to outputs of this manual, nested loop: int[] i; i = new int[7]; for( i[6]=0; i[6]<5; i[6]++){ for( i[5]=0; i[5]<7; i[5]++){ for(i[4] =0; i[4]<9; i[4]++){ for(i[3] =0; i[3]<2; i[3]++){ for(i[2] =0; i[2]<8; i[2]++){ for(i[1] =0; i[1]<9; i[1]++){ //... System.out.println(i[1]+" "+i[2]+" "+i[3]+" "+i[4]+" "+i[5]+" "+i[6]); } } } } } } I already, killed 3 days on this, and still no results, was searching it in internet, however the examples are too different. Therefore, posting the programming question in internet first time in my life. Thank you in advance, you are free to change the code efficiency, I just need the same results.

Read the article
Approaches for Content-based Item Recommendations

- by PartlyCloudy

Hello, I'm currently developing an application where I want to group similar items. Items (like videos) can be created by users and also their attributes can be altered or extended later (like new tags). Instead of relying on users' preferences as most collaborative filtering mechanisms do, I want to compare item similarity based on the items' attributes (like similar length, similar colors, similar set of tags, etc.). The computation is necessary for two main purposes: Suggesting x similar items for a given item and for clustering into groups of similar items. My application so far is follows an asynchronous design and I want to decouple this clustering component as far as possible. The creation of new items or the addition of new attributes for an existing item will be advertised by publishing events the component can then consume. Computations can be provided best-effort and "snapshotted", which means that I'm okay with the best result possible at a given point in time, although result quality will eventually increase. So I am now searching for appropriate algorithms to compute both similar items and clusters. At important constraint is scalability. Initially the application has to handle a few thousand items, but later million items might be possible as well. Of course, computations will then be executed on additional nodes, but the algorithm itself should scale. It would also be nice if the algorithm supports some kind of incremental mode on partial changes of the data. My initial thought of comparing each item with each other and storing the numerical similarity sounds a little bit crude. Also, it requires n*(n-1)/2 entries for storing all similarities and any change or new item will eventually cause n similarity computations. Thanks in advance! UPDATE tl;dr To clarify what I want, here is my targeted scenario: User generate entries (think of documents) User edit entry meta data (think of tags) And here is what my system should provide: List of similar entries to a given item as recommendation Clusters of similar entries Both calculations should be based on: The meta data/attributes of entries (i.e. usage of similar tags) Thus, the distance of two entries using appropriate metrics NOT based on user votings, preferences or actions (unlike collaborative filtering). Although users may create entries and change attributes, the computation should only take into account the items and their attributes, and not the users associated with (just like a system where only items and no users exist). Ideally, the algorithm should support: permanent changes of attributes of an entry incrementally compute similar entries/clusters on changes scale something better than a simple distance table, if possible (because of the O(n²) space complexity)

Read the article
Primary language - QtC++, C#, Java?

- by Airjoe

I'm looking for some input, but let me start with a bit of background (for tl;dr skip to end). I'm an IT major with a concentration in networking. While I'm not a CS major nor do I want to program as a vocation, I do consider myself a programmer and do pretty well with the concepts involved. I've been programming since about 6th grade, started out with a proprietary game creation language that made my transition into C++ at college pretty easy. I like to make programs for myself and friends, and have been paid to program for local businesses. A bit about that- I wrote some programs for a couple local businesses in my senior year in high school. I wrote management systems for local shops (inventory, phone/pos orders, timeclock, customer info, and more stuff I can't remember). It definitely turned out to be over my head, as I had never had any formal programming education. It was a great learning experience, but damn was it crappy code. Oh yeah, by the way, it was all vb6. So, I've used vb6 pretty extensively, I've used c++ in my classes (intro to programming up to algorithms), used Java a little bit in another class (had to write a ping client program, pretty easy) and used Java for some simple Project Euler problems to help learn syntax and such when writing the program for the class. I've also used C# a bit for my own simple personal projects (simple programs, one which would just generate an HTTP request on a list of websites and notify if one responded unexpectedly or not at all, and another which just held a list of things to do and periodically reminded me to do them), things I would've written in vb6 a year or two ago. I've just started using Qt C++ for some undergrad research I'm working on. Now I've had some formal education, I [think I] understand organization in programming a lot better (I didn't even use classes in my vb6 programs where I really should have), how it's important to structure code, split into functions where appropriate, document properly, efficiency both in memory and speed, dynamic and modular programming etc. I was looking for some input on which language to pick up as my "primary". As I'm not a "real programmer", it will be mostly hobby projects, but will include some 'real' projects I'm sure. From my perspective: QtC++ and Java are cross platform, which is cool. Java and C# run in a virtual machine, but I'm not sure if that's a big deal (something extra to distribute, possibly a bit slower? I think Qt would require additional distributables too, right?). I don't really know too much more than this, so I appreciate any help, thanks! TL;DR Am an avocational programmer looking for a language, want quick and straight forward development, liked vb6, will be working with database driven GUI apps- should I go with QtC++, Java, C#, or perhaps something else?

Read the article
string s; &s+1; Legal? UB?

- by John Dibling

Consider the following code: #include <cstdlib> #include <iostream> #include <string> #include <vector> #include <algorithm> using namespace std; int main() { string myAry[] = { "Mary", "had", "a", "Little", "Lamb" }; const size_t numStrs = sizeof(myStr)/sizeof(myAry[0]); vector<string> myVec(&myAry[0], &myAry[numStrs]); copy( myVec.begin(), myVec.end(), ostream_iterator<string>(cout, " ")); return 0; } Of interest here is &myAry[numStrs]: numStrs is equal to 5, so &myAry[numStrs] points to something that doesn't exist; the sixth element in the array. There is another example of this in the above code: myVec.end(), which points to one-past-the-end of the vector myVec. It's perfecly legal to take the address of this element that doesn't exist. We know the size of string, so we know where the address of the 6th element of a C-style array of strings must point to. So long as we only evaluate this pointer and never dereference it, we're fine. We can even compare it to other pointers for equality. The STL does this all the time in algorithms that act on a range of iterators. The end() iterator points past the end, and the loops keep looping while a counter != end(). So now consider this: #include <cstdlib> #include <iostream> #include <string> #include <vector> #include <algorithm> using namespace std; int main() { string myStr = "Mary"; string* myPtr = &myStr; vector<string> myVec2(myPtr, &myPtr[1]); copy( myVec2.begin(), myVec2.end(), ostream_iterator<string>(cout, " ")); return 0; } Is this code legal and well-defined? It is legal and well-defined to take the address of an array element past the end, as in &myAry[numStrs], so should it be legal and well-defined to pretend that myPtr is also an array?

Read the article
hashing password giving different results

- by geoff

I am taking over a system that a previous developer wrote. The system has an administrator approve a user account and when they do that the system uses the following method to hash a password and save it to the database. It sends the unhashed password to the user. When the user logs in the system uses the exact same method to hash what the user enters and compares it to the database value. We've run into a couple of times when the database entry doesn't match the user's entry whey they should. So it appears that the method isn't always hashing the value the same. Does anyone know if this method of hashing isn't reliable and how to make it reliable? Thanks. private string HashPassword(string password) { string hashedPassword = string.Empty; // Convert plain text into a byte array. byte[] plainTextBytes = Encoding.UTF8.GetBytes(password); // Allocate array, which will hold plain text and salt. byte[] plainTextWithSaltBytes = new byte[plainTextBytes.Length + SALT.Length]; // Copy plain text bytes into resulting array. for(int i = 0; i < plainTextBytes.Length; i++) plainTextWithSaltBytes[i] = plainTextBytes[i]; // Append salt bytes to the resulting array. for(int i = 0; i < SALT.Length; i++) plainTextWithSaltBytes[plainTextBytes.Length + i] = SALT[i]; // Because we support multiple hashing algorithms, we must define // hash object as a common (abstract) base class. We will specify the // actual hashing algorithm class later during object creation. HashAlgorithm hash = new SHA256Managed(); // Compute hash value of our plain text with appended salt. byte[] hashBytes = hash.ComputeHash(plainTextWithSaltBytes); // Create array which will hold hash and original salt bytes. byte[] hashWithSaltBytes = new byte[hashBytes.Length + SALT.Length]; // Copy hash bytes into resulting array. for(int i = 0; i < hashBytes.Length; i++) hashWithSaltBytes[i] = hashBytes[i]; // Append salt bytes to the result. for(int i = 0; i < SALT.Length; i++) hashWithSaltBytes[hashBytes.Length + i] = SALT[i]; // Convert result into a base64-encoded string. hashedPassword = Convert.ToBase64String(hashWithSaltBytes); return hashedPassword; }

Read the article
Best way to have common class shared by both C++ and Ruby?

- by shuttle87

I am currently working on a project where a team of us are designing a game, all of us are proficient in ruby and some (but not all) of us are proficient in c++. Initially we made the backend in ruby but we ported it to c++ for more speed. The c++ port of the backend has exactly the same features and algorithms as the original ruby code. However we still have a bunch of code in ruby that does useful things but we want it to now get the data from the c++ classes. Our first thought was that we could save some of the data structures in something like XML or redis and call that, but some of the developers don't like that idea. We don't need anything particularly complex data structures to be passed between the different parts of the code, just tuples, strings and ints. Is there any way of integrating the ruby code so that it can call the c++ stuff natively? Will we need to embed code? Will we have to make a ruby extension? If so are there any good resources/tutorials you could suggest? For example say we have this code in the c++ backend: class The_game{ private: bool printinfo; //print the player diagnostic info at the beginning if true int numplayers; std::vector<Player*> players; string current_action; int action_is_on; // the index of the player in the players array that the action is now on //more code here public: Table(std::vector<Player *> in_players, std::vector<Statistics *> player_stats ,const int in_numplayers); ~Table(); void play_game(); History actions_history; }; class History{ private: int action_sequence_number; std::vector<Action*> hand_actions; public: void print_history(); void add_action(Action* the_action_to_be_added); int get_action_sequence_number(){ return action_sequence_number;} bool history_actions_are_equal(); int last_action_size(int street,int number_of_actions_ago); History(); ~History(); }; Is there any way to natively call something in the actions_history via The_game object in ruby? (The objects in the original ruby code all had the same names and functionality) By this I mean: class MyRubyClass def method1(arg1) puts arg1 self.f() # ... but still available puts cpp_method.the_current_game.actions_history.get_action_sequence_number() end # Constructor: def initialize(arg) puts "In constructor with arg #{arg}" #get the c++ object here and call it cpp_method end end Is this possible? Any advice or suggestions are appreciated.

Read the article
Oracle Data Mining a Star Schema: Telco Churn Case Study

- by charlie.berger

There is a complete and detailed Telco Churn case study "How to" Blog Series just posted by Ari Mozes, ODM Dev. Manager. In it, Ari provides detailed guidance in how to leverage various strengths of Oracle Data Mining including the ability to: mine Star Schemas and join tables and views together to obtain a complete 360 degree view of a customer combine transactional data e.g. call record detail (CDR) data, etc. define complex data transformation, model build and model deploy analytical methodologies inside the Database His blog is posted in a multi-part series. Below are some opening excerpts for the first 3 blog entries. This is an excellent resource for any novice to skilled data miner who wants to gain competitive advantage by mining their data inside the Oracle Database. Many thanks Ari! Mining a Star Schema: Telco Churn Case Study (1 of 3) One of the strengths of Oracle Data Mining is the ability to mine star schemas with minimal effort. Star schemas are commonly used in relational databases, and they often contain rich data with interesting patterns. While dimension tables may contain interesting demographics, fact tables will often contain user behavior, such as phone usage or purchase patterns. Both of these aspects - demographics and usage patterns - can provide insight into behavior.Churn is a critical problem in the telecommunications industry, and companies go to great lengths to reduce the churn of their customer base. One case study1 describes a telecommunications scenario involving understanding, and identification of, churn, where the underlying data is present in a star schema. That case study is a good example for demonstrating just how natural it is for Oracle Data Mining to analyze a star schema, so it will be used as the basis for this series of posts...... Mining a Star Schema: Telco Churn Case Study (2 of 3) This post will follow the transformation steps as described in the case study, but will use Oracle SQL as the means for preparing data. Please see the previous post for background material, including links to the case study and to scripts that can be used to replicate the stages in these posts.1) Handling missing values for call data recordsThe CDR_T table records the number of phone minutes used by a customer per month and per call type (tariff). For example, the table may contain one record corresponding to the number of peak (call type) minutes in January for a specific customer, and another record associated with international calls in March for the same customer. This table is likely to be fairly dense (most type-month combinations for a given customer will be present) due to the coarse level of aggregation, but there may be some missing values. Missing entries may occur for a number of reasons: the customer made no calls of a particular type in a particular month, the customer switched providers during the timeframe, or perhaps there is a data entry problem. In the first situation, the correct interpretation of a missing entry would be to assume that the number of minutes for the type-month combination is zero. In the other situations, it is not appropriate to assume zero, but rather derive some representative value to replace the missing entries. The referenced case study takes the latter approach. The data is segmented by customer and call type, and within a given customer-call type combination, an average number of minutes is computed and used as a replacement value.In SQL, we need to generate additional rows for the missing entries and populate those rows with appropriate values. To generate the missing rows, Oracle's partition outer join feature is a perfect fit. select cust_id, cdre.tariff, cdre.month, minsfrom cdr_t cdr partition by (cust_id) right outer join (select distinct tariff, month from cdr_t) cdre on (cdr.month = cdre.month and cdr.tariff = cdre.tariff); ....... Mining a Star Schema: Telco Churn Case Study (3 of 3) Now that the "difficult" work is complete - preparing the data - we can move to building a predictive model to help identify and understand churn.The case study suggests that separate models be built for different customer segments (high, medium, low, and very low value customer groups). To reduce the data to a single segment, a filter can be applied: create or replace view churn_data_high asselect * from churn_prep where value_band = 'HIGH'; It is simple to take a quick look at the predictive aspects of the data on a univariate basis. While this does not capture the more complex multi-variate effects as would occur with the full-blown data mining algorithms, it can give a quick feel as to the predictive aspects of the data as well as validate the data preparation steps. Oracle Data Mining includes a predictive analytics package which enables quick analysis. begin dbms_predictive_analytics.explain( 'churn_data_high','churn_m6','expl_churn_tab'); end; /select * from expl_churn_tab where rank <= 5 order by rank; ATTRIBUTE_NAME ATTRIBUTE_SUBNAME EXPLANATORY_VALUE RANK-------------------- ----------------- ----------------- ----------LOS_BAND .069167052 1MINS_PER_TARIFF_MON PEAK-5 .034881648 2REV_PER_MON REV-5 .034527798 3DROPPED_CALLS .028110322 4MINS_PER_TARIFF_MON PEAK-4 .024698149 5From the above results, it is clear that some predictors do contain information to help identify churn (explanatory value > 0). The strongest uni-variate predictor of churn appears to be the customer's (binned) length of service. The second strongest churn indicator appears to be the number of peak minutes used in the most recent month. The subname column contains the interior piece of the DM_NESTED_NUMERICALS column described in the previous post. By using the object relational approach, many related predictors are included within a single top-level column. ..... NOTE: These are just EXCERPTS. Click here to start reading the Oracle Data Mining a Star Schema: Telco Churn Case Study from the beginning.

Read the article
CodePlex Daily Summary for Monday, June 14, 2010

CodePlex Daily Summary for Monday, June 14, 2010New ProjectsBD File Hash: BD File Hash is a convenient file hash and hash compare tool for Windows which currently works with MD5, SHA-1, and SHA-256 algorithms. FileScan: This is an application that searches through a drive or directory structure for files matching a filter. This project was converted from VB to ...genesis9: genesis9HeinanOS: HeinanOS is an operating system developed mainly in C++. HeinanOS is a light OS (1.44 MB image) with a lot of capabilites and many more are being ...MediaBrowserWS - Creates a Web Service for the popular MediaBrowser plugin: Creates a web service in Media Center for accessing your MediaBrowser collection. Allows for external devices (Tablets/phones/laptops) to access a ...MME: New Edition of Managed Menu Extensions for Visual Studio 2010 The Main goal of "MME" is to provide easy access to adding Right Click menus in the ...MVMMapper: Generate the ViewModel and its mapping to the Model when implementing MVVM in .NET. Developed using T4 templates. Current version supports Silver...ProjectArDotNet: Si te agarro te parto! Si te agarro te emperno no me importa que seas menor de edad!Scriptagility for DotNetNuke: Scriptagility is a DotNetNuke module for Javascript developers. This module provides dynamic client scripting infrastructure for developing javascr...simpleLinux Distro: SimpleLinux. is a Linux distributions that is easy to use. Simple Linux website: http://simplelinux.tkTag Cloud Control for asp.net: Tag Cloud Control for asp.net allows the user to display the most important keywords to display in tag cloud. Each Tag has it own navigation url to...thefreeimdb: fsadie qwUppityUp: UppityUp is a simple and light-weight tray application which monitors a remote server and shows a notification when it comes online. This is usefu...Vivid3D 2 - DirectX 10 3D ToolKit: The sequel to my first ever engine wrote several years ago. It is not based on it in anyway. VSIDev: VSI DevXTQXK_WORK: Actionscript 3.0东坡博客: 这是一个ASP。net mvc 2博客。New Releases.NET Extensions - Extension Methods Library: Release 2010.08: Added extension methods for Bitmap manipulation (scaling for now): - Bitmap.ScaleToSize() - Bitmap.ScaleToSizeProportional() - Bitmap.ScaleProport...Black Falcon Software's Database Data-Access-Layers: “SQLHELPER”, “ORAHELPER” - Handling Binary Data: See attached document...BTech Networking Library: BTech Networking Library: Same as pervious just new namespace, extended networking coming soon!!!Community Forums NNTP bridge: Community Forums NNTP Bridge V37: Release of the Community Forums NNTP Bridge to access the social and anwsers MS forums with a single, open source NNTP bridge. This release has ad...Generic Entity Model 2: GEM2 build 54383: This is second BETA release of GEM2! Please see source code change sets for updates! Following implementation is not included in this release: My...Hades: Projet Hadès - Official Demo - Version 0.1.0 Beta: ---------------------------------------------------------------------------- - Projet Hadès - Official Demo - Version 0.1.0 Beta ------------------...HeinanOS: HeinanOS M1 Source Code: You can download HeinanOS M1 Source Code and contribute to HeinanOS development! Be aware that you should not use this code for your own systems! ...HeinanOS: Milestone 1: This is the first major release for HeinanOS 1.0 Please note this is a PRE-RELEASE! This release includes the following features: -Bootable DOS-...HKGolden Express: HKGoldenExpress (Build 201006131900): New features: (None) Bug fix: Incorrect message submit date of message/ replies. (Note: Showing message submit date is enabled since Build 20100...HKGolden Express: HKGoldenExpress (Build 201006140110): New features: (None) Bug fix: (None) Improvements: (None) Other changes: Set time zone of message date as Hong Kong. Adjusted the format of messa...MediaCoder.NET: MediaCoder.NET v1.0 Beta 1.5: Installer file for MediaCoder.NET v1.0 beta 1.5. Now converts multiple files.MME: First release: Features of this release 1. One installer MME.msi. However you can also install MMEMenuManagerSetup.vsix which installs a project template that e...MSBuild Launch Pad (mPad): 1.1 Beta 1: Platform selection box is added.MVMMapper: MVMMapper Release v 1.0.1: This release has no downloadable documentation. Please use the Documentation section to get started.NginxTray: NginxTray 0.7 RC2: NginxTray 0.7 RC2PowerAuras: PowerAuras-3.0.0K-beta3: New Auras: Item Name Equipment Slot Tracking Changes from beta1 5 new aura textures Fixed Tracking bug Added graphical equipment slot sele...PowerAuras: PowerAuras-3.0.0K-beta4: New Auras: Item Name Equipment Slot Tracking Changes from beta1 5 new aura textures Fixed Tracking bug Added graphical equipment slot sele...Scriptagility for DotNetNuke: Scriptagility 1.0 (Beta): Initial public release please evaluate and feedbackSharpDevelop: SharpDevelop 4.0 Beta 1: Release notes: http://community.sharpdevelop.net/forums/t/11388.aspxsimpleLinux Distro: Project X3: This is an example of download for simpleLinuxSOAPI - StackOverflow API Parser/Wrapper Generator: SOAPI Beta 3: The SOAPI Beta 3 download will be made availabe later today when the initial documentation is complete. The previously available Beta 1 download h...Sofa: Initial release V1.0: This is the first release of Sofa. As it is made of code being previously used, as we tested it is a stable release. But bugs are always possible,...Tag Cloud Control for asp.net: Tag Cloud Control for asp.net: Tag Cloud Control for asp.net allows the user to display the most important keywords to display in tag cloud. Each Tag has it own navigation url to...UppityUp: UppityUp v0.1: First functional version, supports monitoring availability by ping (ICMP) requests. Fit for general use. Consists of one standalone .exe file - no...VCC: Latest build, v2.1.30613.0: Automatic drop of latest buildWindStyle ExifInfo for Windows Live Writer: 1.1.0.0: Add: Multiple Language(English and Simplified Chinese); Add: Insert multiple files; Fix: Error when insert pictures without Exif info; Update: Icon...Work Recorder - Hold on own time!: WorkRecorder 1.2: +Add a whole day chartXsltDb - DotNetNuke Module Builder: 01.01.24: Syntax highlighting delivered!New samples for RadControls. On single page you can find RadTreeView, RadRating, RadChart, RadFormDecorator, RadEdito...xUnit.net Contrib: xunitcontrib 0.4 (ReSharper 5.0 RTM + dotCover): xunitcontrib release 0.4 (ReSharper runner) This release provides a test runner plugin for Resharper 5.0, 4.5 and 4.1, targetting all versions of x...Most Popular ProjectsCommunity Forums NNTP bridgeRIA Services EssentialsNeatUploadBxf (Basic XAML Framework)Agile Personal Development Methodology.NET Transactional File ManagerSOLID by exampleASP.NET MVC Time PlannerWEI ShareSiverlight ProjectMost Active ProjectsjQuery Library for SharePoint Web Servicespatterns & practices – Enterprise LibraryNB_Store - Free DotNetNuke Ecommerce Catalog ModuleRhyduino - Arduino and Managed CodeCommunity Forums NNTP bridgeCassandraemonBlogEngine.NETLightweight Fluent WorkflowMediaCoder.NETAndrew's XNA Helpers

Read the article
What is SharePoint Out of the Box?

- by Bil Simser

It’s always fun in the blog-o-sphere and SharePoint bloggers always keep the pot boiling. Bjorn Furuknap recently posted a blog entry titled Why Out-of-the-Box Makes No Sense in SharePoint, quickly followed up by a rebuttal by Marc Anderson on his blog. Okay, now that we have all the players and the stage what’s the big deal? Bjorn started his post saying that you don’t use “out-of-the-box” (OOTB) SharePoint because it makes no sense. I have to disagree with his premise because what he calls OOTB is basically installing SharePoint and admiring it, but not using it. In his post he lays claim that modifying say the OOTB contacts list by removing (or I suppose adding) a column, now puts you in a situation where you’re no longer using the OOTB functionality. Really? Side note. Dear Internet, please stop comparing building software to building houses. Or comparing software architecture to building architecture. Or comparing web sites to making dinner. Are you trying to dumb down something so the general masses understand it? Comparing a technical skill to a construction operation isn’t the way to do this. Last time I checked, most people don’t know how to build houses and last time I checked people reading technical SharePoint blogs are generally technical people that understand the terms you use. Putting metaphors around software development to make it easy to understand is detrimental to the goal. </rant> Okay, where were we? Right, adding columns to lists means you are no longer using the OOTB functionality. Yeah, I still don’t get it. Another statement Bjorn makes is that using the OOTB functionality kills the flexibility SharePoint has in creating exactly what you want. IMHO this really flies in the absolute face of *where* SharePoint *really* shines. For the past year or so I’ve been leaning more and more towards OOTB solutions over custom development for the simple reason that its expensive to maintain systems and code and assets. SharePoint has enabled me to do this simply by providing the tools where I can give users what they need without cracking open up Visual Studio. This might be the fact that my day job is with a regulated company and there’s more scrutiny with spending money on anything new, but frankly that should be the position of any responsible developer, architect, manager, or PM. Do you really want to throw money away because some developer tells you that you need a custom web part when perhaps with some creative thinking or expectation setting with customers you can meet the need with what you already have. The way I read Bjorn’s terminology of “out-of-the-box” is install the software and tell people to go to a website and admire the OOTB system, but don’t change it! For those that know things like WordPress, DotNetNuke, SubText, Drupal or any of those content management/blogging systems, its akin to installing the software and setting up the “Hello World” blog post or page, then staring at it like it’s useful. “Yes, we are using WordPress!”. Then not adding a new post, creating a new category, or adding an About page. Perhaps I’m wrong in my interpretation. This leads us to what is OOTB SharePoint? To many people I’ve talked to the last few hours on twitter, email, etc. it is *not* just installing software but actually using it as it was fit for purpose. What’s the purpose of SharePoint then? It has many purposes, but using the OOTB templates Microsoft has given you the ability to collaborate on projects, author/share/publish documents, create pages, track items/contacts/tasks/etc. in a multi-user web based interface, and so on. Microsoft has pretty clear definitions of these different levels of SharePoint we’re talking about and I think it’s important for everyone to know what they are and what they mean. Personalization and Administration To me, this is the OOTB experience. You install the product and then are able to do things like create new lists, sites, edit and personalize pages, create new views, etc. Basically use the platform services available to you with Windows SharePoint Services (or SharePoint Foundation in 2010) to your full advantage. No code, no special tools needed, and very little user training required. Could you take someone who has never done anything in a website or piece of software and unleash them onto a site? Probably not. However I would argue that anyone who’s configured the Outlook reading layout or applied styles to a Word document probably won’t have too much difficulty in using SharePoint OUT OF THE BOX. Customization Here’s where things might get a bit murky but to me this is where you start looking at HTML/ASPX page code through SharePoint Designer, using jQuery scripts and plugging them into Web Part Pages via a Content Editor Web Part, and generally enhancing the site. The JavaScript debate might kick in here claiming it’s no different than C#, and frankly you can totally screw a site up with jQuery on a CEWP just as easily as you can with a C# delegate control deployed to the server file system. However (again, my blog, my opinion) the customization label comes in when I need to access the server (for example creating a custom theme) or have some kind of net-new element I add to the system that wasn’t there OOTB. It’s not content (like a new list or site), it’s code and does something functional. Development Here’s were the propeller hats come on and we’re talking algorithms and unit tests and compilers oh my. Software is deployed to the server, people are writing solutions after some kind of training (perhaps), there might be some specialized tools they use to craft and deploy the solutions, there’s the possibility of exceptions being thrown, etc. There are a lot of definitions here and just like customization it might get murky (do you let non-developers build solutions using development, i.e. jQuery/C#?). In my experience, it’s much more cost effective keeping solutions under the first two umbrellas than leaping into the third one. Arguably you could say that you can’t build useful solutions without *some* kind of code (even just some simple jQuery). I think you can get a *lot* of value just from using the OOTB experience and I don’t think you’re constraining your users that much. I’m not saying Marc or Bjorn are wrong. Like Obi-Wan stated, they’re both correct “from a certain point of view”. To me, SharePoint Out of the Box makes total sense and should not be dismissed. I just don’t agree with the premise that Bjorn is basing his statements on but that’s just my opinion and his is different and never the twain shall meet.

Read the article
Company Review: Google Products

Google, Inc offers an array of products and services to all of its end-users. However their search capabilities are the foundation for Google’s current success and their primary business focus. Currently, Google offers over twenty different search applications that allow users to search the internet for books, maps, videos, images, products and much more. Their product decisions have allowed users demands to be met while focusing on the free based model. This allows users to access Google data free of charge and indirectly gives Google a strong competitive advantage of other competitors along with the accuracy of the search results. According to Google, Inc, they offer the following types of searching capabilities: Alerts Get email updates on the topics of your choice Blog Search Find blogs on your favorite topics Books Search the full text of books Custom Search Create a customized search experience for your community Desktop Search and personalize your computer Dictionary Search for definitions of words and phrases Directory Search the web, organized by topic or category Earth Explore the world from your computer Finance Business info, news and interactive charts GOOG-411 Find and connect for free with businesses from your phone Images Search for images on the web Maps View maps and directions News Search thousands of news stories Patent Search Search the full text of US Patents Product Search Search for stuff to buy Scholar Search scholarly papers Toolbar Add a search box to your browser Trends Explore past and present search trends Videos Search for videos on the web Web Search Search billions of web pages Web Search Features Find movies, music, stocks, books and more mapping Google’s free based business model is only one way it differentiates itself from its competition. There is also a strong focus on the accuracy of search results and the speed in which they are returned to the end-user. Quality function deployment (QFD) is a structured method used to help connect user needs to the design features of a project proposed to address those needs. This method is particularly useful in accounting for needs that are not easily articulated or precisely defined according to the U. S. Department of Transportation Federal Highway Administration. Due to the fact that QFD is so customer driven Google is always in a constant state of change in attempt to reengineer its search algorithms, and other dependant systems so that end-users requirements are constantly being met. Value engineering is a key example of this, Google is constantly trying to improve all aspects of its products, improve system maintainability, and system interoperability. Bridgefield Group defines value engineering as an organized methodology that identifies and selects the lowest lifecycle cost options in design, materials and processes that achieves the desired level of performance, reliability and customer satisfaction. In addition, it seeks to remove unnecessary costs in the above areas and is often a joint effort with cross-functional internal teams and relevant suppliers. Common issues that appear when developing large scale systems like Google’s search applications include modular design of a product and/or service and providing accurate value analysis. A design approach that adheres to four fundamental tenets of cohesiveness, encapsulation, self-containment, and high binding to design a system component as an independently operable unit subject to change is how the Open System Joint Task Force defines modular design. More specifically M. S. Schmaltz defines modular software design as having a large collection of statements strung together in one partition of in-line code; we segment or divide the statements into logical groups called modules. Each module performs one or two tasks, and then passes control to another module. By breaking up the code into "bite-sized chunks", so to speak, we are able to better control the flow of data and control. This is especially true in large software systems. Value analysis is a process to evaluate products and services based on effectiveness, safety, and cost. Value analysis involves assessing the quality as well as the cost of a product or service as defined by the Healthcare Financial Management Association. “Operations Management deals with the design and management of products, processes, services and supply chains. It considers the acquisition, development, and utilization of resources that firms need to deliver the goods and services their clients want.” (MIT,2010) Google, Inc encourages an open environment between all employees, also known as Googlers. This is reinforced by a cross-section team or cross-functional teams comprised from multiple departments assigned to every project so that every department like marketing, finance, and quality assurance has input on every project. In addition, Google is known for their openness to new ideas regardless of the status or seniority of an employee. In fact, Google allows for 20% of an employee’s time can be devoted to developing new ideas and/or pet projects. HumTech.com defines a cross-functional team as a collection of people with varied levels of skills and experience brought together to accomplish a task. As the name implies, Cross-Functional Team members come from different organizational units. Cross-Functional Teams may be permanent or ad hoc. Google’s search application product strategy primarily focuses on mass customization. This is allows Google to create a base search application and allows results to be returned to the end-users quickly based on specific parameters and search settings. In addition, they also store the data that is returned in case other desire the same results based on other end-users supplying the same customized settings. This allows Google to appear to render search results in virtually real-time to the user while allowing for complete customization of the searching criteria. Greg Vogl, a professor at Uganda Martyrs University, defines mass customization as when a business gives its customers the opportunity to tailor its products or services to the customer's specifications. The IT staff at Google play a key role in ensuring that the search application’s product strategy is maintained simply because the IT staff designs, develops, and maintains all of their proprietary applications. In fact, they also maintain all network infrastructure to ensure that it is available to all end-users. References: http://www.google.com/intl/en/options/ http://ops.fhwa.dot.gov/freight/publications/ftat_user_guide/sec5.htm http://www.bridgefieldgroup.com/bridgefieldgroup/glos9.htm#V http://www.acq.osd.mil/osjtf/termsdef.html http://www.cise.ufl.edu/~mssz/Pascal-CGS2462/prog-dsn.html http://www.hfma.org/publications/business_caring_newsletter/exclusives/Supply+and+Inventory+Terms+Defined.htm http://mitsloan.mit.edu/omg/om-definition.php http://www.humtech.com/opm/grtl/ols/ols3.cfm http://www.gregvogl.net/courses/mis1/glossary.htm

Read the article
Is Berkeley DB a NoSQL solution?

- by Gregory Burd

Berkeley DB is a library. To use it to store data you must link the library into your application. You can use most programming languages to access the API, the calls across these APIs generally mimic the Berkeley DB C-API which makes perfect sense because Berkeley DB is written in C. The inspiration for Berkeley DB was the DBM library, a part of the earliest versions of UNIX written by AT&T's Ken Thompson in 1979. DBM was a simple key/value hashtable-based storage library. In the early 1990s as BSD UNIX was transitioning from version 4.3 to 4.4 and retrofitting commercial code owned by AT&T with unencumbered code, it was the future founders of Sleepycat Software who wrote libdb (aka Berkeley DB) as the replacement for DBM. The problem it addressed was fast, reliable local key/value storage. At that time databases almost always lived on a single node, even the most sophisticated databases only had simple fail-over two node solutions. If you had a lot of data to store you would choose between the few commercial RDBMS solutions or to write your own custom solution. Berkeley DB took the headache out of the custom approach. These basic market forces inspired other DBM implementations. There was the "New DBM" (ndbm) and the "GNU DBM" (GDBM) and a few others, but the theme was the same. Even today TokyoCabinet calls itself "a modern implementation of DBM" mimicking, and improving on, something first created over thirty years ago. In the mid-1990s, DBM was the name for what you needed if you were looking for fast, reliable local storage. Fast forward to today. What's changed? Systems are connected over fast, very reliable networks. Disks are cheep, fast, and capable of storing huge amounts of data. CPUs continued to follow Moore's Law, processing power that filled a room in 1990 now fits in your pocket. PCs, servers, and other computers proliferated both in business and the personal markets. In addition to the new hardware entire markets, social systems, and new modes of interpersonal communication moved onto the web and started evolving rapidly. These changes cause a massive explosion of data and a need to analyze and understand that data. Taken together this resulted in an entirely different landscape for database storage, new solutions were needed. A number of novel solutions stepped up and eventually a category called NoSQL emerged. The new market forces inspired the CAP theorem and the heated debate of BASE vs. ACID. But in essence this was simply the market looking at what to trade off to meet these new demands. These new database systems shared many qualities in common. There were designed to address massive amounts of data, millions of requests per second, and scale out across multiple systems. The first large-scale and successful solution was Dynamo, Amazon's distributed key/value database. Dynamo essentially took the next logical step and added a twist. Dynamo was to be the database of record, it would be distributed, data would be partitioned across many nodes, and it would tolerate failure by avoiding single points of failure. Amazon did this because they recognized that the majority of the dynamic content they provided to customers visiting their web store front didn't require the services of an RDBMS. The queries were simple, key/value look-ups or simple range queries with only a few queries that required more complex joins. They set about to use relational technology only in places where it was the best solution for the task, places like accounting and order fulfillment, but not in the myriad of other situations. The success of Dynamo, and it's design, inspired the next generation of Non-SQL, distributed database solutions including Cassandra, Riak and Voldemort. The problem their designers set out to solve was, "reliability at massive scale" so the first focal point was distributed database algorithms. Underneath Dynamo there is a local transactional database; either Berkeley DB, Berkeley DB Java Edition, MySQL or an in-memory key/value data structure. Dynamo was an evolution of local key/value storage onto networks. Cassandra, Riak, and Voldemort all faced similar design decisions and one, Voldemort, choose Berkeley DB Java Edition for it's node-local storage. Riak at first was entirely in-memory, but has recently added write-once, append-only log-based on-disk storage similar type of storage as Berkeley DB except that it is based on a hash table which must reside entirely in-memory rather than a btree which can live in-memory or on disk. Berkeley DB evolved too, we added high availability (HA) and a replication manager that makes it easy to setup replica groups. Berkeley DB's replication doesn't partitioned the data, every node keeps an entire copy of the database. For consistency, there is a single node where writes are committed first - a master - then those changes are delivered to the replica nodes as log records. Applications can choose to wait until all nodes are consistent, or fire and forget allowing Berkeley DB to eventually become consistent. Berkeley DB's HA scales-out quite well for read-intensive applications and also effectively eliminates the central point of failure by allowing replica nodes to be elected (using a PAXOS algorithm) to mastership if the master should fail. This implementation covers a wide variety of use cases. MemcacheDB is a server that implements the Memcache network protocol but uses Berkeley DB for storage and HA to replicate the cache state across all the nodes in the cache group. Google Accounts, the user authentication layer for all Google properties, was until recently running Berkeley DB HA. That scaled to a globally distributed system. That said, most NoSQL solutions try to partition (shard) data across nodes in the replication group and some allow writes as well as reads at any node, Berkeley DB HA does not. So, is Berkeley DB a "NoSQL" solution? Not really, but it certainly is a component of many of the existing NoSQL solutions out there. Forgetting all the noise about how NoSQL solutions are complex distributed databases when you boil them down to a single node you still have to store the data to some form of stable local storage. DBMs solved that problem a long time ago. NoSQL has more to do with the layers on top of the DBM; the distributed, sometimes-consistent, partitioned, scale-out storage that manage key/value or document sets and generally have some form of simple HTTP/REST-style network API. Does Berkeley DB do that? Not really. Is Berkeley DB a "NoSQL" solution today? Nope, but it's the most robust solution on which to build such a system. Re-inventing the node-local data storage isn't easy. A lot of people are starting to come to appreciate the sophisticated features found in Berkeley DB, even mimic them in some cases. Could Berkeley DB grow into a NoSQL solution? Absolutely. Our key/value API could be extended over the net using any of a number of existing network protocols such as memcache or HTTP/REST. We could adapt our node-local data partitioning out over replicated nodes. We even have a nice query language and cost-based query optimizer in our BDB XML product that we could reuse were we to build out a document-based NoSQL-style product. XML and JSON are not so different that we couldn't adapt one to work with the other interchangeably. Without too much effort we could add what's missing, we could jump into this No SQL market withing a single product development cycle. Why isn't Berkeley DB already a NoSQL solution? Why aren't we working on it? Why indeed...

Read the article
My Code Kata–A Solution Kata

- by Glav

There are many developers and coders out there who like to do code Kata’s to keep their coding ability up to scratch and to practice their skills. I think it is a good idea. While I like the concept, I find them dead boring and of minimal purpose. Yes, they serve to hone your skills but that’s about it. They are often quite abstract, in that they usually focus on a small problem set requiring specific solutions. It is fair enough as that is how they are designed but again, I find them quite boring. What I personally like to do is go for something a little larger and a little more fun. It takes a little more time and is not as easily executed as a kata though, but it services the same purposes from a practice perspective and allows me to continue to solve some problems that are not directly part of the initial goal. This means I can cover a broader learning range and have a bit more fun. If I am lucky, sometimes they even end up being useful tools. With that in mind, I thought I’d share my current ‘kata’. It is not really a code kata as it is too big. I prefer to think of it as a ‘solution kata’. The code is on bitbucket here. What I wanted to do was create a kind of simplistic virtual world where I can create a player, or a class, stuff it into the world, and see if it survives, and can navigate its way to the exit. Requirements were pretty simple: Must be able to define a map to describe the world using simple X,Y co-ordinates. Z co-ordinates as well if you feel like getting clever. Should have the concept of entrances, exists, solid blocks, and potentially other materials (again if you want to get clever). A coder should be able to easily write a class which will act as an inhabitant of the world. An inhabitant will receive stimulus from the world in the form of surrounding environment and be able to make a decision on action which it passes back to the ‘world’ for processing. At a minimum, an inhabitant will have sight and speed characteristics which determine how far they can ‘see’ in the world, and how fast they can move. Coders who write a really bad ‘inhabitant’ should not adversely affect the rest of world. Should allow multiple inhabitants in the world. So that was the solution I set out to act as a practice solution and a little bit of fun. It had some interesting problems to solve and I figured, if it turned out ok, I could potentially use this as a ‘developer test’ for interviews. Ask a potential coder to write a class for an inhabitant. Show the coder the map they will navigate, but also mention that we will use their code to navigate a map they have not yet seen and a little more complex. I have been playing with solution for a short time now and have it working in basic concepts. Below is a screen shot using a very basic console visualiser that shows the map, boundaries, blocks, entrance, exit and players/inhabitants. The yellow asterisks ‘*’ are the players, green ‘O’ the entrance, purple ‘^’ the exit, maroon/browny ‘#’ are solid blocks. The players can move around at different speeds, knock into each others, and make directional movement decisions based on what they see and who is around them. It has been quite fun to write and it is also quite fun to develop different players to inject into the world. The code below shows a really simple implementation of an inhabitant that can work out what to do based on stimulus from the world. It is pretty simple and just tries to move in some direction if there is nothing blocking the path. public class TestPlayer:LivingEntity { public TestPlayer() { Name = "Beta Boy"; LifeKey = Guid.NewGuid(); } public override ActionResult DecideActionToPerform(EcoDev.Core.Common.Actions.ActionContext actionContext) { try { var action = new MovementAction(); // move forward if we can if (actionContext.Position.ForwardFacingPositions.Length > 0) { if (CheckAccessibilityOfMapBlock(actionContext.Position.ForwardFacingPositions[0])) { action.DirectionToMove = MovementDirection.Forward; return action; } } if (actionContext.Position.LeftFacingPositions.Length > 0) { if (CheckAccessibilityOfMapBlock(actionContext.Position.LeftFacingPositions[0])) { action.DirectionToMove = MovementDirection.Left; return action; } } if (actionContext.Position.RearFacingPositions.Length > 0) { if (CheckAccessibilityOfMapBlock(actionContext.Position.RearFacingPositions[0])) { action.DirectionToMove = MovementDirection.Back; return action; } } if (actionContext.Position.RightFacingPositions.Length > 0) { if (CheckAccessibilityOfMapBlock(actionContext.Position.RightFacingPositions[0])) { action.DirectionToMove = MovementDirection.Right; return action; } } return action; } catch (Exception ex) { World.WriteDebugInformation("Player: "+ Name, string.Format("Player Generated exception: {0}",ex.Message)); throw ex; } } private bool CheckAccessibilityOfMapBlock(MapBlock block) { if (block == null || block.Accessibility == MapBlockAccessibility.AllowEntry || block.Accessibility == MapBlockAccessibility.AllowExit || block.Accessibility == MapBlockAccessibility.AllowPotentialEntry) { return true; } return false; } } It is simple and it seems to work well. The world implementation itself decides the stimulus context that is passed to he inhabitant to make an action decision. All movement is carried out on separate threads and timed appropriately to be as fair as possible and to cater for additional skills such as speed, and eventually maybe stamina, strength, with actions like fighting. It is pretty fun to make up random maps and see how your inhabitant does. You can download the code from here. Along the way I have played with parallel extensions to make the compute intensive stuff spread across all cores, had to heavily factor in visibility of methods and properties so design of classes was paramount, work out movement algorithms that play fairly in the world and properly favour the players with higher abilities, as well as a host of other issues. So that is my ‘solution kata’. If I keep going with it, I may develop a web interface for it where people can upload assemblies and watch their player within a web browser visualiser and maybe even a map designer. What do you do to keep the fires burning?

Read the article
Security in Software

The term security has many meanings based on the context and perspective in which it is used. Security from the perspective of software/system development is the continuous process of maintaining confidentiality, integrity, and availability of a system, sub-system, and system data. This definition at a very high level can be restated as the following: Computer security is a continuous process dealing with confidentiality, integrity, and availability on multiple layers of a system. Key Aspects of Software Security Integrity Confidentiality Availability Integrity within a system is the concept of ensuring only authorized users can only manipulate information through authorized methods and procedures. An example of this can be seen in a simple lead management application. If the business decided to allow each sales member to only update their own leads in the system and sales managers can update all leads in the system then an integrity violation would occur if a sales member attempted to update someone else’s leads. An integrity violation occurs when a team member attempts to update someone else’s lead because it was not entered by the sales member. This violates the business rule that leads can only be update by the originating sales member. Confidentiality within a system is the concept of preventing unauthorized access to specific information or tools. In a perfect world the knowledge of the existence of confidential information/tools would be unknown to all those who do not have access. When this this concept is applied within the context of an application only the authorized information/tools will be available. If we look at the sales lead management system again, leads can only be updated by originating sales members. If we look at this rule then we can say that all sales leads are confidential between the system and the sales person who entered the lead in to the system. The other sales team members would not need to know about the leads let alone need to access it. Availability within a system is the concept of authorized users being able to access the system. A real world example can be seen again from the lead management system. If that system was hosted on a web server then IP restriction can be put in place to limit access to the system based on the requesting IP address. If in this example all of the sales members where accessing the system from the 192.168.1.23 IP address then removing access from all other IPs would be need to ensure that improper access to the system is prevented while approved users can access the system from an authorized location. In essence if the requesting user is not coming from an authorized IP address then the system will appear unavailable to them. This is one way of controlling where a system is accessed. Through the years several design principles have been identified as being beneficial when integrating security aspects into a system. These principles in various combinations allow for a system to achieve the previously defined aspects of security based on generic architectural models. Security Design Principles Least Privilege Fail-Safe Defaults Economy of Mechanism Complete Mediation Open Design Separation Privilege Least Common Mechanism Psychological Acceptability Defense in Depth Least Privilege Design PrincipleThe Least Privilege design principle requires a minimalistic approach to granting user access rights to specific information and tools. Additionally, access rights should be time based as to limit resources access bound to the time needed to complete necessary tasks. The implications of granting access beyond this scope will allow for unnecessary access and the potential for data to be updated out of the approved context. The assigning of access rights will limit system damaging attacks from users whether they are intentional or not. This principle attempts to limit data changes and prevents potential damage from occurring by accident or error by reducing the amount of potential interactions with a resource. Fail-Safe Defaults Design PrincipleThe Fail-Safe Defaults design principle pertains to allowing access to resources based on granted access over access exclusion. This principle is a methodology for allowing resources to be accessed only if explicit access is granted to a user. By default users do not have access to any resources until access has been granted. This approach prevents unauthorized users from gaining access to resource until access is given. Economy of Mechanism Design PrincipleThe Economy of mechanism design principle requires that systems should be designed as simple and small as possible. Design and implementation errors result in unauthorized access to resources that would not be noticed during normal use. Complete Mediation Design PrincipleThe Complete Mediation design principle states that every access to every resource must be validated for authorization. Open Design Design PrincipleThe Open Design Design Principle is a concept that the security of a system and its algorithms should not be dependent on secrecy of its design or implementation Separation Privilege Design PrincipleThe separation privilege design principle requires that all resource approved resource access attempts be granted based on more than a single condition. For example a user should be validated for active status and has access to the specific resource. Least Common Mechanism Design PrincipleThe Least Common Mechanism design principle declares that mechanisms used to access resources should not be shared. Psychological Acceptability Design PrincipleThe Psychological Acceptability design principle refers to security mechanisms not make resources more difficult to access than if the security mechanisms were not present Defense in Depth Design PrincipleThe Defense in Depth design principle is a concept of layering resource access authorization verification in a system reduces the chance of a successful attack. This layered approach to resource authorization requires unauthorized users to circumvent each authorization attempt to gain access to a resource. When designing a system that requires meeting a security quality attribute architects need consider the scope of security needs and the minimum required security qualities. Not every system will need to use all of the basic security design principles but will use one or more in combination based on a company’s and architect’s threshold for system security because the existence of security in an application adds an additional layer to the overall system and can affect performance. That is why the definition of minimum security acceptably is need when a system is design because this quality attributes needs to be factored in with the other system quality attributes so that the system in question adheres to all qualities based on the priorities of the qualities. Resources: Barnum, Sean. Gegick, Michael. (2005). Least Privilege. Retrieved on August 28, 2011 from https://buildsecurityin.us-cert.gov/bsi/articles/knowledge/principles/351-BSI.html Saltzer, Jerry. (2011). BASIC PRINCIPLES OF INFORMATION PROTECTION. Retrieved on August 28, 2011 from http://web.mit.edu/Saltzer/www/publications/protection/Basic.html Barnum, Sean. Gegick, Michael. (2005). Defense in Depth. Retrieved on August 28, 2011 from https://buildsecurityin.us-cert.gov/bsi/articles/knowledge/principles/347-BSI.html Bertino, Elisa. (2005). Design Principles for Security. Retrieved on August 28, 2011 from http://homes.cerias.purdue.edu/~bhargav/cs526/security-9.pdf

Read the article
A Good Developer is So Hard to Find

- by James Michael Hare

Let me start out by saying I want to damn the writers of the Toughest Developer Puzzle Ever – 2. It is eating every last shred of my free time! But as I've been churning through each puzzle and marvelling at the brain teasers and trivia within, I began to think about interviewing developers and why it seems to be so hard to find good ones. The problem is, it seems like no matter how hard we try to find the perfect way to separate the chaff from the wheat, inevitably someone will get hired who falls far short of expectations or someone will get passed over for missing a piece of trivia or a tricky brain teaser that could have been an excellent team member. In shops that are primarily software-producing businesses or other heavily IT-oriented businesses (Microsoft, Amazon, etc) there often exists a much tighter bond between HR and the hiring development staff because development is their life-blood. Unfortunately, many of us work in places where IT is viewed as a cost or just a means to an end. In these shops, too often, HR and development staff may work against each other due to differences in opinion as to what a good developer is or what one is worth. It seems that if you ask two different people what makes a good developer, often you will get three different opinions. With the exception of those shops that are purely development-centric (you guys have it much easier!), most other shops have management who have very little knowledge about the development process. Their view can often be that development is simply a skill that one learns and then once aquired, that developer can produce widgets as good as the next like workers on an assembly-line floor. On the other side, you have many developers that feel that software development is an art unto itself and that the ability to create the most pure design or know the most obscure of keywords or write the shortest-possible obfuscated piece of code is a good coder. So is it a skill? An Art? Or something entirely in between? Saying that software is merely a skill and one just needs to learn the syntax and tools would be akin to saying anyone who knows English and can use Word can write a 300 page book that is accurate, meaningful, and stays true to the point. This just isn't so. It takes more than mere skill to take words and form a sentence, join those sentences into paragraphs, and those paragraphs into a document. I've interviewed candidates who could answer obscure syntax and keyword questions and once they were hired could not code effectively at all. So development must be more than a skill. But on the other end, we have art. Is development an art? Is our end result to produce art? I can marvel at a piece of code -- see it as concise and beautiful -- and yet that code most perform some stated function with accuracy and efficiency and maintainability. None of these three things have anything to do with art, per se. Art is beauty for its own sake and is a wonderful thing. But if you apply that same though to development it just doesn't hold. I've had developers tell me that all that matters is the end result and how you code it is entirely part of the art and I couldn't disagree more. Yes, the end result, the accuracy, is the prime criteria to be met. But if code is not maintainable and efficient, it would be just as useless as a beautiful car that breaks down once a week or that gets 2 miles to the gallon. Yes, it may work in that it moves you from point A to point B and is pretty as hell, but if it can't be maintained or is not efficient, it's not a good solution. So development must be something less than art. In the end, I think I feel like development is a matter of craftsmanship. We use our tools and we use our skills and set about to construct something that satisfies a purpose and yet is also elegant and efficient. There is skill involved, and there is an art, but really it boils down to being able to craft code. Crafting code is far more than writing code. Anyone can write code if they know the syntax, but so few people can actually craft code that solves a purpose and craft it well. So this is what I want to find, I want to find code craftsman! But how? I used to ask coding-trivia questions a long time ago and many people still fall back on this. The thought is that if you ask the candidate some piece of coding trivia and they know the answer it must follow that they can craft good code. For example: What C++ keyword can be applied to a class/struct field to allow it to be changed even from a const-instance of that class/struct? (answer: mutable) So what do we prove if a candidate can answer this? Only that they know what mutable means. One would hope that this would infer that they'd know how to use it, and more importantly when and if it should ever be used! But it rarely does! The problem with triva questions is that you will either: Approve a really good developer who knows what some obscure keyword is (good) Reject a really good developer who never needed to use that keyword or is too inexperienced to know how to use it (bad) Approve a really bad developer who googled "C++ Interview Questions" and studied like hell but can't craft (very bad) Many HR departments love these kind of tests because they are short and easy to defend if a legal issue arrises on hiring decisions. After all it's easy to say a person wasn't hired because they scored 30 out of 100 on some trivia test. But unfortunately, you've eliminated a large part of your potential developer pool and possibly hired a few duds. There are times I've hired candidates who knew every trivia question I could throw out them and couldn't craft. And then there are times I've interviewed candidates who failed all my trivia but who I took a chance on who were my best finds ever. So if not trivia, then what? Brain teasers? The thought is, these type of questions measure the thinking power of a candidate. The problem is, once again, you will either: Approve a good candidate who has never heard the problem and can solve it (good) Reject a good candidate who just happens not to see the "catch" because they're nervous or it may be really obscure (bad) Approve a candidate who has studied enough interview brain teasers (once again, you can google em) to recognize the "catch" or knows the answer already (bad). Once again, you're eliminating good candidates and possibly accepting bad candidates. In these cases, I think testing someone with brain teasers only tests their ability to answer brain teasers, not the ability to craft code. So how do we measure someone's ability to craft code? Here's a novel idea: have them code! Give them a computer and a compiler, or a whiteboard and a pen, or paper and pencil and have them construct a piece of code. It just makes sense that if we're going to hire someone to code we should actually watch them code. When they're done, we can judge them on several criteria: Correctness - does the candidate's solution accurately solve the problem proposed? Accuracy - is the candidate's solution reasonably syntactically correct? Efficiency - did the candidate write or use the more efficient data structures or algorithms for the job? Maintainability - was the candidate's code free of obfuscation and clever tricks that diminish readability? Persona - are they eager and willing or aloof and egotistical? Will they work well within your team? It may sound simple, or it may sound crazy, but when I'm looking to hire a developer, I want to see them actually develop well-crafted code.

Read the article

Search Results

Search found 1848 results on 74 pages for 'algorithms'.

Page 70/74 | < Previous Page | 66 67 68 69 70 71 72 73 74 | Next Page >

- by Neil G

- by I82Much

- by Bijan

- by James Morris

- by hvidgaard

- by DrOnline

- by Sakura

- by peter.murray.rust

- by SoulBeaver

- by MarkPowell

- by Airjoe

- by vvinjj

- by PartlyCloudy

- by Airjoe

- by John Dibling

- by geoff

- by shuttle87

- by charlie.berger

- by Bil Simser

- by Gregory Burd

- by Glav

- by James Michael Hare

< Previous Page | 66 67 68 69 70 71 72 73 74 | Next Page >