Search Results

Search found 5298 results on 212 pages for 'marching cubes algorithm'.

Page 203/212 | < Previous Page | 199 200 201 202 203 204 205 206 207 208 209 210  | Next Page >

  • Parallelism in .NET – Part 18, Task Continuations with Multiple Tasks

    - by Reed
    In my introduction to Task continuations I demonstrated how the Task class provides a more expressive alternative to traditional callbacks.  Task continuations provide a much cleaner syntax to traditional callbacks, but there are other reasons to switch to using continuations… Task continuations provide a clean syntax, and a very simple, elegant means of synchronizing asynchronous method results with the user interface.  In addition, continuations provide a very simple, elegant means of working with collections of tasks. Prior to .NET 4, working with multiple related asynchronous method calls was very tricky.  If, for example, we wanted to run two asynchronous operations, followed by a single method call which we wanted to run when the first two methods completed, we’d have to program all of the handling ourselves.  We would likely need to take some approach such as using a shared callback which synchronized against a common variable, or using a WaitHandle shared within the callbacks to allow one to wait for the second.  Although this could be accomplished easily enough, it requires manually placing this handling into every algorithm which requires this form of blocking.  This is error prone, difficult, and can easily lead to subtle bugs. Similar to how the Task class static methods providing a way to block until multiple tasks have completed, TaskFactory contains static methods which allow a continuation to be scheduled upon the completion of multiple tasks: TaskFactory.ContinueWhenAll. This allows you to easily specify a single delegate to run when a collection of tasks has completed.  For example, suppose we have a class which fetches data from the network.  This can be a long running operation, and potentially fail in certain situations, such as a server being down.  As a result, we have three separate servers which we will “query” for our information.  Now, suppose we want to grab data from all three servers, and verify that the results are the same from all three. With traditional asynchronous programming in .NET, this would require using three separate callbacks, and managing the synchronization between the various operations ourselves.  The Task and TaskFactory classes simplify this for us, allowing us to write: var server1 = Task.Factory.StartNew( () => networkClass.GetResults(firstServer) ); var server2 = Task.Factory.StartNew( () => networkClass.GetResults(secondServer) ); var server3 = Task.Factory.StartNew( () => networkClass.GetResults(thirdServer) ); var result = Task.Factory.ContinueWhenAll( new[] {server1, server2, server3 }, (tasks) => { // Propogate exceptions (see below) Task.WaitAll(tasks); return this.CompareTaskResults( tasks[0].Result, tasks[1].Result, tasks[2].Result); }); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This is clean, simple, and elegant.  The one complication is the Task.WaitAll(tasks); statement. Although the continuation will not complete until all three tasks (server1, server2, and server3) have completed, there is a potential snag.  If the networkClass.GetResults method fails, and raises an exception, we want to make sure to handle it cleanly.  By using Task.WaitAll, any exceptions raised within any of our original tasks will get wrapped into a single AggregateException by the WaitAll method, providing us a simplified means of handling the exceptions.  If we wait on the continuation, we can trap this AggregateException, and handle it cleanly.  Without this line, it’s possible that an exception could remain uncaught and unhandled by a task, which later might trigger a nasty UnobservedTaskException.  This would happen any time two of our original tasks failed. Just as we can schedule a continuation to occur when an entire collection of tasks has completed, we can just as easily setup a continuation to run when any single task within a collection completes.  If, for example, we didn’t need to compare the results of all three network locations, but only use one, we could still schedule three tasks.  We could then have our completion logic work on the first task which completed, and ignore the others.  This is done via TaskFactory.ContinueWhenAny: var server1 = Task.Factory.StartNew( () => networkClass.GetResults(firstServer) ); var server2 = Task.Factory.StartNew( () => networkClass.GetResults(secondServer) ); var server3 = Task.Factory.StartNew( () => networkClass.GetResults(thirdServer) ); var result = Task.Factory.ContinueWhenAny( new[] {server1, server2, server3 }, (firstTask) => { return this.ProcessTaskResult(firstTask.Result); }); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here, instead of working with all three tasks, we’re just using the first task which finishes.  This is very useful, as it allows us to easily work with results of multiple operations, and “throw away” the others.  However, you must take care when using ContinueWhenAny to properly handle exceptions.  At some point, you should always wait on each task (or use the Task.Result property) in order to propogate any exceptions raised from within the task.  Failing to do so can lead to an UnobservedTaskException.

    Read the article

  • Interesting things – Twitter annotations and your phone as a web server

    - by jamiet
    I overheard/read a couple of things today that really made me, data junkie that I am, take a step back and think, “Hmmm, yeah, that could be really interesting” and I wanted to make a note of them here so that (a) I could bring them to the attention of anyone that happens to read this and (b) I can maybe come back here in a few years and see if either of these have come to fruition. Your phone as a web server While listening to Jon Udell’s (twitter) “Interviews with Innovators Podcast” today in which he interviewed Herbert Van de Sompel (twitter) about his Momento project. During the interview Jon and Herbert made the following remarks: Jon: [some people] really had this vision of a web of servers, the notion that every node on the internet, every connected entity, is potentially a server and a client…we can see where we’re getting to a point where these endpoint devices we have in our pockets are going to be massively capable and it may be in the not too distant future that significant chunks of the web archive will be cached all over the place including on your own machine… Herbert: wasn’t it Opera who at one point turned your browser into a server? That really got my brain ticking. We all carry a mobile phone with us and therefore we all potentially carry a mobile web server with us as well and to my mind the only thing really stopping that from happening is the capabilities of the phone hardware, the capabilities of the network infrastructure and the will to just bloody do it. Certainly all the standards required for addressing a web server on a phone already exist (to this uninitiated observer DNS and IPv6 seem to solve that problem) so why not? I tweeted about the idea and Rory Street answered back with “why would you want a phone to be a web server?”: Its a fair question and one that I would like to try and answer. Mobile phones are increasingly becoming our window onto the world as we use them to upload messages to Twitter, record our location on FourSquare or interact with our friends on Facebook but in each of these cases some other service is acting as our intermediary; to see what I’m thinking you have to go via Twitter, to see where I am you have to go to FourSquare (I’m using ‘I’ liberally, I don’t actually use FourSquare before you ask). Why should this have to be the case? Why can’t that data be decentralised? Why can’t we be masters of our own data universe? If my phone acted as a web server then I could expose all of that information without needing those intermediary services. I see a time when we can pass around URLs such as the following: http://jamiesphone.net/location/current - Where is Jamie right now? http://jamiesphone.net/location/2010-04-21 – Where was Jamie on 21st April 2010? http://jamiesphone.net/thoughts/current – What’s on Jamie’s mind right now? http://jamiesphone.net/blog – What documents is Jamie sharing with me? http://jamiesphone.net/calendar/next7days – Where is Jamie planning to be over the next 7 days? and those URLs get served off of the phone in our pockets. If we govern that data then we can control who has access to it and (crucially) how long its available for. Want to wipe yourself off the face of the web? its pretty easy if you’re in control of all the data – just turn your phone off. None of this exists today but I look forward to a time when it does. Opera really were onto something last June when they announced Opera Unite (admittedly Unite only works because Opera provide an intermediary DNS-alike system – it isn’t totally decentralised). Opening up Twitter annotations Last week Twitter held their first developer conference called Chirp where they announced an upcoming new feature called ‘Twitter Annotations’; in short this will allow us to attach metadata to a Tweet thus enhancing the tweet itself. Think of it as a richer version of hashtags. To think of it another way Twitter are turning their data into a humongous Entity-Attribute-Value or triple-tuple store. That alone has huge implications both for the web and Twitter as a whole – the ability to enrich that 140 characters data and thus make it more useful is indeed compelling however today I stumbled upon a blog post from Eugene Mandel entitled Tweet Annotations – a Way to a Metadata Marketplace? where he proposed the idea of allowing tweets to have metadata added by people other than the person who tweeted the original tweet. This idea really fascinated me especially when I read some of the potential uses that Eugene and his commenters suggested. They included: Amazon could attach an ISBN to a tweet that mentions a book. Specialist clients apps for book lovers could be built up around this metadata. Advertisers could pay to place adverts in metadata. The revenue generated from those adverts could be shared with the tweeter or people who add the metadata. Granted, allowing anyone to add metadata to a tweet has the potential to create a spam problem the like of which we haven’t even envisaged but spam hasn’t halted the growth of the web and neither should it halt the growth of data annotations either. The original tweeter should of course be able to determine who can add metadata and whether it should be moderated. As Eugene says himself: Opening publishing tweet annotations to anyone will open the way to a marketplace of metadata where client developers, data mining companies and advertisers can add new meaning to Twitter and build innovative businesses. What Eugene and his followers did not mention is what I think is potentially the most fascinating use of opening up annotations. Google’s success today is built on their page rank algorithm that measures the validity of a web page by the number of incoming links to it and the page rank of the sites containing those links – its a system built on reputation. Twitter annotations could open up a new paradigm however – let’s call it People rank- where reputation can be measured by the metadata that people choose to apply to links and the websites containing those links. Its not hard to see why Google and Microsoft have paid big bucks to get access to the Twitter firehose! Neither of these features, phones as a web server or the ability to add annotations to other people’s tweets, exist today but I strongly believe that they could dramatically enhance the web as we know it today. I hope to look back on this blog post in a few years in the knowledge that these ideas have been put into place. @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • How can I get penetration depth from Minkowski Portal Refinement / Xenocollide?

    - by Raven Dreamer
    I recently got an implementation of Minkowski Portal Refinement (MPR) successfully detecting collision. Even better, my implementation returns a good estimate (local minimum) direction for the minimum penetration depth. So I took a stab at adjusting the algorithm to return the penetration depth in an arbitrary direction, and was modestly successful - my altered method works splendidly for face-edge collision resolution! What it doesn't currently do, is correctly provide the minimum penetration depth for edge-edge scenarios, such as the case on the right: What I perceive to be happening, is that my current method returns the minimum penetration depth to the nearest vertex - which works fine when the collision is actually occurring on the plane of that vertex, but not when the collision happens along an edge. Is there a way I can alter my method to return the penetration depth to the point of collision, rather than the nearest vertex? Here's the method that's supposed to return the minimum penetration distance along a specific direction: public static Vector3 CalcMinDistance(List<Vector3> shape1, List<Vector3> shape2, Vector3 dir) { //holding variables Vector3 n = Vector3.zero; Vector3 swap = Vector3.zero; // v0 = center of Minkowski sum v0 = Vector3.zero; // Avoid case where centers overlap -- any direction is fine in this case //if (v0 == Vector3.zero) return Vector3.zero; //always pass in a valid direction. // v1 = support in direction of origin n = -dir; //get the differnce of the minkowski sum Vector3 v11 = GetSupport(shape1, -n); Vector3 v12 = GetSupport(shape2, n); v1 = v12 - v11; //if the support point is not in the direction of the origin if (v1.Dot(n) <= 0) { //Debug.Log("Could find no points this direction"); return Vector3.zero; } // v2 - support perpendicular to v1,v0 n = v1.Cross(v0); if (n == Vector3.zero) { //v1 and v0 are parallel, which means //the direction leads directly to an endpoint n = v1 - v0; //shortest distance is just n //Debug.Log("2 point return"); return n; } //get the new support point Vector3 v21 = GetSupport(shape1, -n); Vector3 v22 = GetSupport(shape2, n); v2 = v22 - v21; if (v2.Dot(n) <= 0) { //can't reach the origin in this direction, ergo, no collision //Debug.Log("Could not reach edge?"); return Vector2.zero; } // Determine whether origin is on + or - side of plane (v1,v0,v2) //tests linesegments v0v1 and v0v2 n = (v1 - v0).Cross(v2 - v0); float dist = n.Dot(v0); // If the origin is on the - side of the plane, reverse the direction of the plane if (dist > 0) { //swap the winding order of v1 and v2 swap = v1; v1 = v2; v2 = swap; //swap the winding order of v11 and v12 swap = v12; v12 = v11; v11 = swap; //swap the winding order of v11 and v12 swap = v22; v22 = v21; v21 = swap; //and swap the plane normal n = -n; } /// // Phase One: Identify a portal while (true) { // Obtain the support point in a direction perpendicular to the existing plane // Note: This point is guaranteed to lie off the plane Vector3 v31 = GetSupport(shape1, -n); Vector3 v32 = GetSupport(shape2, n); v3 = v32 - v31; if (v3.Dot(n) <= 0) { //can't enclose the origin within our tetrahedron //Debug.Log("Could not reach edge after portal?"); return Vector3.zero; } // If origin is outside (v1,v0,v3), then eliminate v2 and loop if (v1.Cross(v3).Dot(v0) < 0) { //failed to enclose the origin, adjust points; v2 = v3; v21 = v31; v22 = v32; n = (v1 - v0).Cross(v3 - v0); continue; } // If origin is outside (v3,v0,v2), then eliminate v1 and loop if (v3.Cross(v2).Dot(v0) < 0) { //failed to enclose the origin, adjust points; v1 = v3; v11 = v31; v12 = v32; n = (v3 - v0).Cross(v2 - v0); continue; } bool hit = false; /// // Phase Two: Refine the portal int phase2 = 0; // We are now inside of a wedge... while (phase2 < 20) { phase2++; // Compute normal of the wedge face n = (v2 - v1).Cross(v3 - v1); n.Normalize(); // Compute distance from origin to wedge face float d = n.Dot(v1); // If the origin is inside the wedge, we have a hit if (d > 0 ) { //Debug.Log("Do plane test here"); float T = n.Dot(v2) / n.Dot(dir); Vector3 pointInPlane = (dir * T); return pointInPlane; } // Find the support point in the direction of the wedge face Vector3 v41 = GetSupport(shape1, -n); Vector3 v42 = GetSupport(shape2, n); v4 = v42 - v41; float delta = (v4 - v3).Dot(n); float separation = -(v4.Dot(n)); if (delta <= kCollideEpsilon || separation >= 0) { //Debug.Log("Non-convergance detected"); //Debug.Log("Do plane test here"); return Vector3.zero; } // Compute the tetrahedron dividing face (v4,v0,v1) float d1 = v4.Cross(v1).Dot(v0); // Compute the tetrahedron dividing face (v4,v0,v2) float d2 = v4.Cross(v2).Dot(v0); // Compute the tetrahedron dividing face (v4,v0,v3) float d3 = v4.Cross(v3).Dot(v0); if (d1 < 0) { if (d2 < 0) { // Inside d1 & inside d2 ==> eliminate v1 v1 = v4; v11 = v41; v12 = v42; } else { // Inside d1 & outside d2 ==> eliminate v3 v3 = v4; v31 = v41; v32 = v42; } } else { if (d3 < 0) { // Outside d1 & inside d3 ==> eliminate v2 v2 = v4; v21 = v41; v22 = v42; } else { // Outside d1 & outside d3 ==> eliminate v1 v1 = v4; v11 = v41; v12 = v42; } } } return Vector3.zero; } }

    Read the article

  • A Guided Tour of Complexity

    - by JoshReuben
    I just re-read Complexity – A Guided Tour by Melanie Mitchell , protégé of Douglas Hofstadter ( author of “Gödel, Escher, Bach”) http://www.amazon.com/Complexity-Guided-Tour-Melanie-Mitchell/dp/0199798109/ref=sr_1_1?ie=UTF8&qid=1339744329&sr=8-1 here are some notes and links:   Evolved from Cybernetics, General Systems Theory, Synergetics some interesting transdisciplinary fields to investigate: Chaos Theory - http://en.wikipedia.org/wiki/Chaos_theory – small differences in initial conditions (such as those due to rounding errors in numerical computation) yield widely diverging outcomes for chaotic systems, rendering long-term prediction impossible. System Dynamics / Cybernetics - http://en.wikipedia.org/wiki/System_Dynamics – study of how feedback changes system behavior Network Theory - http://en.wikipedia.org/wiki/Network_theory – leverage Graph Theory to analyze symmetric  / asymmetric relations between discrete objects Algebraic Topology - http://en.wikipedia.org/wiki/Algebraic_topology – leverage abstract algebra to analyze topological spaces There are limits to deterministic systems & to computation. Chaos Theory definitely applies to training an ANN (artificial neural network) – different weights will emerge depending upon the random selection of the training set. In recursive Non-Linear systems http://en.wikipedia.org/wiki/Nonlinear_system – output is not directly inferable from input. E.g. a Logistic map: Xt+1 = R Xt(1-Xt) Different types of bifurcations, attractor states and oscillations may occur – e.g. a Lorenz Attractor http://en.wikipedia.org/wiki/Lorenz_system Feigenbaum Constants http://en.wikipedia.org/wiki/Feigenbaum_constants express ratios in a bifurcation diagram for a non-linear map – the convergent limit of R (the rate of period-doubling bifurcations) is 4.6692016 Maxwell’s Demon - http://en.wikipedia.org/wiki/Maxwell%27s_demon - the Second Law of Thermodynamics has only a statistical certainty – the universe (and thus information) tends towards entropy. While any computation can theoretically be done without expending energy, with finite memory, the act of erasing memory is permanent and increases entropy. Life & thought is a counter-example to the universe’s tendency towards entropy. Leo Szilard and later Claude Shannon came up with the Information Theory of Entropy - http://en.wikipedia.org/wiki/Entropy_(information_theory) whereby Shannon entropy quantifies the expected value of a message’s information in bits in order to determine channel capacity and leverage Coding Theory (compression analysis). Ludwig Boltzmann came up with Statistical Mechanics - http://en.wikipedia.org/wiki/Statistical_mechanics – whereby our Newtonian perception of continuous reality is a probabilistic and statistical aggregate of many discrete quantum microstates. This is relevant for Quantum Information Theory http://en.wikipedia.org/wiki/Quantum_information and the Physics of Information - http://en.wikipedia.org/wiki/Physical_information. Hilbert’s Problems http://en.wikipedia.org/wiki/Hilbert's_problems pondered whether mathematics is complete, consistent, and decidable (the Decision Problem – http://en.wikipedia.org/wiki/Entscheidungsproblem – is there always an algorithm that can determine whether a statement is true).  Godel’s Incompleteness Theorems http://en.wikipedia.org/wiki/G%C3%B6del's_incompleteness_theorems  proved that mathematics cannot be both complete and consistent (e.g. “This statement is not provable”). Turing through the use of Turing Machines (http://en.wikipedia.org/wiki/Turing_machine symbol processors that can prove mathematical statements) and Universal Turing Machines (http://en.wikipedia.org/wiki/Universal_Turing_machine Turing Machines that can emulate other any Turing Machine via accepting programs as well as data as input symbols) that computation is limited by demonstrating the Halting Problem http://en.wikipedia.org/wiki/Halting_problem (is is not possible to know when a program will complete – you cannot build an infinite loop detector). You may be used to thinking of 1 / 2 / 3 dimensional systems, but Fractal http://en.wikipedia.org/wiki/Fractal systems are defined by self-similarity & have non-integer Hausdorff Dimensions !!!  http://en.wikipedia.org/wiki/List_of_fractals_by_Hausdorff_dimension – the fractal dimension quantifies the number of copies of a self similar object at each level of detail – eg Koch Snowflake - http://en.wikipedia.org/wiki/Koch_snowflake Definitions of complexity: size, Shannon entropy, Algorithmic Information Content (http://en.wikipedia.org/wiki/Algorithmic_information_theory - size of shortest program that can generate a description of an object) Logical depth (amount of info processed), thermodynamic depth (resources required). Complexity is statistical and fractal. John Von Neumann’s other machine was the Self-Reproducing Automaton http://en.wikipedia.org/wiki/Self-replicating_machine  . Cellular Automata http://en.wikipedia.org/wiki/Cellular_automaton are alternative form of Universal Turing machine to traditional Von Neumann machines where grid cells are locally synchronized with their neighbors according to a rule. Conway’s Game of Life http://en.wikipedia.org/wiki/Conway's_Game_of_Life demonstrates various emergent constructs such as “Glider Guns” and “Spaceships”. Cellular Automatons are not practical because logical ops require a large number of cells – wasteful & inefficient. There are no compilers or general program languages available for Cellular Automatons (as far as I am aware). Random Boolean Networks http://en.wikipedia.org/wiki/Boolean_network are extensions of cellular automata where nodes are connected at random (not to spatial neighbors) and each node has its own rule –> they demonstrate the emergence of complex  & self organized behavior. Stephen Wolfram’s (creator of Mathematica, so give him the benefit of the doubt) New Kind of Science http://en.wikipedia.org/wiki/A_New_Kind_of_Science proposes the universe may be a discrete Finite State Automata http://en.wikipedia.org/wiki/Finite-state_machine whereby reality emerges from simple rules. I am 2/3 through this book. It is feasible that the universe is quantum discrete at the plank scale and that it computes itself – Digital Physics: http://en.wikipedia.org/wiki/Digital_physics – a simulated reality? Anyway, all behavior is supposedly derived from simple algorithmic rules & falls into 4 patterns: uniform , nested / cyclical, random (Rule 30 http://en.wikipedia.org/wiki/Rule_30) & mixed (Rule 110 - http://en.wikipedia.org/wiki/Rule_110 localized structures – it is this that is interesting). interaction between colliding propagating signal inputs is then information processing. Wolfram proposes the Principle of Computational Equivalence - http://mathworld.wolfram.com/PrincipleofComputationalEquivalence.html - all processes that are not obviously simple can be viewed as computations of equivalent sophistication. Meaning in information may emerge from analogy & conceptual slippages – see the CopyCat program: http://cognitrn.psych.indiana.edu/rgoldsto/courses/concepts/copycat.pdf Scale Free Networks http://en.wikipedia.org/wiki/Scale-free_network have a distribution governed by a Power Law (http://en.wikipedia.org/wiki/Power_law - much more common than Normal Distribution). They are characterized by hubs (resilience to random deletion of nodes), heterogeneity of degree values, self similarity, & small world structure. They grow via preferential attachment http://en.wikipedia.org/wiki/Preferential_attachment – tipping points triggered by positive feedback loops. 2 theories of cascading system failures in complex systems are Self-Organized Criticality http://en.wikipedia.org/wiki/Self-organized_criticality and Highly Optimized Tolerance http://en.wikipedia.org/wiki/Highly_optimized_tolerance. Computational Mechanics http://en.wikipedia.org/wiki/Computational_mechanics – use of computational methods to study phenomena governed by the principles of mechanics. This book is a great intuition pump, but does not cover the more mathematical subject of Computational Complexity Theory – http://en.wikipedia.org/wiki/Computational_complexity_theory I am currently reading this book on this subject: http://www.amazon.com/Computational-Complexity-Christos-H-Papadimitriou/dp/0201530821/ref=pd_sim_b_1   stay tuned for that review!

    Read the article

  • SQL SERVER – 5 Tips for Improving Your Data with expressor Studio

    - by pinaldave
    It’s no secret that bad data leads to bad decisions and poor results.  However, how do you prevent dirty data from taking up residency in your data store?  Some might argue that it’s the responsibility of the person sending you the data.  While that may be true, in practice that will rarely hold up.  It doesn’t matter how many times you ask, you will get the data however they decide to provide it. So now you have bad data.  What constitutes bad data?  There are quite a few valid answers, for example: Invalid date values Inappropriate characters Wrong data Values that exceed a pre-set threshold While it is certainly possible to write your own scripts and custom SQL to identify and deal with these data anomalies, that effort often takes too long and becomes difficult to maintain.  Instead, leveraging an ETL tool like expressor Studio makes the data cleansing process much easier and faster.  Below are some tips for leveraging expressor to get your data into tip-top shape. Tip 1:     Build reusable data objects with embedded cleansing rules One of the new features in expressor Studio 3.2 is the ability to define constraints at the metadata level.  Using expressor’s concept of Semantic Types, you can define reusable data objects that have embedded logic such as constraints for dealing with dirty data.  Once defined, they can be saved as a shared atomic type and then re-applied to other data attributes in other schemas. As you can see in the figure above, I’ve defined a constraint on zip code.  I can then save the constraint rules I defined for zip code as a shared atomic type called zip_type for example.   The next time I get a different data source with a schema that also contains a zip code field, I can simply apply the shared atomic type (shown below) and the previously defined constraints will be automatically applied. Tip 2:     Unlock the power of regular expressions in Semantic Types Another powerful feature introduced in expressor Studio 3.2 is the option to use regular expressions as a constraint.   A regular expression is used to identify patterns within data.   The patterns could be something as simple as a date format or something much more complex such as a street address.  For example, I could define that a valid IP address should be made up of 4 numbers, each 0 to 255, and separated by a period.  So 192.168.23.123 might be a valid IP address whereas 888.777.0.123 would not be.   How can I account for this using regular expressions? A very simple regular expression that would look for any 4 sets of 3 digits separated by a period would be:  ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ Alternatively, the following would be the exact check for truly valid IP addresses as we had defined above:  ^(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])$ .  In expressor, we would enter this regular expression as a constraint like this: Here we select the corrective action to be ‘Escalate’, meaning that the expressor Dataflow operator will decide what to do.  Some of the options include rejecting the offending record, skipping it, or aborting the dataflow. Tip 3:     Email pattern expressions that might come in handy In the example schema that I am using, there’s a field for email.  Email addresses are often entered incorrectly because people are trying to avoid spam.  While there are a lot of different ways to define what constitutes a valid email address, a quick search online yields a couple of really useful regular expressions for validating email addresses: This one is short and sweet:  \b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b (Source: http://www.regular-expressions.info/) This one is more specific about which characters are allowed:  ^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$ (Source: http://regexlib.com/REDetails.aspx?regexp_id=26 ) Tip 4:     Reject “dirty data” for analysis or further processing Yet another feature introduced in expressor Studio 3.2 is the ability to reject records based on constraint violations.  To capture reject records on input, simply specify Reject Record in the Error Handling setting for the Read File operator.  Then attach a Write File operator to the reject port of the Read File operator as such: Next, in the Write File operator, you can configure the expressor operator in a similar way to the Read File.  The key difference would be that the schema needs to be derived from the upstream operator as shown below: Once configured, expressor will output rejected records to the file you specified.  In addition to the rejected records, expressor also captures some diagnostic information that will be helpful towards identifying why the record was rejected.  This makes diagnosing errors much easier! Tip 5:    Use a Filter or Transform after the initial cleansing to finish the job Sometimes you may want to predicate the data cleansing on a more complex set of conditions.  For example, I may only be interested in processing data containing males over the age of 25 in certain zip codes.  Using an expressor Filter operator, you can define the conditional logic which isolates the records of importance away from the others. Alternatively, the expressor Transform operator can be used to alter the input value via a user defined algorithm or transformation.  It also supports the use of conditional logic and data can be rejected based on constraint violations. However, the best tip I can leave you with is to not constrain your solution design approach – expressor operators can be combined in many different ways to achieve the desired results.  For example, in the expressor Dataflow below, I can post-process the reject data from the Filter which did not meet my pre-defined criteria and, if successful, Funnel it back into the flow so that it gets written to the target table. I continue to be impressed that expressor offers all this functionality as part of their FREE expressor Studio desktop ETL tool, which you can download from here.  Their Studio ETL tool is absolutely free and they are very open about saying that if you want to deploy their software on a dedicated Windows Server, you need to purchase their server software, whose pricing is posted on their website. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Is Berkeley DB a NoSQL solution?

    - by Gregory Burd
    Berkeley DB is a library. To use it to store data you must link the library into your application. You can use most programming languages to access the API, the calls across these APIs generally mimic the Berkeley DB C-API which makes perfect sense because Berkeley DB is written in C. The inspiration for Berkeley DB was the DBM library, a part of the earliest versions of UNIX written by AT&T's Ken Thompson in 1979. DBM was a simple key/value hashtable-based storage library. In the early 1990s as BSD UNIX was transitioning from version 4.3 to 4.4 and retrofitting commercial code owned by AT&T with unencumbered code, it was the future founders of Sleepycat Software who wrote libdb (aka Berkeley DB) as the replacement for DBM. The problem it addressed was fast, reliable local key/value storage. At that time databases almost always lived on a single node, even the most sophisticated databases only had simple fail-over two node solutions. If you had a lot of data to store you would choose between the few commercial RDBMS solutions or to write your own custom solution. Berkeley DB took the headache out of the custom approach. These basic market forces inspired other DBM implementations. There was the "New DBM" (ndbm) and the "GNU DBM" (GDBM) and a few others, but the theme was the same. Even today TokyoCabinet calls itself "a modern implementation of DBM" mimicking, and improving on, something first created over thirty years ago. In the mid-1990s, DBM was the name for what you needed if you were looking for fast, reliable local storage. Fast forward to today. What's changed? Systems are connected over fast, very reliable networks. Disks are cheep, fast, and capable of storing huge amounts of data. CPUs continued to follow Moore's Law, processing power that filled a room in 1990 now fits in your pocket. PCs, servers, and other computers proliferated both in business and the personal markets. In addition to the new hardware entire markets, social systems, and new modes of interpersonal communication moved onto the web and started evolving rapidly. These changes cause a massive explosion of data and a need to analyze and understand that data. Taken together this resulted in an entirely different landscape for database storage, new solutions were needed. A number of novel solutions stepped up and eventually a category called NoSQL emerged. The new market forces inspired the CAP theorem and the heated debate of BASE vs. ACID. But in essence this was simply the market looking at what to trade off to meet these new demands. These new database systems shared many qualities in common. There were designed to address massive amounts of data, millions of requests per second, and scale out across multiple systems. The first large-scale and successful solution was Dynamo, Amazon's distributed key/value database. Dynamo essentially took the next logical step and added a twist. Dynamo was to be the database of record, it would be distributed, data would be partitioned across many nodes, and it would tolerate failure by avoiding single points of failure. Amazon did this because they recognized that the majority of the dynamic content they provided to customers visiting their web store front didn't require the services of an RDBMS. The queries were simple, key/value look-ups or simple range queries with only a few queries that required more complex joins. They set about to use relational technology only in places where it was the best solution for the task, places like accounting and order fulfillment, but not in the myriad of other situations. The success of Dynamo, and it's design, inspired the next generation of Non-SQL, distributed database solutions including Cassandra, Riak and Voldemort. The problem their designers set out to solve was, "reliability at massive scale" so the first focal point was distributed database algorithms. Underneath Dynamo there is a local transactional database; either Berkeley DB, Berkeley DB Java Edition, MySQL or an in-memory key/value data structure. Dynamo was an evolution of local key/value storage onto networks. Cassandra, Riak, and Voldemort all faced similar design decisions and one, Voldemort, choose Berkeley DB Java Edition for it's node-local storage. Riak at first was entirely in-memory, but has recently added write-once, append-only log-based on-disk storage similar type of storage as Berkeley DB except that it is based on a hash table which must reside entirely in-memory rather than a btree which can live in-memory or on disk. Berkeley DB evolved too, we added high availability (HA) and a replication manager that makes it easy to setup replica groups. Berkeley DB's replication doesn't partitioned the data, every node keeps an entire copy of the database. For consistency, there is a single node where writes are committed first - a master - then those changes are delivered to the replica nodes as log records. Applications can choose to wait until all nodes are consistent, or fire and forget allowing Berkeley DB to eventually become consistent. Berkeley DB's HA scales-out quite well for read-intensive applications and also effectively eliminates the central point of failure by allowing replica nodes to be elected (using a PAXOS algorithm) to mastership if the master should fail. This implementation covers a wide variety of use cases. MemcacheDB is a server that implements the Memcache network protocol but uses Berkeley DB for storage and HA to replicate the cache state across all the nodes in the cache group. Google Accounts, the user authentication layer for all Google properties, was until recently running Berkeley DB HA. That scaled to a globally distributed system. That said, most NoSQL solutions try to partition (shard) data across nodes in the replication group and some allow writes as well as reads at any node, Berkeley DB HA does not. So, is Berkeley DB a "NoSQL" solution? Not really, but it certainly is a component of many of the existing NoSQL solutions out there. Forgetting all the noise about how NoSQL solutions are complex distributed databases when you boil them down to a single node you still have to store the data to some form of stable local storage. DBMs solved that problem a long time ago. NoSQL has more to do with the layers on top of the DBM; the distributed, sometimes-consistent, partitioned, scale-out storage that manage key/value or document sets and generally have some form of simple HTTP/REST-style network API. Does Berkeley DB do that? Not really. Is Berkeley DB a "NoSQL" solution today? Nope, but it's the most robust solution on which to build such a system. Re-inventing the node-local data storage isn't easy. A lot of people are starting to come to appreciate the sophisticated features found in Berkeley DB, even mimic them in some cases. Could Berkeley DB grow into a NoSQL solution? Absolutely. Our key/value API could be extended over the net using any of a number of existing network protocols such as memcache or HTTP/REST. We could adapt our node-local data partitioning out over replicated nodes. We even have a nice query language and cost-based query optimizer in our BDB XML product that we could reuse were we to build out a document-based NoSQL-style product. XML and JSON are not so different that we couldn't adapt one to work with the other interchangeably. Without too much effort we could add what's missing, we could jump into this No SQL market withing a single product development cycle. Why isn't Berkeley DB already a NoSQL solution? Why aren't we working on it? Why indeed...

    Read the article

  • techniques for an AI for a highly cramped turn-based tactics game

    - by Adam M.
    I'm trying to write an AI for a tactics game in the vein of Final Fantasy Tactics or Vandal Hearts. I can't change the game rules in any way, only upgrade the AI. I have experience programming AI for classic board games (basically minimax and its variants), but I think the branching factor is too great for the approach to be reasonable here. I'll describe the game and some current AI flaws that I'd like to fix. I'd like to hear ideas for applicable techniques. I'm a decent enough programmer, so I only need the ideas, not an implementation (though that's always appreciated). I'd rather not expend effort chasing (too many) dead ends, so although speculation and brainstorming are good and probably helpful, I'd prefer to hear from somebody with actual experience solving this kind of problem. For those who know it, the game is the land battle mini-game in Sid Meier's Pirates! (2004) and you can skim/skip the next two paragraphs. For those who don't, here's briefly how it works. The battle is turn-based and takes place on a 16x16 grid. There are three terrain types: clear (no hindrance), forest (hinders movement, ranged attacks, and sight), and rock (impassible, but does not hinder attacks or sight). The map is randomly generated with roughly equal amounts of each type of terrain. Because there are many rock and forest tiles, movement is typically very cramped. This is tactically important. The terrain is not flat; higher terrain gives minor bonuses. The terrain is known to both sides. The player is always the attacker and the AI is always the defender, so it's perfectly valid for the AI to set up a defensive position and just wait. The player wins by killing all defenders or by getting a unit to the city gates (a tile on the other side of the map). There are very few units on each side, usually 4-8. Because of this, it's crucial not to take damage without gaining some advantage from it. Units can take multiple actions per turn. All units on one side move before any units on the other side. Order of execution is important, and interleaving of actions between units is often useful. Units have melee and ranged attacks. Melee attacks vary widely in strength; ranged attacks have the same strength but vary in range. The main challenges I face are these: Lots of useful move combinations start with a "useless" move that gains no immediate advantage, or even loses advantage, in order to set up a powerful flank attack in the future. And, since the player units are stronger and have longer range, the AI pretty much always has to take some losses before they can start to gain kills. The AI must be able to look ahead to distinguish between sacrificial actions that provide a future benefit and those that don't. Because the terrain is so cramped, most of the tactics come down to achieving good positioning with multiple units that work together to defend an area. For instance, two defenders can often dominate a narrow pass by positioning themselves so an enemy unit attempting to pass must expose itself to a flank attack. But one defender in the same pass would be useless, and three units can defend a slightly larger pass. Etc. The AI should be able to figure out where the player must go to reach the city gates and how to best position its few units to cover the approaches, shifting, splitting, or combining them appropriately as the player moves. Because flank attacks are extremely deadly (and engineering flank attacks is key to the player strategy), the AI should be competent at moving its units so that they cover each other's flanks unless the sacrifice of a unit would give a substantial benefit. They should also be able to force flank attacks on players, for instance by threatening a unit from two different directions such that responding to one threat exposes the flank to the other. The AI should attack if possible, but sometimes there are no good ways to approach the player's position. In that case, the AI should be able to recognize this and set up a defensive position of its own. But the AI shouldn't be vulnerable to a trivial exploit where the player repeatedly opens and closes a hole in his defense and shoots at the AI as it approaches and retreats. That is, the AI should ideally be able to recognize that the player is capable of establishing a solid defense of an area, even if the defense is not currently in place. (I suppose if a good unit allocation algorithm existed, as needed for the second bullet point, the AI could run it on the player units to see where they could defend.) Because it's important to choose a good order of action and interleave actions between units, it's not as simple as just finding the best move for each unit in turn. All of these can be accomplished with a minimax search in theory, but the search space is too large, so specialized techniques are needed. I thought about techniques such as influence mapping, but I don't see how to use the technique to great effect. I thought about assigning goals to the units. This can help them work together in some limited way, and the problem of "how do I accomplish this goal?" is easier to solve than "how do I win this battle?", but assigning good goals is a hard problem in itself, because it requires knowing whether the goal is achievable and whether it's a good use of resources. So, does anyone have specific ideas for techniques that can help cleverize this AI? Update: I found a related question on Stackoverflow: http://stackoverflow.com/questions/3133273/ai-for-a-final-fantasy-tactics-like-game The selected answer gives a decent approach to choosing between alternative actions, but it doesn't seem to have much ability to look into the future and discern beneficial sacrifices from wasteful ones. It also focuses on a single unit at a time and it's not clear how it could be extended to support cooperation between units in defending or attacking.

    Read the article

  • CI Deployment Of Azure Web Roles Using TeamCity

    - by srkirkland
    After recently migrating an important new website to use Windows Azure “Web Roles” I wanted an easier way to deploy new versions to the Azure Staging environment as well as a reliable process to rollback deployments to a certain “known good” source control commit checkpoint.  By configuring our JetBrains’ TeamCity CI server to utilize Windows Azure PowerShell cmdlets to create new automated deployments, I’ll show you how to take control of your Azure publish process. Step 0: Configuring your Azure Project in Visual Studio Before we can start looking at automating the deployment, we should make sure manual deployments from Visual Studio are working properly.  Detailed information for setting up deployments can be found at http://msdn.microsoft.com/en-us/library/windowsazure/ff683672.aspx#PublishAzure or by doing some quick Googling, but the basics are as follows: Install the prerequisite Windows Azure SDK Create an Azure project by right-clicking on your web project and choosing “Add Windows Azure Cloud Service Project” (or by manually adding that project type) Configure your Role and Service Configuration/Definition as desired Right-click on your azure project and choose “Publish,” create a publish profile, and push to your web role You don’t actually have to do step #4 and create a publish profile, but it’s a good exercise to make sure everything is working properly.  Once your Windows Azure project is setup correctly, we are ready to move on to understanding the Azure Publish process. Understanding the Azure Publish Process The actual Windows Azure project is fairly simple at its core—it builds your dependent roles (in our case, a web role) against a specific service and build configuration, and outputs two files: ServiceConfiguration.Cloud.cscfg: This is just the file containing your package configuration info, for example Instance Count, OsFamily, ConnectionString and other Setting information. ProjectName.Azure.cspkg: This is the package file that contains the guts of your deployment, including all deployable files. When you package your Azure project, these two files will be created within the directory ./[ProjectName].Azure/bin/[ConfigName]/app.publish/.  If you want to build your Azure Project from the command line, it’s as simple as calling MSBuild on the “Publish” target: msbuild.exe /target:Publish Windows Azure PowerShell Cmdlets The last pieces of the puzzle that make CI automation possible are the Azure PowerShell Cmdlets (http://msdn.microsoft.com/en-us/library/windowsazure/jj156055.aspx).  These cmdlets are what will let us create deployments without Visual Studio or other user intervention. Preparing TeamCity for Azure Deployments Now we are ready to get our TeamCity server setup so it can build and deploy Windows Azure projects, which we now know requires the Azure SDK and the Windows Azure PowerShell Cmdlets. Installing the Azure SDK is easy enough, just go to https://www.windowsazure.com/en-us/develop/net/ and click “Install” Once this SDK is installed, I recommend running a test build to make sure your project is building correctly.  You’ll want to setup your build step using MSBuild with the “Publish” target against your solution file.  Mine looks like this: Assuming the build was successful, you will now have the two *.cspkg and *cscfg files within your build directory.  If the build was red (failed), take a look at the build logs and keep an eye out for “unsupported project type” or other build errors, which will need to be addressed before the CI deployment can be completed. With a successful build we are now ready to install and configure the Windows Azure PowerShell Cmdlets: Follow the instructions at http://msdn.microsoft.com/en-us/library/windowsazure/jj554332 to install the Cmdlets and configure PowerShell After installing the Cmdlets, you’ll need to get your Azure Subscription Info using the Get-AzurePublishSettingsFile command. Store the resulting *.publishsettings file somewhere you can get to easily, like C:\TeamCity, because you will need to reference it later from your deploy script. Scripting the CI Deploy Process Now that the cmdlets are installed on our TeamCity server, we are ready to script the actual deployment using a TeamCity “PowerShell” build runner.  Before we look at any code, here’s a breakdown of our deployment algorithm: Setup your variables, including the location of the *.cspkg and *cscfg files produced in the earlier MSBuild step (remember, the folder is something like [ProjectName].Azure/bin/[ConfigName]/app.publish/ Import the Windows Azure PowerShell Cmdlets Import and set your Azure Subscription information (this is basically your authentication/authorization step, so protect your settings file Now look for a current deployment, and if you find one Upgrade it, else Create a new deployment Pretty simple and straightforward.  Now let’s look at the code (also available as a gist here: https://gist.github.com/3694398): $subscription = "[Your Subscription Name]" $service = "[Your Azure Service Name]" $slot = "staging" #staging or production $package = "[ProjectName]\bin\[BuildConfigName]\app.publish\[ProjectName].cspkg" $configuration = "[ProjectName]\bin\[BuildConfigName]\app.publish\ServiceConfiguration.Cloud.cscfg" $timeStampFormat = "g" $deploymentLabel = "ContinuousDeploy to $service v%build.number%"   Write-Output "Running Azure Imports" Import-Module "C:\Program Files (x86)\Microsoft SDKs\Windows Azure\PowerShell\Azure\*.psd1" Import-AzurePublishSettingsFile "C:\TeamCity\[PSFileName].publishsettings" Set-AzureSubscription -CurrentStorageAccount $service -SubscriptionName $subscription   function Publish(){ $deployment = Get-AzureDeployment -ServiceName $service -Slot $slot -ErrorVariable a -ErrorAction silentlycontinue   if ($a[0] -ne $null) { Write-Output "$(Get-Date -f $timeStampFormat) - No deployment is detected. Creating a new deployment. " } if ($deployment.Name -ne $null) { #Update deployment inplace (usually faster, cheaper, won't destroy VIP) Write-Output "$(Get-Date -f $timeStampFormat) - Deployment exists in $servicename. Upgrading deployment." UpgradeDeployment } else { CreateNewDeployment } }   function CreateNewDeployment() { write-progress -id 3 -activity "Creating New Deployment" -Status "In progress" Write-Output "$(Get-Date -f $timeStampFormat) - Creating New Deployment: In progress"   $opstat = New-AzureDeployment -Slot $slot -Package $package -Configuration $configuration -label $deploymentLabel -ServiceName $service   $completeDeployment = Get-AzureDeployment -ServiceName $service -Slot $slot $completeDeploymentID = $completeDeployment.deploymentid   write-progress -id 3 -activity "Creating New Deployment" -completed -Status "Complete" Write-Output "$(Get-Date -f $timeStampFormat) - Creating New Deployment: Complete, Deployment ID: $completeDeploymentID" }   function UpgradeDeployment() { write-progress -id 3 -activity "Upgrading Deployment" -Status "In progress" Write-Output "$(Get-Date -f $timeStampFormat) - Upgrading Deployment: In progress"   # perform Update-Deployment $setdeployment = Set-AzureDeployment -Upgrade -Slot $slot -Package $package -Configuration $configuration -label $deploymentLabel -ServiceName $service -Force   $completeDeployment = Get-AzureDeployment -ServiceName $service -Slot $slot $completeDeploymentID = $completeDeployment.deploymentid   write-progress -id 3 -activity "Upgrading Deployment" -completed -Status "Complete" Write-Output "$(Get-Date -f $timeStampFormat) - Upgrading Deployment: Complete, Deployment ID: $completeDeploymentID" }   Write-Output "Create Azure Deployment" Publish   Creating the TeamCity Build Step The only thing left is to create a second build step, after your MSBuild “Publish” step, with the build runner type “PowerShell”.  Then set your script to “Source Code,” the script execution mode to “Put script into PowerShell stdin with “-Command” arguments” and then copy/paste in the above script (replacing the placeholder sections with your values).  This should look like the following:   Wrap Up After combining the MSBuild /target:Publish step (which creates the necessary Windows Azure *.cspkg and *.cscfg files) and a PowerShell script step which utilizes the Azure PowerShell Cmdlets, we have a fully deployable build configuration in TeamCity.  You can configure this step to run whenever you’d like using build triggers – for example, you could even deploy whenever a new master branch deploy comes in and passes all required tests. In the script I’ve hardcoded that every deployment goes to the Staging environment on Azure, but you could deploy straight to Production if you want to, or even setup a deployment configuration variable and set it as desired. After your TeamCity Build Configuration is complete, you’ll see something that looks like this: Whenever you click the “Run” button, all of your code will be compiled, published, and deployed to Windows Azure! One additional enormous benefit of automating the process this way is that you can easily deploy any specific source control changeset by clicking the little ellipsis button next to "Run.”  This will bring up a dialog like the one below, where you can select the last change to use for your deployment.  Since Azure Web Role deployments don’t have any rollback functionality, this is a critical feature.   Enjoy!

    Read the article

  • A New Threat To Web Applications: Connection String Parameter Pollution (CSPP)

    - by eric.maurice
    Hi, this is Shaomin Wang. I am a security analyst in Oracle's Security Alerts Group. My primary responsibility is to evaluate the security vulnerabilities reported externally by security researchers on Oracle Fusion Middleware and to ensure timely resolution through the Critical Patch Update. Today, I am going to talk about a serious type of attack: Connection String Parameter Pollution (CSPP). Earlier this year, at the Black Hat DC 2010 Conference, two Spanish security researchers, Jose Palazon and Chema Alonso, unveiled a new class of security vulnerabilities, which target insecure dynamic connections between web applications and databases. The attack called Connection String Parameter Pollution (CSPP) exploits specifically the semicolon delimited database connection strings that are constructed dynamically based on the user inputs from web applications. CSPP, if carried out successfully, can be used to steal user identities and hijack web credentials. CSPP is a high risk attack because of the relative ease with which it can be carried out (low access complexity) and the potential results it can have (high impact). In today's blog, we are going to first look at what connection strings are and then review the different ways connection string injections can be leveraged by malicious hackers. We will then discuss how CSPP differs from traditional connection string injection, and the measures organizations can take to prevent this kind of attacks. In web applications, a connection string is a set of values that specifies information to connect to backend data repositories, in most cases, databases. The connection string is passed to a provider or driver to initiate a connection. Vendors or manufacturers write their own providers for different databases. Since there are many different providers and each provider has multiple ways to make a connection, there are many different ways to write a connection string. Here are some examples of connection strings from Oracle Data Provider for .Net/ODP.Net: Oracle Data Provider for .Net / ODP.Net; Manufacturer: Oracle; Type: .NET Framework Class Library: - Using TNS Data Source = orcl; User ID = myUsername; Password = myPassword; - Using integrated security Data Source = orcl; Integrated Security = SSPI; - Using the Easy Connect Naming Method Data Source = username/password@//myserver:1521/my.server.com - Specifying Pooling parameters Data Source=myOracleDB; User Id=myUsername; Password=myPassword; Min Pool Size=10; Connection Lifetime=120; Connection Timeout=60; Incr Pool Size=5; Decr Pool Size=2; There are many variations of the connection strings, but the majority of connection strings are key value pairs delimited by semicolons. Attacks on connection strings are not new (see for example, this SANS White Paper on Securing SQL Connection String). Connection strings are vulnerable to injection attacks when dynamic string concatenation is used to build connection strings based on user input. When the user input is not validated or filtered, and malicious text or characters are not properly escaped, an attacker can potentially access sensitive data or resources. For a number of years now, vendors, including Oracle, have created connection string builder class tools to help developers generate valid connection strings and potentially prevent this kind of vulnerability. Unfortunately, not all application developers use these utilities because they are not aware of the danger posed by this kind of attacks. So how are Connection String parameter Pollution (CSPP) attacks different from traditional Connection String Injection attacks? First, let's look at what parameter pollution attacks are. Parameter pollution is a technique, which typically involves appending repeating parameters to the request strings to attack the receiving end. Much of the public attention around parameter pollution was initiated as a result of a presentation on HTTP Parameter Pollution attacks by Stefano Di Paola and Luca Carettoni delivered at the 2009 Appsec OWASP Conference in Poland. In HTTP Parameter Pollution attacks, an attacker submits additional parameters in HTTP GET/POST to a web application, and if these parameters have the same name as an existing parameter, the web application may react in different ways depends on how the web application and web server deal with multiple parameters with the same name. When applied to connections strings, the rule for the majority of database providers is the "last one wins" algorithm. If a KEYWORD=VALUE pair occurs more than once in the connection string, the value associated with the LAST occurrence is used. This opens the door to some serious attacks. By way of example, in a web application, a user enters username and password; a subsequent connection string is generated to connect to the back end database. Data Source = myDataSource; Initial Catalog = db; Integrated Security = no; User ID = myUsername; Password = XXX; In the password field, if the attacker enters "xxx; Integrated Security = true", the connection string becomes, Data Source = myDataSource; Initial Catalog = db; Integrated Security = no; User ID = myUsername; Password = XXX; Intergrated Security = true; Under the "last one wins" principle, the web application will then try to connect to the database using the operating system account under which the application is running to bypass normal authentication. CSPP poses serious risks for unprepared organizations. It can be particularly dangerous if an Enterprise Systems Management web front-end is compromised, because attackers can then gain access to control panels to configure databases, systems accounts, etc. Fortunately, organizations can take steps to prevent this kind of attacks. CSPP falls into the Injection category of attacks like Cross Site Scripting or SQL Injection, which are made possible when inputs from users are not properly escaped or sanitized. Escaping is a technique used to ensure that characters (mostly from user inputs) are treated as data, not as characters, that is relevant to the interpreter's parser. Software developers need to become aware of the danger of these attacks and learn about the defenses mechanism they need to introduce in their code. As well, software vendors need to provide templates or classes to facilitate coding and eliminate developers' guesswork for protecting against such vulnerabilities. Oracle has introduced the OracleConnectionStringBuilder class in Oracle Data Provider for .NET. Using this class, developers can employ a configuration file to provide the connection string and/or dynamically set the values through key/value pairs. It makes creating connection strings less error-prone and easier to manager, and ultimately using the OracleConnectionStringBuilder class provides better security against injection into connection strings. For More Information: - The OracleConnectionStringBuilder is located at http://download.oracle.com/docs/cd/B28359_01/win.111/b28375/OracleConnectionStringBuilderClass.htm - Oracle has developed a publicly available course on preventing SQL Injections. The Server Technologies Curriculum course "Defending Against SQL Injection Attacks!" is located at http://st-curriculum.oracle.com/tutorial/SQLInjection/index.htm - The OWASP web site also provides a number of useful resources. It is located at http://www.owasp.org/index.php/Main_Page

    Read the article

  • Optimizing AES modes on Solaris for Intel Westmere

    - by danx
    Optimizing AES modes on Solaris for Intel Westmere Review AES is a strong method of symmetric (secret-key) encryption. It is a U.S. FIPS-approved cryptographic algorithm (FIPS 197) that operates on 16-byte blocks. AES has been available since 2001 and is widely used. However, AES by itself has a weakness. AES encryption isn't usually used by itself because identical blocks of plaintext are always encrypted into identical blocks of ciphertext. This encryption can be easily attacked with "dictionaries" of common blocks of text and allows one to more-easily discern the content of the unknown cryptotext. This mode of encryption is called "Electronic Code Book" (ECB), because one in theory can keep a "code book" of all known cryptotext and plaintext results to cipher and decipher AES. In practice, a complete "code book" is not practical, even in electronic form, but large dictionaries of common plaintext blocks is still possible. Here's a diagram of encrypting input data using AES ECB mode: Block 1 Block 2 PlainTextInput PlainTextInput | | | | \/ \/ AESKey-->(AES Encryption) AESKey-->(AES Encryption) | | | | \/ \/ CipherTextOutput CipherTextOutput Block 1 Block 2 What's the solution to the same cleartext input producing the same ciphertext output? The solution is to further process the encrypted or decrypted text in such a way that the same text produces different output. This usually involves an Initialization Vector (IV) and XORing the decrypted or encrypted text. As an example, I'll illustrate CBC mode encryption: Block 1 Block 2 PlainTextInput PlainTextInput | | | | \/ \/ IV >----->(XOR) +------------->(XOR) +---> . . . . | | | | | | | | \/ | \/ | AESKey-->(AES Encryption) | AESKey-->(AES Encryption) | | | | | | | | | \/ | \/ | CipherTextOutput ------+ CipherTextOutput -------+ Block 1 Block 2 The steps for CBC encryption are: Start with a 16-byte Initialization Vector (IV), choosen randomly. XOR the IV with the first block of input plaintext Encrypt the result with AES using a user-provided key. The result is the first 16-bytes of output cryptotext. Use the cryptotext (instead of the IV) of the previous block to XOR with the next input block of plaintext Another mode besides CBC is Counter Mode (CTR). As with CBC mode, it also starts with a 16-byte IV. However, for subsequent blocks, the IV is just incremented by one. Also, the IV ix XORed with the AES encryption result (not the plain text input). Here's an illustration: Block 1 Block 2 PlainTextInput PlainTextInput | | | | \/ \/ AESKey-->(AES Encryption) AESKey-->(AES Encryption) | | | | \/ \/ IV >----->(XOR) IV + 1 >---->(XOR) IV + 2 ---> . . . . | | | | \/ \/ CipherTextOutput CipherTextOutput Block 1 Block 2 Optimization Which of these modes can be parallelized? ECB encryption/decryption can be parallelized because it does more than plain AES encryption and decryption, as mentioned above. CBC encryption can't be parallelized because it depends on the output of the previous block. However, CBC decryption can be parallelized because all the encrypted blocks are known at the beginning. CTR encryption and decryption can be parallelized because the input to each block is known--it's just the IV incremented by one for each subsequent block. So, in summary, for ECB, CBC, and CTR modes, encryption and decryption can be parallelized with the exception of CBC encryption. How do we parallelize encryption? By interleaving. Usually when reading and writing data there are pipeline "stalls" (idle processor cycles) that result from waiting for memory to be loaded or stored to or from CPU registers. Since the software is written to encrypt/decrypt the next data block where pipeline stalls usually occurs, we can avoid stalls and crypt with fewer cycles. This software processes 4 blocks at a time, which ensures virtually no waiting ("stalling") for reading or writing data in memory. Other Optimizations Besides interleaving, other optimizations performed are Loading the entire key schedule into the 128-bit %xmm registers. This is done once for per 4-block of data (since 4 blocks of data is processed, when present). The following is loaded: the entire "key schedule" (user input key preprocessed for encryption and decryption). This takes 11, 13, or 15 registers, for AES-128, AES-192, and AES-256, respectively The input data is loaded into another %xmm register The same register contains the output result after encrypting/decrypting Using SSSE 4 instructions (AESNI). Besides the aesenc, aesenclast, aesdec, aesdeclast, aeskeygenassist, and aesimc AESNI instructions, Intel has several other instructions that operate on the 128-bit %xmm registers. Some common instructions for encryption are: pxor exclusive or (very useful), movdqu load/store a %xmm register from/to memory, pshufb shuffle bytes for byte swapping, pclmulqdq carry-less multiply for GCM mode Combining AES encryption/decryption with CBC or CTR modes processing. Instead of loading input data twice (once for AES encryption/decryption, and again for modes (CTR or CBC, for example) processing, the input data is loaded once as both AES and modes operations occur at in the same function Performance Everyone likes pretty color charts, so here they are. I ran these on Solaris 11 running on a Piketon Platform system with a 4-core Intel Clarkdale processor @3.20GHz. Clarkdale which is part of the Westmere processor architecture family. The "before" case is Solaris 11, unmodified. Keep in mind that the "before" case already has been optimized with hand-coded Intel AESNI assembly. The "after" case has combined AES-NI and mode instructions, interleaved 4 blocks at-a-time. « For the first table, lower is better (milliseconds). The first table shows the performance improvement using the Solaris encrypt(1) and decrypt(1) CLI commands. I encrypted and decrypted a 1/2 GByte file on /tmp (swap tmpfs). Encryption improved by about 40% and decryption improved by about 80%. AES-128 is slighty faster than AES-256, as expected. The second table shows more detail timings for CBC, CTR, and ECB modes for the 3 AES key sizes and different data lengths. » The results shown are the percentage improvement as shown by an internal PKCS#11 microbenchmark. And keep in mind the previous baseline code already had optimized AESNI assembly! The keysize (AES-128, 192, or 256) makes little difference in relative percentage improvement (although, of course, AES-128 is faster than AES-256). Larger data sizes show better improvement than 128-byte data. Availability This software is in Solaris 11 FCS. It is available in the 64-bit libcrypto library and the "aes" Solaris kernel module. You must be running hardware that supports AESNI (for example, Intel Westmere and Sandy Bridge, microprocessor architectures). The easiest way to determine if AES-NI is available is with the isainfo(1) command. For example, $ isainfo -v 64-bit amd64 applications pclmulqdq aes sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov amd_sysc cx8 tsc fpu 32-bit i386 applications pclmulqdq aes sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov sep cx8 tsc fpu No special configuration or setup is needed to take advantage of this software. Solaris libraries and kernel automatically determine if it's running on AESNI-capable machines and execute the correctly-tuned software for the current microprocessor. Summary Maximum throughput of AES cipher modes can be achieved by combining AES encryption with modes processing, interleaving encryption of 4 blocks at a time, and using Intel's wide 128-bit %xmm registers and instructions. References "Block cipher modes of operation", Wikipedia Good overview of AES modes (ECB, CBC, CTR, etc.) "Advanced Encryption Standard", Wikipedia "Current Modes" describes NIST-approved block cipher modes (ECB,CBC, CFB, OFB, CCM, GCM)

    Read the article

  • Resizing text in an HTML 5 page using JQuery

    - by nikolaosk
    This is going to be the ninth post in a series of posts regarding HTML 5. You can find the other posts here, here , here , here, here , here , here and here.In this post I will demonstrate how to implement a very common feature found in websites today, enabling the visitor to increase or decrease the font size of a page. You can use the JQuery code I will write in this post for HTML pages which do not follow the HTML 5 standard. As I said earlier we need to write JavaScript to implement this functionality.I will use the very popular JQuery Library. Please download the library (minified version) from http://jquery.com/downloadIn this hands-on example I will be using Expression Web 4.0.This application is not a free application. You can use any HTML editor you like.You can use Visual Studio 2012 Express edition. You can download it here. The HTML markup for the page follows. <!DOCTYPE html><html lang="en">  <head>    <title>HTML 5, CSS3 and JQuery</title>        <meta http-equiv="Content-Type" content="text/html;charset=utf-8" >    <link rel="stylesheet" type="text/css" href="style.css">     <script type="text/javascript" src="jquery-1.8.2.min.js">        </script><script type="text/javascript">$(function() {    $('a').click(function() {        var getfont = $('p').css('font-size');        var mynum = parseFloat(getfont, 10);        var newmwasure = getfont.slice(-2);                $('p').css('font-size', mynum / 1.2 + newmwasure);                if(this.id == 'increase') {            $('p').css('font-size', mynum * 1.4 + newmwasure);        }     })    })</script>       </head>  <body>      <div id="header">      <h1>Learn cutting edge technologies</h1>      <h2>HTML 5, JQuery, CSS3</h2>    </div>    <div id="resize">    <a href="" id="increase">Increase Font</a>       |        <a href="" id="decrease">Decrease Font</a>        </div>        <div id="main">          <h2>HTML 5</h2>                        <article>          <p>            HTML5 is the latest version of HTML and XHTML. The HTML standard defines a single language that can be written in HTML and XML. It attempts to solve issues found in previous iterations of HTML and addresses the needs of Web Applications, an area previously not adequately covered by HTML.          </p>          </article>      </div>             </body>  </html>  There is nothing difficult or fancy in the HTML markup above. I have a link to the external JQuery library and the JQuery code is included inside the .html page.I have two links on this page that will increase/decrease the font size of the contents enclosed inside the <p></p> tags.Let me explain what the JQuery code does.When the user clicks on the link, I store in a variable the current font size of the <p> element that I get back from the CSS function. var getfont = $('p').css('font-size'); So now we have the original value. That will return a value like "16px" "1.2em".Then I need to get the unit of measurement (px,em).I use the slice() function. var newmwasure = getfont.slice(-2); Then I want to get only the numeric part of the returning value.I do that using the parseFloat() function.Have a look at the parseFloat() function.Finally with this bit of code I choose a ratio (I am devising a very simple algorithm for increasing and decreasing) and apply it to the <p> element. I still use the CSS function. You can get but also set the font size for a particular element with the CSS function.So I check for the id=increase and if this matches I will increase the font size of the <p> element.If it does not match we will decrease the font size.   $('p').css('font-size', mynum / 1.2 + newmwasure);                if(this.id == 'increase') {            $('p').css('font-size', mynum * 1.4 + newmwasure);  The code for the css file (style.css) followsbody{background-color:#eaeaea;}p{font-size:0.8em;font-family:Tahoma;}#resize{width:200px;background-color:#dadada;}#resize a {text-decoration:none;}The above CSS rules are very easy to understand. Now I save all my work.I view my page on the browser for the first time.Have a look at the picture below Now I increase the font size by clicking the respective linkHave a look at the picture below  Finally I decrease the font size by clicking on the respective linkHave a look at the picture below   Once more we see that the power and simplicity of JQuery library enables us to write less code but accomplish a lot at the same time. Hope it helps!!  

    Read the article

  • Movement prediction for non-shooters

    - by ShadowChaser
    I'm working on an isometric 2D game with moderate-scale multiplayer, approximately 20-30 players connected at once to a persistent server. I've had some difficulty getting a good movement prediction implementation in place. Physics/Movement The game doesn't have a true physics implementation, but uses the basic principles to implement movement. Rather than continually polling input, state changes (ie/ mouse down/up/move events) are used to change the state of the character entity the player is controlling. The player's direction (ie/ north-east) is combined with a constant speed and turned into a true 3D vector - the entity's velocity. In the main game loop, "Update" is called before "Draw". The update logic triggers a "physics update task" that tracks all entities with a non-zero velocity uses very basic integration to change the entities position. For example: entity.Position += entity.Velocity.Scale(ElapsedTime.Seconds) (where "Seconds" is a floating point value, but the same approach would work for millisecond integer values). The key point is that no interpolation is used for movement - the rudimentary physics engine has no concept of a "previous state" or "current state", only a position and velocity. State Change and Update Packets When the velocity of the character entity the player is controlling changes, a "move avatar" packet is sent to the server containing the entity's action type (stand, walk, run), direction (north-east), and current position. This is different from how 3D first person games work. In a 3D game the velocity (direction) can change frame to frame as the player moves around. Sending every state change would effectively transmit a packet per frame, which would be too expensive. Instead, 3D games seem to ignore state changes and send "state update" packets on a fixed interval - say, every 80-150ms. Since speed and direction updates occur much less frequently in my game, I can get away with sending every state change. Although all of the physics simulations occur at the same speed and are deterministic, latency is still an issue. For that reason, I send out routine position update packets (similar to a 3D game) but much less frequently - right now every 250ms, but I suspect with good prediction I can easily boost it towards 500ms. The biggest problem is that I've now deviated from the norm - all other documentation, guides, and samples online send routine updates and interpolate between the two states. It seems incompatible with my architecture, and I need to come up with a better movement prediction algorithm that is closer to a (very basic) "networked physics" architecture. The server then receives the packet and determines the players speed from it's movement type based on a script (Is the player able to run? Get the player's running speed). Once it has the speed, it combines it with the direction to get a vector - the entity's velocity. Some cheat detection and basic validation occurs, and the entity on the server side is updated with the current velocity, direction, and position. Basic throttling is also performed to prevent players from flooding the server with movement requests. After updating its own entity, the server broadcasts an "avatar position update" packet to all other players within range. The position update packet is used to update the client side physics simulations (world state) of the remote clients and perform prediction and lag compensation. Prediction and Lag Compensation As mentioned above, clients are authoritative for their own position. Except in cases of cheating or anomalies, the client's avatar will never be repositioned by the server. No extrapolation ("move now and correct later") is required for the client's avatar - what the player sees is correct. However, some sort of extrapolation or interpolation is required for all remote entities that are moving. Some sort of prediction and/or lag-compensation is clearly required within the client's local simulation / physics engine. Problems I've been struggling with various algorithms, and have a number of questions and problems: Should I be extrapolating, interpolating, or both? My "gut feeling" is that I should be using pure extrapolation based on velocity. State change is received by the client, client computes a "predicted" velocity that compensates for lag, and the regular physics system does the rest. However, it feels at odds to all other sample code and articles - they all seem to store a number of states and perform interpolation without a physics engine. When a packet arrives, I've tried interpolating the packet's position with the packet's velocity over a fixed time period (say, 200ms). I then take the difference between the interpolated position and the current "error" position to compute a new vector and place that on the entity instead of the velocity that was sent. However, the assumption is that another packet will arrive in that time interval, and it's incredibly difficult to "guess" when the next packet will arrive - especially since they don't all arrive on fixed intervals (ie/ state changes as well). Is the concept fundamentally flawed, or is it correct but needs some fixes / adjustments? What happens when a remote player stops? I can immediately stop the entity, but it will be positioned in the "wrong" spot until it moves again. If I estimate a vector or try to interpolate, I have an issue because I don't store the previous state - the physics engine has no way to say "you need to stop after you reach position X". It simply understands a velocity, nothing more complex. I'm reluctant to add the "packet movement state" information to the entities or physics engine, since it violates basic design principles and bleeds network code across the rest of the game engine. What should happen when entities collide? There are three scenarios - the controlling player collides locally, two entities collide on the server during a position update, or a remote entity update collides on the local client. In all cases I'm uncertain how to handle the collision - aside from cheating, both states are "correct" but at different time periods. In the case of a remote entity it doesn't make sense to draw it walking through a wall, so I perform collision detection on the local client and cause it to "stop". Based on point #2 above, I might compute a "corrected vector" that continually tries to move the entity "through the wall" which will never succeed - the remote avatar is stuck there until the error gets too high and it "snaps" into position. How do games work around this?

    Read the article

  • SortedDictionary and SortedList

    - by Simon Cooper
    Apart from Dictionary<TKey, TValue>, there's two other dictionaries in the BCL - SortedDictionary<TKey, TValue> and SortedList<TKey, TValue>. On the face of it, these two classes do the same thing - provide an IDictionary<TKey, TValue> interface where the iterator returns the items sorted by the key. So what's the difference between them, and when should you use one rather than the other? (as in my previous post, I'll assume you have some basic algorithm & datastructure knowledge) SortedDictionary We'll first cover SortedDictionary. This is implemented as a special sort of binary tree called a red-black tree. Essentially, it's a binary tree that uses various constraints on how the nodes of the tree can be arranged to ensure the tree is always roughly balanced (for more gory algorithmical details, see the wikipedia link above). What I'm concerned about in this post is how the .NET SortedDictionary is actually implemented. In .NET 4, behind the scenes, the actual implementation of the tree is delegated to a SortedSet<KeyValuePair<TKey, TValue>>. One example tree might look like this: Each node in the above tree is stored as a separate SortedSet<T>.Node object (remember, in a SortedDictionary, T is instantiated to KeyValuePair<TKey, TValue>): class Node { public bool IsRed; public T Item; public SortedSet<T>.Node Left; public SortedSet<T>.Node Right; } The SortedSet only stores a reference to the root node; all the data in the tree is accessed by traversing the Left and Right node references until you reach the node you're looking for. Each individual node can be physically stored anywhere in memory; what's important is the relationship between the nodes. This is also why there is no constructor to SortedDictionary or SortedSet that takes an integer representing the capacity; there are no internal arrays that need to be created and resized. This may seen trivial, but it's an important distinction between SortedDictionary and SortedList that I'll cover later on. And that's pretty much it; it's a standard red-black tree. Plenty of webpages and datastructure books cover the algorithms behind the tree itself far better than I could. What's interesting is the comparions between SortedDictionary and SortedList, which I'll cover at the end. As a side point, SortedDictionary has existed in the BCL ever since .NET 2. That means that, all through .NET 2, 3, and 3.5, there has been a bona-fide sorted set class in the BCL (called TreeSet). However, it was internal, so it couldn't be used outside System.dll. Only in .NET 4 was this class exposed as SortedSet. SortedList Whereas SortedDictionary didn't use any backing arrays, SortedList does. It is implemented just as the name suggests; two arrays, one containing the keys, and one the values (I've just used random letters for the values): The items in the keys array are always guarenteed to be stored in sorted order, and the value corresponding to each key is stored in the same index as the key in the values array. In this example, the value for key item 5 is 'z', and for key item 8 is 'm'. Whenever an item is inserted or removed from the SortedList, a binary search is run on the keys array to find the correct index, then all the items in the arrays are shifted to accomodate the new or removed item. For example, if the key 3 was removed, a binary search would be run to find the array index the item was at, then everything above that index would be moved down by one: and then if the key/value pair {7, 'f'} was added, a binary search would be run on the keys to find the index to insert the new item, and everything above that index would be moved up to accomodate the new item: If another item was then added, both arrays would be resized (to a length of 10) before the new item was added to the arrays. As you can see, any insertions or removals in the middle of the list require a proportion of the array contents to be moved; an O(n) operation. However, if the insertion or removal is at the end of the array (ie the largest key), then it's only O(log n); the cost of the binary search to determine it does actually need to be added to the end (excluding the occasional O(n) cost of resizing the arrays to fit more items). As a side effect of using backing arrays, SortedList offers IList Keys and Values views that simply use the backing keys or values arrays, as well as various methods utilising the array index of stored items, which SortedDictionary does not (and cannot) offer. The Comparison So, when should you use one and not the other? Well, here's the important differences: Memory usage SortedDictionary and SortedList have got very different memory profiles. SortedDictionary... has a memory overhead of one object instance, a bool, and two references per item. On 64-bit systems, this adds up to ~40 bytes, not including the stored item and the reference to it from the Node object. stores the items in separate objects that can be spread all over the heap. This helps to keep memory fragmentation low, as the individual node objects can be allocated wherever there's a spare 60 bytes. In contrast, SortedList... has no additional overhead per item (only the reference to it in the array entries), however the backing arrays can be significantly larger than you need; every time the arrays are resized they double in size. That means that if you add 513 items to a SortedList, the backing arrays will each have a length of 1024. To conteract this, the TrimExcess method resizes the arrays back down to the actual size needed, or you can simply assign list.Capacity = list.Count. stores its items in a continuous block in memory. If the list stores thousands of items, this can cause significant problems with Large Object Heap memory fragmentation as the array resizes, which SortedDictionary doesn't have. Performance Operations on a SortedDictionary always have O(log n) performance, regardless of where in the collection you're adding or removing items. In contrast, SortedList has O(n) performance when you're altering the middle of the collection. If you're adding or removing from the end (ie the largest item), then performance is O(log n), same as SortedDictionary (in practice, it will likely be slightly faster, due to the array items all being in the same area in memory, also called locality of reference). So, when should you use one and not the other? As always with these sort of things, there are no hard-and-fast rules. But generally, if you: need to access items using their index within the collection are populating the dictionary all at once from sorted data aren't adding or removing keys once it's populated then use a SortedList. But if you: don't know how many items are going to be in the dictionary are populating the dictionary from random, unsorted data are adding & removing items randomly then use a SortedDictionary. The default (again, there's no definite rules on these sort of things!) should be to use SortedDictionary, unless there's a good reason to use SortedList, due to the bad performance of SortedList when altering the middle of the collection.

    Read the article

  • Premature-Optimization and Performance Anxiety

    - by James Michael Hare
    While writing my post analyzing the new .NET 4 ConcurrentDictionary class (here), I fell into one of the classic blunders that I myself always love to warn about.  After analyzing the differences of time between a Dictionary with locking versus the new ConcurrentDictionary class, I noted that the ConcurrentDictionary was faster with read-heavy multi-threaded operations.  Then, I made the classic blunder of thinking that because the original Dictionary with locking was faster for those write-heavy uses, it was the best choice for those types of tasks.  In short, I fell into the premature-optimization anti-pattern. Basically, the premature-optimization anti-pattern is when a developer is coding very early for a perceived (whether rightly-or-wrongly) performance gain and sacrificing good design and maintainability in the process.  At best, the performance gains are usually negligible and at worst, can either negatively impact performance, or can degrade maintainability so much that time to market suffers or the code becomes very fragile due to the complexity. Keep in mind the distinction above.  I'm not talking about valid performance decisions.  There are decisions one should make when designing and writing an application that are valid performance decisions.  Examples of this are knowing the best data structures for a given situation (Dictionary versus List, for example) and choosing performance algorithms (linear search vs. binary search).  But these in my mind are macro optimizations.  The error is not in deciding to use a better data structure or algorithm, the anti-pattern as stated above is when you attempt to over-optimize early on in such a way that it sacrifices maintainability. In my case, I was actually considering trading the safety and maintainability gains of the ConcurrentDictionary (no locking required) for a slight performance gain by using the Dictionary with locking.  This would have been a mistake as I would be trading maintainability (ConcurrentDictionary requires no locking which helps readability) and safety (ConcurrentDictionary is safe for iteration even while being modified and you don't risk the developer locking incorrectly) -- and I fell for it even when I knew to watch out for it.  I think in my case, and it may be true for others as well, a large part of it was due to the time I was trained as a developer.  I began college in in the 90s when C and C++ was king and hardware speed and memory were still relatively priceless commodities and not to be squandered.  In those days, using a long instead of a short could waste precious resources, and as such, we were taught to try to minimize space and favor performance.  This is why in many cases such early code-bases were very hard to maintain.  I don't know how many times I heard back then to avoid too many function calls because of the overhead -- and in fact just last year I heard a new hire in the company where I work declare that she didn't want to refactor a long method because of function call overhead.  Now back then, that may have been a valid concern, but with today's modern hardware even if you're calling a trivial method in an extremely tight loop (which chances are the JIT compiler would optimize anyway) the results of removing method calls to speed up performance are negligible for the great majority of applications.  Now, obviously, there are those coding applications where speed is absolutely king (for example drivers, computer games, operating systems) where such sacrifices may be made.  But I would strongly advice against such optimization because of it's cost.  Many folks that are performing an optimization think it's always a win-win.  That they're simply adding speed to the application, what could possibly be wrong with that?  What they don't realize is the cost of their choice.  For every piece of straight-forward code that you obfuscate with performance enhancements, you risk the introduction of bugs in the long term technical debt of the application.  It will become so fragile over time that maintenance will become a nightmare.  I've seen such applications in places I have worked.  There are times I've seen applications where the designer was so obsessed with performance that they even designed their own memory management system for their application to try to squeeze out every ounce of performance.  Unfortunately, the application stability often suffers as a result and it is very difficult for anyone other than the original designer to maintain. I've even seen this recently where I heard a C++ developer bemoaning that in VS2010 the iterators are about twice as slow as they used to be because Microsoft added range checking (probably as part of the 0x standard implementation).  To me this was almost a joke.  Twice as slow sounds bad, but it almost never as bad as you think -- especially if you're gaining safety.  The only time twice is really that much slower is when once was too slow to begin with.  Think about it.  2 minutes is slow as a response time because 1 minute is slow.  But if an iterator takes 1 microsecond to move one position and a new, safer iterator takes 2 microseconds, this is trivial!  The only way you'd ever really notice this would be in iterating a collection just for the sake of iterating (i.e. no other operations).  To my mind, the added safety makes the extra time worth it. Always favor safety and maintainability when you can.  I know it can be a hard habit to break, especially if you started out your career early or in a language such as C where they are very performance conscious.  But in reality, these type of micro-optimizations only end up hurting you in the long run. Remember the two laws of optimization.  I'm not sure where I first heard these, but they are so true: For beginners: Do not optimize. For experts: Do not optimize yet. This is so true.  If you're a beginner, resist the urge to optimize at all costs.  And if you are an expert, delay that decision.  As long as you have chosen the right data structures and algorithms for your task, your performance will probably be more than sufficient.  Chances are it will be network, database, or disk hits that will be your slow-down, not your code.  As they say, 98% of your code's bottleneck is in 2% of your code so premature-optimization may add maintenance and safety debt that won't have any measurable impact.  Instead, code for maintainability and safety, and then, and only then, when you find a true bottleneck, then you should go back and optimize further.

    Read the article

  • tile_static, tile_barrier, and tiled matrix multiplication with C++ AMP

    - by Daniel Moth
    We ended the previous post with a mechanical transformation of the C++ AMP matrix multiplication example to the tiled model and in the process introduced tiled_index and tiled_grid. This is part 2. tile_static memory You all know that in regular CPU code, static variables have the same value regardless of which thread accesses the static variable. This is in contrast with non-static local variables, where each thread has its own copy. Back to C++ AMP, the same rules apply and each thread has its own value for local variables in your lambda, whereas all threads see the same global memory, which is the data they have access to via the array and array_view. In addition, on an accelerator like the GPU, there is a programmable cache, a third kind of memory type if you'd like to think of it that way (some call it shared memory, others call it scratchpad memory). Variables stored in that memory share the same value for every thread in the same tile. So, when you use the tiled model, you can have variables where each thread in the same tile sees the same value for that variable, that threads from other tiles do not. The new storage class for local variables introduced for this purpose is called tile_static. You can only use tile_static in restrict(direct3d) functions, and only when explicitly using the tiled model. What this looks like in code should be no surprise, but here is a snippet to confirm your mental image, using a good old regular C array // each tile of threads has its own copy of locA, // shared among the threads of the tile tile_static float locA[16][16]; Note that tile_static variables are scoped and have the lifetime of the tile, and they cannot have constructors or destructors. tile_barrier In amp.h one of the types introduced is tile_barrier. You cannot construct this object yourself (although if you had one, you could use a copy constructor to create another one). So how do you get one of these? You get it, from a tiled_index object. Beyond the 4 properties returning index objects, tiled_index has another property, barrier, that returns a tile_barrier object. The tile_barrier class exposes a single member, the method wait. 15: // Given a tiled_index object named t_idx 16: t_idx.barrier.wait(); 17: // more code …in the code above, all threads in the tile will reach line 16 before a single one progresses to line 17. Note that all threads must be able to reach the barrier, i.e. if you had branchy code in such a way which meant that there is a chance that not all threads could reach line 16, then the code above would be illegal. Tiled Matrix Multiplication Example – part 2 So now that we added to our understanding the concepts of tile_static and tile_barrier, let me obfuscate rewrite the matrix multiplication code so that it takes advantage of tiling. Before you start reading this, I suggest you get a cup of your favorite non-alcoholic beverage to enjoy while you try to fully understand the code. 01: void MatrixMultiplyTiled(vector<float>& vC, const vector<float>& vA, const vector<float>& vB, int M, int N, int W) 02: { 03: static const int TS = 16; 04: array_view<const float,2> a(M, W, vA); 05: array_view<const float,2> b(W, N, vB); 06: array_view<writeonly<float>,2> c(M,N,vC); 07: parallel_for_each(c.grid.tile< TS, TS >(), 08: [=] (tiled_index< TS, TS> t_idx) restrict(direct3d) 09: { 10: int row = t_idx.local[0]; int col = t_idx.local[1]; 11: float sum = 0.0f; 12: for (int i = 0; i < W; i += TS) { 13: tile_static float locA[TS][TS], locB[TS][TS]; 14: locA[row][col] = a(t_idx.global[0], col + i); 15: locB[row][col] = b(row + i, t_idx.global[1]); 16: t_idx.barrier.wait(); 17: for (int k = 0; k < TS; k++) 18: sum += locA[row][k] * locB[k][col]; 19: t_idx.barrier.wait(); 20: } 21: c[t_idx.global] = sum; 22: }); 23: } Notice that all the code up to line 9 is the same as per the changes we made in part 1 of tiling introduction. If you squint, the body of the lambda itself preserves the original algorithm on lines 10, 11, and 17, 18, and 21. The difference being that those lines use new indexing and the tile_static arrays; the tile_static arrays are declared and initialized on the brand new lines 13-15. On those lines we copy from the global memory represented by the array_view objects (a and b), to the tile_static vanilla arrays (locA and locB) – we are copying enough to fit a tile. Because in the code that follows on line 18 we expect the data for this tile to be in the tile_static storage, we need to synchronize the threads within each tile with a barrier, which we do on line 16 (to avoid accessing uninitialized memory on line 18). We also need to synchronize the threads within a tile on line 19, again to avoid the race between lines 14, 15 (retrieving the next set of data for each tile and overwriting the previous set) and line 18 (not being done processing the previous set of data). Luckily, as part of the awesome C++ AMP debugger in Visual Studio there is an option that helps you find such races, but that is a story for another blog post another time. May I suggest reading the next section, and then coming back to re-read and walk through this code with pen and paper to really grok what is going on, if you haven't already? Cool. Why would I introduce this tiling complexity into my code? Funny you should ask that, I was just about to tell you. There is only one reason we tiled our extent, had to deal with finding a good tile size, ensure the number of threads we schedule are correctly divisible with the tile size, had to use a tiled_index instead of a normal index, and had to understand tile_barrier and to figure out where we need to use it, and double the size of our lambda in terms of lines of code: the reason is to be able to use tile_static memory. Why do we want to use tile_static memory? Because accessing tile_static memory is around 10 times faster than accessing the global memory on an accelerator like the GPU, e.g. in the code above, if you can get 150GB/second accessing data from the array_view a, you can get 1500GB/second accessing the tile_static array locA. And since by definition you are dealing with really large data sets, the savings really pay off. We have seen tiled implementations being twice as fast as their non-tiled counterparts. Now, some algorithms will not have performance benefits from tiling (and in fact may deteriorate), e.g. algorithms that require you to go only once to global memory will not benefit from tiling, since with tiling you already have to fetch the data once from global memory! Other algorithms may benefit, but you may decide that you are happy with your code being 150 times faster than the serial-version you had, and you do not need to invest to make it 250 times faster. Also algorithms with more than 3 dimensions, which C++ AMP supports in the non-tiled model, cannot be tiled. Also note that in future releases, we may invest in making the non-tiled model, which already uses tiling under the covers, go the extra step and use tile_static memory on your behalf, but it is obviously way to early to commit to anything like that, and we certainly don't do any of that today. Comments about this post by Daniel Moth welcome at the original blog.

    Read the article

  • Testing Workflows &ndash; Test-After

    - by Timothy Klenke
    Originally posted on: http://geekswithblogs.net/TimothyK/archive/2014/05/30/testing-workflows-ndash-test-after.aspxIn this post I’m going to outline a few common methods that can be used to increase the coverage of of your test suite.  This won’t be yet another post on why you should be doing testing; there are plenty of those types of posts already out there.  Assuming you know you should be testing, then comes the problem of how do I actual fit that into my day job.  When the opportunity to automate testing comes do you take it, or do you even recognize it? There are a lot of ways (workflows) to go about creating automated tests, just like there are many workflows to writing a program.  When writing a program you can do it from a top-down approach where you write the main skeleton of the algorithm and call out to dummy stub functions, or a bottom-up approach where the low level functionality is fully implement before it is quickly wired together at the end.  Both approaches are perfectly valid under certain contexts. Each approach you are skilled at applying is another tool in your tool belt.  The more vectors of attack you have on a problem – the better.  So here is a short, incomplete list of some of the workflows that can be applied to increasing the amount of automation in your testing and level of quality in general.  Think of each workflow as an opportunity that is available for you to take. Test workflows basically fall into 2 categories:  test first or test after.  Test first is the best approach.  However, this post isn’t about the one and only best approach.  I want to focus more on the lesser known, less ideal approaches that still provide an opportunity for adding tests.  In this post I’ll enumerate some test-after workflows.  In my next post I’ll cover test-first. Bug Reporting When someone calls you up or forwards you a email with a vague description of a bug its usually standard procedure to create or verify a reproduction plan for the bug via manual testing and log that in a bug tracking system.  This can be problematic.  Often reproduction plans when written down might skip a step that seemed obvious to the tester at the time or they might be missing some crucial environment setting. Instead of data entry into a bug tracking system, try opening up the test project and adding a failing unit test to prove the bug.  The test project guarantees that all aspects of the environment are setup properly and no steps are missing.  The language in the test project is much more precise than the English that goes into a bug tracking system. This workflow can easily be extended for Enhancement Requests as well as Bug Reporting. Exploratory Testing Exploratory testing comes in when you aren’t sure how the system will behave in a new scenario.  The scenario wasn’t planned for in the initial system requirements and there isn’t an existing test for it.  By definition the system behaviour is “undefined”. So write a new unit test to define that behaviour.  Add assertions to the tests to confirm your assumptions.  The new test becomes part of the living system specification that is kept up to date with the test suite. Examples This workflow is especially good when developing APIs.  When you are finally done your production API then comes the job of writing documentation on how to consume the API.  Good documentation will also include code examples.  Don’t let these code examples merely exist in some accompanying manual; implement them in a test suite. Example tests and documentation do not have to be created after the production API is complete.  It is best to write the example code (tests) as you go just before the production code. Smoke Tests Every system has a typical use case.  This represents the basic, core functionality of the system.  If this fails after an upgrade the end users will be hosed and they will be scratching their heads as to how it could be possible that an update got released with this core functionality broken. The tests for this core functionality are referred to as “smoke tests”.  It is a good idea to have them automated and run with each build in order to avoid extreme embarrassment and angry customers. Coverage Analysis Code coverage analysis is a tool that reports how much of the production code base is exercised by the test suite.  In Visual Studio this can be found under the Test main menu item. The tool will report a total number for the code coverage, which can be anywhere between 0 and 100%.  Coverage Analysis shouldn’t be used strictly for numbers reporting.  Companies shouldn’t set minimum coverage targets that mandate that all projects must have at least 80% or 100% test coverage.  These arbitrary requirements just invite gaming of the coverage analysis, which makes the numbers useless. The analysis tool will break down the coverage by the various classes and methods in projects.  Instead of focusing on the total number, drill down into this view and see which classes have high or low coverage.  It you are surprised by a low number on a class this is an opportunity to add tests. When drilling through the classes there will be generally two types of reaction to a surprising low test coverage number.  The first reaction type is a recognition that there is low hanging fruit to be picked.  There may be some classes or methods that aren’t being tested, which could easy be.  The other reaction type is “OMG”.  This were you find a critical piece of code that isn’t under test.  In both cases, go and add the missing tests. Test Refactoring The general theme of this post up to this point has been how to add more and more tests to a test suite.  I’ll step back from that a bit and remind that every line of code is a liability.  Each line of code has to be read and maintained, which costs money.  This is true regardless whether the code is production code or test code. Remember that the primary goal of the test suite is that it be easy to read so that people can easily determine the specifications of the system.  Make sure that adding more and more tests doesn’t interfere with this primary goal. Perform code reviews on the test suite as often as on production code.  Hold the test code up to the same high readability standards as the production code.  If the tests are hard to read then change them.  Look to remove duplication.  Duplicate setup code between two or more test methods that can be moved to a shared function.  Entire test methods can be removed if it is found that the scenario it tests is covered by other tests.  Its OK to delete a test that isn’t pulling its own weight anymore. Remember to only start refactoring when all the test are green.  Don’t refactor the tests and the production code at the same time.  An automated test suite can be thought of as a double entry book keeping system.  The unchanging, passing production code serves as the tests for the test suite while refactoring the tests. As with all refactoring, it is best to fit this into your regular work rather than asking for time later to get it done.  Fit this into the standard red-green-refactor cycle.  The refactor step no only applies to production code but also the tests, but not at the same time.  Perhaps the cycle should be called red-green-refactor production-refactor tests (not quite as catchy).   That about covers most of the test-after workflows I can think of.  In my next post I’ll get into test-first workflows.

    Read the article

  • What's new in Solaris 11.1?

    - by Karoly Vegh
    Solaris 11.1 is released. This is the first release update since Solaris 11 11/11, the versioning has been changed from MM/YY style to 11.1 highlighting that this is Solaris 11 Update 1.  Solaris 11 itself has been great. What's new in Solaris 11.1? Allow me to pick some new features from the What's New PDF that can be found in the official Oracle Solaris 11.1 Documentation. The updates are very numerous, I really can't include all.  I. New AI Automated Installer RBAC profiles have been introduced to enable delegation of installation tasks. II. The interactive installer now supports installing the OS to iSCSI targets. III. ASR (Auto Service Request) and OCM (Oracle Configuration Manager) have been enabled by default to proactively provide support information and create service requests to speed up support processes. This is optional and can be disabled but helps a lot in supportcases. For further information, see: http://oracle.com/goto/solarisautoreg IV. The new command svcbundle helps you to create SMF manifests without having to struggle with XML editing. (btw, do you know the interactive editprop subcommand in svccfg? The listprop/setprop subcommands are great for scripting and automating, but for an interactive property editing session try, for example, this: svccfg -s svc:/application/pkg/system-repository:default editprop )  V. pfedit: Ever wondered how to delegate editing permissions to certain files? It is well known "sudo /usr/bin/vi /etc/hosts" is not the right way, for sudo elevates the complete vi process to admin levels, and the user can "break" out of the session as root with simply starting a shell from that vi. Now, the new pfedit command provides a solution exactly to this challenge - an auditable, secure, per-user configurable editing possibility. See the pfedit man page for examples.   VI. rsyslog, the popular logging daemon (filters, SSL, formattable output, SQL collect...) has been included in Solaris 11.1 as an alternative to syslog.  VII: Zones: Solaris Zones - as a major Solaris differentiator - got lots of love in terms of new features: ZOSS - Zones on Shared Storage: Placing your zones to shared storage (FC, iSCSI) has never been this easy - via zonecfg.  parallell updates - with S11's bootenvironments updating zones was no problem and meant no downtime anyway, but still, now you can update them parallelly, a way faster update action if you are running a large number of zones. This is like parallell patching in Solaris 10, but with all the IPS/ZFS/S11 goodness.  per-zone fstype statistics: Running zones on a shared filesystems complicate the I/O debugging, since ZFS collects all the random writes and delivers them sequentially to boost performance. Now, over kstat you can find out which zone's I/O has an impact on the other ones, see the examples in the documentation: http://docs.oracle.com/cd/E26502_01/html/E29024/gmheh.html#scrolltoc Zones got RDSv3 protocol support for InfiniBand, and IPoIB support with Crossbow's anet (automatic vnic creation) feature.  NUMA I/O support for Zones: customers can now determine the NUMA I/O topology of the system from within zones.  VIII: Security got a lot of attention too:  Automated security/audit reporting, with builtin reporting templates e.g. for PCI (payment card industry) audits.  PAM is now configureable on a per-user basis instead of system wide, allowing different authentication requirements for different users  SSH in Solaris 11.1 now supports running in FIPS 140-2 mode, that is, in a U.S. government security accredited fashion.  SHA512/224 and SHA512/256 cryptographic hash functions are implemented in a FIPS-compliant way - and on a T4 implemented in silicon! That is, goverment-approved cryptography at HW-speed.  Generally, Solaris is currently under evaluation to be both FIPS and Common Criteria certified.  IX. Networking, as one of the core strengths of Solaris 11, has been extended with:  Data Center Bridging (DCB) - not only setups where network and storage share the same fabric (FCoE, anyone?) can have Quality-of-Service requirements. DCB enables peers to distinguish traffic based on priorities. Your NICs have to support DCB, see the documentation, and additional information on Wikipedia. DataLink MultiPathing, DLMP, enables link aggregation to span across multiple switches, even between those of different vendors. But there are essential differences to the good old bandwidth-aggregating LACP, see the documentation: http://docs.oracle.com/cd/E26502_01/html/E28993/gmdlu.html#scrolltoc VNIC live migration is now supported from one physical NIC to another on-the-fly  X. Data management:  FedFS, (Federated FileSystem) is new, it relies on Solaris 11's NFS referring mechanism to join separate shares of different NFS servers into a single filesystem namespace. The referring system has been there since S11 11/11, in Solaris 11.1 FedFS uses a LDAP - as the one global nameservice to bind them all.  The iSCSI initiator now uses the T4 CPU's HW-implemented CRC32 algorithm - thus improving iSCSI throughput while reducing CPU utilization on a T4 Storage locking improvements are now RAC aware, speeding up throughput with better locking-communication between nodes up to 20%!  XI: Kernel performance optimizations: The new Virtual Memory subsystem ("VM2") scales now to 100+ TB Memory ranges.  The memory predictor monitors large memory page usage, and adjust memory page sizes to applications' needs OSM, the Optimized Shared Memory allows Oracle DBs' SGA to be resized online XII: The Power Aware Dispatcher in now by default enabled, reducing power consumption of idle CPUs. Also, the LDoms' Power Management policies and the poweradm settings in Solaris 11 OS will cooperate. XIII: x86 boot: upgrade to the (Grand Unified Bootloader) GRUB2. Because grub2 differs in the configuration syntactically from grub1, one shall not edit the new grub configuration (grub.cfg) but use the new bootadm features to update it. GRUB2 adds UEFI support and also support for disks over 2TB. XIV: Improved viewing of per-CPU statistics of mpstat. This one might seem of less importance at first, but nowadays having better sorting/filtering possibilities on a periodically updated mpstat output of 256+ vCPUs can be a blessing. XV: Support for Solaris Cluster 4.1: The What's New document doesn't actually mention this one, since OSC 4.1 has not been released at the time 11.1 was. But since then it is available, and it requires Solaris 11.1. And it's only a "pkg update" away. ...aand I seriously need to stop here. There's a lot I missed, Edge Virtual Bridging, lofi tuning, ZFS sharing and crypto enhancements, USB3.0, pulseaudio, trusted extensions updates, etc - but if I mention all those then I effectively copy the What's New document. Which I recommend reading now anyway, it is a great extract of the 300+ new projects and RFE-followups in S11.1. And this blogpost is a summary of that extract.  For closing words, allow me to come back to Request For Enhancements, RFEs. Any customer can request features. Open up a Support Request, explain that this is an RFE, describe the feature you/your company desires to have in S11 implemented. The more SRs are collected for an RFE, the more chance it's got to get implemented. Feel free to provide feedback about the product, as well as about the Solaris 11.1 Documentation using the "Feedback" button there. Both the Solaris engineers and the documentation writers are eager to hear your input.Feel free to comment about this post too. Except that it's too long ;)  wbr,charlie

    Read the article

  • how to do event checks for loops?

    - by yao jiang
    I am having some trouble getting the logic down for this. Currently, I have an app that animates the astar pathfinding algorithm. On start of the app, the ui will show the following: User can press "space" to randomly choose start/end coords, then the app will animate it. Or, user can choose the start/end by left-click/right-click. During the animation, the user can also left-click to generate blocks, or right-click to choose a new destiantion. Where I am stuck at is how to handle the events while the app is animating. Right now, I am checking events in the main loop, then when the app is animating, I do event checks again. While it works fine, I feel that I am probably doing it wrong. What is the proper way of setting up the main loop that will handle the events while the app is animating? In main loop, the app start animating once user choose start/end. In my draw function, I am putting another event checker in there. def clear(rows): for r in range(rows): for c in range(rows): if r%3 == 1 and c%3 == 1: color = brown; grid[r][c] = 1; buildCoor.append(r); buildCoor.append(c); else: color = white; grid[r][c] = 0; pick_image(screen, color, width*c, height*r); pygame.display.flip(); os.system('cls'); # draw out the grid def draw(start, end, grid, route_coord): # draw the end coords color = red; pick_image(screen, color, width*end[1],height*end[0]); pygame.display.flip(); # then draw the rest of the route for i in range(len(route_coord)): # pausing because we want animation time.sleep(speed); # get the x/y coords x,y = route_coord[i]; event_on = False; if grid[x][y] == 2: color = green; elif grid[x][y] == 3: color = blue; for event in pygame.event.get(): if event.type == pygame.MOUSEBUTTONDOWN: if event.button == 3: print "destination change detected, rerouting"; # get mouse position, px coords pos = pygame.mouse.get_pos(); # get grid coord c = pos[0] // width; r = pos[1] // height; grid[r][c] = 4; end = [r, c]; elif event.button == 1: print "user generated event"; pos = pygame.mouse.get_pos(); # get grid coord c = pos[0] // width; r = pos[1] // height; # mark it as a block for now grid[r][c] = 1; event_on = True; if check_events([x,y]) or event_on: # there is an event # mark it as a block for now grid[y][x] = 1; pick_image(screen, event_x, width*y, height*x); pygame.display.flip(); # then find a new route new_start = route_coord[i-1]; marked_grid, route_coord = find_route(new_start, end, grid); draw(new_start, end, grid, route_coord); return; # just end draw here so it wont throw the "index out of range" error elif grid[x][y] == 4: color = red; pick_image(screen, color, width*y, height*x); pygame.display.flip(); # clear route coord list, otherwise itll just add more unwanted coords route_coord_list[:] = []; clear(rows); # main loop while not done: # check the events for event in pygame.event.get(): # mouse events if event.type == pygame.MOUSEBUTTONDOWN: # get mouse position, px coords pos = pygame.mouse.get_pos(); # get grid coord c = pos[0] // width; r = pos[1] // height; # find which button pressed, highlight grid accordingly if event.button == 1: # left click, start coords if grid[r][c] == 2: grid[r][c] = 0; color = white; elif grid[r][c] == 0 or grid[r][c] == 4: grid[r][c] = 2; start = [r,c]; color = green; else: grid[r][c] = 1; color = brown; elif event.button == 3: # right click, end coords if grid[r][c] == 4: grid[r][c] = 0; color = white; elif grid[r][c] == 0 or grid[r][c] == 2: grid[r][c] = 4; end = [r,c]; color = red; else: grid[r][c] = 1; color = brown; pick_image(screen, color, width*c, height*r); # keyboard events elif event.type == pygame.KEYDOWN: clear(rows); # one way to quit program if event.key == pygame.K_ESCAPE: print "program will now exit."; done = True; # space key for random start/end elif event.key == pygame.K_SPACE: # first clear the ui clear(rows); # now choose random start/end coords buildLoc = zip(buildCoor,buildCoor[1:])[::2]; #print buildLoc; (start_x, start_y, end_x, end_y) = pick_point(); while (start_x, start_y) in buildLoc or (end_x, end_y) in buildLoc: (start_x, start_y, end_x, end_y) = pick_point(); clear(rows); print "chosen random start/end coords: ", (start_x, start_y, end_x, end_y); if (start_x, start_y) in buildLoc or (end_x, end_y) in buildLoc: print "error"; # draw the route marked_grid, route_coord = find_route([start_x,start_y],[end_x,end_y], grid); draw([start_x, start_y], [end_x, end_y], marked_grid, route_coord); # return key for user defined start/end elif event.key == pygame.K_RETURN: # first clear the ui clear(rows); # get the user defined start/end print "user defined start/end are: ", (start[0], start[1], end[0], end[1]); grid[start[0]][start[1]] = 1; grid[end[0]][end[1]] = 2; # draw the route marked_grid, route_coord = find_route(start, end, grid); draw(start, end, marked_grid, route_coord); # c to clear the screen elif event.key == pygame.K_c: print "clearing screen."; clear(rows); # go fullscreen elif event.key == pygame.K_f: if not full_sc: pygame.display.set_mode([1366, 768], pygame.FULLSCREEN); full_sc = True; rows = 15; clear(rows); else: pygame.display.set_mode(size); full_sc = False; # +/- key to change speed of animation elif event.key == pygame.K_LEFTBRACKET: if speed >= 0.1: print SPEED_UP; speed = speed_up(speed); print speed; else: print FASTEST; print speed; elif event.key == pygame.K_RIGHTBRACKET: if speed < 1.0: print SPEED_DOWN; speed = slow_down(speed); print speed; else: print SLOWEST print speed; # second method to quit program elif event.type == pygame.QUIT: print "program will now exit."; done = True; # limit to 20 fps clock.tick(20); # update the screen pygame.display.flip();

    Read the article

  • Quaternion Camera Orbiting around a Sphere

    - by jessejuicer
    Background: I'm trying to create a game where the camera is always rotating around a single sphere. I'm using the DirectX D3DX math functions in C++ on Windows. The Problem: I cannot get both the camera position and orientation both working properly at the same time. Either one works but not both together. Here's the code for my quaternion camera that revolves around a sphere, always looking at the centerpoint of the sphere, ... as far as I understand it (but which isn't working properly): (I'm only going to present rotation around the X axis here, to simplify this post) Whenever the UP key is pressed or held down, the camera should rotate around the X axis, while looking at the centerpoint of the sphere (which is at 0,0,0 in the world). So, I build a quaternion that represents a small angle of rotation around the x axis like this (where 'deltaAngle' is a small enough number for a slow rotation): D3DXVECTOR3 rotAxis; D3DXQUATERNION tempQuat; tempQuat.x = 0.0f; tempQuat.y = 0.0f; tempQuat.z = 0.0f; tempQuat.w = 1.0f; rotAxis.x = 1.0f; rotAxis.y = 0.0f; rotAxis.z = 0.0f; D3DXQuaternionRotationAxis(&tempQuat, &rotAxis, deltaAngle); ...and I accumulate the result into the camera's current orientation quat, like this: D3DXQuaternionMultiply(&cameraOrientationQuat, &cameraOrientationQuat, &tempQuat); ...which all works fine. Now I need to build a view matrix to pass to DirectX SetTransform function. So I build a rotation matrix from the camera orientation quat as follows: D3DXMATRIXA16 rotationMatrix; D3DXMatrixIdentity(&rotationMatrix); D3DXMatrixRotationQuaternion(&rotationMatrix, &cameraOrientationQuat); ...Now (as seen below) if I just transpose that rotationMatrix and plug it into the 3x3 section of the view matrix, then negate the camera's position and plug it into the translation section of the view matrix, the rotation magically works. Perfectly. (even when I add in rotations for all three axes). There's no gimbal lock, just a smooth rotation all around in any direction. BUT- this works even though I never change the camera's position. At all. Which sorta blows my mind. I even display the camera position and can watch it stay constant at it's starting point (0.0, 0.0, -4000.0). It never moves, but the rotation around the sphere is perfect. I don't understand that. For proper view rotation, the camera position should be revolving around the sphere. Here's the rest of building the view matrix (I'll talk about the commented code below). Note that the camera starts out at (0.0, 0.0, -4000.0) and m_camDistToTarget is 4000.0: /* D3DXVECTOR3 vec1; D3DXVECTOR4 vec2; vec1.x = 0.0f; vec1.y = 0.0f; vec1.z = -1.0f; D3DXVec3Transform(&vec2, &vec1, &rotationMatrix); g_cameraActor->pos.x = vec2.x * g_cameraActor->m_camDistToTarget; g_cameraActor->pos.y = vec2.y * g_cameraActor->m_camDistToTarget; g_cameraActor->pos.z = vec2.z * g_cameraActor->m_camDistToTarget; */ D3DXMatrixTranspose(&g_viewMatrix, &rotationMatrix); g_viewMatrix._41 = -g_cameraActor->pos.x; g_viewMatrix._42 = -g_cameraActor->pos.y; g_viewMatrix._43 = -g_cameraActor->pos.z; g_viewMatrix._44 = 1.0f; g_direct3DDevice9->SetTransform( D3DTS_VIEW, &g_viewMatrix ); ...(The world matrix is always an identity, and the perspective projection works fine). ...So, without the commented code being compiled, the rotation works fine. But to be proper, for obvious reasons, the camera position should be rotating around the sphere, which it currently is not. That's what the commented code is supposed to do. And when I add in that chunk of code to do that, and look at all the data as I hold the keys down (using UP, DOWN, LEFT, RIGHT to rotate different directions) all the values look correct! The camera position is rotating around the sphere just fine, and I can watch that happen visually too. The problem is that the camera orientation does not lookat the center of the sphere. It always looks straight forward down the z axis (toward positive z) as it revolves around the sphere. Yet the values of both the rotation matrix and the view matrix seem to be behaving correctly. (The view matrix orientation is the same as the rotation matrix, just transposed). For instance if I just hold down the key to spin around the x axis, I can watch the values of the three axes represented in the view matrix (x, y, and z axes)... view x-axis stays at (1.0, 0.0, 0.0), and view y-axis and z-axis both spin around the x axis just fine. All the numbers are changing as they should be... well, almost. As far as I can tell, the position of the view matrix is spinning around the sphere one direction (like clockwise), and the orientation (the axes in the view matrix) are spinning the opposite direction (like counter-clockwise). Which I guess explains why the orientation appears to stay straight ahead. I know the position is correct. It revolves properly. It's the orientation that's wrong. Can anyone see what am I doing wrong? Am I using these functions incorrectly? Or is my algorithm flawed? As usual I've been combing my code for simple mistakes for many hours. I'm willing to post the actual code, and a video of the behavior, but that will take much more effort. Thought I'd ask this way first.

    Read the article

  • Can Google Employees See My Saved Google Chrome Passwords?

    - by Jason Fitzpatrick
    Storing your passwords in your web browser seems like a great time saver, but are the passwords secure and inaccessible to others (even employees of the browser company) when squirreled away? Today’s Question & Answer session comes to us courtesy of SuperUser—a subdivision of Stack Exchange, a community-driven grouping of Q&A web sites. The Question SuperUser reader MMA is curious if Google employees have (or could have) access to the passwords he stores in Google Chrome: I understand that we are really tempted to save our passwords in Google Chrome. The likely benefit is two fold, You don’t need to (memorize and) input those long and cryptic passwords. These are available wherever you are once you log in to your Google account. The last point sparked my doubt. Since the password is available anywhere, the storage must in some central location, and this should be at Google. Now, my simple question is, can a Google employee see my passwords? Searching over the Internet revealed several articles/messages. Do you save passwords in Chrome? Maybe you should reconsider: Talks about your passwords being stolen by someone who has access to your computer account. Nothing mentioned about the central storage security and vulnerability. There is even a response from Chrome browser security tech lead about the first issue. Chrome’s insane password security strategy: Mostly along the same line. You can steal password from somebody if you have access to the computer account. How to Steal Passwords Saved in Google Chrome in 5 Simple Steps: Teaches you how to actually perform the act mentioned in the previous two when you have access to somebody else’s account. There are many more (including this one at this site), mostly along the same line, points, counter-points, huge debates. I refrain from mentioning them here, simply carry a search if you want to find them. Coming back to my original query, can a Google employee see my password? Since I can view the password using a simple button, definitely they can be unhashed (decrypted) even if encrypted. This is very different from the passwords saved in Unix-like OS’s where the saved password can never be seen in plain text. They use a one-way encryption algorithm to encrypt your passwords. This encrypted password is then stored in the passwd or shadow file. When you attempt to login, the password you type in is encrypted again and compared with the entry in the file that stores your passwords. If they match, it must be the same password, and you are allowed access. Thus, a superuser can change my password, can block my account, but he can never see my password. So are his concerns well founded or will a little insight dispel his worry? The Answer SuperUser contributor Zeel helps put his mind at ease: Short answer: No* Passwords stored on your local machine can be decrypted by Chrome, as long as your OS user account is logged in. And then you can view those in plain text. At first this seems horrible, but how did you think auto-fill worked? When that password field gets filled in, Chrome must insert the real password into the HTML form element – or else the page wouldn’t work right, and you could not submit the form. And if the connection to the website is not over HTTPS, the plain text is then sent over the internet. In other words, if chrome can’t get the plain text passwords, then they are totally useless. A one way hash is no good, because we need to use them. Now the passwords are in fact encrypted, the only way to get them back to plain text is to have the decryption key. That key is your Google password, or a secondary key you can set up. When you sign into Chrome and sync the Google servers will transmit the encrypted passwords, settings, bookmarks, auto-fill, etc, to your local machine. Here Chrome will decrypt the information and be able to use it. On Google’s end all that info is stored in its encrpyted state, and they do not have the key to decrypt it. Your account password is checked against a hash to log in to Google, and even if you let chrome remember it, that encrypted version is hidden in the same bundle as the other passwords, impossible to access. So an employee could probably grab a dump of the encrypted data, but it wouldn’t do them any good, since they would have no way to use it.* So no, Google employees can not** access your passwords, since they are encrypted on their servers. * However, do not forget that any system that can be accessed by an authorized user can be accessed by an unauthorized user. Some systems are easier to break than other, but none are fail-proof. . . That being said, I think I will trust Google and the millions they spend on security systems, over any other password storage solution. And heck, I’m a wimpy nerd, it would be easier to beat the passwords out of me than break Google’s encryption. ** I am also assuming that there isn’t a person who just happens to work for Google gaining access to your local machine. In that case you are screwed, but employment at Google isn’t actually a factor any more. Moral: Hit Win + L before leaving machine. While we agree with zeel that it’s a pretty safe bet (as long as your computer is not compromised) that your passwords are in fact safe while stored in Chrome, we prefer to encrypt all our logins and passwords in a LastPass vault. Have something to add to the explanation? Sound off in the the comments. Want to read more answers from other tech-savvy Stack Exchange users? Check out the full discussion thread here.     

    Read the article

  • Scheduling thread tiles with C++ AMP

    - by Daniel Moth
    This post assumes you are totally comfortable with, what some of us call, the simple model of C++ AMP, i.e. you could write your own matrix multiplication. We are now ready to explore the tiled model, which builds on top of the non-tiled one. Tiling the extent We know that when we pass a grid (which is just an extent under the covers) to the parallel_for_each call, it determines the number of threads to schedule and their index values (including dimensionality). For the single-, two-, and three- dimensional cases you can go a step further and subdivide the threads into what we call tiles of threads (others may call them thread groups). So here is a single-dimensional example: extent<1> e(20); // 20 units in a single dimension with indices from 0-19 grid<1> g(e);      // same as extent tiled_grid<4> tg = g.tile<4>(); …on the 3rd line we subdivided the single-dimensional space into 5 single-dimensional tiles each having 4 elements, and we captured that result in a concurrency::tiled_grid (a new class in amp.h). Let's move on swiftly to another example, in pictures, this time 2-dimensional: So we start on the left with a grid of a 2-dimensional extent which has 8*6=48 threads. We then have two different examples of tiling. In the first case, in the middle, we subdivide the 48 threads into tiles where each has 4*3=12 threads, hence we have 2*2=4 tiles. In the second example, on the right, we subdivide the original input into tiles where each has 2*2=4 threads, hence we have 4*3=12 tiles. Notice how you can play with the tile size and achieve different number of tiles. The numbers you pick must be such that the original total number of threads (in our example 48), remains the same, and every tile must have the same size. Of course, you still have no clue why you would do that, but stick with me. First, we should see how we can use this tiled_grid, since the parallel_for_each function that we know expects a grid. Tiled parallel_for_each and tiled_index It turns out that we have additional overloads of parallel_for_each that accept a tiled_grid instead of a grid. However, those overloads, also expect that the lambda you pass in accepts a concurrency::tiled_index (new in amp.h), not an index<N>. So how is a tiled_index different to an index? A tiled_index object, can have only 1 or 2 or 3 dimensions (matching exactly the tiled_grid), and consists of 4 index objects that are accessible via properties: global, local, tile_origin, and tile. The global index is the same as the index we know and love: the global thread ID. The local index is the local thread ID within the tile. The tile_origin index returns the global index of the thread that is at position 0,0 of this tile, and the tile index is the position of the tile in relation to the overall grid. Confused? Here is an example accompanied by a picture that hopefully clarifies things: array_view<int, 2> data(8, 6, p_my_data); parallel_for_each(data.grid.tile<2,2>(), [=] (tiled_index<2,2> t_idx) restrict(direct3d) { /* todo */ }); Given the code above and the picture on the right, what are the values of each of the 4 index objects that the t_idx variables exposes, when the lambda is executed by T (highlighted in the picture on the right)? If you can't work it out yourselves, the solution follows: t_idx.global       = index<2> (6,3) t_idx.local          = index<2> (0,1) t_idx.tile_origin = index<2> (6,2) t_idx.tile             = index<2> (3,1) Don't move on until you are comfortable with this… the picture really helps, so use it. Tiled Matrix Multiplication Example – part 1 Let's paste here the C++ AMP matrix multiplication example, bolding the lines we are going to change (can you guess what the changes will be?) 01: void MatrixMultiplyTiled_Part1(vector<float>& vC, const vector<float>& vA, const vector<float>& vB, int M, int N, int W) 02: { 03: 04: array_view<const float,2> a(M, W, vA); 05: array_view<const float,2> b(W, N, vB); 06: array_view<writeonly<float>,2> c(M, N, vC); 07: parallel_for_each(c.grid, 08: [=](index<2> idx) restrict(direct3d) { 09: 10: int row = idx[0]; int col = idx[1]; 11: float sum = 0.0f; 12: for(int i = 0; i < W; i++) 13: sum += a(row, i) * b(i, col); 14: c[idx] = sum; 15: }); 16: } To turn this into a tiled example, first we need to decide our tile size. Let's say we want each tile to be 16*16 (which assumes that we'll have at least 256 threads to process, and that c.grid.extent.size() is divisible by 256, and moreover that c.grid.extent[0] and c.grid.extent[1] are divisible by 16). So we insert at line 03 the tile size (which must be a compile time constant). 03: static const int TS = 16; ...then we need to tile the grid to have tiles where each one has 16*16 threads, so we change line 07 to be as follows 07: parallel_for_each(c.grid.tile<TS,TS>(), ...that means that our index now has to be a tiled_index with the same characteristics as the tiled_grid, so we change line 08 08: [=](tiled_index<TS, TS> t_idx) restrict(direct3d) { ...which means, without changing our core algorithm, we need to be using the global index that the tiled_index gives us access to, so we insert line 09 as follows 09: index<2> idx = t_idx.global; ...and now this code just works and it is tiled! Closing thoughts on part 1 The process we followed just shows the mechanical transformation that can take place from the simple model to the tiled model (think of this as step 1). In fact, when we wrote the matrix multiplication example originally, the compiler was doing this mechanical transformation under the covers for us (and it has additional smarts to deal with the cases where the total number of threads scheduled cannot be divisible by the tile size). The point is that the thread scheduling is always tiled, even when you use the non-tiled model. But with this mechanical transformation, we haven't gained anything… Hint: our goal with explicitly using the tiled model is to gain even more performance. In the next post, we'll evolve this further (beyond what the compiler can automatically do for us, in this first release), so you can see the full usage of the tiled model and its benefits… Comments about this post by Daniel Moth welcome at the original blog.

    Read the article

  • Master-slave vs. peer-to-peer archictecture: benefits and problems

    - by Ashok_Ora
    Normal 0 false false false EN-US X-NONE X-NONE Almost two decades ago, I was a member of a database development team that introduced adaptive locking. Locking, the most popular concurrency control technique in database systems, is pessimistic. Locking ensures that two or more conflicting operations on the same data item don’t “trample” on each other’s toes, resulting in data corruption. In a nutshell, here’s the issue we were trying to address. In everyday life, traffic lights serve the same purpose. They ensure that traffic flows smoothly and when everyone follows the rules, there are no accidents at intersections. As I mentioned earlier, the problem with typical locking protocols is that they are pessimistic. Regardless of whether there is another conflicting operation in the system or not, you have to hold a lock! Acquiring and releasing locks can be quite expensive, depending on how many objects the transaction touches. Every transaction has to pay this penalty. To use the earlier traffic light analogy, if you have ever waited at a red light in the middle of nowhere with no one on the road, wondering why you need to wait when there’s clearly no danger of a collision, you know what I mean. The adaptive locking scheme that we invented was able to minimize the number of locks that a transaction held, by detecting whether there were one or more transactions that needed conflicting eyou could get by without holding any lock at all. In many “well-behaved” workloads, there are few conflicts, so this optimization is a huge win. If, on the other hand, there are many concurrent, conflicting requests, the algorithm gracefully degrades to the “normal” behavior with minimal cost. We were able to reduce the number of lock requests per TPC-B transaction from 178 requests down to 2! Wow! This is a dramatic improvement in concurrency as well as transaction latency. The lesson from this exercise was that if you can identify the common scenario and optimize for that case so that only the uncommon scenarios are more expensive, you can make dramatic improvements in performance without sacrificing correctness. So how does this relate to the architecture and design of some of the modern NoSQL systems? NoSQL systems can be broadly classified as master-slave sharded, or peer-to-peer sharded systems. NoSQL systems with a peer-to-peer architecture have an interesting way of handling changes. Whenever an item is changed, the client (or an intermediary) propagates the changes synchronously or asynchronously to multiple copies (for availability) of the data. Since the change can be propagated asynchronously, during some interval in time, it will be the case that some copies have received the update, and others haven’t. What happens if someone tries to read the item during this interval? The client in a peer-to-peer system will fetch the same item from multiple copies and compare them to each other. If they’re all the same, then every copy that was queried has the same (and up-to-date) value of the data item, so all’s good. If not, then the system provides a mechanism to reconcile the discrepancy and to update stale copies. So what’s the problem with this? There are two major issues: First, IT’S HORRIBLY PESSIMISTIC because, in the common case, it is unlikely that the same data item will be updated and read from different locations at around the same time! For every read operation, you have to read from multiple copies. That’s a pretty expensive, especially if the data are stored in multiple geographically separate locations and network latencies are high. Second, if the copies are not all the same, the application has to reconcile the differences and propagate the correct value to the out-dated copies. This means that the application program has to handle discrepancies in the different versions of the data item and resolve the issue (which can further add to cost and operation latency). Resolving discrepancies is only one part of the problem. What if the same data item was updated independently on two different nodes (copies)? In that case, due to the asynchronous nature of change propagation, you might land up with different versions of the data item in different copies. In this case, the application program also has to resolve conflicts and then propagate the correct value to the copies that are out-dated or have incorrect versions. This can get really complicated. My hunch is that there are many peer-to-peer-based applications that don’t handle this correctly, and worse, don’t even know it. Imagine have 100s of millions of records in your database – how can you tell whether a particular data item is incorrect or out of date? And what price are you willing to pay for ensuring that the data can be trusted? Multiple network messages per read request? Discrepancy and conflict resolution logic in the application, and potentially, additional messages? All this overhead, when all you were trying to do was to read a data item. Wouldn’t it be simpler to avoid this problem in the first place? Master-slave architectures like the Oracle NoSQL Database handles this very elegantly. A change to a data item is always sent to the master copy. Consequently, the master copy always has the most current and authoritative version of the data item. The master is also responsible for propagating the change to the other copies (for availability and read scalability). Client drivers are aware of master copies and replicas, and client drivers are also aware of the “currency” of a replica. In other words, each NoSQL Database client knows how stale a replica is. This vastly simplifies the job of the application developer. If the application needs the most current version of the data item, the client driver will automatically route the request to the master copy. If the application is willing to tolerate some staleness of data (e.g. a version that is no more than 1 second out of date), the client can easily determine which replica (or set of replicas) can satisfy the request, and route the request to the most efficient copy. This results in a dramatic simplification in application logic and also minimizes network requests (the driver will only send the request to exactl the right replica, not many). So, back to my original point. A well designed and well architected system minimizes or eliminates unnecessary overhead and avoids pessimistic algorithms wherever possible in order to deliver a highly efficient and high performance system. If you’ve every programmed an Oracle NoSQL Database application, you’ll know the difference! /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}

    Read the article

  • Who could ask for more with LESS CSS? (Part 3 of 3&ndash;Clrizr)

    - by ToString(theory);
    Welcome back!  In the first two posts in this series, I covered some of the awesome features in CSS precompilers such as SASS and LESS, as well as how to get an initial project setup up and running in ASP.Net MVC 4. In this post, I will cover an actual advanced example of using LESS in a project, and show some of the great productivity features we gain from its usage. Introduction In the first post, I mentioned two subjects that I will be using in this example – constants, and color functions.  I’ve always enjoyed using online color scheme utilities such as Adobe Kuler or Color Scheme Designer to come up with a scheme based off of one primary color.  Using these tools, and requesting a complementary scheme you can get a couple of shades of your primary color, and a couple of shades of a complementary/accent color to display. Because there is no way in regular css to do color operations or store variables, there was no way to accomplish something like defining a primary color, and have a site theme cascade off of that.  However with tools such as LESS, that impossibility becomes a reality!  So, if you haven’t guessed it by now, this post is on the creation of a plugin/module/less file to drop into your project, plugin one color, and have your primary theme cascade from it.  I only went through the trouble of creating a module for getting Complementary colors.  However, it wouldn’t be too much trouble to go through other options such as Triad or Monochromatic to get a module that you could use off of that. Step 1 – Analysis I decided to mimic Adobe Kuler’s Complementary theme algorithm as I liked its simplicity and aesthetics.  Color Scheme Designer is great, but I do believe it can give you too many color options, which can lead to chaos and overload.  The first thing I had to check was if the complementary values for the color schemes were actually hues rotated by 180 degrees at all times – they aren’t.  Apparently Adobe applies some variance to the complementary colors to get colors that are actually more aesthetically appealing to users.  So, I opened up Excel and began to plot complementary hues based on rotation in increments of 10: Long story short, I completed the same calculations for Hue, Saturation, and Lightness.  For Hue, I only had to record the Complementary hue values, however for saturation and lightness, I had to record the values for ALL of the shades.  Since the functions were too complicated to put into LESS since they aren’t constant/linear, but rather interval functions, I instead opted to extrapolate the HSL values using the trendline function for each major interval, onto intervals of spacing 1. For example, using the hue extraction, I got the following values: Interval Function 0-60 60-140 140-270 270-360 Saturation and Lightness were much worse, but in the end, I finally had functions for all of the intervals, and then went the route of just grabbing each shades value in intervals of 1.  Step 2 – Mapping I declared variable names for each of these sections as something that shouldn’t ever conflict with a variable someone would define in their own file.  After I had each of the values, I extracted the values and put them into files of their own for hue variables, saturation variables, and lightness variables…  Example: /*HUE CONVERSIONS*/@clrizr-hue-source-0deg: 133.43;@clrizr-hue-source-1deg: 135.601;@clrizr-hue-source-2deg: 137.772;@clrizr-hue-source-3deg: 139.943;@clrizr-hue-source-4deg: 142.114;.../*SATURATION CONVERSIONS*/@clrizr-saturation-s2SV0px: 0;@clrizr-saturation-s2SV1px: 0;@clrizr-saturation-s2SV2px: 0;@clrizr-saturation-s2SV3px: 0;@clrizr-saturation-s2SV4px: 0;.../*LIGHTNESS CONVERSIONS*/@clrizr-lightness-s2LV0px: 30;@clrizr-lightness-s2LV1px: 31;@clrizr-lightness-s2LV2px: 32;@clrizr-lightness-s2LV3px: 33;@clrizr-lightness-s2LV4px: 34;...   In the end, I have 973 lines of mapping/conversion from source HSL to shade HSL for two extra primary shades, and two complementary shades. The last bit of the work was the file to compose each of the shades from these mappings. Step 3 – Clrizr Mapper The final step was the hardest to overcome as I was still trying to understand LESS to its fullest extent.  Imports As mentioned previously, I had separated the HSL mappings into different files, so the first necessary step is to import those for use into the Clrizr plugin: @import url("hue.less");@import url("saturation.less");@import url("lightness.less"); Extract Component Values For Each Shade Next, I extracted the necessary information for each shade HSL before shade composition: @clrizr-input-saturation: 1px+floor(saturation(@clrizr-input))-1;@clrizr-input-lightness: 1px+floor(lightness(@clrizr-input))-1; @clrizr-complementary-hue: formatstring("clrizr-hue-source-{0}", ceil(hue(@clrizr-input))); @clrizr-primary-2-saturation: formatstring("clrizr-saturation-s2SV{0}",@clrizr-input-saturation);@clrizr-primary-1-saturation: formatstring("clrizr-saturation-s1SV{0}",@clrizr-input-saturation);@clrizr-complementary-1-saturation: formatstring("clrizr-saturation-c1SV{0}",@clrizr-input-saturation); @clrizr-primary-2-lightness: formatstring("clrizr-lightness-s2LV{0}",@clrizr-input-lightness);@clrizr-primary-1-lightness: formatstring("clrizr-lightness-s1LV{0}",@clrizr-input-lightness);@clrizr-complementary-1-lightness: formatstring("clrizr-lightness-c1LV{0}",@clrizr-input-lightness); Here, you can see a couple of odd things…  On the first line, I am using operations to add units to the saturation and lightness.  This is due to some limitations in the operations that would give me saturation or lightness in %, which can’t be in a variable name.  So, I use first add 1px to it, which casts the result of the following functions as px instead of %, and then at the end, I remove that pixel.  You can also see here the formatstring method which is exactly what it sounds like – something like String.Format(string str, params object[] obj). Get Primary & Complementary Shades Now that I have components for each of the different shades, I can now compose them into each of their pieces.  For this, I use the @@ operator which will look for a variable with the name specified in a string, and then call that variable: @clrizr-primary-2: hsl(hue(@clrizr-input), @@clrizr-primary-2-saturation, @@clrizr-primary-2-lightness);@clrizr-primary-1: hsl(hue(@clrizr-input), @@clrizr-primary-1-saturation, @@clrizr-primary-1-lightness);@clrizr-primary: @clrizr-input;@clrizr-complementary-1: hsl(@@clrizr-complementary-hue, @@clrizr-complementary-1-saturation, @@clrizr-complementary-1-lightness);@clrizr-complementary-2: hsl(@@clrizr-complementary-hue, saturation(@clrizr-input), lightness(@clrizr-input)); That’s is it, for the most part.  These variables now hold the theme for the one input color – @clrizr-input.  However, I have one last addition… Perceptive Luminance Well, after I got the colors, I decided I wanted to also get the best font color that would go on top of it.  Black or white depending on light or dark color.  Now I couldn’t just go with checking the lightness, as that is half the story.  You see, the human eye doesn’t see ALL colors equally well but rather has more cells for interpreting green light compared to blue or red.  So, using the ratio, we can calculate the perceptive luminance of each of the shades, and get the font color that best matches it! @clrizr-perceptive-luminance-ps2: round(1 - ( (0.299 * red(@clrizr-primary-2) ) + ( 0.587 * green(@clrizr-primary-2) ) + (0.114 * blue(@clrizr-primary-2)))/255)*255;@clrizr-perceptive-luminance-ps1: round(1 - ( (0.299 * red(@clrizr-primary-1) ) + ( 0.587 * green(@clrizr-primary-1) ) + (0.114 * blue(@clrizr-primary-1)))/255)*255;@clrizr-perceptive-luminance-ps: round(1 - ( (0.299 * red(@clrizr-primary) ) + ( 0.587 * green(@clrizr-primary) ) + (0.114 * blue(@clrizr-primary)))/255)*255;@clrizr-perceptive-luminance-pc1: round(1 - ( (0.299 * red(@clrizr-complementary-1)) + ( 0.587 * green(@clrizr-complementary-1)) + (0.114 * blue(@clrizr-complementary-1)))/255)*255;@clrizr-perceptive-luminance-pc2: round(1 - ( (0.299 * red(@clrizr-complementary-2)) + ( 0.587 * green(@clrizr-complementary-2)) + (0.114 * blue(@clrizr-complementary-2)))/255)*255; @clrizr-col-font-on-primary-2: rgb(@clrizr-perceptive-luminance-ps2, @clrizr-perceptive-luminance-ps2, @clrizr-perceptive-luminance-ps2);@clrizr-col-font-on-primary-1: rgb(@clrizr-perceptive-luminance-ps1, @clrizr-perceptive-luminance-ps1, @clrizr-perceptive-luminance-ps1);@clrizr-col-font-on-primary: rgb(@clrizr-perceptive-luminance-ps, @clrizr-perceptive-luminance-ps, @clrizr-perceptive-luminance-ps);@clrizr-col-font-on-complementary-1: rgb(@clrizr-perceptive-luminance-pc1, @clrizr-perceptive-luminance-pc1, @clrizr-perceptive-luminance-pc1);@clrizr-col-font-on-complementary-2: rgb(@clrizr-perceptive-luminance-pc2, @clrizr-perceptive-luminance-pc2, @clrizr-perceptive-luminance-pc2); Conclusion That’s it!  I have posted a project on clrizr.codePlex.com for this, and included a testing page for you to test out how it works.  Feel free to use it in your own project, and if you have any questions, comments or suggestions, please feel free to leave them here as a comment, or on the contact page!

    Read the article

  • CodePlex Daily Summary for Monday, August 18, 2014

    CodePlex Daily Summary for Monday, August 18, 2014Popular ReleasesMagick.NET: Magick.NET 7.0.0.0001: Magick.NET linked with ImageMagick 7-Beta.CMake Tools for Visual Studio: CMake Tools for Visual Studio 1.2: This release adds the following new features and bug fixes from CMake Tools for Visual Studio 1.1: Added support for CMake 3.0. Added support for word completion. Added IntelliSense support for the CMAKEHOSTSYSTEM_INFORMATION command. Fixed syntax highlighting for tokens beginning with escape sequences. Fixed issue uninstalling CMake Tools for Visual Studio after Visual Studio has been uninstalled.GW2 Personal Assistant Overlay: GW2 Personal Assistant Overlay 1.1: Overview1.1 is the second 'stable' release of the GW2 Personal Assistant Overlay. This version includes just a couple of very minor features and some minor bug fixes. For details regarding installation, setup, and general use, see Documentation. Note: If you were using a previous version, you will probably want to copy over the following user settings files: GW2PAO.DungeonSettings.xml GW2PAO.EventSettings.xml GW2PAO.WvWSettings.xml GW2PAO.ZoneCompletionSettings.xml New FeaturesAdded new "No...WallSwitch: WallSwitch 1.2.5: Version 1.2.5 Changes: Added support for sequential order in collage mode. Added option to display multiple images per switch in collage mode. Fixed bug where border width wasn't being loaded properly, and was reverting to default values. Fixed bug where sequential order was repeating images on multiple monitors. Decreased likelihood of random images being repeated.OpenCppCoverage: OpenCppCoverage 0.9.1: - Add Jenkins support. - Command line argument can be placed inside a config file. If you do not have Visual Studio C++ 2013 you need to download redistributable packages: http://www.microsoft.com/en-us/download/details.aspx?id=40784Easy Backup Windows Service: Release 2.0 with CU: Fix log error when "To" directory not exist in fyle system. Force run program as administrator by default. Add 'everyday' schedule element. Update solution to VS 2013.Easy Backup Application: Release 2.0 with CU: Fix log error when "To" directory not exist in fyle system. Fix app location initialization. Force run program as administrator by default. Update solution to VS 2013.TEBookConverter: 1.5: Added: Turkish and French translations Added: A few interface changes Removed: SkinDynamulet: Dynamulet v0.1: DynamoDB Transaction Server v0.1Console parallel nunit tests runner: ConsoleUnitTestsRunner 1.03: bugfixingFluentx: Fluentx v1.5.3: Added few more extension methods.fastJSON: v2.1.2: 2.1.2 - bug fix circular referencesJPush.NET: JPush Server SDK 1.2.1 (For JPush V3): Assembly: 1.2.1.24728 JPush REST API Version: v3 JPush Documentation Reference .NET framework: v4.0 or above. Sample: class: JPushClientV3 2014 Augest 15th.SEToolbox: SEToolbox 01.043.008 Release 1: Changed ship/station names to use new DisplayName instead of Beacon/Antenna. Fixed issue with updated SE binaries 01.043.018 using new Voxel Material definitions.Google .Net API: Drive.Sample: Google .NET Client API – Drive.SampleInstructions for the Google .NET Client API – Drive.Sample</h2> http://code.google.com/p/google-api-dotnet-client/source/browse/?repo=samples#hg%2FDrive.SampleBrowse Source, or main file http://code.google.com/p/google-api-dotnet-client/source/browse/Drive.Sample/Program.cs?repo=samplesProgram.cs <h3>1. Checkout Instructions</h3> <p><b>Prerequisites:</b> Install Visual Studio, and <a href="http://mercurial.selenic.com/">Mercurial</a>.</p> ...FineUI - jQuery / ExtJS based ASP.NET Controls: FineUI v4.1.1: -??Form??????????????(???-5929)。 -?TemplateField??ExpandOnDoubleClick、ExpandOnEnter、ExpandToSelectRow????(LZOM-5932)。 -BodyPadding???????,??“5”“5 10”,???????????“5px”“5px 10px”。 -??TriggerBox?EnableEdit=false????,??????????????(Jango_Jing-5450)。 -???????????DataKeyNames???????????(yygy-6002)。 -????????????????????????(Gnid-6018)。 -??PageManager???AutoSizePanelID????,??????????????????(yygy-6008)。 -?FState???????????????,????????????????(????-5925)。 -??????OnClientClick???return?????????(FineU...DNN CMS Platform: 07.03.02: Major Highlights Fixed backwards compatibility issue with 3rd party control panels Fixed issue in the drag and drop functionality of the File Uploader in IE 11 and Safari Fixed issue where users were able to create pages with the same name Fixed issue that affected older versions of DNN that do not include the maxAllowedContentLength during upgrade Fixed issue that stopped some skins from being upgraded to newer versions Fixed issue that randomly showed an unexpected error during us...WordMat: WordMat for Mac: WordMat for Mac has a few limitations compared to the Windows version - Graph is not supported (Gnuplot, GeoGebra and Excel works) - Units are not supported yet (Coming up) The Mac version is yet as tested as the windows version.MFCMAPI: August 2014 Release: Build: 15.0.0.1042 Full release notes at SGriffin's blog. If you just want to run the MFCMAPI or MrMAPI, get the executables. If you want to debug them, get the symbol files and the source. The 64 bit builds will only work on a machine with Outlook 2010/2013 64 bit installed. All other machines should use the 32 bit builds, regardless of the operating system. Facebook BadgeEWSEditor: EwsEditor 1.10 Release: • Export and import of items as a full fidelity steam works - without proxy classes! - I used raw EWS POSTs. • Turned off word wrap for EWS request field in EWS POST windows. • Several windows with scrolling texts boxes were limiting content to 32k - I removed this restriction. • Split server timezone info off to separate menu item from the timezone info windows so that the timezone info window could be used without logging into a mailbox. • Lots of updates to the TimeZone window. • UserAgen...New Projectsballmon: ballmonExchange Database Recovery With and Without Log Files is Possible: This segments giving an overview of Exchange Server transaction log files. It describes process how users can recover their database with & without log filesFabs.Net: Ego tatmini ve gelisme amaçli yaptigim bir projedir.JacoChat: JacoChat is a simple chatting interface that uses my personal webserver as a "wall" for people to chat on.ManagedWin32: ManagedWin32 is a library that exposes the Win32 API to .NET applications.Open XML Extensions: The project provides additions to the Open XML SDK and related projects (e.g., PowerTools for Open XML), starting with MemoryStreams for Open XML Documents.orntic: Project for insurace companyTBOX: The Treasure Box Library: TBOX is a mutli-platform c library for unix, windows, mac, ios, android, etc. It includes asio, stream, container, algorithm, xml and other library modules.WeatherTS: Typescript weather application.?????@/????: ??????????????:????,????,????,???????,????????,??????:????????,?????! ?????????: ????????????????????,????????:??、??、???,?????????????????????! ????-??: ??????????????,????,???????????????。

    Read the article

  • The code works but when using printf it gives me a weird answer. Help please [closed]

    - by user71458
    //Programmer-William Chen //Seventh Period Computer Science II //Problem Statement - First get the elapsed times and the program will find the //split times for the user to see. // //Algorithm- First the programmer makes the prototype and calls them in the //main function. The programmer then asks the user to input lap time data. //Secondly, you convert the splits into seconds and subtract them so you can //find the splits. Then the average is all the lap time's in seconds. Finally, //the programmer printf all the results for the user to see. #include <iostream> #include <stdlib.h> #include <math.h> #include <conio.h> #include <stdio.h> using namespace std; void thisgetsElapsedTimes( int &m1, int &m2, int &m3, int &m4, int &m5, int &s1, int &s2, int &s3, int &s4, int &s5); //this is prototype void thisconvertstoseconds ( int &m1, int &m2, int &m3, int &m4, int &m5, int &s1, int &s2, int &s3, int &s4, int &s5, int &split1, int &split2, int &split3, int &split4, int &split5);//this too void thisfindsSplits(int &m1, int &m2, int &m3, int &m4, int &m5, int &split1, int &split2, int &split3, int &split4, int &split5, int &split6, int &split7, int &split8, int &split9, int &split10);// this is part of prototype void thisisthesecondconversation (int &split1M, int &split2M, int &split3M, int &split4M, int &split5M, int &split1S,int &split2S, int &split3S, int &split4S, int &split5S, int &split1, int &split2, int &split3, int &split4, int &split5);//this gets a value void thisfindstheaverage(double &average, int &split1, int &split2, int &split3, int &split4, int &split5);//and this void thisprintsstuff( int &split1M, int &split2M, int &split3M, int &split4M, int &split5M, int &split1S, int &split2S, int &split3S, int &split4S, int &split5S, double &average); //this prints int main(int argc, char *argv[]) { int m1, m2, m3, m4, m5, s1, s2, s3, s4, s5, split1, split2, split3, split4, split5, split1M, split2M, split3M, split4M, split5M, split1S, split2S, split3S, split4S, split5S; int split6, split7, split8, split9, split10; double average; char thistakescolon; thisgetsElapsedTimes ( m1, m2, m3, m4, m5, s1, s2, s3, s4, s5); thisconvertstoseconds ( m1, m2, m3, m4, m5, s1, s2, s3, s4, s5, split1, split2, split3, split4, split5); thisfindsSplits ( m1, m2, m3, m4, m5, split1, split2, split3, split4, split5, split6, split7, split8, split9, split10); thisisthesecondconversation ( split1M, split2M, split3M, split4M, split5M, split1S, split2S, split3S, split4S, split5S, split1, split2, split3, split4, split5); thisfindstheaverage ( average, split1, split2, split3, split4, split5); thisprintsstuff ( split1M, split2M, split3M, split4M, split5M, split1S, split2S, split3S, split4S, split5S, average); // these are calling statements and they call from the main function to the other functions. system("PAUSE"); return 0; } void thisgetsElapsedTimes(int &m1, int &m2, int &m3, int &m4, int &m5, int &s1, int &s2, int &s3, int &s4, int &s5) { char thistakescolon; cout << "Enter the elapsed time:" << endl; cout << " Kilometer 1 "; cin m1 thistakescolon s1; cout << " Kilometer 2 "; cin m2 thistakescolon s2; cout << " Kilometer 3 " ; cin m3 thistakescolon s3; cout << " Kilometer 4 "; cin m4 thistakescolon s4; cout << " Kilometer 5 "; cin m5 thistakescolon s5; // this gets the data required to get the results needed for the user to see // . } void thisconvertstoseconds (int &m1, int &m2, int &m3, int &m4, int &m5, int &s1, int &s2, int &s3, int &s4, int &s5, int &split1, int &split2, int &split3, int &split4, int &split5) { split1 = (m1 * 60) + s1;//this converts for minutes to seconds for m1 split2 = (m2 * 60) + s2;//this converts for minutes to seconds for m2 split3 = (m3 * 60) + s3;//this converts for minutes to seconds for m3 split4 = (m4 * 60) + s4;//this converts for minutes to seconds for m4 split5 = (m5 * 60) + s5;//this converts for minutes to seconds for m5 } void thisfindsSplits (int &m1, int &m2, int &m3, int &m4, int &m5,int &split1, int &split2, int &split3, int &split4, int &split5, int &split6, int &split7, int &split8, int &split9, int &split10)//this is function heading { split6 = split1; //this is split for the first lap. split7 = split2 - split1;//this is split for the second lap. split8 = split3 - split2;//this is split for the third lap. split9 = split4 - split3;//this is split for the fourth lap. split10 = split5 - split4;//this is split for the fifth lap. } void thisfindstheaverage(double &average, int &split1, int &split2, int &split3, int &split4, int &split5) { average = (split1 + split2 + split3 + split4 + split5)/5; // this finds the average from all the splits in seconds } void thisisthesecondconversation (int &split1M, int &split2M, int &split3M, int &split4M, int &split5M, int &split1S,int &split2S, int &split3S, int &split4S, int &split5S, int &split1, int &split2, int &split3, int &split4, int &split5) { split1M = split1 * 60; //this finds the split times split1S = split1M - split1 * 60; //then this finds split2M = split2 * 60; //and all of this split2S = split2M - split2 * 60; //does basically split3M = split3 * 60; //the same thing split3S = split3M - split3 * 60; //all of it split4M = split4 * 60; //it's also a split4S = split4M - split4 * 60; //function split5M = split5 * 60; //and it finds the splits split5S = split5M - split5 * 60; //for each lap. } void thisprintsstuff (int &split1M, int &split2M, int &split3M, int &split4M, int &split5M, int &split1S, int &split2S, int &split3S, int &split4S, int &split5S, double &average)// this is function heading { printf("\n kilometer 1 %d" , ":02%d",'split1M','split1S'); printf("\n kilometer 2 %d" , ":02%d",'split2M','split2S'); printf("\n kilometer 3 %d" , ":02%d",'split3M','split3S'); printf("\n kilometer 4 %d" , ":02%d",'split4M','split4S'); printf("\n kilometer 5 %d" , ":02%d",'split5M','split5S'); printf("\n your average pace is ",'average',"per kilometer \n", "William Chen\n"); // this printf so the programmer // can allow the user to see // the results from the data gathered. }

    Read the article

< Previous Page | 199 200 201 202 203 204 205 206 207 208 209 210  | Next Page >