Search Results

Search found 3828 results on 154 pages for 'mathematical optimization'.

Page 112/154 | < Previous Page | 108 109 110 111 112 113 114 115 116 117 118 119  | Next Page >

  • A Guided Tour of Complexity

    - by JoshReuben
    I just re-read Complexity – A Guided Tour by Melanie Mitchell , protégé of Douglas Hofstadter ( author of “Gödel, Escher, Bach”) http://www.amazon.com/Complexity-Guided-Tour-Melanie-Mitchell/dp/0199798109/ref=sr_1_1?ie=UTF8&qid=1339744329&sr=8-1 here are some notes and links:   Evolved from Cybernetics, General Systems Theory, Synergetics some interesting transdisciplinary fields to investigate: Chaos Theory - http://en.wikipedia.org/wiki/Chaos_theory – small differences in initial conditions (such as those due to rounding errors in numerical computation) yield widely diverging outcomes for chaotic systems, rendering long-term prediction impossible. System Dynamics / Cybernetics - http://en.wikipedia.org/wiki/System_Dynamics – study of how feedback changes system behavior Network Theory - http://en.wikipedia.org/wiki/Network_theory – leverage Graph Theory to analyze symmetric  / asymmetric relations between discrete objects Algebraic Topology - http://en.wikipedia.org/wiki/Algebraic_topology – leverage abstract algebra to analyze topological spaces There are limits to deterministic systems & to computation. Chaos Theory definitely applies to training an ANN (artificial neural network) – different weights will emerge depending upon the random selection of the training set. In recursive Non-Linear systems http://en.wikipedia.org/wiki/Nonlinear_system – output is not directly inferable from input. E.g. a Logistic map: Xt+1 = R Xt(1-Xt) Different types of bifurcations, attractor states and oscillations may occur – e.g. a Lorenz Attractor http://en.wikipedia.org/wiki/Lorenz_system Feigenbaum Constants http://en.wikipedia.org/wiki/Feigenbaum_constants express ratios in a bifurcation diagram for a non-linear map – the convergent limit of R (the rate of period-doubling bifurcations) is 4.6692016 Maxwell’s Demon - http://en.wikipedia.org/wiki/Maxwell%27s_demon - the Second Law of Thermodynamics has only a statistical certainty – the universe (and thus information) tends towards entropy. While any computation can theoretically be done without expending energy, with finite memory, the act of erasing memory is permanent and increases entropy. Life & thought is a counter-example to the universe’s tendency towards entropy. Leo Szilard and later Claude Shannon came up with the Information Theory of Entropy - http://en.wikipedia.org/wiki/Entropy_(information_theory) whereby Shannon entropy quantifies the expected value of a message’s information in bits in order to determine channel capacity and leverage Coding Theory (compression analysis). Ludwig Boltzmann came up with Statistical Mechanics - http://en.wikipedia.org/wiki/Statistical_mechanics – whereby our Newtonian perception of continuous reality is a probabilistic and statistical aggregate of many discrete quantum microstates. This is relevant for Quantum Information Theory http://en.wikipedia.org/wiki/Quantum_information and the Physics of Information - http://en.wikipedia.org/wiki/Physical_information. Hilbert’s Problems http://en.wikipedia.org/wiki/Hilbert's_problems pondered whether mathematics is complete, consistent, and decidable (the Decision Problem – http://en.wikipedia.org/wiki/Entscheidungsproblem – is there always an algorithm that can determine whether a statement is true).  Godel’s Incompleteness Theorems http://en.wikipedia.org/wiki/G%C3%B6del's_incompleteness_theorems  proved that mathematics cannot be both complete and consistent (e.g. “This statement is not provable”). Turing through the use of Turing Machines (http://en.wikipedia.org/wiki/Turing_machine symbol processors that can prove mathematical statements) and Universal Turing Machines (http://en.wikipedia.org/wiki/Universal_Turing_machine Turing Machines that can emulate other any Turing Machine via accepting programs as well as data as input symbols) that computation is limited by demonstrating the Halting Problem http://en.wikipedia.org/wiki/Halting_problem (is is not possible to know when a program will complete – you cannot build an infinite loop detector). You may be used to thinking of 1 / 2 / 3 dimensional systems, but Fractal http://en.wikipedia.org/wiki/Fractal systems are defined by self-similarity & have non-integer Hausdorff Dimensions !!!  http://en.wikipedia.org/wiki/List_of_fractals_by_Hausdorff_dimension – the fractal dimension quantifies the number of copies of a self similar object at each level of detail – eg Koch Snowflake - http://en.wikipedia.org/wiki/Koch_snowflake Definitions of complexity: size, Shannon entropy, Algorithmic Information Content (http://en.wikipedia.org/wiki/Algorithmic_information_theory - size of shortest program that can generate a description of an object) Logical depth (amount of info processed), thermodynamic depth (resources required). Complexity is statistical and fractal. John Von Neumann’s other machine was the Self-Reproducing Automaton http://en.wikipedia.org/wiki/Self-replicating_machine  . Cellular Automata http://en.wikipedia.org/wiki/Cellular_automaton are alternative form of Universal Turing machine to traditional Von Neumann machines where grid cells are locally synchronized with their neighbors according to a rule. Conway’s Game of Life http://en.wikipedia.org/wiki/Conway's_Game_of_Life demonstrates various emergent constructs such as “Glider Guns” and “Spaceships”. Cellular Automatons are not practical because logical ops require a large number of cells – wasteful & inefficient. There are no compilers or general program languages available for Cellular Automatons (as far as I am aware). Random Boolean Networks http://en.wikipedia.org/wiki/Boolean_network are extensions of cellular automata where nodes are connected at random (not to spatial neighbors) and each node has its own rule –> they demonstrate the emergence of complex  & self organized behavior. Stephen Wolfram’s (creator of Mathematica, so give him the benefit of the doubt) New Kind of Science http://en.wikipedia.org/wiki/A_New_Kind_of_Science proposes the universe may be a discrete Finite State Automata http://en.wikipedia.org/wiki/Finite-state_machine whereby reality emerges from simple rules. I am 2/3 through this book. It is feasible that the universe is quantum discrete at the plank scale and that it computes itself – Digital Physics: http://en.wikipedia.org/wiki/Digital_physics – a simulated reality? Anyway, all behavior is supposedly derived from simple algorithmic rules & falls into 4 patterns: uniform , nested / cyclical, random (Rule 30 http://en.wikipedia.org/wiki/Rule_30) & mixed (Rule 110 - http://en.wikipedia.org/wiki/Rule_110 localized structures – it is this that is interesting). interaction between colliding propagating signal inputs is then information processing. Wolfram proposes the Principle of Computational Equivalence - http://mathworld.wolfram.com/PrincipleofComputationalEquivalence.html - all processes that are not obviously simple can be viewed as computations of equivalent sophistication. Meaning in information may emerge from analogy & conceptual slippages – see the CopyCat program: http://cognitrn.psych.indiana.edu/rgoldsto/courses/concepts/copycat.pdf Scale Free Networks http://en.wikipedia.org/wiki/Scale-free_network have a distribution governed by a Power Law (http://en.wikipedia.org/wiki/Power_law - much more common than Normal Distribution). They are characterized by hubs (resilience to random deletion of nodes), heterogeneity of degree values, self similarity, & small world structure. They grow via preferential attachment http://en.wikipedia.org/wiki/Preferential_attachment – tipping points triggered by positive feedback loops. 2 theories of cascading system failures in complex systems are Self-Organized Criticality http://en.wikipedia.org/wiki/Self-organized_criticality and Highly Optimized Tolerance http://en.wikipedia.org/wiki/Highly_optimized_tolerance. Computational Mechanics http://en.wikipedia.org/wiki/Computational_mechanics – use of computational methods to study phenomena governed by the principles of mechanics. This book is a great intuition pump, but does not cover the more mathematical subject of Computational Complexity Theory – http://en.wikipedia.org/wiki/Computational_complexity_theory I am currently reading this book on this subject: http://www.amazon.com/Computational-Complexity-Christos-H-Papadimitriou/dp/0201530821/ref=pd_sim_b_1   stay tuned for that review!

    Read the article

  • Heaps of Trouble?

    - by Paul White NZ
    If you’re not already a regular reader of Brad Schulz’s blog, you’re missing out on some great material.  In his latest entry, he is tasked with optimizing a query run against tables that have no indexes at all.  The problem is, predictably, that performance is not very good.  The catch is that we are not allowed to create any indexes (or even new statistics) as part of our optimization efforts. In this post, I’m going to look at the problem from a slightly different angle, and present an alternative solution to the one Brad found.  Inevitably, there’s going to be some overlap between our entries, and while you don’t necessarily need to read Brad’s post before this one, I do strongly recommend that you read it at some stage; he covers some important points that I won’t cover again here. The Example We’ll use data from the AdventureWorks database, copied to temporary unindexed tables.  A script to create these structures is shown below: CREATE TABLE #Custs ( CustomerID INTEGER NOT NULL, TerritoryID INTEGER NULL, CustomerType NCHAR(1) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #Prods ( ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, Name NVARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #OrdHeader ( SalesOrderID INTEGER NOT NULL, OrderDate DATETIME NOT NULL, SalesOrderNumber NVARCHAR(25) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, CustomerID INTEGER NOT NULL, ); GO CREATE TABLE #OrdDetail ( SalesOrderID INTEGER NOT NULL, OrderQty SMALLINT NOT NULL, LineTotal NUMERIC(38,6) NOT NULL, ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, ); GO INSERT #Custs ( CustomerID, TerritoryID, CustomerType ) SELECT C.CustomerID, C.TerritoryID, C.CustomerType FROM AdventureWorks.Sales.Customer C WITH (TABLOCK); GO INSERT #Prods ( ProductMainID, ProductSubID, ProductSubSubID, Name ) SELECT P.ProductID, P.ProductID, P.ProductID, P.Name FROM AdventureWorks.Production.Product P WITH (TABLOCK); GO INSERT #OrdHeader ( SalesOrderID, OrderDate, SalesOrderNumber, CustomerID ) SELECT H.SalesOrderID, H.OrderDate, H.SalesOrderNumber, H.CustomerID FROM AdventureWorks.Sales.SalesOrderHeader H WITH (TABLOCK); GO INSERT #OrdDetail ( SalesOrderID, OrderQty, LineTotal, ProductMainID, ProductSubID, ProductSubSubID ) SELECT D.SalesOrderID, D.OrderQty, D.LineTotal, D.ProductID, D.ProductID, D.ProductID FROM AdventureWorks.Sales.SalesOrderDetail D WITH (TABLOCK); The query itself is a simple join of the four tables: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #OrdDetail D ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID JOIN #OrdHeader H ON D.SalesOrderID = H.SalesOrderID JOIN #Custs C ON H.CustomerID = C.CustomerID ORDER BY P.ProductMainID ASC OPTION (RECOMPILE, MAXDOP 1); Remember that these tables have no indexes at all, and only the single-column sampled statistics SQL Server automatically creates (assuming default settings).  The estimated query plan produced for the test query looks like this (click to enlarge): The Problem The problem here is one of cardinality estimation – the number of rows SQL Server expects to find at each step of the plan.  The lack of indexes and useful statistical information means that SQL Server does not have the information it needs to make a good estimate.  Every join in the plan shown above estimates that it will produce just a single row as output.  Brad covers the factors that lead to the low estimates in his post. In reality, the join between the #Prods and #OrdDetail tables will produce 121,317 rows.  It should not surprise you that this has rather dire consequences for the remainder of the query plan.  In particular, it makes a nonsense of the optimizer’s decision to use Nested Loops to join to the two remaining tables.  Instead of scanning the #OrdHeader and #Custs tables once (as it expected), it has to perform 121,317 full scans of each.  The query takes somewhere in the region of twenty minutes to run to completion on my development machine. A Solution At this point, you may be thinking the same thing I was: if we really are stuck with no indexes, the best we can do is to use hash joins everywhere. We can force the exclusive use of hash joins in several ways, the two most common being join and query hints.  A join hint means writing the query using the INNER HASH JOIN syntax; using a query hint involves adding OPTION (HASH JOIN) at the bottom of the query.  The difference is that using join hints also forces the order of the join, whereas the query hint gives the optimizer freedom to reorder the joins at its discretion. Adding the OPTION (HASH JOIN) hint results in this estimated plan: That produces the correct output in around seven seconds, which is quite an improvement!  As a purely practical matter, and given the rigid rules of the environment we find ourselves in, we might leave things there.  (We can improve the hashing solution a bit – I’ll come back to that later on). Faster Nested Loops It might surprise you to hear that we can beat the performance of the hash join solution shown above using nested loops joins exclusively, and without breaking the rules we have been set. The key to this part is to realize that a condition like (A = B) can be expressed as (A <= B) AND (A >= B).  Armed with this tremendous new insight, we can rewrite the join predicates like so: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #OrdDetail D JOIN #OrdHeader H ON D.SalesOrderID >= H.SalesOrderID AND D.SalesOrderID <= H.SalesOrderID JOIN #Custs C ON H.CustomerID >= C.CustomerID AND H.CustomerID <= C.CustomerID JOIN #Prods P ON P.ProductMainID >= D.ProductMainID AND P.ProductMainID <= D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (RECOMPILE, LOOP JOIN, MAXDOP 1, FORCE ORDER); I’ve also added LOOP JOIN and FORCE ORDER query hints to ensure that only nested loops joins are used, and that the tables are joined in the order they appear.  The new estimated execution plan is: This new query runs in under 2 seconds. Why Is It Faster? The main reason for the improvement is the appearance of the eager Index Spools, which are also known as index-on-the-fly spools.  If you read my Inside The Optimiser series you might be interested to know that the rule responsible is called JoinToIndexOnTheFly. An eager index spool consumes all rows from the table it sits above, and builds a index suitable for the join to seek on.  Taking the index spool above the #Custs table as an example, it reads all the CustomerID and TerritoryID values with a single scan of the table, and builds an index keyed on CustomerID.  The term ‘eager’ means that the spool consumes all of its input rows when it starts up.  The index is built in a work table in tempdb, has no associated statistics, and only exists until the query finishes executing. The result is that each unindexed table is only scanned once, and just for the columns necessary to build the temporary index.  From that point on, every execution of the inner side of the join is answered by a seek on the temporary index – not the base table. A second optimization is that the sort on ProductMainID (required by the ORDER BY clause) is performed early, on just the rows coming from the #OrdDetail table.  The optimizer has a good estimate for the number of rows it needs to sort at that stage – it is just the cardinality of the table itself.  The accuracy of the estimate there is important because it helps determine the memory grant given to the sort operation.  Nested loops join preserves the order of rows on its outer input, so sorting early is safe.  (Hash joins do not preserve order in this way, of course). The extra lazy spool on the #Prods branch is a further optimization that avoids executing the seek on the temporary index if the value being joined (the ‘outer reference’) hasn’t changed from the last row received on the outer input.  It takes advantage of the fact that rows are still sorted on ProductMainID, so if duplicates exist, they will arrive at the join operator one after the other. The optimizer is quite conservative about introducing index spools into a plan, because creating and dropping a temporary index is a relatively expensive operation.  It’s presence in a plan is often an indication that a useful index is missing. I want to stress that I rewrote the query in this way primarily as an educational exercise – I can’t imagine having to do something so horrible to a production system. Improving the Hash Join I promised I would return to the solution that uses hash joins.  You might be puzzled that SQL Server can create three new indexes (and perform all those nested loops iterations) faster than it can perform three hash joins.  The answer, again, is down to the poor information available to the optimizer.  Let’s look at the hash join plan again: Two of the hash joins have single-row estimates on their build inputs.  SQL Server fixes the amount of memory available for the hash table based on this cardinality estimate, so at run time the hash join very quickly runs out of memory. This results in the join spilling hash buckets to disk, and any rows from the probe input that hash to the spilled buckets also get written to disk.  The join process then continues, and may again run out of memory.  This is a recursive process, which may eventually result in SQL Server resorting to a bailout join algorithm, which is guaranteed to complete eventually, but may be very slow.  The data sizes in the example tables are not large enough to force a hash bailout, but it does result in multiple levels of hash recursion.  You can see this for yourself by tracing the Hash Warning event using the Profiler tool. The final sort in the plan also suffers from a similar problem: it receives very little memory and has to perform multiple sort passes, saving intermediate runs to disk (the Sort Warnings Profiler event can be used to confirm this).  Notice also that because hash joins don’t preserve sort order, the sort cannot be pushed down the plan toward the #OrdDetail table, as in the nested loops plan. Ok, so now we understand the problems, what can we do to fix it?  We can address the hash spilling by forcing a different order for the joins: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #Custs C JOIN #OrdHeader H ON H.CustomerID = C.CustomerID JOIN #OrdDetail D ON D.SalesOrderID = H.SalesOrderID ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (MAXDOP 1, HASH JOIN, FORCE ORDER); With this plan, each of the inputs to the hash joins has a good estimate, and no hash recursion occurs.  The final sort still suffers from the one-row estimate problem, and we get a single-pass sort warning as it writes rows to disk.  Even so, the query runs to completion in three or four seconds.  That’s around half the time of the previous hashing solution, but still not as fast as the nested loops trickery. Final Thoughts SQL Server’s optimizer makes cost-based decisions, so it is vital to provide it with accurate information.  We can’t really blame the performance problems highlighted here on anything other than the decision to use completely unindexed tables, and not to allow the creation of additional statistics. I should probably stress that the nested loops solution shown above is not one I would normally contemplate in the real world.  It’s there primarily for its educational and entertainment value.  I might perhaps use it to demonstrate to the sceptical that SQL Server itself is crying out for an index. Be sure to read Brad’s original post for more details.  My grateful thanks to him for granting permission to reuse some of his material. Paul White Email: [email protected] Twitter: @PaulWhiteNZ

    Read the article

  • C#: String Concatenation vs Format vs StringBuilder

    - by James Michael Hare
    I was looking through my groups’ C# coding standards the other day and there were a couple of legacy items in there that caught my eye.  They had been passed down from committee to committee so many times that no one even thought to second guess and try them for a long time.  It’s yet another example of how micro-optimizations can often get the best of us and cause us to write code that is not as maintainable as it could be for the sake of squeezing an extra ounce of performance out of our software. So the two standards in question were these, in paraphrase: Prefer StringBuilder or string.Format() to string concatenation. Prefer string.Equals() with case-insensitive option to string.ToUpper().Equals(). Now some of you may already know what my results are going to show, as these items have been compared before on many blogs, but I think it’s always worth repeating and trying these yourself.  So let’s dig in. The first test was a pretty standard one.  When concattenating strings, what is the best choice: StringBuilder, string concattenation, or string.Format()? So before we being I read in a number of iterations from the console and a length of each string to generate.  Then I generate that many random strings of the given length and an array to hold the results.  Why am I so keen to keep the results?  Because I want to be able to snapshot the memory and don’t want garbage collection to collect the strings, hence the array to keep hold of them.  I also didn’t want the random strings to be part of the allocation, so I pre-allocate them and the array up front before the snapshot.  So in the code snippets below: num – Number of iterations. strings – Array of randomly generated strings. results – Array to hold the results of the concatenation tests. timer – A System.Diagnostics.Stopwatch() instance to time code execution. start – Beginning memory size. stop – Ending memory size. after – Memory size after final GC. So first, let’s look at the concatenation loop: 1: // build num strings using concattenation. 2: for (int i = 0; i < num; i++) 3: { 4: results[i] = "This is test #" + i + " with a result of " + strings[i]; 5: } Pretty standard, right?  Next for string.Format(): 1: // build strings using string.Format() 2: for (int i = 0; i < num; i++) 3: { 4: results[i] = string.Format("This is test #{0} with a result of {1}", i, strings[i]); 5: }   Finally, StringBuilder: 1: // build strings using StringBuilder 2: for (int i = 0; i < num; i++) 3: { 4: var builder = new StringBuilder(); 5: builder.Append("This is test #"); 6: builder.Append(i); 7: builder.Append(" with a result of "); 8: builder.Append(strings[i]); 9: results[i] = builder.ToString(); 10: } So I take each of these loops, and time them by using a block like this: 1: // get the total amount of memory used, true tells it to run GC first. 2: start = System.GC.GetTotalMemory(true); 3:  4: // restart the timer 5: timer.Reset(); 6: timer.Start(); 7:  8: // *** code to time and measure goes here. *** 9:  10: // get the current amount of memory, stop the timer, then get memory after GC. 11: stop = System.GC.GetTotalMemory(false); 12: timer.Stop(); 13: other = System.GC.GetTotalMemory(true); So let’s look at what happens when I run each of these blocks through the timer and memory check at 500,000 iterations: 1: Operator + - Time: 547, Memory: 56104540/55595960 - 500000 2: string.Format() - Time: 749, Memory: 57295812/55595960 - 500000 3: StringBuilder - Time: 608, Memory: 55312888/55595960 – 500000   Egad!  string.Format brings up the rear and + triumphs, well, at least in terms of speed.  The concat burns more memory than StringBuilder but less than string.Format().  This shows two main things: StringBuilder is not always the panacea many think it is. The difference between any of the three is miniscule! The second point is extremely important!  You will often here people who will grasp at results and say, “look, operator + is 10% faster than StringBuilder so always use StringBuilder.”  Statements like this are a disservice and often misleading.  For example, if I had a good guess at what the size of the string would be, I could have preallocated my StringBuffer like so:   1: for (int i = 0; i < num; i++) 2: { 3: // pre-declare StringBuilder to have 100 char buffer. 4: var builder = new StringBuilder(100); 5: builder.Append("This is test #"); 6: builder.Append(i); 7: builder.Append(" with a result of "); 8: builder.Append(strings[i]); 9: results[i] = builder.ToString(); 10: }   Now let’s look at the times: 1: Operator + - Time: 551, Memory: 56104412/55595960 - 500000 2: string.Format() - Time: 753, Memory: 57296484/55595960 - 500000 3: StringBuilder - Time: 525, Memory: 59779156/55595960 - 500000   Whoa!  All of the sudden StringBuilder is back on top again!  But notice, it takes more memory now.  This makes perfect sense if you examine the IL behind the scenes.  Whenever you do a string concat (+) in your code, it examines the lengths of the arguments and creates a StringBuilder behind the scenes of the appropriate size for you. But even IF we know the approximate size of our StringBuilder, look how much less readable it is!  That’s why I feel you should always take into account both readability and performance.  After all, consider all these timings are over 500,000 iterations.   That’s at best  0.0004 ms difference per call which is neglidgable at best.  The key is to pick the best tool for the job.  What do I mean?  Consider these awesome words of wisdom: Concatenate (+) is best at concatenating.  StringBuilder is best when you need to building. Format is best at formatting. Totally Earth-shattering, right!  But if you consider it carefully, it actually has a lot of beauty in it’s simplicity.  Remember, there is no magic bullet.  If one of these always beat the others we’d only have one and not three choices. The fact is, the concattenation operator (+) has been optimized for speed and looks the cleanest for joining together a known set of strings in the simplest manner possible. StringBuilder, on the other hand, excels when you need to build a string of inderterminant length.  Use it in those times when you are looping till you hit a stop condition and building a result and it won’t steer you wrong. String.Format seems to be the looser from the stats, but consider which of these is more readable.  Yes, ignore the fact that you could do this with ToString() on a DateTime.  1: // build a date via concatenation 2: var date1 = (month < 10 ? string.Empty : "0") + month + '/' 3: + (day < 10 ? string.Empty : "0") + '/' + year; 4:  5: // build a date via string builder 6: var builder = new StringBuilder(10); 7: if (month < 10) builder.Append('0'); 8: builder.Append(month); 9: builder.Append('/'); 10: if (day < 10) builder.Append('0'); 11: builder.Append(day); 12: builder.Append('/'); 13: builder.Append(year); 14: var date2 = builder.ToString(); 15:  16: // build a date via string.Format 17: var date3 = string.Format("{0:00}/{1:00}/{2:0000}", month, day, year); 18:  So the strength in string.Format is that it makes constructing a formatted string easy to read.  Yes, it’s slower, but look at how much more elegant it is to do zero-padding and anything else string.Format does. So my lesson is, don’t look for the silver bullet!  Choose the best tool.  Micro-optimization almost always bites you in the end because you’re sacrificing readability for performance, which is almost exactly the wrong choice 90% of the time. I love the rules of optimization.  They’ve been stated before in many forms, but here’s how I always remember them: For Beginners: Do not optimize. For Experts: Do not optimize yet. It’s so true.  Most of the time on today’s modern hardware, a micro-second optimization at the sake of readability will net you nothing because it won’t be your bottleneck.  Code for readability, choose the best tool for the job which will usually be the most readable and maintainable as well.  Then, and only then, if you need that extra performance boost after profiling your code and exhausting all other options… then you can start to think about optimizing.

    Read the article

  • CodePlex Daily Summary for Sunday, March 06, 2011

    CodePlex Daily Summary for Sunday, March 06, 2011Popular ReleasesIIS Tuner: IIS Tuner 1.0: IIS and ASP.NET performance optimization toolMinemapper: Minemapper v0.1.6: Once again supports biomes, thanks to an updated Minecraft Biome Extractor, which added support for the new Minecraft beta v1.3 map format. Updated mcmap to support new biome format.CRM 2011 OData Query Designer: CRM 2011 OData Query Designer: The CRM 2011 OData Query Designer is a Silverlight 4 application that is packaged as a Managed CRM 2011 Solution. This tool allows you to build OData queries by selecting filter criteria, select attributes and order by attributes. The tool also allows you to Execute the query and view the ATOM and JSON data returned. The look and feel of this component will improve and new functionality will be added in the near future so please provide feedback on your experience. Import this solution int...AutoLoL: AutoLoL v1.6.4: It is now possible to run the clicker anyway when it can't detect the Masteries Window Fixed a critical bug in the open file dialog Removed the resize button Some UI changes 3D camera movement is now more intuitive (Trackball rotation) When an error occurs on the clicker it will attempt to focus AutoLoLYAF.NET (aka Yet Another Forum.NET): v1.9.5.5 RTW: YAF v1.9.5.5 RTM (Date: 3/4/2011 Rev: 4742) Official Discussion Thread here: http://forum.yetanotherforum.net/yaf_postsm47149_v1-9-5-5-RTW--Date-3-4-2011-Rev-4742.aspx Changes in v1.9.5.5 Rev. #4661 - Added "Copy" function to forum administration -- Now instead of having to manually re-enter all the access masks, etc, you can just duplicate an existing forum and modify after the fact. Rev. #4642 - New Setting to Enable/Disable Last Unread posts links Rev. #4641 - Added Arabic Language t...Snippet Designer: Snippet Designer 1.3.1: Snippet Designer 1.3.1 for Visual Studio 2010This is a bug fix release. Change logFixed bug where Snippet Designer would fail if you had the most recent Productivity Power Tools installed Fixed bug where "Export as Snippet" was failing in non-english locales Fixed bug where opening a new .snippet file would fail in non-english localesChiave File Encryption: Chiave 1.0: Final Relase for Chave 1.0 Stable: Application for file encryption and decryption using 512 Bit rijndael encyrption algorithm with simple to use UI. Its written in C# and compiled in .Net version 3.5. It incorporates features of Windows 7 like Jumplists, Taskbar progress and Aero Glass. Now with added support to Windows XP! Change Log from 0.9.2 to 1.0: ==================== Added: > Added Icon Overlay for Windows 7 Taskbar Icon. >Added Thumbnail Toolbar buttons to make the navigation easier...DotNetNuke® Community Edition: 05.06.02 Beta: Major HighlightsFixed issue where "My Folder" was not available in the URL control and the Telerik HTML Editor Fixed issue where HTML Editor dialogs were not displaying correctly in alternate languages Fixed issue with Regex for email validation Fixed race condition in the core scheduler Fixed issue where editing Host page settings would result in broken host menu Fixed issue where "Apply to All Modules" setting was not propogating settings correctly. Fixed issue where browser lan...DirectQ: Release 1.8.7 (RC1): Release candidate 1 of 1.8.7GoogleTrail: TrailMap Beta 1: Trailmap beta 1 release Now we have updated custom map builder. Now we have complete gpx file editor. Now we have elevation data update service for any gpx file. (currently supports only google only).Chirpy - VS Add In For Handling Js, Css, DotLess, and T4 Files: Margogype Chirpy (ver 2.0): Chirpy loves Americans. Chirpy hates Americanos.ASP.NET: Sprite and Image Optimization Preview 3: The ASP.NET Sprite and Image Optimization framework is designed to decrease the amount of time required to request and display a page from a web server by performing a variety of optimizations on the page’s images. This is the third preview of the feature and works with ASP.NET Web Forms 4, ASP.NET MVC 3, and ASP.NET Web Pages (Razor) projects. The binaries are also available via NuGet: AspNetSprites-Core AspNetSprites-WebFormsControl AspNetSprites-MvcAndRazorHelper It includes the foll...Sandcastle Help File Builder: SHFB v1.9.2.0 Release: This release supports the Sandcastle June 2010 Release (v2.6.10621.1). It includes full support for generating, installing, and removing MS Help Viewer files. This new release is compiled under .NET 4.0, supports Visual Studio 2010 solutions and projects as documentation sources, and adds support for projects targeting the Silverlight Framework. NOTE: The included help file and the online help have not been completely updated to reflect all changes in this release. A refresh will be issue...Network Monitor Open Source Parsers: Microsoft Network Monitor Parsers 3.4.2554: The Network Monitor Parsers packages contain parsers for more than 400 network protocols, including RFC based public protocols and protocols for Microsoft products defined in the Microsoft Open Specifications for Windows and SQL Server. NetworkMonitor_Parsers.msi is the base parser package which defines parsers for commonly used public protocols and protocols for Microsoft Windows. In this release, we have added 4 new protocol parsers and updated 79 existing parsers in the NetworkMonitor_Pa...Image Resizer for Windows: Image Resizer 3 Preview 1: Prepare to have your minds blown. This is the first preview of what will eventually become 39613. There are still a lot of rough edges and plenty of areas still under construction, but for your basic needs, it should be relativly stable. Note: You will need the .NET Framework 4 installed to use this version. Below is a status report of where this release is in terms of the overall goal for version 3. If you're feeling a bit technically ambitious and want to check out some of the features th...JSON Toolkit: JSON Toolkit 1.1: updated GetAllJsonObjects() method and GetAllProperties() methods to JsonObject and Properties propertiesFacebook Graph Toolkit: Facebook Graph Toolkit 1.0: Refer to http://computerbeacon.net for Documentation and Tutorial New features:added FQL support added Expires property to Api object added support for publishing to a user's friend / Facebook Page added support for posting and removing comments on posts added support for adding and removing likes on posts and comments added static methods for Page class added support for Iframe Application Tab of Facebook Page added support for obtaining the user's country, locale and age in If...ASP.NET MVC Project Awesome, jQuery Ajax helpers (controls): 1.7.1: A rich set of helpers (controls) that you can use to build highly responsive and interactive Ajax-enabled Web applications. These helpers include Autocomplete, AjaxDropdown, Lookup, Confirm Dialog, Popup Form, Popup and Pager small improvements for some helpers and AjaxDropdown has Data like the Lookup except it's value gets reset and list refilled if any element from data gets changedManaged Extensibility Framework: MEF 2 Preview 3: This release aims .net 4.0 and Silverlight 4.0. Accordingly, there are two solutions files. The assemblies are named System.ComponentModel.Composition.Codeplex.dll as a way to avoid clashing with the version shipped with the 4th version of the framework. Introduced CompositionOptions to container instantiation CompositionOptions.DisableSilentRejection makes MEF throw an exception on composition errors. Useful for diagnostics Support for open generics Support for attribute-less registr...PHPExcel: PHPExcel 1.7.6 Production: DonationsDonate via PayPal via PayPal. If you want to, we can also add your name / company on our Donation Acknowledgements page. PEAR channelWe now also have a full PEAR channel! Here's how to use it: New installation: pear channel-discover pear.pearplex.net pear install pearplex/PHPExcel Or if you've already installed PHPExcel before: pear upgrade pearplex/PHPExcel The official page can be found at http://pearplex.net. Want to contribute?Please refer the Contribute page.New Projectsasp.net mvc 3 simple cms: asp.net mvc3 cms for learling purpose.C++ Mini Framework: C++ Mini Framework is a simple and easy to use class library in source format to quickly do things you commonly need to do in native projects with the purpose to get you started specifically targeting new C++ developers hopping you will be productive from the very start.Community Megaphone Helpers: Community Megaphone Helpers is a project intended as a means of sharing and accepting contributions for reusable Razor helper modules for functionality used in the Community Megaphone events web application, including Bing maps and more. Supports Microsoft WebMatrix & MVC 3CRM 2011 OData Query Designer: The CRM 2011 OData Query Designer is a Silverlight 4 application that is packaged as a Managed CRM 2011 Solution. The tool allows you to build OData REST queries by selecting filter criteria, select attributes and order by attributes. The tool also allows you to Execute the queryFaceted Search: Implementation of faceted (advanced) search with composite client side UI. Abstraction interfaces for intagration with different server side technologies, implementation for ASP.NET MVC. FileRenamePro: FileRenamePro makes it easier for users to rename files using advanced rules and regular expressions. It's developed in C#.GT5 Mobile: Gran-Turismo Remote Racing mobile site wrapper.KAT: KAT - Knowledge Assessment Tool is a Solution from IndERP in order to automate Performance and Process Management for a Technology/Job Oriented Companies. monopoly game: Monopoly is an open source project for educational purposes. The project will incorporate XNA, Silverlight, WCF technologies. The project will also show good design patterns considerations, and integration into Facebook App. The project will be written in C#.MuDB: MuDB is an embedded object-oriented database for the .NET Micro Framework which provides a simple yet useful interface for managing data.NDataStructure: A library providing a handful of useful data structures omitted from the .NET framework.Set NuGet version number: A simple command line tool that makes it easy to set the version number within a NuGet .nuspec package configuration file. This is useful for when you want to automatically update and publish a NuGet package from your build system.Sightreader: Small application to aid in the wrote learning of basic musical notation.SSAS Operations Templates: SSAS Operations Templates includes SSIS packages, scripts and code samples for automating maintenance of SSAS in a production environment. Includes operations such as backup the current state of cube designs in production, scripting paritition creation, etc.Team Run Log: Team Run LogTEDHelper: Download TED movie's subtitle. ?? TED ?????。testprojectit339: project339Ultimate Resume Repository: A class library and application for storing resumes of multiple people with the ability to export a targeted resume in various, configurable formats. Further additions may include cover letters, browser add-ons to populate applications, job search engine integration, etc...Umbraco: Inspired DataTypes: New datatypes that are not in the default install to make Umbraco have some new controls such as Content/Media Treeview, Content/Media Drop Down List with Treeview. Controls have options to restrict DocumentTypes or MediaType and the start location to retreive fromUsing different schemas in the same Orchestration Receive Port: Using different schemas in the same Orchestration Receive PortWF4Host: Examples in re-hosting Workflow 4 designer.WMP Hotkeys: WMP Hotkeys is a windows media player plugin that enable users to use VLC player like keyboard shortcuts(e.g SPACE to play/pause) in Windows Media Player.

    Read the article

  • CodePlex Daily Summary for Tuesday, December 11, 2012

    CodePlex Daily Summary for Tuesday, December 11, 2012Popular ReleasesDirectX Tool Kit: December 11, 2012: December 11, 2012 Ex versions of DDSTextureLoader and WICTextureLoader Removed use of ATL's CComPtr in favor of WRL's ComPtr for all platforms to support VS Express editions Updated VS 2010 project for official 'property sheet' integration for Windows 8.0 SDK Minor fix to CommonStates for Feature Level 9.1 Tweaked AlphaTestEffect.cpp to work around ARM NEON compiler codegen bug Added dxguid.lib as a default library for Debug builds to resolve GUID link issuesCleverBobCat: CleverBobCat 1.1.3: Fixed: - Cart code now works in vab, wheels retract correctly and no more anchors visible - No more exceptions on VAB Added: - ResourceTransfer beam now has a wide array of configurable options for beam material, size and colors - Added an optional offset to resource transfer, if specificed the "beam" will start from part center + the specified offsetSharpCompress - a fully native C# library for RAR, 7Zip, Zip, Tar, GZip, BZip2: SharpCompress 0.8.2: This release just contains some fixes that have been done since the last release. Plus, this is strong named as well. I apologize for the lack of updates but my free time is less these days.re-linq: 1.13.178.0: This is build 1.13.178.0 of re-linq. Find the complete release notes for the build here: Release NotesMedia Companion: MediaCompanion3.511b release: Two more bug fixes: - General Preferences were not getting restored - Fanart and poster image files were being locked, preventing changes to themVodigi Open Source Interactive Digital Signage: Vodigi Release 5.5: The following enhancements and fixes are included in Vodigi 5.5. Vodigi Administrator - Manage Music Files - Add Music Files to Image Slide Shows - Manage System Messages - Display System Messages to Users During Login - Ported to Visual Studio 2012 and MVC 4 - Added New Vodigi Administrator User Guide Vodigi Player - Improved Login/Schedule Startup Procedure - Startup Using Last Known Schedule when Disconnected on Startup - Improved Check for Schedule Changes - Now Every 15 Minutes - Pla...Secretary Tool: Secretary Tool v1.1.0: I'm still considering this version a beta version because, while it seems to work well for me, I haven't received any feedback and I certainly don't want anyone relying solely on this tool for calculations and such until its correct functioning is verified by someone. This version includes several bug fixes, including a rather major one with Emergency Contact Information not saving. Also, reporting is completed. There may be some tweaking to the reporting engine, but it is good enough to rel...VidCoder: 1.4.10 Beta: Added progress percent to the title bar/task bar icon. Added MPLS information to Blu-ray titles. Fixed the following display issues in Windows 8: Uncentered text in textbox controls Disabled controls not having gray text making them hard to identify as disabled Drop-down menus having hard-to distinguish white on light-blue text Added more logging to proxy disconnect issues and increased timeout on initial call to help prevent timeouts. Fixed encoding window showing the built-in pre...WPF Application Framework (WAF): WPF Application Framework (WAF) 2.5.0.400: Version 2.5.0.400 (Release): This release contains the source code of the WPF Application Framework (WAF) and the sample applications. Requirements .NET Framework 4.0 (The package contains a solution file for Visual Studio 2010) The unit test projects require Visual Studio 2010 Professional Changelog Legend: [B] Breaking change; [O] Marked member as obsolete Update the documentation. InfoMan: Write the documentation. Other Downloads Downloads OverviewHome Access Plus+: v8.5: v8.5.1211.1240 Fixed: changed to using the thumbnailPhoto attribute instead of jpegPhoto v8.5.1208.1500 This is a point release, for the other parts of HAP+ see the v8.3 release. Fixed: #me#me issue with the Edit Profile Link Updated: 8.5.1208 release Updated: Documentation with hidden booking system feature Added: Room Drop Down to the Booking System (no control panel interface), can be Resource Specific Fixed: Recursive AD Group Membership Lookup Fixed: User.IsInRole with recursive lookup...Http Explorer: httpExplorer-1.1: httpExplorer now has the ability to connect to http server via web proxies. The proxy may be explicitly specified by hostname or IP address. Or it may be specified via the Internet Options settings of Windows. You may also specify credentials to pass to the proxy if the proxy requires them. These credentials may be NTLM or basic authentication (clear text username and password).Bee OPOA Platform: Bee OPOA Demo V1.0.001: Initial version.Microsoft Ajax Minifier: Microsoft Ajax Minifier 4.78: Fix for issue #18924 - using -pretty option left in ///#DEBUG blocks. Fix for issue #18980 - bad += optimization caused bug in resulting code. Optimization has been removed pending further review.Dynamics Crm 2011 Solution Manager: Crm Solution Manager 1.0: First Version.Lava JS: Lava JS 1.0: Sourcemp3player-xslt-plugin: mp3player module 1.0: This is a mp3 player module that you can install on your umbraco project. It can display the list of the media and you can click the button to play or stop.STeaL : stealed functionarities from STL: STeaL 0.4.1(prelerease): + fill / fill-nMi-DevEnv: Development 0.5: Ran a new test scenario where I created a DC, then created a web server, blew it away, then re-created the web server. The purpose of the test was to see how the installers would handle information existing in AD, like the OU for CRM. This revealed yet another silly issue in retrieving details of existing OU's, where I had the syntax wrong. I should read exception details more closely :) Apologies also, I've just now started labelling the solution according to these "snapshots". Sorry for...Yahoo! UI Library: YUI Compressor for .Net: Version 2.2.0.0 - Epee: New : Web Optimization package! Cleaned up the nuget packages BugFix: minifying lots of files will now be faster because of a recent regression in some code. (We were instantiating something far too many times).DtPad - .NET Framework text editor: DtPad 2.9.0.40: http://dtpad.diariotraduttore.com/files/images/flag-eng.png English + A new built-in editor for the management of CSV files, including the edit of cells, deleting and adding new rows, replacement of delimiter character and much more (issue #1137) + The limit of rows allowed before the decommissioning of their side panel has been raised (new default: 1.000) (issue #1155, only partially solved) + Pressing CTRL+TAB now DtPad opens a screen that shows the list of opened tabs (issue #1143) + Note...New ProjectsAmazon S3 Upload with Uploadify: ASP.MVC upload file from client browser direct to Amazon S3Ameoto DEV: Bin for all of our software sources :DAutonomic Computing Library: This framework will simplify the development of Autonomic Computing Systems. BIUUITest: BIUUITestBotvaBot: This project should help tothe Botva.Ru gamers to automize their game proces. It is a set of libraries for simple sending of the HTTP requests with list of the main requests to the botva.ru web-site. Also project contains the clients forsimple using this request umulators.Cartellino: Scopo del progetto è la realizzazione di un software in grado di rilevare i dati dai rilevatori 3Tec (www.3tec.it) e stampare i cartellini presenza dei dipendenticmsforbeginner: in this project explain how to edit, create multiple webpages to a single website with less knowledge on html ,.net languagesContact Us Module: This is an Umbraco based module for feedback. You can upload the contact us module in the dashboard and then display those module anywhere in your site.CrmXpress Security Roles Helper For Microsoft Dynamics CRM 2011: CrmXpress Security Roles Helper is a tool/utility. It lets you compare the Security Roles for their differences at the privilege level with their scopes.DarkSky Helpdesk: DarkSky Helpdesk is an Orchard module that turns your Orchard application into a Helpdesk system.DataEntity: A Framework that help us to make persistences and materializations from any SQLServer DataBase Tables to Entity classes. Better performance with native ADO.NET.Edisfera DNN Training Task Module: This is a training project to familiarize with codeplex and dnn module creation.eyoung events description language: Eyoung is a event driven language. It can be used in many fields, such as intrusion action definition.FastBuild: Fast build VS2010 C++ project with the options OpenCV, QT, Boost and OpenGL.Find Changeset By Comment: This extension is used to find changesets that contain a specific phrase in a comment.Gatepass System: This application under development is my first application on codeplex. It is meant primarily as a learning tool while developing a Gatepass System. This GateHandle Template Library (HTL): Handle Template Library (HTL) is a C++ library for developing Windows applications and services. It provides a set of classes for files, threads, events, and more.iDoklad API: iDoklad API Demo je Open Source projekt spolecnosti Cígler Software. Ukázková aplikace demonstruje napojení na fakturacní službu iDokladIronBefunge: IronBefunge is an interpretor (written in .NET) for Befunge programs.Iterator Tasks: Iterator Tasks is a iterator-based coroutine class library.Javacc-Compile-N14: Javacc - compile - N14 is a students' compiler project and we use javacc to compiler XYZ-2 languageMessenger Game - Starter Kit: Kom godt i gang med at lave spil til Messenger med dette komplette Starter Kit. Indeholder et komplet netværksspil lavet med Messenger Activity API og Silverlight.MOBZcript: MOBZcript is a simple batch file editor for cmd scripts. It saves and runs your scripts as if you started them from a command line.NET.Library: The NET.Library will be available in different versions to run on the .Net 2, 3 and 4 frameworks and consists of various .Net programming utilities. proxificator: Proxificator is component for getting the appropriate proxyRetailManagement: abcStudentManage: testWebQQ Interface: A library that enable programmers to access QQ much easier via codeZytonic Screenshot: Upload Images From your Computer Harddrive or Screen With Ease!

    Read the article

  • How do I get confidence intervals without inverting a singular Hessian matrix?

    - by AmalieNot
    Hello. I recently posted this to reddit and it was suggested I come here, so here I am. I'm a student working on an epidemiology model in R, using maximum likelihood methods. I created my negative log likelihood function. It's sort of gross looking, but here it is: NLLdiff = function(v1, CV1, v2, CV2, st1 = (czI01 - czV01), st2 = (czI02 - czV02), st01 = czI01, st02 = czI02, tt1 = czT01, tt2 = czT02) { prob1 = (1 + v1 * CV1 * tt1)^(-1/CV1) prob2 = ( 1 + v2 * CV2 * tt2)^(-1/CV2) -(sum(dbinom(st1, st01, prob1, log = T)) + sum(dbinom(st2, st02, prob2, log = T))) } The reason the first line looks so awful is because most of the data it takes is inputted there. czI01, for example, is already declared. I did this simply so that my later calls to the function don't all have to have awful vectors in them. I then optimized for CV1, CV2, v1 and v2 using mle2 (library bbmle). That's also a bit gross looking, and looks like: ml.cz.diff = mle2 (NLLdiff, start=list(v1 = vguess, CV1 = cguess, v2 = vguess, CV2 = cguess), method="L-BFGS-B", lower = 0.0001) Now, everything works fine up until here. ml.cz.diff gives me values that I can turn into a plot that reasonably fits my data. I also have several different models, and can get AICc values to compare them. However, when I try to get confidence intervals around v1, CV1, v2 and CV2 I have problems. Basically, I get a negative bound on CV1, which is impossible as it actually represents a square number in the biological model as well as some warnings. The warnings are this: http://i.imgur.com/B3H2l.png . Is there a better way to get confidence intervals? Or, really, a way to get confidence intervals that make sense here? What I see happening is that, by coincidence, my hessian matrix is singular for some values in the optimization space. But, since I'm optimizing over 4 variables and don't have overly extensive programming knowledge, I can't come up with a good method of optimization that doesn't rely on the hessian. I have googled the problem - it suggested that my model's bad, but I'm reconstructing some work done before which suggests that my model's really not awful (the plots I make using the ml.cz.diff look like the plots of the original work). I have also read the relevant parts of the manual as well as Bolker's book Ecological Models in R. I have also tried different optimization methods, which resulted in a longer run time but the same errors. The "SANN" method didn't finish running within an hour, so I didn't wait around to see the result. tl;dr : my confidence intervals are bad, is there a relatively straightforward way to fix them in R. My vectors are: czT01 = c(5, 5, 5, 5, 5, 5, 5, 25, 25, 25, 25, 25, 25, 25, 50, 50, 50, 50, 50, 50, 50) czT02 = c(5, 5, 5, 5, 5, 10, 10, 10, 10, 10, 25, 25, 25, 25, 25, 50, 50, 50, 50, 50, 75, 75, 75, 75, 75) czI01 = c(25, 24, 22, 22, 26, 23, 25, 25, 25, 23, 25, 18, 21, 24, 22, 23, 25, 23, 25, 25, 25) czI02 = c(13, 16, 5, 18, 16, 13, 17, 22, 13, 15, 15, 22, 12, 12, 13, 13, 11, 19, 21, 13, 21, 18, 16, 15, 11) czV01 = c(1, 4, 5, 5, 2, 3, 4, 11, 8, 1, 11, 12, 10, 16, 5, 15, 18, 12, 23, 13, 22) czV02 = c(0, 3, 1, 5, 1, 6, 3, 4, 7, 12, 2, 8, 8, 5, 3, 6, 4, 6, 11, 5, 11, 1, 13, 9, 7) and I get my guesses by: v = -log((c(czI01, czI02) - c(czV01, czV02))/c(czI01, czI02))/c(czT01, czT02) vguess = mean(v) cguess = var(v)/vguess^2 It's also possible that I'm doing something else completely wrong, but my results seem reasonable so I haven't caught it.

    Read the article

  • Silverlight Cream for March 26, 2010 -- #821

    - by Dave Campbell
    In this Issue: Max Paulousky, Christian Schormann, John Papa, Phani Raj, David Anson(-2-, -3-), Brad Abrams(-2-), and Jeff Wilcox(-2-, -3-). Shoutouts: Jeff Wilcox posted his material from mix and some preview TestFramework bits: Unit Testing Silverlight & Windows Phone Applications – talk now online At MIX10, Jeff Wilcox demo'd an app called "Peppermint"... here's the bleeding edge demo: “Peppermint” MIX demo sources Erik Mork and Co. have put out their weekly This Week In Silverlight 3.25.2010 Brad Abrams has all his materials posted for his MIX10 session Mix2010: Search Engine Optimization (SEO) for Microsoft Silverlight... including play-by-play of the demo and all source. Do you use Rooler? Well you should! Watch a video Pete Brown did with Pete Blois on Expression Blend, Windows Phone, Rooler Interested in Silverlight and XNA for WP7? Me too! Michael Klucher has a post outlining the two: Silverlight and XNA Framework Game Development and Compatibility From SilverlightCream.com: Modularity in Silverlight Applications - An Issue With ModuleInitializeException Max Paulousky has a truly ugly error trace listed by way of not having a reference listed, and the obvious simple solution. Next time he'll talk about the difficult situations. Using SketchFlow to Prototype for Windows Phone Christian Schormann has a tutorial up on using Expression Blend to develop for WP7 ... who better than Christian for that task?? Silverlight TV 18: WCF RIA Services Validation John Papa held forth with Nikhil Kothari on WCF RIA Services and validation just prior to MIX10, and was posted yesterday. Building SL3 applications using OData client Library with Vs 2010 RC Phani Raj walks through building an OData consumer in SL3, the first problem you're going to hit, and the easy solution to it. Tip: When creating a DependencyProperty, follow the handy convention of "wrapper+register+static+virtual" David Anson has a couple more of his 'Tips' up... this first is about Dependency Properties again... having a good foundation for all your Dependency Properties is a great way to avoid problems. Tip: Do not assign DependencyProperty values in a constructor; it prevents users from overriding them In the next post, David Anson talks about not assigning Dependency Property values in a constructor and gives one of the two ways to get around doing so. Tip: Set DependencyProperty default values in a class's default style if it's more convenient In his latest post, David Anson gives the second way to avoid setting a Dependency Property value in the constructor. Silverlight 4 + RIA Services - Ready for Business: Search Engine Optimization (SEO) Brad Abrams Abrams adds SEO to the tutorial series he's doing. He begins with his PDC09 session material on the subject and then takes off on a great detailed tutorial all with source. Silverlight 4 + RIA Services - Ready for Business: Localizing Business Application Brad Abrams then discusses localization and Silverlight in another detailed tutorial with all code included. Silverlight Toolkit and the Windows Phone: WrapPanel, and a few others Jeff Wilcox has a few WP7 posts I'm going to push today. This first is from earlier this week and is about using the Toolkit in WP7 and better than that, he includes the bits you need if all you want is the WrapPanel Data binding user settings in Windows Phone applications In the next one from yesterday, Jeff Wilcox demonstrates saving some user info in Isolated Storage to improve the user experience, and shares all the necessary plumbing files, and other external links as well. Displaying 2D QR barcodes in Windows Phone applications In a post from today, Jeff Wilcox ported his Silverlight 2D QR Barcode app from last year into WP7 ... just very cool... get the source and display your Microsoft Tag. Stay in the 'Light! Twitter SilverlightNews | Twitter WynApse | WynApse.com | Tagged Posts | SilverlightCream Join me @ SilverlightCream | Phoenix Silverlight User Group Technorati Tags: Silverlight    Silverlight 3    Silverlight 4    Windows Phone    MIX10

    Read the article

  • Willy Rotstein on Supply Chain Planning

    - by sarah.taylor(at)oracle.com
    Each time a merchandiser, buyer or planner in Retail makes a business decision around assortment, inventory, pricing and promotions there is an opportunity to improve both Profitability and Customer Service. Improving decision making, however, has always been a tricky business for retailers.  I have worked in this space for more than 15 years. I began my career as an academic, at Imperial College London, and then broadened this interest with Retailers, aiming to optimize their merchandising and supply chain decisions. Planning the business and optimizing profit is a complex process. The complexity arises from the variety of people involved, the large number of decisions to take across all business processes, the uncertainty intrinsic to the retail environment as well as the volume of data available for analysis.  Things are not getting any easier either. The advent of multi-channel, social media and mobile is taking these complexities to a new level and presenting additional opportunities for those willing to exploit them. I guess it is due to the complexities of the decision making process that, over the last couple of years working with Oracle Retail, I have witnessed a clear trend around the deployment of planning systems. Retailers are aiming to simplify their decision making processes. They want to use one joined up planning platform across the business and enhance it with "actionable" data mining and optimization techniques. At Oracle Retail, we have a vibrant community of international retailers who regularly come together to discuss the big issues in retail planning. It is a combination of fashion, grocery and speciality retailers, all sharing their best practice vision for planning and optimizing merchandise decisions. As part of the Retail Exchange program, at the recent National Retail Federation event in New York, I jointly hosted a Planning dinner with Peter Fitzgerald from Google UK, Retail Division. Those retailers from our international planning community who were in New York for the annual NRF event were able to attend. The group comprised some of Europe's great International Retail brands.  All sectors were represented by organisations like Mango, LVMH, Ahold, Morrisons, Shop Direct and River Island. They confirmed the current importance of engaging with Planning and Optimization issues. In particular the impact of the internet was a key topic. We had a great debate about new retail initiatives.  Peter highlighted how mobility is changing retail - in particular with the new "local availability search" initiative. We also had an exciting discussion around the opportunities to improve merchandising using the new data that is becoming available from search, social media and ecommerce sites. It will be our focus to continue to help retailers translate this data into better results while keeping their business operations simple. New developments in "actionable" analytics and computing capacity make this a very exciting area today. Watch this space for my contributions on these topics which will be made available through this blog. Oracle Retail has a strong Planning community. if you are a category manager, a planner, a buyer, a merchandiser, a retail supplier or any retail executive with a keen interest in planning then you would be very welcome to join Oracle Retail's Planning Community. As part of our community you will be able to join our in-person and virtual events, download topical white papers and best practice information specifically tailored to your area of interest.  If anyone would like to register their interest in joining our community of retailers discussing planning then please contact me at [email protected]   Willy Rotstein, Oracle Retail

    Read the article

  • Low level programming - what's in it for me?

    - by back2dos
    For years I have considered digging into what I consider "low level" languages. For me this means C and assembly. However I had no time for this yet, nor has it EVER been neccessary. Now because I don't see any neccessity arising, I feel like I should either just schedule some point in time when I will study the subject or drop the plan forever. My Position For the past 4 years I have focused on "web technologies", which may change, and I am an application developer, which is unlikely to change. In application development, I think usability is the most important thing. You write applications to be "consumed" by users. The more usable those applications are, the more value you have produced. In order to achieve good usability, I believe the following things are viable Good design: Well-thought-out features accessible through a well-thought-out user interface. Correctness: The best design isn't worth anything, if not implemented correctly. Flexibility: An application A should constantly evolve, so that its users need not switch to a different application B, that has new features, that A could implement. Applications addressing the same problem should not differ in features but in philosophy. Performance: Performance contributes to a good user experience. An application is ideally always responsive and performs its tasks reasonably fast (based on their frequency). The value of performance optimization beyond the point where it is noticeable by the user is questionable. I think low level programming is not going to help me with that, except for performance. But writing a whole app in a low level language for the sake of performance is premature optimization to me. My Question What could low level programming teach me, what other languages wouldn't teach me? Am I missing something, or is it just a skill, that is of very little use for application development? Please understand, that I am not questioning the value of C and assembly. It's just that in my everyday life, I am quite happy that all the intricacies of that world are abstracted away and managed for me (mostly by layers written in C/C++ and assembly themselves). I just don't see any concepts, that could be new to me, only details I would have to stuff my head with. So what's in it for me? My Conclusion Thanks to everyone for their answers. I must say, nobody really surprised me, but at least now I am quite sure I will drop this area of interest until any need for it arises. To my understanding, writing assembly these days for processors as they are in use in today's CPUs is not only unneccesarily complicated, but risks to result in poorer runtime performance than a C counterpart. Optimizing by hand is nearly impossible due to OOE, while you do not get all kinds of optimizations a compiler can do automatically. Also, the code is either portable, because it uses a small subset of available commands, or it is optimized, but then it probably works on one architecture only. Writing C is not nearly as neccessary anymore, as it was in the past. If I were to write an application in C, I would just as much use tested and established libraries and frameworks, that would spare me implementing string copy routines, sorting algorithms and other kind of stuff serving as exercise at university. My own code would execute faster at the cost of type safety. I am neither keen on reeinventing the wheel in the course of normal app development, nor trying to debug by looking at core dumps :D I am currently experimenting with languages and interpreters, so if there is anything I would like to publish, I suppose I'd port a working concept to C, although C++ might just as well do the trick. Again, thanks to everyone for your answers and your insight.

    Read the article

  • .Net Reflector 6.5 EAP now available

    - by CliveT
    With the release of CLR 4 being so close, we’ve been working hard on getting the new C# and VB language features implemented inside Reflector. The work isn’t complete yet, but we have some of the features working. Most importantly, there are going to be changes to the Reflector object model, and we though it would be useful for people to see the changes and have an opportunity to comment on them. Before going any further, we should tell you what the EAP contains that’s different from the released version. A number of bugs have been fixed, mainly bugs that were raised via the forum. This is slightly offset by the fact that this EAP hasn’t had a whole lot of testing and there may have been new bugs introduced during the development work we’ve been doing. The C# language writer has been changed to display in and out co- and contra-variance markers on interfaces and delegates, and to display default values for optional parameters in method definitions. We also concisely display values passed by reference into COM calls. However, we do not change callsites to display calls using named parameters; this looks like hard work to get right. The forthcoming version of the C# language introduces dynamic types and dynamic calls. The new version of Reflector should display a dynamic call rather than the generated C#: dynamic target = MyTestObject(); target.Hello("Mum"); We have a few bugs in this area where we are not casting to dynamic when necessary. These have been fixed on a branch and should make their way into the next EAP. To support the dynamic features, we’ve added the types IDynamicMethodReferenceExpression, IDynamicPropertyIndexerExpression, and IDynamicPropertyReferenceExpression to the object model. These types, based on the versions without “Dynamic” in the name, reflect the fact that we don’t have full information about the method that is going to be called, but only have its name (as a string). These interfaces are going to change – in an internal version, they have been extended to include information about which parameter positions use runtime types and which use compile time types. There’s also the interface, IDynamicVariableDeclaration, that can be used to determine if a particular variable is used at dynamic call sites as a target. A couple of these language changes have also been added to the Visual Basic language writer. The new features are exposed only when the optimization level is set to .NET 4. When the level is set this high, the other standard language writers will simply display a message to say that they do not handle such an optimization level. Reflector Pro now has 4.0 as an optional compilation target and we have done some work to get the pdb generation right for these new features. The EAP version of Reflector no longer installs the add-in on startup. The first time you run the EAP, it displays the integration options dialog. You can use the checkboxes to select the versions of Visual Studio into which you want to install the EAP version. Note that you can only have one version of Reflector Pro installed in Visual Studio; if you install into a Visual Studio that has another version installed, the previous version will be removed. Please try it out and send your feedback to the EAP forum.

    Read the article

  • Clever memory usage through the years

    - by Ben Emmett
    A friend and I were recently talking about the really clever tricks people have used to get the most out of memory. I thought I’d share my favorites, and would love to hear yours too! Interleaving on drum memory Back in the ye olde days before I’d been born (we’re talking the 50s / 60s here), working memory commonly took the form of rotating magnetic drums. These would spin at a constant speed, and a fixed head would read from memory when the correct part of the drum passed it by, a bit like a primitive platter disk. Because each revolution took a few milliseconds, programmers took to manually arranging information non-sequentially on the drum, timing when an instruction or memory address would need to be accessed, then spacing information accordingly around the edge of the drum, thus reducing the access delay. Similar techniques were still used on hard disks and floppy disks into the 90s, but have become irrelevant with modern disk technologies. The Hashlife algorithm Conway’s Game of Life has attracted numerous implementations over the years, but Bill Gosper’s Hashlife algorithm is particularly impressive. Taking advantage of the repetitive nature of many cellular automata, it uses a quadtree structure to store the hashes of pieces of the overall grid. Over time there are fewer and fewer new structures which need to be evaluated, so it starts to run faster with larger grids, drastically outperforming other algorithms both in terms of speed and the size of grid which can be simulated. The actual amount of memory used is huge, but it’s used in a clever way, so makes the list . Elite’s procedural generation Ok, so this isn’t exactly a memory optimization – more a storage optimization – but it gets an honorable mention anyway. When writing Elite, David Braben and Ian Bell wanted to build a rich world which gamers could explore, but their 22K memory was something of a limitation (for comparison that’s about the size of my avatar picture at the top of this page). They procedurally generated all the characteristics of the 2048 planets in their virtual universe, including the names, which were stitched together using a lookup table of parts of names. In fact the original plans were for 2^52 planets, but it was decided that that was probably too many. Oh, and they did that all in assembly language. Other games of the time used similar techniques too – The Sentinel’s landscape generation algorithm being another example. Modern Garbage Collectors Garbage collection in managed languages like Java and .NET ensures that most of the time, developers stop needing to care about how they use and clean up memory as the garbage collector handles it automatically. Achieving this without killing performance is a near-miraculous feet of software engineering. Much like when learning chemistry, you find that every time you think you understand how the garbage collector works, it turns out to be a mere simplification; that there are yet more complexities and heuristics to help it run efficiently. Of course introducing memory problems is still possible (and there are tools like our memory profiler to help if that happens to you) but they’re much, much rarer. A cautionary note In the examples above, there were good and well understood reasons for the optimizations, but cunningly optimized code has usually had to trade away readability and maintainability to achieve its gains. Trying to optimize memory usage without being pretty confident that there’s actually a problem is doing it wrong. So what have I missed? Tell me about the ingenious (or stupid) tricks you’ve seen people use. Ben

    Read the article

  • Detect Unicode Usage in SQL Column

    One optimization you can make to a SQL table that is overly large is to change from nvarchar (or nchar) to varchar (or char).  Doing so will cut the size used by the data in half, from 2 bytes per character (+ 2 bytes of overhead for varchar) to only 1 byte per character.  However, you will lose the ability to store Unicode characters, such as those used by many non-English alphabets.  If the tables are storing user-input, and your application is or might one day be used internationally, its likely that using Unicode for your characters is a good thing.  However, if instead the data is being generated by your application itself or your development team (such as lookup data), and you can be certain that Unicode character sets are not required, then switching such columns to varchar/char can be an easy improvement to make. Avoid Premature Optimization If you are working with a lookup table that has a small number of rows, and is only ever referenced in the application by its numeric ID column, then you wont see any benefit to using varchar vs. nvarchar.  More generally, for small tables, you wont see any significant benefit.  Thus, if you have a general policy in place to use nvarchar/nchar because it offers more flexibility, do not take this post as a recommendation to go against this policy anywhere you can.  You really only want to act on measurable evidence that suggests that using Unicode is resulting in a problem, and that you wont lose anything by switching to varchar/char. Obviously the main reason to make this change is to reduce the amount of space required by each row.  This in turn affects how many rows SQL Server can page through at a time, and can also impact index size and how much disk I/O is required to respond to queries, etc.  If for example you have a table with 100 million records in it and this table has a column of type nchar(5), this column will use 5 * 2 = 10 bytes per row, and with 100M rows that works out to 10 bytes * 100 million = 1000 MBytes or 1GB.  If it turns out that this column only ever stores ASCII characters, then changing it to char(5) would reduce this to 5*1 = 5 bytes per row, and only 500MB.  Of course, if it turns out that it only ever stores the values true and false then you could go further and replace it with a bit data type which uses only 1 byte per row (100MB  total). Detecting Whether Unicode Is In Use So by now you think that you have a problem and that it might be alleviated by switching some columns from nvarchar/nchar to varchar/char but youre not sure whether youre currently using Unicode in these columns.  By definition, you should only be thinking about this for a column that has a lot of rows in it, since the benefits just arent there for a small table, so you cant just eyeball it and look for any non-ASCII characters.  Instead, you need a query.  Its actually very simple: SELECT DISTINCT(CategoryName)FROM CategoriesWHERE CategoryName <> CONVERT(varchar, CategoryName) Summary Gregg Stark for the tip. Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Investigating on xVelocity (VertiPaq) column size

    - by Marco Russo (SQLBI)
      In January I published an article about how to optimize high cardinality columns in VertiPaq. In the meantime, VertiPaq has been rebranded to xVelocity: the official name is now “xVelocity in-memory analytics engine (VertiPaq)” but using xVelocity and VertiPaq when we talk about Analysis Services has the same meaning. In this post I’ll show how to investigate on columns size of an existing Tabular database so that you can find the most important columns to be optimized. A first approach can be looking in the DataDir of Analysis Services and look for the folder containing the database. Then, look for the biggest files in all subfolders and you will find the name of a file that contains the name of the most expensive column. However, this heuristic process is not very optimized. A better approach is using a DMV that provides the exact information. For example, by using the following query (open SSMS, open an MDX query on the database you are interested to and execute it) you will see all database objects sorted by used size in a descending way. SELECT * FROM $SYSTEM.DISCOVER_STORAGE_TABLE_COLUMN_SEGMENTS ORDER BY used_size DESC You can look at the first rows in order to understand what are the most expensive columns in your tabular model. The interesting data provided are: TABLE_ID: it is the name of the object – it can be also a dictionary or an index COLUMN_ID: it is the column name the object belongs to – you can also see ID_TO_POS and POS_TO_ID in case they refer to internal indexes RECORDS_COUNT: it is the number of rows in the column USED_SIZE: it is the used memory for the object By looking at the ration between USED_SIZE and RECORDS_COUNT you can understand what you can do in order to optimize your tabular model. Your options are: Remove the column. Yes, if it contains data you will never use in a query, simply remove the column from the tabular model Change granularity. If you are tracking time and you included milliseconds but seconds would be enough, round the data source column to the nearest second. If you have a floating point number but two decimals are good enough (i.e. the temperature), round the number to the nearest decimal is relevant to you. Split the column. Create two or more columns that have to be combined together in order to produce the original value. This technique is described in VertiPaq optimization article. Sort the table by that column. When you read the data source, you might consider sorting data by this column, so that the compression will be more efficient. However, this technique works better on columns that don’t have too many distinct values and you will probably move the problem to another column. Sorting data starting from the lower density columns (those with a few number of distinct values) and going to higher density columns (those with high cardinality) is the technique that provides the best compression ratio. After the optimization you should be able to reduce the used size and improve the count/size ration you measured before. If you are interested in a longer discussion about internal storage in VertiPaq and you want understand why this approach can save you space (and time), you can attend my 24 Hours of PASS session “VertiPaq Under the Hood” on March 21 at 08:00 GMT.

    Read the article

  • Investigating on xVelocity (VertiPaq) column size

    - by Marco Russo (SQLBI)
      In January I published an article about how to optimize high cardinality columns in VertiPaq. In the meantime, VertiPaq has been rebranded to xVelocity: the official name is now “xVelocity in-memory analytics engine (VertiPaq)” but using xVelocity and VertiPaq when we talk about Analysis Services has the same meaning. In this post I’ll show how to investigate on columns size of an existing Tabular database so that you can find the most important columns to be optimized. A first approach can be looking in the DataDir of Analysis Services and look for the folder containing the database. Then, look for the biggest files in all subfolders and you will find the name of a file that contains the name of the most expensive column. However, this heuristic process is not very optimized. A better approach is using a DMV that provides the exact information. For example, by using the following query (open SSMS, open an MDX query on the database you are interested to and execute it) you will see all database objects sorted by used size in a descending way. SELECT * FROM $SYSTEM.DISCOVER_STORAGE_TABLE_COLUMN_SEGMENTS ORDER BY used_size DESC You can look at the first rows in order to understand what are the most expensive columns in your tabular model. The interesting data provided are: TABLE_ID: it is the name of the object – it can be also a dictionary or an index COLUMN_ID: it is the column name the object belongs to – you can also see ID_TO_POS and POS_TO_ID in case they refer to internal indexes RECORDS_COUNT: it is the number of rows in the column USED_SIZE: it is the used memory for the object By looking at the ration between USED_SIZE and RECORDS_COUNT you can understand what you can do in order to optimize your tabular model. Your options are: Remove the column. Yes, if it contains data you will never use in a query, simply remove the column from the tabular model Change granularity. If you are tracking time and you included milliseconds but seconds would be enough, round the data source column to the nearest second. If you have a floating point number but two decimals are good enough (i.e. the temperature), round the number to the nearest decimal is relevant to you. Split the column. Create two or more columns that have to be combined together in order to produce the original value. This technique is described in VertiPaq optimization article. Sort the table by that column. When you read the data source, you might consider sorting data by this column, so that the compression will be more efficient. However, this technique works better on columns that don’t have too many distinct values and you will probably move the problem to another column. Sorting data starting from the lower density columns (those with a few number of distinct values) and going to higher density columns (those with high cardinality) is the technique that provides the best compression ratio. After the optimization you should be able to reduce the used size and improve the count/size ration you measured before. If you are interested in a longer discussion about internal storage in VertiPaq and you want understand why this approach can save you space (and time), you can attend my 24 Hours of PASS session “VertiPaq Under the Hood” on March 21 at 08:00 GMT.

    Read the article

  • ADO and Two Way Storage Tiering

    - by Andy-Oracle
    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 We get asked the following question about Automatic Data Optimization (ADO) storage tiering quite a bit. Can you tier back to the original location if the data gets hot again? The answer is yes but not with standard Automatic Data Optimization policies, at least not reliably. That's not how ADO is meant to operate. ADO is meant to mirror a traditional view of Information Lifecycle Management (ILM) where data will be very volatile when first created, will become less active or cool, and then will eventually cease to be accessed at all (i.e. cold). I think the reason this question gets asked is because customers realize that many of their business processes are cyclical and the thinking goes that those segments that only get used during month end or year-end cycles could sit on lower cost storage when not being used. Unfortunately this doesn't fit very well with the ADO storage tiering model. ADO storage tiering is based on the amount of free and used space in the source tablespace. There are two parameters that control this behavior, TBS_PERCENT_USED and TBS_PERCENT_FREE. When the space in the tablespace exceeds the TBS_PERCENT_USED value then segments specified in storage tiering clause(s) can be moved until the percent of free space reaches the TBS_PERCENT_FREE value. It is worth mentioning that no checks are made for available space in the target tablespace. Now, it is certainly possible to create custom functions to control storage tiering, but this can get complicated. The biggest problem is insuring that there is enough space to move the segment back to tier 1 storage, assuming that that's the goal. This isn't as much of a problem when moving from tier 1 to tier 2 storage because there is typically more tier 2 storage available. At least that's the premise since it is supposed to be less costly, lower performing and higher capacity storage. In either case though, if there isn't enough space then the operation fails. In the case of a customized function, the question becomes do you attempt to free the space so the move can be made or do you just stop and return false so that the move cannot take place? This is really the crux of the issue. Once you cross into this territory you're really going to have to implement two-way hierarchical storage and the whole point of ADO was to provide automatic storage tiering. You're probably better off using heat map and/or business access requirements and building your own hierarchical storage management infrastructure if you really want two way storage tiering. /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;}

    Read the article

  • Rethinking Oracle Optimizer Statistics for P6 Part 2

    - by Brian Diehl
    In the previous post (Part 1), I tried to draw some key insights about the relationship between P6 and Oracle Optimizer Statistics.  The first is that average cardinality has the greatest impact on query optimization and that the particular queries generated by P6 are more likely to use this average during calculations. The second is that these are statistics that are unlikely to change greatly over the life of the application. Ultimately, our goal is to get the best query optimization possible.  Or is it? Stability No application administrator wants to get the call at 9am that their application users cannot get there work done because everything is running slow. This is a possibility with a regularly scheduled nightly collection of statistics. It may not just be slow performance, but a complete loss of service because one or more queries are optimized poorly. Ideally, this should not be the case. The database optimizer should make better decisions with more up-to-date data. Better statistics may give incremental performance benefit. However, this benefit must be balanced against the potential cost of system down time.  It is stability that we ultimately desire and not absolute optimal performance. We do want the benefit from more accurate statistics and better query plans, but not at the risk of an unusable system. As a result, I've developed the following methodology around managing database statistics for the P6 database.  1. No Automatic Re-Gathering - The daily, weekly, or other interval of statistic gathering is unlikely to be beneficial. Quite the opposite. It is more likely to cause problems. 2. Smart Re-Gathering - The time to collect statistics is when things have changed significantly. For a new installation of P6, this is happening more often because the data is growing from a few rows to thousands and more. But for a mature system, the data is not changing significantly from week-to-week. There are times to collect statistics: New releases of the application Changes in the underlying hardware or software versions (ex. new Oracle RDBMS version) When additional user groups are added. The new groups may use the software in significantly different ways. After significant changes in the data. This may be monthly, quarterly or yearly.  3. Always Test - If you take away one thing from this post, it would be to always have a plan to test after changing statistics. In reality, statistics can be collected as often as you desire provided there are tests in place to verify that performance is the same or better. These might be automated tests or simply a manual script of application functions. 4. Have a Way Out - Never change the statistics without a way to return to the previous set. Think of the statistics as one part of the overall application code that also includes the source code--both application and RDBMS. It would be foolish to change to the new code without a way to get back to the previous version. In the final post, I will talk about the actual script I created for P6 PMDB and possible future direction for managing query performance. 

    Read the article

  • Can you use gzip over SSL? And Connection: Keep-Alive headers

    - by magenta
    I'm evaluating the front end performance of a secure (SSL) web app here at work and I'm wondering if it's possible to compress text files (html/css/javascript) over SSL. I've done some googling around but haven't found anything specifically related to SSL. If it's possible, is it even worth the extra CPU cycles since responses are also being encrypted? Would compressing responses hurt performance? Also, I'm wanting to make sure we're keeping the SSL connection alive so we're not making SSL handshakes over and over. I'm not seeing Connection: Keep-Alive in the response headers. I do see Keep-Alive: 115 in the request headers but that's only keeping the connection alive for 115 milliseconds (seems like the app server is closing the connection after a single request is processed?) Wouldn't you want the server to be setting that response header for as long as the session inactivity timeout is? I understand browsers don't cache SSL content to disk so we're serving the same files over and over and over on subsequent visits even though nothing has changed. The main optimization recommendations are reducing the number of http requests, minification, moving scripts to bottom, image optimization, possible domain sharding (though need to weigh the cost of another SSL handshake), things of that nature.

    Read the article

  • Using Traveling Salesman Solver to Decide Hamiltonian Path

    - by Firas Assaad
    This is for a project where I'm asked to implement a heuristic for the traveling salesman optimization problem and also the Hamiltonian path or cycle decision problem. I don't need help with the implementation itself, but have a question on the direction I'm going in. I already have a TSP heuristic based on a genetic algorithm: it assumes a complete graph, starts with a set of random solutions as a population, and works to improve the population for a number of generations. Can I also use it to solve the Hamiltonian path or cycle problems? Instead of optimizing to get the shortest path, I just want to check if there is a path. Now any complete graph will have a Hamiltonian path in it, so the TSP heuristic would have to be extended to any graph. This could be done by setting the edges to some infinity value if there is no path between two cities, and returning the first path that is a valid Hamiltonian path. Is that the right way to approach it? Or should I use a different heuristic for Hamiltonian path? My main concern is whether it's a viable approach since I can be somewhat sure that TSP optimization works (because you start with solutions and improve them) but not if a Hamiltonian path decider would find any path in a fixed number of generations. I assume the best approach would be to test it myself, but I'm constrained by time and thought I'd ask before going down this route... (I could find a different heuristic for Hamiltonian path instead)

    Read the article

  • Iterating through String word at a time in Python

    - by AlgoMan
    I have a string buffer of a huge text file. I have to search a given words/phrases in the string buffer. Whats the efficient way to do it ? I tried using re module matches. But As i have a huge text corpus that i have to search through. This is taking large amount of time. Given a Dictionary of words and Phrases. I iterate through the each file, read that into string , search all the words and phrases in the dictionary and increment the count in the dictionary if the keys are found. One small optimization that we thought was to sort the dictionary of phrases/words with the max number of words to lowest. And then compare each word start position from the string buffer and compare the list of words. If one phrase is found, we don search for the other phrases (as it matched the longest phrase ,which is what we want) Can some one suggest how to go about word by word in the string buffer. (Iterate string buffer word by word) ? Also, Is there any other optimization that can be done on this ?

    Read the article

  • efficiently determining if a polynomial has a root in the interval [0,T]

    - by user168715
    I have polynomials of nontrivial degree (4+) and need to robustly and efficiently determine whether or not they have a root in the interval [0,T]. The precise location or number of roots don't concern me, I just need to know if there is at least one. Right now I'm using interval arithmetic as a quick check to see if I can prove that no roots can exist. If I can't, I'm using Jenkins-Traub to solve for all of the polynomial roots. This is obviously inefficient since it's checking for all real roots and finding their exact positions, information I don't end up needing. Is there a standard algorithm I should be using? If not, are there any other efficient checks I could do before doing a full Jenkins-Traub solve for all roots? For example, one optimization I could do is to check if my polynomial f(t) has the same sign at 0 and T. If not, there is obviously a root in the interval. If so, I can solve for the roots of f'(t) and evaluate f at all roots of f' in the interval [0,T]. f(t) has no root in that interval if and only if all of these evaluations have the same sign as f(0) and f(T). This reduces the degree of the polynomial I have to root-find by one. Not a huge optimization, but perhaps better than nothing.

    Read the article

  • Moving inserted container element if possible

    - by doublep
    I'm trying to achieve the following optimization in my container library: when inserting an lvalue-referenced element, copy it to internal storage; but when inserting rvalue-referenced element, move it if supported. The optimization is supposed to be useful e.g. if contained element type is something like std::vector, where moving if possible would give substantial speedup. However, so far I was unable to devise any working scheme for this. My container is quite complicated, so I can't just duplicate insert() code several times: it is large. I want to keep all "real" code in some inner helper, say do_insert() (may be templated) and various insert()-like functions would just call that with different arguments. My best bet code for this (a prototype, of course, without doing anything real): #include <iostream> #include <utility> struct element { element () { }; element (element&&) { std::cerr << "moving\n"; } }; struct container { void insert (const element& value) { do_insert (value); } void insert (element&& value) { do_insert (std::move (value)); } private: template <typename Arg> void do_insert (Arg arg) { element x (arg); } }; int main () { { // Shouldn't move. container c; element x; c.insert (x); } { // Should move. container c; c.insert (element ()); } } However, this doesn't work at least with GCC 4.4 and 4.5: it never prints "moving" on stderr. Or is what I want impossible to achieve and that's why emplace()-like functions exist in the first place?

    Read the article

  • VS 2008 C++ build output?

    - by STingRaySC
    Why when I watch the build output from a VC++ project in VS do I see: 1Compiling... 1a.cpp 1b.cpp 1c.cpp 1d.cpp 1e.cpp [etc...] 1Generating code... 1x.cpp 1y.cpp [etc...] The output looks as though several compilation units are being handled before any code is generated. Is this really going on? I'm trying to improve build times, and by using pre-compiled headers, I've gotten great speedups for each ".cpp" file, but there is a relatively long pause during the "Generating Code..." message. I do not have "Whole Program Optimization" nor "Link Time Code Generation" turned on. If this is the case, then why? Why doesn't VC++ compile each ".cpp" individually (which would include the code generation phase)? If this isn't just an illusion of the output, is there cross-compilation-unit optimization potentially going on here? There don't appear to be any compiler options to control that behavior (I know about WPO and LTCG, as mentioned above). EDIT: The build log just shows the ".obj" files in the output directory, one per line. There is no indication of "Compiling..." vs. "Generating code..." steps. EDIT: I have confirmed that this behavior has nothing to do with the "maximum number of parallel project builds" setting in Tools - Options - Projects and Solutions - Build and Run. Nor is it related to the MSBuild project build output verbosity setting. Indeed if I cancel the build before the "Generating code..." step, the ".obj" files will not exist for the most recent set of "compiled" files. E.g., if I cancel the build during "c.cpp" above, I will see only "a.obj" and "b.obj".

    Read the article

  • Thread management advice - Is TPL a good idea?

    - by Ian
    I'm hoping to get some advice on the use of thread managment and hopefully the task parallel library, because I'm not sure I've been going down the correct route. Probably best is that I give an outline of what I'm trying to do. Given a Problem I need to generate a Solution using a heuristic based algorithm. I start of by calculating a base solution, this operation I don't think can be parallelised so we don't need to worry about. Once the inital solution has been generated, I want to trigger n threads, which attempt to find a better solution. These threads need to do a couple of things: They need to be initalized with a different 'optimization metric'. In other words they are attempting to optimize different things, with a precedence level set within code. This means they all run slightly different calculation engines. I'm not sure if I can do this with the TPL.. If one of the threads finds a better solution that the currently best known solution (which needs to be shared across all threads) then it needs to update the best solution, and force a number of other threads to restart (again this depends on precedence levels of the optimization metrics). I may also wish to combine certain calculations across threads (e.g. keep a union of probabilities for a certain approach to the problem). This is probably more optional though. The whole system needs to be thread safe obviously and I want it to be running as fast as possible. I tried quite an implementation that involved managing my own threads and shutting them down etc, but it started getting quite complicated, and I'm now wondering if the TPL might be better. I'm wondering if anyone can offer any general guidance? Thanks...

    Read the article

  • In sync query calls, one query causing other query to run slower. Why?

    - by Irchi
    Sorry for the long question, but I think this is an interesting situation and I couldn't find any explanations for it: I was involved in optimization of an application that performed a large number of sequential SELECT and INSERT statements on a single dedicated SQL Server database. The process needs to INSERT a large number of records into a table, but for each of them there should be some value mappings, which performed using SELECT statements on another table in the same database. For a specific execution, it took 90 minutes to run. I used a profiler (JProfiler - the application is Java-based) to determine how much time does each part of the application take. It yields that 60% of the time was spent on INSERT method calls, and almost 20% on SELECT calls (the rest distributed in other parts). After some trials, I came to this situation: I commented out the INSERT query that took 60% of the time. I was expecting for the total run time to be around 35 minutes, as I have removed 60% of the 90 minutes. But the whole process took the same 90 minutes (doing only SELECTs and nothing else), but each SELECT took longer this time! Everything was running sync, there were no async calls. And there was only one single thread of execution. SELECT and INSERT queries are very simple, and don't have anything special, and they are on different tables, but on the same DB. I tested with both the DB on the application machine, and on a remote network machine. I can't think of any explanation for this, as the Profiler (Application profiler, not SQL Profiler) reported the changes in the method call times, and by removing INSERT statements SELECT statements took longer to run. Can anyone give me some kind of explanation of what could have happened? (there can't be cache / query optimization stuff, because the queries were run in sync, and in a single thread, and it was far from affecting the cache this much) I should note that the bottleneck of the speed was in SQL server, using most of the CPU time.

    Read the article

  • Best CMS to handle short snippets of text?

    - by Federico Poloni
    I have to install a CMS to manage a set of mathematical problems, i.e., our main content will be short (~3 lines) snippets of text. We need the ability to add comments and categories/tags, possibly with a powerful search function combining different constraints on the categories. A crucial ability is the possibility to combine the results of a search in the same page to produce a (printable) problem sheet: not many CMS's seem to be able to do so, and it is difficult for me to test every one for this specific function. Do you guys know of a CMS that is capable to return formatted search result in this fashion? Thanks in advance!

    Read the article

< Previous Page | 108 109 110 111 112 113 114 115 116 117 118 119  | Next Page >