Search Results

Search found 1504 results on 61 pages for 'pros cons'.

Page 60/61 | < Previous Page | 56 57 58 59 60 61 | Next Page >

C#/.NET Little Wonders: Skip() and Take()

- by James Michael Hare

Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. I’ve covered many valuable methods from System.Linq class library before, so you already know it’s packed with extension-method goodness. Today I’d like to cover two small families I’ve neglected to mention before: Skip() and Take(). While these methods seem so simple, they are an easy way to create sub-sequences for IEnumerable<T>, much the way GetRange() creates sub-lists for List<T>. Skip() and SkipWhile() The Skip() family of methods is used to ignore items in a sequence until either a certain number are passed, or until a certain condition becomes false. This makes the methods great for starting a sequence at a point possibly other than the first item of the original sequence. The Skip() family of methods contains the following methods (shown below in extension method syntax): Skip(int count) Ignores the specified number of items and returns a sequence starting at the item after the last skipped item (if any). SkipWhile(Func<T, bool> predicate) Ignores items as long as the predicate returns true and returns a sequence starting with the first item to invalidate the predicate (if any). SkipWhile(Func<T, int, bool> predicate) Same as above, but passes not only the item itself to the predicate, but also the index of the item. For example: 1: var list = new[] { 3.14, 2.72, 42.0, 9.9, 13.0, 101.0 }; 2: 3: // sequence contains { 2.72, 42.0, 9.9, 13.0, 101.0 } 4: var afterSecond = list.Skip(1); 5: Console.WriteLine(string.Join(", ", afterSecond)); 6: 7: // sequence contains { 42.0, 9.9, 13.0, 101.0 } 8: var afterFirstDoubleDigit = list.SkipWhile(v => v < 10.0); 9: Console.WriteLine(string.Join(", ", afterFirstDoubleDigit)); Note that the SkipWhile() stops skipping at the first item that returns false and returns from there to the rest of the sequence, even if further items in that sequence also would satisfy the predicate (otherwise, you’d probably be using Where() instead, of course). If you do use the form of SkipWhile() which also passes an index into the predicate, then you should keep in mind that this is the index of the item in the sequence you are calling SkipWhile() from, not the index in the original collection. That is, consider the following: 1: var list = new[] { 1.0, 1.1, 1.2, 2.2, 2.3, 2.4 }; 2: 3: // Get all items < 10, then 4: var whatAmI = list 5: .Skip(2) 6: .SkipWhile((i, x) => i > x); For this example the result above is 2.4, and not 1.2, 2.2, 2.3, 2.4 as some might expect. The key is knowing what the index is that’s passed to the predicate in SkipWhile(). In the code above, because Skip(2) skips 1.0 and 1.1, the sequence passed to SkipWhile() begins at 1.2 and thus it considers the “index” of 1.2 to be 0 and not 2. This same logic applies when using any of the extension methods that have an overload that allows you to pass an index into the delegate, such as SkipWhile(), TakeWhile(), Select(), Where(), etc. It should also be noted, that it’s fine to Skip() more items than exist in the sequence (an empty sequence is the result), or even to Skip(0) which results in the full sequence. So why would it ever be useful to return Skip(0) deliberately? One reason might be to return a List<T> as an immutable sequence. Consider this class: 1: public class MyClass 2: { 3: private List<int> _myList = new List<int>(); 4: 5: // works on surface, but one can cast back to List<int> and mutate the original... 6: public IEnumerable<int> OneWay 7: { 8: get { return _myList; } 9: } 10: 11: // works, but still has Add() etc which throw at runtime if accidentally called 12: public ReadOnlyCollection<int> AnotherWay 13: { 14: get { return new ReadOnlyCollection<int>(_myList); } 15: } 16: 17: // immutable, can't be cast back to List<int>, doesn't have methods that throw at runtime 18: public IEnumerable<int> YetAnotherWay 19: { 20: get { return _myList.Skip(0); } 21: } 22: } This code snippet shows three (among many) ways to return an internal sequence in varying levels of immutability. Obviously if you just try to return as IEnumerable<T> without doing anything more, there’s always the danger the caller could cast back to List<T> and mutate your internal structure. You could also return a ReadOnlyCollection<T>, but this still has the mutating methods, they just throw at runtime when called instead of giving compiler errors. Finally, you can return the internal list as a sequence using Skip(0) which skips no items and just runs an iterator through the list. The result is an iterator, which cannot be cast back to List<T>. Of course, there’s many ways to do this (including just cloning the list, etc.) but the point is it illustrates a potential use of using an explicit Skip(0). Take() and TakeWhile() The Take() and TakeWhile() methods can be though of as somewhat of the inverse of Skip() and SkipWhile(). That is, while Skip() ignores the first X items and returns the rest, Take() returns a sequence of the first X items and ignores the rest. Since they are somewhat of an inverse of each other, it makes sense that their calling signatures are identical (beyond the method name obviously): Take(int count) Returns a sequence containing up to the specified number of items. Anything after the count is ignored. TakeWhile(Func<T, bool> predicate) Returns a sequence containing items as long as the predicate returns true. Anything from the point the predicate returns false and beyond is ignored. TakeWhile(Func<T, int, bool> predicate) Same as above, but passes not only the item itself to the predicate, but also the index of the item. So, for example, we could do the following: 1: var list = new[] { 1.0, 1.1, 1.2, 2.2, 2.3, 2.4 }; 2: 3: // sequence contains 1.0 and 1.1 4: var firstTwo = list.Take(2); 5: 6: // sequence contains 1.0, 1.1, 1.2 7: var underTwo = list.TakeWhile(i => i < 2.0); The same considerations for SkipWhile() with index apply to TakeWhile() with index, of course. Using Skip() and Take() for sub-sequences A few weeks back, I talked about The List<T> Range Methods and showed how they could be used to get a sub-list of a List<T>. This works well if you’re dealing with List<T>, or don’t mind converting to List<T>. But if you have a simple IEnumerable<T> sequence and want to get a sub-sequence, you can also use Skip() and Take() to much the same effect: 1: var list = new List<double> { 1.0, 1.1, 1.2, 2.2, 2.3, 2.4 }; 2: 3: // results in List<T> containing { 1.2, 2.2, 2.3 } 4: var subList = list.GetRange(2, 3); 5: 6: // results in sequence containing { 1.2, 2.2, 2.3 } 7: var subSequence = list.Skip(2).Take(3); I say “much the same effect” because there are some differences. First of all GetRange() will throw if the starting index or the count are greater than the number of items in the list, but Skip() and Take() do not. Also GetRange() is a method off of List<T>, thus it can use direct indexing to get to the items much more efficiently, whereas Skip() and Take() operate on sequences and may actually have to walk through the items they skip to create the resulting sequence. So each has their pros and cons. My general rule of thumb is if I’m already working with a List<T> I’ll use GetRange(), but for any plain IEnumerable<T> sequence I’ll tend to prefer Skip() and Take() instead. Summary The Skip() and Take() families of LINQ extension methods are handy for producing sub-sequences from any IEnumerable<T> sequence. Skip() will ignore the specified number of items and return the rest of the sequence, whereas Take() will return the specified number of items and ignore the rest of the sequence. Similarly, the SkipWhile() and TakeWhile() methods can be used to skip or take items, respectively, until a given predicate returns false. Technorati Tags: C#, CSharp, .NET, LINQ, IEnumerable<T>, Skip, Take, SkipWhile, TakeWhile

Read the article
CodePlex Daily Summary for Thursday, May 31, 2012

CodePlex Daily Summary for Thursday, May 31, 2012Popular ReleasesNaked Objects: Naked Objects Release 4.1.0: Corresponds to the packaged version 4.1.0 available via NuGet. Note that the versioning has moved to SemVer (http://semver.org/) This is a bug fix release with no new functionality. Please note that the easiest way to install and run the Naked Objects Framework is via the NuGet package manager: just search the Official NuGet Package Source for 'nakedobjects'. It is only necessary to download the source code (from here) if you wish to modify or re-build the framework yourself. If you do wi...Microsoft Ajax Minifier: Microsoft Ajax Minifier 4.54: Fix for issue #18161: pretty-printing CSS @media rule throws an exception due to mismatched Indent/Unindent pair.OMS.Ice - T4 Text Template Generator: OMS.Ice - T4 Text Template Generator v1.4.0.14110: Issue 601 - Template file name cannot contain characters that are not allowed in C#/VB identifiers Issue 625 - Last line will be ignored by the parser Issue 626 - Usage of environment variables and macrosSilverlight Toolkit: Silverlight 5 Toolkit Source - May 2012: Source code for December 2011 Silverlight 5 Toolkit release.totalem: version 2012.05.30.1: Beta version added function for mass renaming files and foldersAudio Pitch & Shift: Audio Pitch and Shift 4.4.0: Tracklist added on main window Improved performances with tracklist Some other fixesJson.NET: Json.NET 4.5 Release 6: New feature - Added IgnoreDataMemberAttribute support New feature - Added GetResolvedPropertyName to DefaultContractResolver New feature - Added CheckAdditionalContent to JsonSerializer Change - Metro build now always uses late bound reflection Change - JsonTextReader no longer returns no content after consecutive underlying content read failures Fix - Fixed bad JSON in an array with error handling creating an infinite loop Fix - Fixed deserializing objects with a non-default cons...Indent Guides for Visual Studio: Indent Guides v12.1: Version History Changed in v12.1: Fixed crash when unable to start asynchronous analysis Fixed upgrade from v11 Changed in v12: background document analysis new options dialog with Quick Set selections for behavior new "glow" style for guides new menu icon in VS 11 preview control now uses editor theming highlighting can be customised on each line fixed issues with collapsed code blocks improved behaviour around left-aligned pragma/preprocessor commands (C#/C++) new setting...DotNetNuke® Community Edition CMS: 06.02.00: Major Highlights Fixed issue in the Site Settings when single quotes were being treated as escape characters Fixed issue loading the Mobile Premium Data after upgrading from CE to PE Fixed errors logged when updating folder provider settings Fixed the order of the mobile device capabilities in the Site Redirection Management UI The User Profile page was completely rebuilt. We needed User Profiles to have multiple child pages. This would allow for the most flexibility by still f...Thales Simulator Library: Version 0.9.6: The Thales Simulator Library is an implementation of a software emulation of the Thales (formerly Zaxus & Racal) Hardware Security Module cryptographic device. This release fixes a problem with the FK command and a bug in the implementation of PIN block 05 format deconstruction. A new 0.9.6.Binaries file has been posted. This includes executable programs without an installer, including the GUI and console simulators, the key manager and the PVV clashing demo. Please note that you will need ...????: ????2.0.1: 1、?????。WiX Toolset: WiX v3.6 RC: WiX v3.6 RC (3.6.2928.0) provides feature complete Burn with VS11 support. For more information see Rob's blog post about the release: http://robmensching.com/blog/posts/2012/5/28/WiX-v3.6-Release-Candidate-availableJavascript .NET: Javascript .NET v0.7: SetParameter() reverts to its old behaviour of allowing JavaScript code to add new properties to wrapped C# objects. The behavior added briefly in 0.6 (throws an exception) can be had via the new SetParameterOptions.RejectUnknownProperties. TerminateExecution now uses its isolate to terminate the correct context automatically. Added support for converting all C# integral types, decimal and enums to JavaScript numbers. (Previously only the common types were handled properly.) Bug fixe...callisto: callisto 2.0.29: Added DNS functionality to scripting. See documentation section for details of how to incorporate this into your scripts.Phalanger - The PHP Language Compiler for the .NET Framework: 3.0 (May 2012): Fixes: unserialize() of negative float numbers fix pcre possesive quantifiers and character class containing ()[] array deserilization when the array contains a reference to ISerializable parsing lambda function fix round() reimplemented as it is in PHP to avoid .NET rounding errors filesize bypass for FileInfo.Length bug in Mono New features: Time zones reimplemented, uses Windows/Linux databaseSharePoint Euro 2012 - UEFA European Football Predictor: havivi.euro2012.wsp (1.1): New fetures:Admin enable / disable match Hide/Show Euro 2012 SharePoint lists (3 lists) Installing SharePoint Euro 2012 PredictorSharePoint Euro 2012 Predictor has been developed as a SharePoint Sandbox solution to support SharePoint Online (Office 365) Download the solution havivi.euro2012.wsp from the download page: Downloads Upload this solution to your Site Collection via the solutions area. Click on Activate to make the web parts in the solution available for use in the Site C...ScreenShot: InstallScreenShot: This is the current stable release.????SDK for .Net 4.0+(OAuth2.0+??V2?API): ??V2?SDK???: ?????????API?? ???????OAuth2.0?? ????：????????????，??????????“SOURCE CODE”?????????Changeset，http://weibosdk.codeplex.com/SourceControl/list/changesets ???：????????，DEMO??AppKey????????????????，?????AppKey，????AppKey???????????，?????“????>????>????>??????”.Net Code Samples: Code Samples: Code samples (SLNs).LINQ_Koans: LinqKoans v.02: Cleaned up a bitNew ProjectsAAABBBCCC: hauhuhaAutomação e Robótica: Sistema de automação e robótica de periféricos. todo o projeto consiste em integrar interface de controle pelo software do hardware para realização de comando e acionamentos eletricos e eletromecânicos.Barium Live! API Client: Simple test client for the Barium Live! REST API Contact Barium Live! support on http://www.bariumlive.com/support to get an API key for development. Please note that the test client has a dependency to the Newtonsoft Json.NET library that has to be downloaded separately from http://json.codeplex.com/BizTalk Host Restarter: This is a quick tool that I wrote to more quickly Stop, Start and Restart the BizTalk Host instances. It's command line, with full source code, I use it in my build scripts and deploy scripts, works a treat. MUCH faster than the BizTalk Admin Console can do the same task. CNAF Portal: CNAFPortalConsulta_Productos: Nos permite ingresar un código del 1 al 10..... y al dar clic en buscar nos aparecerá un mensaje con el nombre del producto de dicho código.Deployment Tool: Deployement is a small piece in our IT lifecycle for many small projects. But, considering that you are working for a large project, it has many servers, softwares, configurations, databases and so on. The problem emits, even you have upgrades and version controls. The deploymentool is a solution which can help you resolve the difficulties described above.Devmil MVVM: A toolset for MVVM development with focus on INotifyPropertyChanged and dependenciesEAL: no summaryEasyCLI: EasyCLI is a command shell for the Commodore 64 computer. EasyCLI is packaged as an EasyFlash cartridge.ExcelSharePointListSync: This tool help is Getting Data Form SharePoint to Excel 2010 , Following are the silent feature : 1) Maintain a List connection Offline 2) Choose to get data Form a View of Form Columns 3) Sync the data Back to SharePoint GH: Some summaryicontrolpcv: icontrolpvcLotteryVoteMVC: LotteryVote use mvcMail API for Windows Phone 7: Library that allows developers to add some email functionality to their application, like sending and reading email messages, with the capability of adding attachments to the messages. Matchy - Fashion Website: Fashion website for Match Brand, a fashion brand of BOOM JSC.,MIracLE Guild Admin System: asdOrchard Content Picker: Orchard Content Picker enables content editor to select Content Items.ProSoft Mail Services: This project implements a mail server that allows sending mail with SMTP and access the mailbox via POP. Later on, the server also get a MAPI interface and an interface for SPAM defense.Silver Sugar Web Browser: This is a simple web browser named "Silver Sugar". It has a back button, forward button, home button, url text box and a go button.social sdk for windows phone: Windows Phone Social Share Lib. (Including Weibo, Tencent Weibo & Renren)SolutionX: dfsdf asdfds asdfsdf asdfsdfspUtils: This project aims to facilitate easy to use methods that interact with SharePoint's Client OM/JSOM.Streambolics Library: A library of C# classes providing robust tools for real-time data monitoringTestExpression: TestExpressionTFS Helper package: VS2010 is well integrated to TFS and has all the features required on day to day basis. Most common feature any developer use is managing workitems. Creating work items is very easy as is copying work items, but what I need is to copy workitem between projects retaining links and attachments. I am sure this is not Ideal and against sprint projects.Ubelsoft: Sistema de Registro Eletrônico de Ponto - SREPZombieSim: Simulate the spread of a zombie virusZYWater: ......................

Read the article
CodePlex Daily Summary for Wednesday, May 30, 2012

CodePlex Daily Summary for Wednesday, May 30, 2012Popular ReleasesOMS.Ice - T4 Text Template Generator: OMS.Ice - T4 Text Template Generator v1.4.0.14110: Issue 601 - Template file name cannot contain characters that are not allowed in C#/VB identifiers Issue 625 - Last line will be ignored by the parser Issue 626 - Usage of environment variables and macrosSilverlight Toolkit: Silverlight 5 Toolkit Source - May 2012: Source code for December 2011 Silverlight 5 Toolkit release.totalem: version 2012.05.30.1: Beta version added function for mass renaming files and foldersAudio Pitch & Shift: Audio Pitch and Shift 4.4.0: Tracklist added on main window Improved performances with tracklist Some other fixesJson.NET: Json.NET 4.5 Release 6: New feature - Added IgnoreDataMemberAttribute support New feature - Added GetResolvedPropertyName to DefaultContractResolver New feature - Added CheckAdditionalContent to JsonSerializer Change - Metro build now always uses late bound reflection Change - JsonTextReader no longer returns no content after consecutive underlying content read failures Fix - Fixed bad JSON in an array with error handling creating an infinite loop Fix - Fixed deserializing objects with a non-default cons...DBScripterCmd - A command line tool to script database objects to seperate files: DBScripterCmd Source v1.0.2.zip: Add support for SQL Server 2005Indent Guides for Visual Studio: Indent Guides v12.1: Version History Changed in v12.1: Fixed crash when unable to start asynchronous analysis Fixed upgrade from v11 Changed in v12: background document analysis new options dialog with Quick Set selections for behavior new "glow" style for guides new menu icon in VS 11 preview control now uses editor theming highlighting can be customised on each line fixed issues with collapsed code blocks improved behaviour around left-aligned pragma/preprocessor commands (C#/C++) new setting...DotNetNuke® Community Edition CMS: 06.02.00: Major Highlights Fixed issue in the Site Settings when single quotes were being treated as escape characters Fixed issue loading the Mobile Premium Data after upgrading from CE to PE Fixed errors logged when updating folder provider settings Fixed the order of the mobile device capabilities in the Site Redirection Management UI The User Profile page was completely rebuilt. We needed User Profiles to have multiple child pages. This would allow for the most flexibility by still f...StarTrinity Face Recognition Library: Version 1.2: Much better accuracy????: ????2.0.1: 1、?????。WiX Toolset: WiX v3.6 RC: WiX v3.6 RC (3.6.2928.0) provides feature complete Burn with VS11 support. For more information see Rob's blog post about the release: http://robmensching.com/blog/posts/2012/5/28/WiX-v3.6-Release-Candidate-availableJavascript .NET: Javascript .NET v0.7: SetParameter() reverts to its old behaviour of allowing JavaScript code to add new properties to wrapped C# objects. The behavior added briefly in 0.6 (throws an exception) can be had via the new SetParameterOptions.RejectUnknownProperties. TerminateExecution now uses its isolate to terminate the correct context automatically. Added support for converting all C# integral types, decimal and enums to JavaScript numbers. (Previously only the common types were handled properly.) Bug fixe...callisto: callisto 2.0.29: Added DNS functionality to scripting. See documentation section for details of how to incorporate this into your scripts.Phalanger - The PHP Language Compiler for the .NET Framework: 3.0 (May 2012): Fixes: unserialize() of negative float numbers fix pcre possesive quantifiers and character class containing ()[] array deserilization when the array contains a reference to ISerializable parsing lambda function fix round() reimplemented as it is in PHP to avoid .NET rounding errors filesize bypass for FileInfo.Length bug in Mono New features: Time zones reimplemented, uses Windows/Linux databaseSharePoint Euro 2012 - UEFA European Football Predictor: havivi.euro2012.wsp (1.1): New fetures:Admin enable / disable match Hide/Show Euro 2012 SharePoint lists (3 lists) Installing SharePoint Euro 2012 PredictorSharePoint Euro 2012 Predictor has been developed as a SharePoint Sandbox solution to support SharePoint Online (Office 365) Download the solution havivi.euro2012.wsp from the download page: Downloads Upload this solution to your Site Collection via the solutions area. Click on Activate to make the web parts in the solution available for use in the Site C...????SDK for .Net 4.0+(OAuth2.0+??V2?API): ??V2?SDK???: ????SDK for .Net 4.X???????PHP?SDK???OAuth??API???Client???。 ??????API?? ???????OAuth2.0???? ???：????????，DEMO??AppKey????????????????，?????AppKey，????AppKey???????????，?????“????>????>????>??????”.Net Code Samples: Code Samples: Code samples (SLNs).LINQ_Koans: LinqKoans v.02: Cleaned up a bitCommonLibrary.NET: CommonLibrary.NET 0.9.8 - Final Release: A collection of very reusable code and components in C# 4.0 ranging from ActiveRecord, Csv, Command Line Parsing, Configuration, Holiday Calendars, Logging, Authentication, and much more. FluentscriptCommonLibrary.NET 0.9.8 contains a scripting language called FluentScript. Application: FluentScript Version: 0.9.8 Build: 0.9.8.4 Changeset: 75050 ( CommonLibrary.NET ) Release date: May 24, 2012 Binaries: CommonLibrary.dll Namespace: ComLib.Lang Project site: http://fluentscript.codeplex.com...JayData - The cross-platform HTML5 data-management library for JavaScript: JayData 1.0 RC1 Refresh 2: JayData is a unified data access library for JavaScript developers to query and update data from different sources like webSQL, indexedDB, OData, Facebook or YQL. See it in action in this 6 minutes video: http://www.youtube.com/watch?v=LlJHgj1y0CU RC1 R2 Release highlights Knockout.js integrationUsing the Knockout.js module, your UI can be automatically refreshed when the data model changes, so you can develop the front-end of your data manager app even faster. Querying 1:N relations in W...New Projects5Widgets: 5Widgets is a framework for building HTML5 canvas interfaces. Written in JavaScript, 5Widgets consists of a library of widgets and a controller that implements the MVC pattern. Though the HTML5 standard is gaining popularity, there is no framework like this at the moment. Yet, as a professional developer, I know many, including myself, would really find such a library useful. I have uploaded my initial code, which can definitely be improved since I have not had the time to work on it fu...Azure Trace Listener: Simple Trace Listener outputting trace data directly to Windows Azure Queue or Table Storage. Unlike the Windows Azure Diagnostics Listener (WAD), logging happens immediately and does not rely on (scheduled or manually triggered) Log Transfer mechanism. A simple Reader application shows how to read trace entries and can be used as is or as base for more advanced scenarios.CodeSample2012: Code Sample is a windows tool for saving pieces of codeEncryption: The goal of the Encryption project is to provide solid, high quality functionality that aims at taking the complexity out of using the System.Security.Cryptography namespace. The first pass of this library provides a very strong password encryption system. It uses variable length random salt bytes with secure SHA512 cryptographic hashing functions to allow you to provide a high level of security to the users. Entity Framework Code-First Automatic Database Migration: The Entity Framework Code-First Automatic Database Migration tool was designed to help developers easily update their database schema while preserving their data when they change their POCO objects. This is not meant to take the place of Code-First Migrations. This project is simply designed to ease the development burden of database changes. It grew out of the desire to not have to delete, recreated, and seed the database every time I made an object model change. Function Point Plugin: Function Point Tracability Mapper VSIX for Visual Studio 2010/TFS 2010+FunkOS: dcpu16 operating systemGit for WebMatrix: This is a WebMatrix Extension that allows users to access Git Source Control functions.Groupon Houses: the groupon site for housesLiquifier - Complete serialisation/deserialisation for complex object graphs: Liquifier is a serialisation/deserialisation library for preserving object graphs with references intact. Liquifier uses attributes and interfaces to allow the user to control how a type is serialised - the aim is to free the user from having to write code to serialise and deserialise objects, especially large or complex graphs in which this is a significant undertaking.MTACompCommEx032012: lak lak lakMVC Essentials: MVC Essentials is aimed to have all my learning in the projects that I have worked.MyWireProject: This project manages wireless networks.Peulot Heshbon - Hebrew: This program is for teaching young students math, until 6th grade. The program gives questions for the user. The user needs to answer the question. After 10 questions the user gets his mark. The marks are saved and can be viewed from every run. PlusOne: A .NET Extension and Utility Library: PlusOne is a library of extension and utility methods for C#.Project Support: This project is a simple project management solution. This will allow you to manage your clients, track bug reports, request additional features to projects that you are currently working on and much more. A client will be allowed to have multiple users, so that you can track who has made reports etc and provide them feedback. The solution is set-up so that if you require you can modify the styling to fit your companies needs with ease, you can even have multiple styles that can be set ...SharePoint 2010 Slide Menu Control: Navigation control for building SharePoint slide menuSIGO: 1 person following this project (follow) Projeto SiGO O Projeto SiGO (Sistema de Gerenciamento Odontologico) tem um Escorpo Complexo com varios programas e rotinas, compondo o modulo de SAC e CRM, Finanças, Estoque e outro itens. Coordenador Heitor F Neto : O Projeto SiGo desenvolvido aqui no CodePlex e open source, sera apenas um Prototipo, assim que desenvolmemos os modulos basicos iremos migrar para um servidor Pago e com segurança dos Codigo Fonte e Banco de Dados. Pessoa...STEM123: Windows Phone 7 application to help people find and create STEM Topic details.TIL: Text Injection and Templating Library: An advanced alternative to string.Format() that encapsulates templates in objects, uses named tokens, and lets you define extra serializing/processing steps before injecting input into the template (think, join an array, serialize an object, etc).UAH Exchange: Ukrainian hrivna currency exchangeuberBook: uberBook ist eine Kontakt-Verwaltung für´s Tray. Das Programm syncronisiert sämtliche Kontakte mit dem Internet und sucht automatisch nach Social-Network Profilen Ihrer KontakteWPF Animated GIF: A simple library to display animated GIF images in WPF, usable in XAML or in code.

Read the article
Parallel Classloading Revisited: Fully Concurrent Loading

- by davidholmes

Java 7 introduced support for parallel classloading. A description of that project and its goals can be found here: http://openjdk.java.net/groups/core-libs/ClassLoaderProposal.html The solution for parallel classloading was to add to each class loader a ConcurrentHashMap, referenced through a new field, parallelLockMap. This contains a mapping from class names to Objects to use as a classloading lock for that class name. This was then used in the following way: protected Class loadClass(String name, boolean resolve) throws ClassNotFoundException { synchronized (getClassLoadingLock(name)) { // First, check if the class has already been loaded Class c = findLoadedClass(name); if (c == null) { long t0 = System.nanoTime(); try { if (parent != null) { c = parent.loadClass(name, false); } else { c = findBootstrapClassOrNull(name); } } catch (ClassNotFoundException e) { // ClassNotFoundException thrown if class not found // from the non-null parent class loader } if (c == null) { // If still not found, then invoke findClass in order // to find the class. long t1 = System.nanoTime(); c = findClass(name); // this is the defining class loader; record the stats sun.misc.PerfCounter.getParentDelegationTime().addTime(t1 - t0); sun.misc.PerfCounter.getFindClassTime().addElapsedTimeFrom(t1); sun.misc.PerfCounter.getFindClasses().increment(); } } if (resolve) { resolveClass(c); } return c; } } Where getClassLoadingLock simply does: protected Object getClassLoadingLock(String className) { Object lock = this; if (parallelLockMap != null) { Object newLock = new Object(); lock = parallelLockMap.putIfAbsent(className, newLock); if (lock == null) { lock = newLock; } } return lock; } This approach is very inefficient in terms of the space used per map and the number of maps. First, there is a map per-classloader. As per the code above under normal delegation the current classloader creates and acquires a lock for the given class, checks if it is already loaded, then asks its parent to load it; the parent in turn creates another lock in its own map, checks if the class is already loaded and then delegates to its parent and so on till the boot loader is invoked for which there is no map and no lock. So even in the simplest of applications, you will have two maps (in the system and extensions loaders) for every class that has to be loaded transitively from the application's main class. If you knew before hand which loader would actually load the class the locking would only need to be performed in that loader. As it stands the locking is completely unnecessary for all classes loaded by the boot loader. Secondly, once loading has completed and findClass will return the class, the lock and the map entry is completely unnecessary. But as it stands, the lock objects and their associated entries are never removed from the map. It is worth understanding exactly what the locking is intended to achieve, as this will help us understand potential remedies to the above inefficiencies. Given this is the support for parallel classloading, the class loader itself is unlikely to need to guard against concurrent load attempts - and if that were not the case it is likely that the classloader would need a different means to protect itself rather than a lock per class. Ultimately when a class file is located and the class has to be loaded, defineClass is called which calls into the VM - the VM does not require any locking at the Java level and uses its own mutexes for guarding its internal data structures (such as the system dictionary). The classloader locking is primarily needed to address the following situation: if two threads attempt to load the same class, one will initiate the request through the appropriate loader and eventually cause defineClass to be invoked. Meanwhile the second attempt will block trying to acquire the lock. Once the class is loaded the first thread will release the lock, allowing the second to acquire it. The second thread then sees that the class has now been loaded and will return that class. Neither thread can tell which did the loading and they both continue successfully. Consider if no lock was acquired in the classloader. Both threads will eventually locate the file for the class, read in the bytecodes and call defineClass to actually load the class. In this case the first to call defineClass will succeed, while the second will encounter an exception due to an attempted redefinition of an existing class. It is solely for this error condition that the lock has to be used. (Note that parallel capable classloaders should not need to be doing old deadlock-avoidance tricks like doing a wait() on the lock object\!). There are a number of obvious things we can try to solve this problem and they basically take three forms: Remove the need for locking. This might be achieved by having a new version of defineClass which acts like defineClassIfNotPresent - simply returning an existing Class rather than triggering an exception. Increase the coarseness of locking to reduce the number of lock objects and/or maps. For example, using a single shared lockMap instead of a per-loader lockMap. Reduce the lifetime of lock objects so that entries are removed from the map when no longer needed (eg remove after loading, use weak references to the lock objects and cleanup the map periodically). There are pros and cons to each of these approaches. Unfortunately a significant "con" is that the API introduced in Java 7 to support parallel classloading has essentially mandated that these locks do in fact exist, and they are accessible to the application code (indirectly through the classloader if it exposes them - which a custom loader might do - and regardless they are accessible to custom classloaders). So while we can reason that we could do parallel classloading with no locking, we can not implement this without breaking the specification for parallel classloading that was put in place for Java 7. Similarly we might reason that we can remove a mapping (and the lock object) because the class is already loaded, but this would again violate the specification because it can be reasoned that the following assertion should hold true: Object lock1 = loader.getClassLoadingLock(name); loader.loadClass(name); Object lock2 = loader.getClassLoadingLock(name); assert lock1 == lock2; Without modifying the specification, or at least doing some creative wordsmithing on it, options 1 and 3 are precluded. Even then there are caveats, for example if findLoadedClass is not atomic with respect to defineClass, then you can have concurrent calls to findLoadedClass from different threads and that could be expensive (this is also an argument against moving findLoadedClass outside the locked region - it may speed up the common case where the class is already loaded, but the cost of re-executing after acquiring the lock could be prohibitive. Even option 2 might need some wordsmithing on the specification because the specification for getClassLoadingLock states "returns a dedicated object associated with the specified class name". The question is, what does "dedicated" mean here? Does it mean unique in the sense that the returned object is only associated with the given class in the current loader? Or can the object actually guard loading of multiple classes, possibly across different class loaders? So it seems that changing the specification will be inevitable if we wish to do something here. In which case lets go for something that more cleanly defines what we want to be doing: fully concurrent class-loading. Note: defineClassIfNotPresent is already implemented in the VM as find_or_define_class. It is only used if the AllowParallelDefineClass flag is set. This gives us an easy hook into existing VM mechanics. Proposal: Fully Concurrent ClassLoaders The proposal is that we expand on the notion of a parallel capable class loader and define a "fully concurrent parallel capable class loader" or fully concurrent loader, for short. A fully concurrent loader uses no synchronization in loadClass and the VM uses the "parallel define class" mechanism. For a fully concurrent loader getClassLoadingLock() can return null (or perhaps not - it doesn't matter as we won't use the result anyway). At present we have not made any changes to this method. All the parallel capable JDK classloaders become fully concurrent loaders. This doesn't require any code re-design as none of the mechanisms implemented rely on the per-name locking provided by the parallelLockMap. This seems to give us a path to remove all locking at the Java level during classloading, while retaining full compatibility with Java 7 parallel capable loaders. Fully concurrent loaders will still encounter the performance penalty associated with concurrent attempts to find and prepare a class's bytecode for definition by the VM. What this penalty is depends on the number of concurrent load attempts possible (a function of the number of threads and the application logic, and dependent on the number of processors), and the costs associated with finding and preparing the bytecodes. This obviously has to be measured across a range of applications. Preliminary webrevs: http://cr.openjdk.java.net/~dholmes/concurrent-loaders/webrev.hotspot/ http://cr.openjdk.java.net/~dholmes/concurrent-loaders/webrev.jdk/ Please direct all comments to the mailing list [email protected].

Read the article
Masters vs. PhD - long [closed]

- by Sterling

I'm 21 years old and a first year master's computer science student. Whether or not to continue with my PhD has been plaguing me for the past few months. I can't stop thinking about it and am extremely torn on the issue. I have read http://www.cs.unc.edu/~azuma/hitch4.html and many, many other masters vs phd articles on the web. Unfortunately, I have not yet come to a conclusion. I was hoping that I could post my ideas about the issue on here in hopes to 1) get some extra insight on the issue and 2) make sure that I am correct in my assumptions. Hopefully having people who have experience in the respective fields can tell me if I am wrong so I don't make my decision based on false ideas. Okay, to get this topic out of the way - money. Money isn't the most important thing to me, but it is still important. It's always been a goal of mine to make 6 figures, but I realize that will probably take me a long time with either path. According to most online salary calculating sites, the average starting salary for a software engineer is ~60-70k. The PhD program here is 5 years, so that's about 300k I am missing out on by not going into the workforce with a masters. I have only ever had ~1k at one time in my life so 300k is something I can't even really accurately imagine. I know that I wouldn't have at once obviously, but just to know I would be earning that is kinda crazy to me. I feel like I would be living quite comfortably by the time I'm 30 years old (but risk being too content too soon). I would definitely love to have at least a few years of my 20s to spend with that kind of money before I have a family to spend it all on. I haven't grown up very financially stable so it would be so nice to just spend some money…get a nice car, buy a new guitar or two, eat some good food, and just be financially comfortable. I have always felt like I deserved to make good money in my life, even as a kid growing up, and I just want to have it be a reality. I know that either path I take will make good money by the time I'm ~40-45 years old, but I guess I'm just sick of not making money and am getting impatient about it. However, a big idea pushing me towards a PhD is that I feel the masters path would give me a feeling of selling out if I have the capability to solve real questions in the computer science world. (pretty straight-forward - not much to elaborate on, but this is a big deal) Now onto other aspects of the decision. I originally got into computer science because of programming. I started in high school and knew very soon that it was what I wanted to do for a career. I feel like getting a masters and being a software engineer in the industry gives me much more time to program in my career. In research, I feel like I would spend more time reading, writing, trying to get grant money, etc than I would coding. A guy I work with in the lab just recently published a paper. He showed it to me and I was shocked by it. The first two pages was littered with equations and formulas. Then the next page or so was followed by more equations and formulas that he derived from the previous ones. That was his work - breaking down and creating all of these formulas for robotic arm movement. And whenever I read computer science papers, they all seem to follow this pattern. I always pictured myself coding all day long…not proving equations and things of that nature. I know that's only one part of computer science research, but that part bores me. A couple cons on each side - Phd - I don't really enjoy writing or feel like I'm that great at technical writing. Whenever I'm in groups to make something, I'm always the one who does the large majority of the work and then give it to my team members to write up a report. Presenting is different though - I don't mind presenting at all as long as I have a good grasp on what I am presenting. But writing papers seems like such a chore to me. And because of this, the "publish or perish" phrase really turns me off from research. Another bad thing - I feel like if I am doing research, most of it would be done alone. I work best in small groups. I like to have at least one person to bounce ideas off of when I am brainstorming. The idea of being a part of some small elite group to build things sounds ideal to me. So being able to work in small groups for the majority of my career is a definite plus. I don't feel like I can get this doing research. Masters - I read a lot online that most people come in as engineers and eventually move into management positions. As of now, I don't see myself wanting to be a part of management. Lets say my company wanted to make some new product or system - I would get much more pride, enjoyment, and overall satisfaction to say "I made this" rather than "I managed a group of people that made this." I want to be a big part of the development process. I want to make things. I think it would be great to be more specialized than other people. I would rather know everything about something than something about everything. I always have been that way - was a great pitcher during my baseball years, but not so good at everything else, great at certain classes in school, but not so good at others, etc. To think that my career would be the same way sounds okay to me. Getting a PhD would point me in this direction. It would be great to be some guy who is someone that people look towards and come to ask for help because of being such an important contributor to a very specific field, such as artificial neural networks or robotic haptic perception. From what I gather about the software industry, being specialized can be a very bad thing because of the speed of the new technology. I When it comes to being employed, I have pretty conservative views. I don't want to change companies every 5 years. Maybe this is something everyone wishes, but I would love to just be an important person in one company for 10+ (maybe 20-25+ if I'm lucky!) years if the working conditions were acceptable. I feel like that is more possible as a PhD though, being a professor or researcher. The more I read about people in the software industry, the more it seems like most software engineers bounce from company to company at rapid paces. Some even work like a hired gun from project to project which is NOT what I want AT ALL. But finding a place to make great and important software would be great if that actually happens in the real world. I'm a very competitive person. I thrive on competition. I don't really know why, but I have always been that way even as a kid growing up. Competition always gave me a reason to practice that little extra every night, always push my limits, etc. It seems to me like there is no competition in the research world. It seems like everyone is very relaxed as long as research is being conducted. The only competition is if someone is researching the same thing as you and its whoever can finish and publish first (but everyone seems to careful to check that circumstance). The only noticeable competition to me is just with yourself and your own discipline. I like the idea that in the industry, there is real competition between companies to put out the best product or be put out of business. I feel like this would constantly be pushing me to be better at what I do. One thing that is really pushing me towards a PhD is the lifetime of the things you make. I feel like if you make something truly innovative in the industry…just some really great new application or system…there is a shelf-life of about 5-10 years before someone just does it faster and more efficiently. But with research work, you could create an idea or algorithm that last decades. For instance, the A* search algorithm was described in 1968 and is still widely used today. That is amazing to me. In the words of Palahniuk, "The goal isn't to live forever, its to create something that will." Over anything, I just want to do something that matters. I want my work to help and progress society. Seriously, if I'm stuck programming GUIs for the next 40 years…I might shoot myself in the face. But then again, I hate the idea that less than 1% of the population will come into contact with my work and even less understand its importance. So if anything I have said is false then please inform me. If you think I come off as a masters or PhD, inform me. If you want to give me some extra insight or add on to any point I made, please do. Thank you so much to anyone for any help.

Read the article
Explicit method tables in C# instead of OO - good? bad?

- by FunctorSalad

Hi! I hope the title doesn't sound too subjective; I absolutely do not mean to start a debate on OO in general. I'd merely like to discuss the basic pros and cons for different ways of solving the following sort of problem. Let's take this minimal example: you want to express an abstract datatype T with functions that may take T as input, output, or both: f1 : Takes a T, returns an int f2 : Takes a string, returns a T f3 : Takes a T and a double, returns another T I'd like to avoid downcasting and any other dynamic typing. I'd also like to avoid mutation whenever possible. 1: Abstract-class-based attempt abstract class T { abstract int f1(); // We can't have abstract constructors, so the best we can do, as I see it, is: abstract void f2(string s); // The convention would be that you'd replace calls to the original f2 by invocation of the nullary constructor of the implementing type, followed by invocation of f2. f2 would need to have side-effects to be of any use. // f3 is a problem too: abstract T f3(double d); // This doesn't express that the return value is of the *same* type as the object whose method is invoked; it just expresses that the return value is *some* T. } 2: Parametric polymorphism and an auxilliary class (all implementing classes of TImpl will be singleton classes): abstract class TImpl<T> { abstract int f1(T t); abstract T f2(string s); abstract T f3(T t, double d); } We no longer express that some concrete type actually implements our original spec -- an implementation is simply a type Foo for which we happen to have an instance of TImpl. This doesn't seem to be a problem: If you want a function that works on arbitrary implementations, you just do something like: // Say we want to return a Bar given an arbitrary implementation of our abstract type Bar bar<T>(TImpl<T> ti, T t); At this point, one might as well skip inheritance and singletons altogether and use a 3 First-class function table class /* or struct, even */ TDictT<T> { readonly Func<T,int> f1; readonly Func<string,T> f2; readonly Func<T,double,T> f3; TDict( ... ) { this.f1 = f1; this.f2 = f2; this.f3 = f3; } } Bar bar<T>(TDict<T> td; T t); Though I don't see much practical difference between #2 and #3. Example Implementation class MyT { /* raw data structure goes here; this class needn't have any methods */ } // It doesn't matter where we put the following; could be a static method of MyT, or some static class collecting dictionaries static readonly TDict<MyT> MyTDict = new TDict<MyT>( (t) => /* body of f1 goes here */ , // f2 (s) => /* body of f2 goes here */, // f3 (t,d) => /* body of f3 goes here */ ); Thoughts? #3 is unidiomatic, but it seems rather safe and clean. One question is whether there are any performance concerns with it. I don't usually need dynamic dispatch, and I'd prefer if these function bodies get statically inlined in places where the concrete implementing type is known statically. Is #2 better in that regard?

Read the article
what persistence layer (xml or mysql) should i use for this xml data?

- by fayer

i wonder how i could store a xml structure in a persistence layer. cause the relational data looks like: <entity id="1000070"> <name>apple</name> <entities> <entity id="7002870"> <name>mac</name> <entities> <entity id="7002907"> <name>leopard</name> <entities> <entity id="7024080"> <name>safari</name> </entity> <entity id="7024701"> <name>finder</name> </entity> </entities> </entity> </entities> </entity> <entity id="7024080"> <name>iphone</name> <entities> <entity id="7024080"> <name>3g</name> </entity> <entity id="7024701"> <name>3gs</name> </entity> </entities> </entity> <entity id="7024080"> <name>ipad</name> </entity> </entities> </entity> as you can see, it has no static structure but a dynamical one. mac got 2 descendant levels while iphone got 1 and ipad got 0. i wonder how i could store this data the best way? what are my options. cause it seems impossible to store it in a mysql database due to this dynamical structure. is the only way to store it as a xml file then? is the speed of getting information (xpath/xquery/simplexml) from a xml file worse or greater than from mysql? what are the pros and cons? do i have other options? is storing information in xml files, suited for a lot of users accessing it at the same time? would be great with feedbacks!! thanks! EDIT: now i noticed that i could use something called xml database to store xml data. could someone shed a light on this issue? cause apparently its not as simple as just store data in a xml file?

Read the article
database design help for game / user levels / progress

- by sprugman

Sorry this got long and all prose-y. I'm creating my first truly gamified web app and could use some help thinking about how to structure the data. The Set-up Users need to accomplish tasks in each of several categories before they can move up a level. I've got my Users, Tasks, and Categories tables, and a UserTasks table which joins the three. ("User 3 has added Task 42 in Category 8. Now they've completed it.") That's all fine and working wonderfully. The Challenge I'm not sure of the best way to track the progress in the individual categories toward each level. The "business" rules are: You have to achieve a certain number of points in each category to move up. If you get the number of points needed in Cat 8, but still have other work to do to complete the level, any new Cat 8 points count toward your overall score, but don't "roll over" into the next level. The number of Categories is small (five currently) and unlikely to change often, but by no means absolutely fixed. The number of points needed to level-up will vary per level, probably by a formula, or perhaps a lookup table. So the challenge is to track each user's progress toward the next level in each category. I've thought of a few potential approaches: Possible Solutions Add a column to the users table for each category and reset them all to zero each time a user levels-up. Have a separate UserProgress table with a row for each category for each user and the number of points they have. (Basically a Many-to-Many version of #1.) Add a userLevel column to the UserTasks table and use that to derive their progress with some kind of SUM statement. Their current level will be a simple int in the User table. Pros & Cons (1) seems like by far the most straightforward, but it's also the least flexible. Perhaps I could use a naming convention based on the category ids to help overcome some of that. (With code like "select cats; for each cat, get the value from Users.progress_{cat.id}.") It's also the one where I lose the most data -- I won't know which points counted toward leveling up. I don't have a need in mind for that, so maybe I don't care about that. (2) seems complicated: every time I add or subtract a user or a category, I have to maintain the other table. I foresee synchronization challenges. (3) Is somewhere in between -- cleaner than #2, but less intuitive than #1. In order to find out where a user is, I'd have mildly complex SQL like: SELECT categoryId, SUM(points) from UserTasks WHERE userId={user.id} & countsTowardLevel={user.level} groupBy categoryId Hmm... that doesn't seem so bad. I think I'm talking myself into #3 here, but would love any input, advice or other ideas.

Read the article
CLSF & CLK 2013 Trip Report by Jeff Liu

- by jamesmorris

This is a contributed post from Jeff Liu, lead XFS developer for the Oracle mainline Linux kernel team. Recently, I attended both the China Linux Storage and Filesystem workshop (CLSF), and the China Linux Kernel conference (CLK), which were held in Shanghai. Here are the highlights for both events. CLSF - 17th October XFS update (led by Jeff Liu) XFS keeps rapid progress with a lot of changes, especially focused on the infrastructure/performance improvements as well as new feature development. This can be reflected with a sample statistics among XFS/Ext4+JBD2/Btrfs via: # git diff --stat --minimal -C -M v3.7..v3.12-rc4 -- fs/xfs|fs/ext4+fs/jbd2|fs/btrfs XFS: 141 files changed, 27598 insertions(+), 19113 deletions(-) Ext4+JBD2: 39 files changed, 10487 insertions(+), 5454 deletions(-) Btrfs: 70 files changed, 19875 insertions(+), 8130 deletions(-) What made up those changes in XFS? Self-describing metadata(CRC32c). This is a new feature and it contributed about 70% code changes, it can be enabled via `mkfs.xfs -m crc=1 /dev/xxx` for v5 superblock. Transaction log space reservation improvements. With this change, we can calculate the log space reservation at mount time rather than runtime to reduce the the CPU overhead. User namespace support. So both XFS and USERNS can be enabled on kernel configuration begin from Linux 3.10. Thanks Dwight Engen's efforts for this thing. Split project/group quota inodes. Originally, project quota can not be enabled with group quota at the same time because they were share the same quota file inode, now it works but only for v5 super block. i.e, CRC enabled. CONFIG_XFS_WARN, an new lightweight runtime debugger which can be deployed in production environment. Readahead log object recovery, this change can speed up the log replay progress significantly. Speculative preallocation inode tracking, clearing and throttling. The main purpose is to deal with inodes with post-EOF space due to speculative preallocation, support improved quota management to free up a significant amount of unwritten space when at or near EDQUOT. It support backgroup scanning which occurs on a longish interval(5 mins by default, tunable), and on-demand scanning/trimming via ioctl(2). Bitter arguments ensued from this session, especially for the comparison between Ext4 and Btrfs in different areas, I have to spent a whole morning of the 1st day answering those questions. We basically agreed on XFS is the best choice in Linux nowadays because: Stable, XFS has a good record in stability in the past 10 years. Fengguang Wu who lead the 0-day kernel test project also said that he has observed less error than other filesystems in the past 1+ years, I own it to the XFS upstream code reviewer, they always performing serious code review as well as testing. Good performance for large/small files, XFS does not works very well for small files has already been an old story for years. Best choice (maybe) for distributed PB filesystems. e.g, Ceph recommends delopy OSD daemon on XFS because Ext4 has limited xattr size. Best choice for large storage (>16TB). Ext4 does not support a single file more than around 15.95TB. Scalability, any objection to XFS is best in this point? :) XFS is better to deal with transaction concurrency than Ext4, why? The maximum size of the log in XFS is 2038MB compare to 128MB in Ext4. Misc. Ext4 is widely used and it has been proved fast/stable in various loads and scenarios, XFS just need more customers, and Btrfs is still on the road to be a manhood. Ceph Introduction (Led by Li Wang) This a hot topic. Li gave us a nice introduction about the design as well as their current works. Actually, Ceph client has been included in Linux kernel since 2.6.34 and supported by Openstack since Folsom but it seems that it has not yet been widely deployment in production environment. Their major work is focus on the inline data support to separate the metadata and data storage, reduce the file access time, i.e, a file access need communication twice, fetch the metadata from MDS and then get data from OSD, and also, the small file access is limited by the network latency. The solution is, for the small files they would like to store the data at metadata so that when accessing a small file, the metadata server can push both metadata and data to the client at the same time. In this way, they can reduce the overhead of calculating the data offset and save the communication to OSD. For this feature, they have only run some small scale testing but really saw noticeable improvements. Test environment: Intel 2 CPU 12 Core, 64GB RAM, Ubuntu 12.04, Ceph 0.56.6 with 200GB SATA disk, 15 OSD, 1 MDS, 1 MON. The sequence read performance for 1K size files improved about 50%. I have asked Li and Zheng Yan (the core developer of Ceph, who also worked on Btrfs) whether Ceph is really stable and can be deployed at production environment for large scale PB level storage, but they can not give a positive answer, looks Ceph even does not spread over Dreamhost (subject to confirmation). From Li, they only deployed Ceph for a small scale storage(32 nodes) although they'd like to try 6000 nodes in the future. Improve Linux swap for Flash storage (led by Shaohua Li) Because of high density, low power and low price, flash storage (SSD) is a good candidate to partially replace DRAM. A quick answer for this is using SSD as swap. But Linux swap is designed for slow hard disk storage, so there are a lot of challenges to efficiently use SSD for swap. SWAPOUT swap_map scan swap_map is the in-memory data structure to track swap disk usage, but it is a slow linear scan. It will become a bottleneck while finding many adjacent pages in the use of SSD. Shaohua Li have changed it to a cluster(128K) list, resulting in O(1) algorithm. However, this apporoach needs restrictive cluster alignment and only enabled for SSD. IO pattern In most cases, the swap io is in interleaved pattern because of mutiple reclaimers or a free cluster is shared by all reclaimers. Even though block layer can merge interleaved IO to some extent, but we cannot count on it completely. Hence the per-cpu cluster is added base on the previous change, it can help reclaimer do sequential IO and the block layer will be easier to merge IO. TLB flush: If we're reclaiming one active page, we should first move the page from active lru list to inactive lru list, and then reclaim the page from inactive lru to swap it out. During the process, we need to clear PTE twice: first is 'A'(ACCESS) bit, second is 'P'(PRESENT) bit. Processors need to send lots of ipi which make the TLB flush really expensive. Some works have been done to improve this, including rework smp_call_functiom_many() or remove the first TLB flush in x86, but there still have some arguments here and only parts of works have been pushed to mainline. SWAPIN: Page fault does iodepth=1 sync io, but it's a little waste if only issue a page size's IO. The obvious solution is doing swap readahead. But the current in-kernel swap readahead is arbitary(always 8 pages), and it always doesn't perform well for both random and sequential access workload. Shaohua introduced a new flag for madvise(MADV_WILLNEED) to do swap prefetch, so the changes happen in userspace API and leave the in-kernel readahead unchanged(but I think some improvement can also be done here). SWAP discard As we know, discard is important for SSD write throughout, but the current swap discard implementation is synchronous. He changed it to async discard which allow discard and write run in the same time. Meanwhile, the unit of discard is also optimized to cluster. Misc: lock contention For many concurrent swapout and swapin , the lock contention such as anon_vma or swap_lock is high, so he changed the swap_lock to a per-swap lock. But there still have some lock contention in very high speed SSD because of swapcache address_space lock. Zproject (led by Bob Liu) Bob gave us a very nice introduction about the current memory compression status. Now there are 3 projects(zswap/zram/zcache) which all aim at smooth swap IO storm and promote performance, but they all have their own pros and cons. ZSWAP It is implemented based on frontswap API and it uses a dynamic allocater named Zbud to allocate free pages. Zbud means pairs of zpages are "buddied" and it can only store at most two compressed pages in one page frame, so the max compress ratio is 50%. Each page frame is lru-linked and can do shink in memory pressure. If the compressed memory pool reach its limitation, shink or reclaim happens. It decompress the page frame into two new allocated pages and then write them to real swap device, but it can fail when allocating the two pages. ZRAM Acts as a compressed ramdisk and used as swap device, and it use zsmalloc as its allocator which has high density but may have fragmentation issues. Besides, page reclaim is hard since it will need more pages to uncompress and free just one page. ZRAM is preferred by embedded system which may not have any real swap device. Now both ZRAM and ZSWAP are in driver/staging tree, and in the mm community there are some disscussions of merging ZRAM into ZSWAP or viceversa, but no agreement yet. ZCACHE Handles file page compression but it is removed out of staging recently. From industry (led by Tang Jie, LSI) An LSI engineer introduced several new produces to us. The first is raid5/6 cards that it use full stripe writes to improve performance. The 2nd one he introduced is SandForce flash controller, who can understand data file types (data entropy) to reduce write amplification (WA) for nearly all writes. It's called DuraWrite and typical WA is 0.5. What's more, if enable its Dynamic Logical Capacity function module, the controller can do data compression which is transparent to upper layer. LSI testing shows that with this virtual capacity enables 1x TB drive can support up to 2x TB capacity, but the application must monitor free flash space to maintain optimal performance and to guard against free flash space exhaustion. He said the most useful application is for datebase. Another thing I think it's worth to mention is that a NV-DRAM memory in NMR/Raptor which is directly exposed to host system. Applications can directly access the NV-DRAM via a memory address - using standard system call mmap(). He said that it is very useful for database logging now. This kind of NVM produces are beginning to appear in recent years, and it is said that Samsung is building a research center in China for related produces. IMHO, NVM will bring an effect to current os layer especially on file system, e.g. its journaling may need to redesign to fully utilize these nonvolatile memory. OCFS2 (led by Canquan Shen) Without a doubt, HuaWei is the biggest contributor to OCFS2 in the past two years. They have posted 46 upstream patches and 39 patches have been merged. Their current project is based on 32/64 nodes cluster, but they also tried 128 nodes at the experimental stage. The major work they are working is to support ATS (atomic test and set), it can be works with DLM at the same time. Looks this idea is inspired by the vmware VMFS locking, i.e, http://blogs.vmware.com/vsphere/2012/05/vmfs-locking-uncovered.html CLK - 18th October 2013 Improving Linux Development with Better Tools (Andi Kleen) This talk focused on how to find/solve bugs along with the Linux complexity growing. Generally, we can do this with the following kind of tools: Static code checkers tools. e.g, sparse, smatch, coccinelle, clang checker, checkpatch, gcc -W/LTO, stanse. This can help check a lot of things, simple mistakes, complex problems, but the challenges are: some are very slow, false positives, may need a concentrated effort to get false positives down. Especially, no static checker I found can follow indirect calls (“OO in C”, common in kernel): struct foo_ops { int (*do_foo)(struct foo *obj); } foo->do_foo(foo); Dynamic runtime checkers, e.g, thread checkers, kmemcheck, lockdep. Ideally all kernel code would come with a test suite, then someone could run all the dynamic checkers. Fuzzers/test suites. e.g, Trinity is a great tool, it finds many bugs, but needs manual model for each syscall. Modern fuzzers around using automatic feedback, but notfor kernel yet: http://taviso.decsystem.org/making_software_dumber.pdf Debuggers/Tracers to understand code, e.g, ftrace, can dump on events/oops/custom triggers, but still too much overhead in many cases to run always during debug. Tools to read/understand source, e.g, grep/cscope work great for many cases, but do not understand indirect pointers (OO in C model used in kernel), give us all “do_foo” instances: struct foo_ops { int (*do_foo)(struct foo *obj); } = { .do_foo = my_foo }; foo>do_foo(foo); That would be great to have a cscope like tool that understands this based on types/initializers XFS: The High Performance Enterprise File System (Jeff Liu) [slides] I gave a talk for introducing the disk layout, unique features, as well as the recent changes. The slides include some charts to reflect the performances between XFS/Btrfs/Ext4 for small files. About a dozen users raised their hands when I asking who has experienced with XFS. I remembered that when I asked the same question in LinuxCon/Japan, only 3 people raised their hands, but they are Chris Mason, Ric Wheeler, and another attendee. The attendee questions were mainly focused on stability, and comparison with other file systems. Linux Containers (Feng Gao) The speaker introduced us that the purpose for those kind of namespaces, include mount/UTS/IPC/Network/Pid/User, as well as the system API/ABI. For the userspace tools, He mainly focus on the Libvirt LXC rather than us(LXC). Libvirt LXC is another userspace container management tool, implemented as one type of libvirt driver, it can manage containers, create namespace, create private filesystem layout for container, Create devices for container and setup resources controller via cgroup. In this talk, Feng also mentioned another two possible new namespaces in the future, the 1st is the audit, but not sure if it should be assigned to user namespace or not. Another is about syslog, but the question is do we really need it? In-memory Compression (Bob Liu) Same as CLSF, a nice introduction that I have already mentioned above. Misc There were some other talks related to ACPI based memory hotplug, smart wake-affinity in scheduler etc., but my head is not big enough to record all those things. -- Jeff Liu

Read the article
Laissez les bon temps rouler! (Microsoft BI Conference 2010)

- by smisner

"Laissez les bons temps rouler" is a Cajun phrase that I heard frequently when I lived in New Orleans in the mid-1990s. It means "Let the good times roll!" and encapsulates a feeling of happy expectation. As I met with many of my peers and new acquaintances at the Microsoft BI Conference last week, this phrase kept running through my mind as people spoke about their plans in their respective businesses, the benefits and opportunities that the recent releases in the BI stack are providing, and their expectations about the future of the BI stack. Notwithstanding some jabs here and there to point out the platform is neither perfect now nor will be anytime soon (along with admissions that the competitors are also not perfect), and notwithstanding several missteps by the event organizers (which I don't care to enumerate), the overarching mood at the conference was positive. It was a refreshing change from the doom and gloom hovering over several conferences that I attended in 2009. Although many people expect economic hardships to continue over the coming year or so, everyone I know in the BI field is busier than ever and expects to stay busy for quite a while. Self-Service BI Self-service was definitely a theme of the BI conference. In the keynote, Ted Kummert opened with a look back to a fairy tale vision of self-service BI that he told in 2008. At that time, the fairy tale future was a time when "every end user was able to use BI technologies within their job in order to move forward more effectively" and transitioned to the present time in which SQL Server 2008 R2, Office 2010, and SharePoint 2010 are available to deliver managed self-service BI. This set of technologies is presumably poised to address the needs of the 80% of users that Kummert said do not use BI today. He proceeded to outline a series of activities that users ought to be able to do themselves--from simple changes to a report like formatting or an addtional data visualization to integration of an additional data source. The keynote then continued with a series of demonstrations of both current and future technology in support of self-service BI. Some highlights that interested me: PowerPivot, of course, is the flagship product for self-service BI in the Microsoft BI stack. In the TechEd keynote, which was open to the BI conference attendees, Amir Netz (twitter) impressed the audience by demonstrating interactivity with a workbook containing 100 million rows. He upped the ante at the BI keynote with his demonstration of a future-state PowerPivot workbook containing over 2 billion records. It's important to note that this volume of data is being processed by a server engine, and not in the PowerPivot client engine. (Yes, I think it's impressive, but none of my clients are typically wrangling with 2 billion records at a time. Maybe they're thinking too small. This ability to work quickly with large data sets has greater implications for BI solutions than for self-service BI, in my opinion.) Amir also demonstrated KPIs for the future PowerPivot, which appeared to be easier to implement than in any other Microsoft product that supports KPIs, apart from simple KPIs in SharePoint. (My initial reaction is that we have one more place to build KPIs. Great. It's confusing enough. I haven't seen how well those KPIs integrate with other BI tools, which will be important for adoption.) One more PowerPivot feature that Amir showed was a graphical display of the lineage for calculations. (This is hugely practical, especially if you build up calculations incrementally. You can more easily follow the logic from calculation to calculation. Furthermore, if you need to make a change to one calculation, you can assess the impact on other calculations.) Another product demonstration will be available within the next 30 days--Pivot for Reporting Services. If you haven't seen this technology yet, check it out at www.getpivot.com. (It definitely has a wow factor, but I'm skeptical about its practicality. However, I'm looking forward to trying it out with data that I understand.) Michael Tejedor (twitter) demonstrated a feature that I think is really interesting and not emphasized nearly enough--overshadowed by PowerPivot, no doubt. That feature is the Microsoft Business Intelligence Indexing Connector, which enables search of the content of Excel workbooks and Reporting Services reports. (This capability existed in MOSS 2007, but was more cumbersome to implement. The search results in SharePoint 2010 are not only cooler, but more useful by describing whether the content is found in a table or a chart, for example.) This may yet be the dawning of the age of self-service BI - a phrase I've heard repeated from time to time over the last decade - but I think BI professionals are likely to stay busy for a long while, and need not start looking for a new line of work. Kummert repeatedly referenced strategic BI solutions in contrast to self-service BI to emphasize that self-service BI is not a replacement for the services that BI professionals provide. After all, self-service BI does not appear magically on user desktops (or whatever device they want to use). A supporting infrastructure is necessary, and grows in complexity in proportion to the need to simplify BI for users. It's one thing to hear the party line touted by Microsoft employees at the BI keynote, but it's another to hear from the people who are responsible for implementing and supporting it within an organization. Rob Collie (blog | twitter), Kasper de Jonge (blog | twitter), Vidas Matelis (site | twitter), and I were invited to join Andrew Brust (blog | twitter) as he led a Birds of a Feather session at TechEd entitled "PowerPivot: Is It the BI Deal-Changer for Developers and IT Pros?" I would single out the prevailing concern in this session as the issue of control. On one side of this issue were those who were concerned that they would lose control once PowerPivot is implemented. On the other side were those who believed that data should be freely accessible to users in PowerPivot, and even acknowledgment that users would get the data they want even if it meant they would have to manually enter into a workbook to have it ready for analysis. For another viewpoint on how PowerPivot played out at the conference, see Rob Collie's observations. Collaborative BI I have been intrigued by the notion of collaborative BI for a very long time. Before I discovered BI, I was a Lotus Notes developer and later a manager of developers, working in a software company that enabled collaboration in the legal industry. Not only did I help create collaborative systems for our clients, I created a complete project management from the ground up to collaboratively manage our custom development work. In that case, collaboration involved my team, my client contacts, and me. I was also able to produce my own BI from that system as well, but didn't know that's what I was doing at the time. Only in recent years has SharePoint begun to catch up with the capabilities that I had with Lotus Notes more than a decade ago. Eventually, I had the opportunity at that job to formally investigate BI as another product offering for our software, and the rest - as they say - is history. I built my first data warehouse with Scott Cameron (who has also ventured into the authoring world by writing Analysis Services 2008 Step by Step and was at the BI Conference last week where I got to reminisce with him for a bit) and that began a career that I never imagined at the time. Fast forward to 2010, and I'm still lauding the virtues of collaborative BI, if only the tools will catch up to my vision! Thus, I was anxious to see what Donald Farmer (blog | twitter) and Rita Sallam of Gartner had to say on the subject in their session "Collaborative Decision Making." As I suspected, the tools aren't quite there yet, but the vendors are moving in the right direction. One thing I liked about this session was a non-Microsoft perspective of the state of the industry with regard to collaborative BI. In addition, this session included a better demonstration of SharePoint collaborative BI capabilities than appeared in the BI keynote. Check out the video in the link to the session to see the demonstration. One of the use cases that was demonstrated was linking from information to a person, because, as Donald put it, "People don't trust data, they trust people." The Microsoft BI Stack in General A question I hear all the time from students when I'm teaching is how to know what tools to use when there is overlap between products in the BI stack. I've never taken the time to codify my thoughts on the subject, but saw that my friend Dan Bulos provided good insight on this topic from a variety of perspectives in his session, "So Many BI Tools, So Little Time." I thought one of his best points was that ideally you should be able to design in your tool of choice, and then deploy to your tool of choice. Unfortunately, the ideal is yet to become real across the platform. The closest we come is with the RDL in Reporting Services which can be produced from two different tools (Report Builder or Business Intelligence Development Studio's Report Designer), manually, or by a third-party or custom application. I have touted the idea for years (and publicly said so about 5 years ago) that eventually more products would be RDL producers or consumers, but we aren't there yet. Maybe in another 5 years. Another interesting session that covered the BI stack against a backdrop of competitive products was delivered by Andrew Brust. Andrew did a marvelous job of consolidating a lot of information in a way that clearly communicated how various vendors' offerings compared to the Microsoft BI stack. He also made a particularly compelling argument about how the existence of an ecosystem around the Microsoft BI stack provided innovation and opportunities lacking for other vendors. Check out his presentation, "How Does the Microsoft BI Stack...Stack Up?" Expo Hall I had planned to spend more time in the Expo Hall to see who was doing new things with the BI stack, but didn't manage to get very far. Each time I set out on an exploratory mission, I got caught up in some fascinating conversations with one or more of my peers. I find interacting with people that I meet at conferences just as important as attending sessions to learn something new. There were a couple of items that really caught me eye, however, that I'll share here. Pragmatic Works. Whether you develop SSIS packages, build SSAS cubes, or author SSRS reports (or all of the above), you really must take a look at BI Documenter. Brian Knight (twitter) walked me through the key features, and I must say I was impressed. Once you've seen what this product can do, you won't want to document your BI projects any other way. You can download a free single-user database edition, or choose from more feature-rich standard or professional editions. Microsoft Press ebooks. I also stopped by the O'Reilly Media booth to meet some folks that one of my acquisitions editors at Microsoft Press recommended. In case you haven't heard, Microsoft Press has partnered with O'Reilly Media for distribution and publishing. Apart from my interest in learning more about O'Reilly Media as an author, an advertisement in their booth caught me eye which I think is a really great move. When you buy Microsoft Press ebooks through the O'Reilly web site, you can receive it in any (or all) of the following formats where possible: PDF, epub, .mobi for Kindle and .apk for Android. You also have lifetime DRM-free access to the ebooks. As someone who is an avid collector of books, I fnd myself running out of room for storage. In addition, I travel a lot, and it's hard to lug my reference library with me. Today's e-reader options make the move to digital books a more viable way to grow my library. Having a variety of formats means I am not limited to a single device, and lifetime access means I don't have to worry about keeping track of where I've stored my files. Because the e-books are DRM-free, I can copy and paste when I'm compiling notes, and I can print pages when necessary. That's a winning combination in my mind! Overall, I was pleased with the BI conference. There were many more sessions that I couldn't attend, either because the room was full when I got there or there were multiple sessions running concurrently that I wanted to see. Fortunately, many of the sessions are accessible for viewing online at http://www.msteched.com/2010/NorthAmerica along with the TechEd sessions. You can spot the BI sessions by the yellow skyline on the title slide of the presentation as shown below. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
C#/.NET Fundamentals: Choosing the Right Collection Class

- by James Michael Hare

The .NET Base Class Library (BCL) has a wide array of collection classes at your disposal which make it easy to manage collections of objects. While it's great to have so many classes available, it can be daunting to choose the right collection to use for any given situation. As hard as it may be, choosing the right collection can be absolutely key to the performance and maintainability of your application! This post will look at breaking down any confusion between each collection and the situations in which they excel. We will be spending most of our time looking at the System.Collections.Generic namespace, which is the recommended set of collections. The Generic Collections: System.Collections.Generic namespace The generic collections were introduced in .NET 2.0 in the System.Collections.Generic namespace. This is the main body of collections you should tend to focus on first, as they will tend to suit 99% of your needs right up front. It is important to note that the generic collections are unsynchronized. This decision was made for performance reasons because depending on how you are using the collections its completely possible that synchronization may not be required or may be needed on a higher level than simple method-level synchronization. Furthermore, concurrent read access (all writes done at beginning and never again) is always safe, but for concurrent mixed access you should either synchronize the collection or use one of the concurrent collections. So let's look at each of the collections in turn and its various pros and cons, at the end we'll summarize with a table to help make it easier to compare and contrast the different collections. The Associative Collection Classes Associative collections store a value in the collection by providing a key that is used to add/remove/lookup the item. Hence, the container associates the value with the key. These collections are most useful when you need to lookup/manipulate a collection using a key value. For example, if you wanted to look up an order in a collection of orders by an order id, you might have an associative collection where they key is the order id and the value is the order. The Dictionary<TKey,TVale> is probably the most used associative container class. The Dictionary<TKey,TValue> is the fastest class for associative lookups/inserts/deletes because it uses a hash table under the covers. Because the keys are hashed, the key type should correctly implement GetHashCode() and Equals() appropriately or you should provide an external IEqualityComparer to the dictionary on construction. The insert/delete/lookup time of items in the dictionary is amortized constant time - O(1) - which means no matter how big the dictionary gets, the time it takes to find something remains relatively constant. This is highly desirable for high-speed lookups. The only downside is that the dictionary, by nature of using a hash table, is unordered, so you cannot easily traverse the items in a Dictionary in order. The SortedDictionary<TKey,TValue> is similar to the Dictionary<TKey,TValue> in usage but very different in implementation. The SortedDictionary<TKey,TValye> uses a binary tree under the covers to maintain the items in order by the key. As a consequence of sorting, the type used for the key must correctly implement IComparable<TKey> so that the keys can be correctly sorted. The sorted dictionary trades a little bit of lookup time for the ability to maintain the items in order, thus insert/delete/lookup times in a sorted dictionary are logarithmic - O(log n). Generally speaking, with logarithmic time, you can double the size of the collection and it only has to perform one extra comparison to find the item. Use the SortedDictionary<TKey,TValue> when you want fast lookups but also want to be able to maintain the collection in order by the key. The SortedList<TKey,TValue> is the other ordered associative container class in the generic containers. Once again SortedList<TKey,TValue>, like SortedDictionary<TKey,TValue>, uses a key to sort key-value pairs. Unlike SortedDictionary, however, items in a SortedList are stored as an ordered array of items. This means that insertions and deletions are linear - O(n) - because deleting or adding an item may involve shifting all items up or down in the list. Lookup time, however is O(log n) because the SortedList can use a binary search to find any item in the list by its key. So why would you ever want to do this? Well, the answer is that if you are going to load the SortedList up-front, the insertions will be slower, but because array indexing is faster than following object links, lookups are marginally faster than a SortedDictionary. Once again I'd use this in situations where you want fast lookups and want to maintain the collection in order by the key, and where insertions and deletions are rare. The Non-Associative Containers The other container classes are non-associative. They don't use keys to manipulate the collection but rely on the object itself being stored or some other means (such as index) to manipulate the collection. The List<T> is a basic contiguous storage container. Some people may call this a vector or dynamic array. Essentially it is an array of items that grow once its current capacity is exceeded. Because the items are stored contiguously as an array, you can access items in the List<T> by index very quickly. However inserting and removing in the beginning or middle of the List<T> are very costly because you must shift all the items up or down as you delete or insert respectively. However, adding and removing at the end of a List<T> is an amortized constant operation - O(1). Typically List<T> is the standard go-to collection when you don't have any other constraints, and typically we favor a List<T> even over arrays unless we are sure the size will remain absolutely fixed. The LinkedList<T> is a basic implementation of a doubly-linked list. This means that you can add or remove items in the middle of a linked list very quickly (because there's no items to move up or down in contiguous memory), but you also lose the ability to index items by position quickly. Most of the time we tend to favor List<T> over LinkedList<T> unless you are doing a lot of adding and removing from the collection, in which case a LinkedList<T> may make more sense. The HashSet<T> is an unordered collection of unique items. This means that the collection cannot have duplicates and no order is maintained. Logically, this is very similar to having a Dictionary<TKey,TValue> where the TKey and TValue both refer to the same object. This collection is very useful for maintaining a collection of items you wish to check membership against. For example, if you receive an order for a given vendor code, you may want to check to make sure the vendor code belongs to the set of vendor codes you handle. In these cases a HashSet<T> is useful for super-quick lookups where order is not important. Once again, like in Dictionary, the type T should have a valid implementation of GetHashCode() and Equals(), or you should provide an appropriate IEqualityComparer<T> to the HashSet<T> on construction. The SortedSet<T> is to HashSet<T> what the SortedDictionary<TKey,TValue> is to Dictionary<TKey,TValue>. That is, the SortedSet<T> is a binary tree where the key and value are the same object. This once again means that adding/removing/lookups are logarithmic - O(log n) - but you gain the ability to iterate over the items in order. For this collection to be effective, type T must implement IComparable<T> or you need to supply an external IComparer<T>. Finally, the Stack<T> and Queue<T> are two very specific collections that allow you to handle a sequential collection of objects in very specific ways. The Stack<T> is a last-in-first-out (LIFO) container where items are added and removed from the top of the stack. Typically this is useful in situations where you want to stack actions and then be able to undo those actions in reverse order as needed. The Queue<T> on the other hand is a first-in-first-out container which adds items at the end of the queue and removes items from the front. This is useful for situations where you need to process items in the order in which they came, such as a print spooler or waiting lines. So that's the basic collections. Let's summarize what we've learned in a quick reference table. Collection Ordered? Contiguous Storage? Direct Access? Lookup Efficiency Manipulate Efficiency Notes Dictionary No Yes Via Key Key: O(1) O(1) Best for high performance lookups. SortedDictionary Yes No Via Key Key: O(log n) O(log n) Compromise of Dictionary speed and ordering, uses binary search tree. SortedList Yes Yes Via Key Key: O(log n) O(n) Very similar to SortedDictionary, except tree is implemented in an array, so has faster lookup on preloaded data, but slower loads. List No Yes Via Index Index: O(1) Value: O(n) O(n) Best for smaller lists where direct access required and no ordering. LinkedList No No No Value: O(n) O(1) Best for lists where inserting/deleting in middle is common and no direct access required. HashSet No Yes Via Key Key: O(1) O(1) Unique unordered collection, like a Dictionary except key and value are same object. SortedSet Yes No Via Key Key: O(log n) O(log n) Unique ordered collection, like SortedDictionary except key and value are same object. Stack No Yes Only Top Top: O(1) O(1)* Essentially same as List<T> except only process as LIFO Queue No Yes Only Front Front: O(1) O(1) Essentially same as List<T> except only process as FIFO The Original Collections: System.Collections namespace The original collection classes are largely considered deprecated by developers and by Microsoft itself. In fact they indicate that for the most part you should always favor the generic or concurrent collections, and only use the original collections when you are dealing with legacy .NET code. Because these collections are out of vogue, let's just briefly mention the original collection and their generic equivalents: ArrayList A dynamic, contiguous collection of objects. Favor the generic collection List<T> instead. Hashtable Associative, unordered collection of key-value pairs of objects. Favor the generic collection Dictionary<TKey,TValue> instead. Queue First-in-first-out (FIFO) collection of objects. Favor the generic collection Queue<T> instead. SortedList Associative, ordered collection of key-value pairs of objects. Favor the generic collection SortedList<T> instead. Stack Last-in-first-out (LIFO) collection of objects. Favor the generic collection Stack<T> instead. In general, the older collections are non-type-safe and in some cases less performant than their generic counterparts. Once again, the only reason you should fall back on these older collections is for backward compatibility with legacy code and libraries only. The Concurrent Collections: System.Collections.Concurrent namespace The concurrent collections are new as of .NET 4.0 and are included in the System.Collections.Concurrent namespace. These collections are optimized for use in situations where multi-threaded read and write access of a collection is desired. The concurrent queue, stack, and dictionary work much as you'd expect. The bag and blocking collection are more unique. Below is the summary of each with a link to a blog post I did on each of them. ConcurrentQueue Thread-safe version of a queue (FIFO). For more information see: C#/.NET Little Wonders: The ConcurrentStack and ConcurrentQueue ConcurrentStack Thread-safe version of a stack (LIFO). For more information see: C#/.NET Little Wonders: The ConcurrentStack and ConcurrentQueue ConcurrentBag Thread-safe unordered collection of objects. Optimized for situations where a thread may be bother reader and writer. For more information see: C#/.NET Little Wonders: The ConcurrentBag and BlockingCollection ConcurrentDictionary Thread-safe version of a dictionary. Optimized for multiple readers (allows multiple readers under same lock). For more information see C#/.NET Little Wonders: The ConcurrentDictionary BlockingCollection Wrapper collection that implement producers & consumers paradigm. Readers can block until items are available to read. Writers can block until space is available to write (if bounded). For more information see C#/.NET Little Wonders: The ConcurrentBag and BlockingCollection Summary The .NET BCL has lots of collections built in to help you store and manipulate collections of data. Understanding how these collections work and knowing in which situations each container is best is one of the key skills necessary to build more performant code. Choosing the wrong collection for the job can make your code much slower or even harder to maintain if you choose one that doesn’t perform as well or otherwise doesn’t exactly fit the situation. Remember to avoid the original collections and stick with the generic collections. If you need concurrent access, you can use the generic collections if the data is read-only, or consider the concurrent collections for mixed-access if you are running on .NET 4.0 or higher. Tweet Technorati Tags: C#,.NET,Collecitons,Generic,Concurrent,Dictionary,List,Stack,Queue,SortedList,SortedDictionary,HashSet,SortedSet

Read the article
CodePlex Daily Summary for Sunday, June 03, 2012

CodePlex Daily Summary for Sunday, June 03, 2012Popular ReleasesLiveChat Starter Kit: LCSK v1.5.2: New features: Visitor location (City - Country) from geo-location Pass configuration via javascript for the chat box New visitor identification (no more using the IP address as visitor identification) To update from 1.5.1 Run the /src/1.5.2-sql-updates.txt SQL script to update your database tables. If you have it installed via NuGet, simply update your package and the file will be included so you can run the update script. New installation The easiest way to add LCSK to your app is by...Prime Factorization: Prime Factorization V2: Download the installation package and run it, to install prime factorization on your computer.DNN Content Localization Tools: CLTools v0.4 (Beta4): 3th Beta release for DNN 6.1 and obove Bug corrections : - Copy module work againNTemplates: NTemplates full source code and examples: This release includes the following changes: - NTemplates code. More enhacements and bug fixing. Nested scans seems to be working ok now. New event: ScanStart. Very usuful for calculating totals (see the example) - Examples. 2 new examples on nested scans. One of them very simple I did just for debugging. The other one is a report of invoices grouped by vendor, including totals calculations. Planned Roadmap: - Work on fixing performance bottlenecek: Try to compile the expression...ZXMAK2: Version 2.6.2.3: - add support for ZIP files created on UNIX system; - improve WAV support (fixed PCM24, FLOAT32; added PCM32, FLOAT64); - fix drag-n-drop on modal dialogs; - tape AutoPlay feature (thanks to Woody for algorithm).Net Code Samples: Full WCF Duplex Service Example: Full WCF Duplex Service ExampleKendo UI ASP.NET Sample Applications: Sample Applications (2012-06-01): Sample application(s) demonstrating the use of Kendo UI in ASP.NET applications.Better Explorer: Better Explorer Beta 1: Finally, the first Beta is here! There were a lot of changes, including: Translations into 10 different languages (the translations are not complete and will be updated soon) Conditional Select new tools for managing archives Folder Tools tab new search bar and Search Tab new image editing tools update function many bug fixes, stability fixes, and memory leak fixes other new features as well! Please check it out and if there are any problems, let us know. :) Also, do not forge...myManga: myManga v1.0.0.3: Will include MangaPanda as a default option. ChangeLog Updating from Previous Version: Extract contents of Release - myManga v1.0.0.3.zip to previous version's folder. Replaces: myManga.exe BakaBox.dll CoreMangaClasses.dll Manga.dll Plugins/MangaReader.manga.dll Plugins/MangaFox.manga.dll Plugins/MangaHere.manga.dll Plugins/MangaPanda.manga.dllPlayer Framework by Microsoft: Player Framework for Windows 8 Metro (Preview 3): Player Framework for HTML/JavaScript and XAML/C# Metro Style Applications. Additional DownloadsIIS Smooth Streaming Client SDK for Windows 8 Microsoft PlayReady Client SDK for Metro Style Apps Release notes:Support for Windows 8 Release Preview (released 5/31/12) Advertising support (VAST, MAST, VPAID, & clips) Miscellaneous improvements and bug fixesMicrosoft Ajax Minifier: Microsoft Ajax Minifier 4.54: Fix for issue #18161: pretty-printing CSS @media rule throws an exception due to mismatched Indent/Unindent pair.Silverlight Toolkit: Silverlight 5 Toolkit Source - May 2012: Source code for December 2011 Silverlight 5 Toolkit release.Windows 8 Metro RSS Reader: RSS Reader release 6: Changed background and foreground colors Used VariableSizeGrid layout to wrap blog posts with images Sort items with Images first, text-only last Enabled Caching to improve navigation between framesJson.NET: Json.NET 4.5 Release 6: New feature - Added IgnoreDataMemberAttribute support New feature - Added GetResolvedPropertyName to DefaultContractResolver New feature - Added CheckAdditionalContent to JsonSerializer Change - Metro build now always uses late bound reflection Change - JsonTextReader no longer returns no content after consecutive underlying content read failures Fix - Fixed bad JSON in an array with error handling creating an infinite loop Fix - Fixed deserializing objects with a non-default cons...DotNetNuke® Community Edition CMS: 06.02.00: Major Highlights Fixed issue in the Site Settings when single quotes were being treated as escape characters Fixed issue loading the Mobile Premium Data after upgrading from CE to PE Fixed errors logged when updating folder provider settings Fixed the order of the mobile device capabilities in the Site Redirection Management UI The User Profile page was completely rebuilt. We needed User Profiles to have multiple child pages. This would allow for the most flexibility by still f...????: ????2.0.1: 1、?????。WiX Toolset: WiX v3.6 RC: WiX v3.6 RC (3.6.2928.0) provides feature complete Burn with VS11 support. For more information see Rob's blog post about the release: http://robmensching.com/blog/posts/2012/5/28/WiX-v3.6-Release-Candidate-availableJavascript .NET: Javascript .NET v0.7: SetParameter() reverts to its old behaviour of allowing JavaScript code to add new properties to wrapped C# objects. The behavior added briefly in 0.6 (throws an exception) can be had via the new SetParameterOptions.RejectUnknownProperties. TerminateExecution now uses its isolate to terminate the correct context automatically. Added support for converting all C# integral types, decimal and enums to JavaScript numbers. (Previously only the common types were handled properly.) Bug fixe...Phalanger - The PHP Language Compiler for the .NET Framework: 3.0 (May 2012): Fixes: unserialize() of negative float numbers fix pcre possesive quantifiers and character class containing ()[] array deserilization when the array contains a reference to ISerializable parsing lambda function fix round() reimplemented as it is in PHP to avoid .NET rounding errors filesize bypass for FileInfo.Length bug in Mono New features: Time zones reimplemented, uses Windows/Linux databaseSharePoint Euro 2012 - UEFA European Football Predictor: havivi.euro2012.wsp (1.1): New fetures:Admin enable / disable match Hide/Show Euro 2012 SharePoint lists (3 lists) Installing SharePoint Euro 2012 PredictorSharePoint Euro 2012 Predictor has been developed as a SharePoint Sandbox solution to support SharePoint Online (Office 365) Download the solution havivi.euro2012.wsp from the download page: Downloads Upload this solution to your Site Collection via the solutions area. Click on Activate to make the web parts in the solution available for use in the Site C...New ProjectsAFS.PhonePusherConnectorVX: pusher for phone vxApache: this is the Apache project.Apple: this is the Apple project.CUARTOAZZJL: HURRA!!Designing Windows 8 Applications with C# and XAML: This project hosts the source code used in the example projects for the book, Designing Windows 8 Metro Applications with C# and XAML.Easy Internet: CyberWeb ist ein einfacher Webbrowser für PC Neulinge. Ideal für Leute, die noch keine PC-Erfahrung haben.Easy Outlook Backup: Beschreibung: Easy Outlook Backup ist ein Programm das alle Daten von Outlook sichert. Ideal für Leute, die noch keine PC-Erfahrung haben. Easy Realtime Start: Beschreibung: Easy Realtime Start ist ein Programm das einen Prozess in höchster Priorität startet. Und das ganz bequem Per drag & drop. Ideal für Leute, die aufwendige Programme starten müssen. (zB.: “Blender” Free 3D Designer)eStock: eStock ist ein Verwaltungstool für Elektroniker um Bauteile im "Lager" sowie Projekte zu verwalten. Es bietet eine Möglichkeit festzulegen, um welche Art von Bauteil es sich handelt und wo sich dieses im Lager bzw. Regal befindet. Die Projektverwaltung ermöglicht es, Bauteile einem Projekt hinzuzufügen und eine Bestellliste / Einkaufsliste von Bauteilen, die nicht mehr im Lager vorhanden sind, zu erstellen. FuTTY: FireEgl's PuTTY -- FuTTY! FuTTY is a fork of PuTTY and PuTTYTray.GeometryWorld: To Do...Google: this is the Google project.Google Advance Search: An easy way to create documents search at Google and read your emails and Much moregoogle maps viewer for dynamics crm 2011: Easy google maps viewer for dynamics CRM 2011GPS Status - (GPS tool für GPS-Dongles und Mäuse) - GPS-Empfänger: Mein Programm verbindet sich mit dem externen GPS über einen Com-Port und bietet verschiedene Tools.Harmony Text Editor: Harmony is a ridiculously simple text editor for code and poetry.Hi! Football: .Goal / Objective -> To help friends gather for enjoying watching football together. .How it works -> To basically choose your favorite team, choose one of the matches fetched for his team, our app will generate a list of popular restraunts, cafes where he can watch the chosen match, the user can select one of the generated locations around his area, and create an event inviting his friends to join him and he can also join other friends' events.Java: this is the Java project.LevelUp Serializer: LevelUp Serializer is a small and simple serialize library.It can help developer to serialize and deserialize data more convenient. Feature: - Ease of use - Supports almost all serializer, like Binary、Xml、Soap、Json、DataContract. - Support serialize to file、serialize to stream、deserialize from file、deserialize from stream. - Support Xml encryption. - Support accelerated through the XML the serialization assemble.LFormatConvert: ????Linux???????????????????????，??ffmpeg????Machine QA Manager: Machine QA Manager is intended to save and help trend results from radiation therapy equipment testing. The program will be made as generic as possible from a initial setup to enable it's use for other types of routine testing activities (for example factory equipment) but preconfigured templates for radiation therapy will be supplied for the convenience of people working in that domain.maven-asbuild-plugin: maven-asbuild-plugin incorporates the adobe flash/flex based artifacts like swc or swf into the maven methology.Midnight Peach - C# framework generator for LINQ: C# framework generator for LINQMiku???????: ????????，????????！?????????????，??????????。?????????????????！MIKU????????????，??！????????????、????。 This program used to detect music beat.You can listen to music while press button,and it can display the BPM of the song.Miku will wave to you.MyTestingStudy: my personal testing studyNameless Sprite Editor: Nameless Sprite Editor is a tool used to thoroughly edit the graphics in ALL Game Boy Advance games. [ UNDER CONSTRUCTION ]NeoModulus Business Rules Builder: A windows form application that allows non-programmers to build strict definitions of a business domain. Once the definition is complete the program will build out object oriented C# files and a .net DLL. My test business domain is the open SRD, basically Dungeons and Dragons 3.5 edition.NodeJs: this is the NodeJs projectNoSQL: this is the Nosql project.OmniKassa for NopCommerce: OmniKassa payment module plugin for nopCommerceOn-Line Therapy: testOpen School: This project is about to create an open platform for all the academic institutions, so that they can manage all of their work. Our efforts will be for every kind of institution who are currently struggling with different kind of systems in place which are not collaborating with each other. This project will provide a common platform to all these kind of systems and provide them a better solution which actually works. Oracle: this is the Oracle project.Orchard Web Services: RESTful web services to expose interaction with Orchard content management.Personal Social Network using asp.net mvc and mongodb: FirstRooster is a network platform that let user create their own social network of interest to connect and share with like minded people anywhere.peshop: E-Commerce application , separated by DAl,BLL and Presentation layersPHP: this is the PHP project.Prime Factorization: Factoring trinomials using the ac method can be made easier through the use of Prime Factorization. Prime Factorization is a program that can assist you in the factoring of numbers in Algebra, namely trinomials using the AC Method. It can also find all the factors of any number.PromedioNotas: El programa trata de promediar 3 nostas y mostrar y si pasaba de año o no por medio de un mensajeProyecto Tarea: Este proyecto esta hecho con el objetivo de aprender sobre TFSProyectotarea1: Sotfware de terminal aéreo de Guayaquil, donde se encuentran el nombre de las aerolíneas y las rutas de vuelo a nivel nacional.Python: this is the Python project.Ruby: this is the Ruby project.tedplay: tedplay is your media player of choice for playing Commodore 264 music format files similar to SIDplay. It is basically a stripped down Commodore plus/4 emulator without video output and peripherals based on the SDL build of the Commodore 264 family emulator YAPE. tedplay is released under version 2 of the GNU Generic Public License and can be built for both Windows and Unix or actually any platform that has a C++ compiler and SDL support.test1: This is a test projectwin-x264: A port of the x264-codebase into a VisualStudio-project. Compilation requires Intel-compiler and Yasm.XDA ROM Hub: Xperia 2011 line toolkit.znvicente_cuartoc: Poyecto Vicente Eduardo Zambrano Navarrete??Win7?????: ??????? *.theme ???????（??，??，??，???）。????VSB???*.msstyles??，????????，????????????????????????！????????????，?????????????。???，????????！ ?????????，?????????，????????，??????！?????????????，?????????????????，?????????，???Aero??，??????~?????????????！??，?????????theme??，?????????，??????，????，??????。 This program used to create .theme file and the relevant documents (wallpaper, pointer, ICONS, sounds, etc.). As long as you use VSB ready . msstyles files, chosen the icon wallpaper, etc, an...

Read the article
Running SSIS packages from C#

- by Piotr Rodak

Most of the developers and DBAs know about two ways of deploying packages: You can deploy them to database server and run them using SQL Server Agent job or you can deploy the packages to file system and run them using dtexec.exe utility. Both approaches have their pros and cons. However I would like to show you that there is a third way (sort of) that is often overlooked, and it can give you capabilities the ‘traditional’ approaches can’t. I have been working for a few years with applications that run packages from host applications that are implemented in .NET. As you know, SSIS provides programming model that you can use to implement more flexible solutions. SSIS applications are usually thought to be batch oriented, with fairly rigid architecture and processing model, with fixed timeframes when the packages are executed to process data. It doesn’t to be the case, you don’t have to limit yourself to batch oriented architecture. I have very good experiences with service oriented architectures processing large amounts of data. These applications are more complex than what I would like to show here, but the principle stays the same: you can execute packages as a service, on ad-hoc basis. You can also implement and schedule various signals, HTTP calls, file drops, time schedules, Tibco messages and other to run the packages. You can implement event handler that will trigger execution of SSIS when a certain event occurs in StreamInsight stream. This post is just a small example of how you can use the API and other features to create a service that can run SSIS packages on demand. I thought it might be a good idea to implement a restful service that would listen to requests and execute appropriate actions. As it turns out, it is trivial in C#. The application is implemented as console application for the ease of debugging and running. In reality, you might want to implement the application as Windows service. To begin, you have to reference namespace System.ServiceModel.Web and then add a few lines of code: Uri baseAddress = new Uri("http://localhost:8011/"); WebServiceHost svcHost = new WebServiceHost(typeof(PackRunner), baseAddress); try { svcHost.Open(); Console.WriteLine("Service is running"); Console.WriteLine("Press enter to stop the service."); Console.ReadLine(); svcHost.Close(); } catch (CommunicationException cex) { Console.WriteLine("An exception occurred: {0}", cex.Message); svcHost.Abort(); } The interesting lines are 3, 7 and 13. In line 3 you create a WebServiceHost object. In line 7 you start listening on the defined URL and then in line 13 you shut down the service. As you have noticed, the WebServiceHost constructor is accepting type of an object (here: PackRunner) that will be instantiated as singleton and subsequently used to process the requests. This is the class where you put your logic, but to tell WebServiceHost how to use it, the class must implement an interface which declares methods to be used by the host. The interface itself must be ornamented with attribute ServiceContract. [ServiceContract] public interface IPackRunner { [OperationContract] [WebGet(UriTemplate = "runpack?package={name}")] string RunPackage1(string name); [OperationContract] [WebGet(UriTemplate = "runpackwithparams?package={name}&rows={rows}")] string RunPackage2(string name, int rows); } Each method that is going to be used by WebServiceHost has to have attribute OperationContract, as well as WebGet or WebInvoke attribute. The detailed discussion of the available options is outside of scope of this post. I also recommend using more descriptive names to methods . Then, you have to provide the implementation of the interface: public class PackRunner : IPackRunner { ... There are two methods defined in this class. I think that since the full code is attached to the post, I will show only the more interesting method, the RunPackage2. /// <summary> /// Runs package and sets some of its variables. /// </summary> /// <param name="name">Name of the package</param> /// <param name="rows">Number of rows to export</param> /// <returns></returns> public string RunPackage2(string name, int rows) { try { string pkgLocation = ConfigurationManager.AppSettings["PackagePath"]; pkgLocation = Path.Combine(pkgLocation, name.Replace("\"", "")); Console.WriteLine(); Console.WriteLine("Calling package {0} with parameter {1}.", name, rows); Application app = new Application(); Package pkg = app.LoadPackage(pkgLocation, null); pkg.Variables["User::ExportRows"].Value = rows; DTSExecResult pkgResults = pkg.Execute(); Console.WriteLine(); Console.WriteLine(pkgResults.ToString()); if (pkgResults == DTSExecResult.Failure) { Console.WriteLine(); Console.WriteLine("Errors occured during execution of the package:"); foreach (DtsError er in pkg.Errors) Console.WriteLine("{0}: {1}", er.ErrorCode, er.Description); Console.WriteLine(); return "Errors occured during execution. Contact your support."; } Console.WriteLine(); Console.WriteLine(); return "OK"; } catch (Exception ex) { Console.WriteLine(ex); return ex.ToString(); } } The method accepts package name and number of rows to export. The packages are deployed to the file system. The path to the packages is configured in the application configuration file. This way, you can implement multiple services on the same machine, provided you also configure the URL for each instance appropriately. To run a package, you have to reference Microsoft.SqlServer.Dts.Runtime namespace. This namespace is implemented in Microsoft.SQLServer.ManagedDTS.dll which in my case was installed in the folder “C:\Program Files (x86)\Microsoft SQL Server\100\SDK\Assemblies”. Once you have done it, you can create an instance of Microsoft.SqlServer.Dts.Runtime.Application as in line 18 in the above snippet. It may be a good idea to create the Application object in the constructor of the PackRunner class, to avoid necessity of recreating it each time the service is invoked. Then, in line 19 you see that an instance of Microsoft.SqlServer.Dts.Runtime.Package is created. The method LoadPackage in its simplest form just takes package file name as the first parameter. Before you run the package, you can set its variables to certain values. This is a great way of configuring your packages without all the hassle with dtsConfig files. In the above code sample, variable “User:ExportRows” is set to value of the parameter “rows” of the method. Eventually, you execute the package. The method doesn’t throw exceptions, you have to test the result of execution yourself. If the execution wasn’t successful, you can examine collection of errors exposed by the package. These are the familiar errors you often see during development and debugging of the package. I you run the package from the code, you have opportunity to persist them or log them using your favourite logging framework. The package itself is very simple; it connects to my AdventureWorks database and saves number of rows specified in variable “User::ExportRows” to a file. You should know that before you run the package, you can change its connection strings, logging, events and many more. I attach solution with the test service, as well as a project with two test packages. To test the service, you have to run it and wait for the message saying that the host is started. Then, just type (or copy and paste) the below command to your browser. http://localhost:8011/runpackwithparams?package=%22ExportEmployees.dtsx%22&rows=12 When everything works fine, and you modified the package to point to your AdventureWorks database, you should see "OK” wrapped in xml: I stopped the database service to simulate invalid connection string situation. The output of the request is different now: And the service console window shows more information: As you see, implementing service oriented ETL framework is not a very difficult task. You have ability to configure the packages before you run them, you can implement logging that is consistent with the rest of your system. In application I have worked with we also have resource monitoring and execution control. We don’t allow to run more than certain number of packages to run simultaneously. This ensures we don’t strain the server and we use memory and CPUs efficiently. The attached zip file contains two projects. One is the package runner. It has to be executed with administrative privileges as it registers HTTP namespace. The other project contains two simple packages. This is really a cool thing, you should check it out!

Read the article
Laissez les bon temps rouler! (Microsoft BI Conference 2010)

- by smisner

Laissez les bons temps rouler" is a Cajun phrase that I heard frequently when I lived in New Orleans in the mid-1990s. It means "Let the good times roll!" and encapsulates a feeling of happy expectation. As I met with many of my peers and new acquaintances at the Microsoft BI Conference last week, this phrase kept running through my mind as people spoke about their plans in their respective businesses, the benefits and opportunities that the recent releases in the BI stack are providing, and their expectations about the future of the BI stack.Notwithstanding some jabs here and there to point out the platform is neither perfect now nor will be anytime soon (along with admissions that the competitors are also not perfect), and notwithstanding several missteps by the event organizers (which I don't care to enumerate), the overarching mood at the conference was positive. It was a refreshing change from the doom and gloom hovering over several conferences that I attended in 2009. Although many people expect economic hardships to continue over the coming year or so, everyone I know in the BI field is busier than ever and expects to stay busy for quite a while.Self-Service BISelf-service was definitely a theme of the BI conference. In the keynote, Ted Kummert opened with a look back to a fairy tale vision of self-service BI that he told in 2008. At that time, the fairy tale future was a time when "every end user was able to use BI technologies within their job in order to move forward more effectively" and transitioned to the present time in which SQL Server 2008 R2, Office 2010, and SharePoint 2010 are available to deliver managed self-service BI.This set of technologies is presumably poised to address the needs of the 80% of users that Kummert said do not use BI today. He proceeded to outline a series of activities that users ought to be able to do themselves--from simple changes to a report like formatting or an addtional data visualization to integration of an additional data source. The keynote then continued with a series of demonstrations of both current and future technology in support of self-service BI. Some highlights that interested me:PowerPivot, of course, is the flagship product for self-service BI in the Microsoft BI stack. In the TechEd keynote, which was open to the BI conference attendees, Amir Netz (twitter) impressed the audience by demonstrating interactivity with a workbook containing 100 million rows. He upped the ante at the BI keynote with his demonstration of a future-state PowerPivot workbook containing over 2 billion records. It's important to note that this volume of data is being processed by a server engine, and not in the PowerPivot client engine. (Yes, I think it's impressive, but none of my clients are typically wrangling with 2 billion records at a time. Maybe they're thinking too small. This ability to work quickly with large data sets has greater implications for BI solutions than for self-service BI, in my opinion.)Amir also demonstrated KPIs for the future PowerPivot, which appeared to be easier to implement than in any other Microsoft product that supports KPIs, apart from simple KPIs in SharePoint. (My initial reaction is that we have one more place to build KPIs. Great. It's confusing enough. I haven't seen how well those KPIs integrate with other BI tools, which will be important for adoption.)One more PowerPivot feature that Amir showed was a graphical display of the lineage for calculations. (This is hugely practical, especially if you build up calculations incrementally. You can more easily follow the logic from calculation to calculation. Furthermore, if you need to make a change to one calculation, you can assess the impact on other calculations.)Another product demonstration will be available within the next 30 days--Pivot for Reporting Services. If you haven't seen this technology yet, check it out at www.getpivot.com. (It definitely has a wow factor, but I'm skeptical about its practicality. However, I'm looking forward to trying it out with data that I understand.)Michael Tejedor (twitter) demonstrated a feature that I think is really interesting and not emphasized nearly enough--overshadowed by PowerPivot, no doubt. That feature is the Microsoft Business Intelligence Indexing Connector, which enables search of the content of Excel workbooks and Reporting Services reports. (This capability existed in MOSS 2007, but was more cumbersome to implement. The search results in SharePoint 2010 are not only cooler, but more useful by describing whether the content is found in a table or a chart, for example.)This may yet be the dawning of the age of self-service BI - a phrase I've heard repeated from time to time over the last decade - but I think BI professionals are likely to stay busy for a long while, and need not start looking for a new line of work. Kummert repeatedly referenced strategic BI solutions in contrast to self-service BI to emphasize that self-service BI is not a replacement for the services that BI professionals provide. After all, self-service BI does not appear magically on user desktops (or whatever device they want to use). A supporting infrastructure is necessary, and grows in complexity in proportion to the need to simplify BI for users.It's one thing to hear the party line touted by Microsoft employees at the BI keynote, but it's another to hear from the people who are responsible for implementing and supporting it within an organization. Rob Collie (blog | twitter), Kasper de Jonge (blog | twitter), Vidas Matelis (site | twitter), and I were invited to join Andrew Brust (blog | twitter) as he led a Birds of a Feather session at TechEd entitled "PowerPivot: Is It the BI Deal-Changer for Developers and IT Pros?" I would single out the prevailing concern in this session as the issue of control. On one side of this issue were those who were concerned that they would lose control once PowerPivot is implemented. On the other side were those who believed that data should be freely accessible to users in PowerPivot, and even acknowledgment that users would get the data they want even if it meant they would have to manually enter into a workbook to have it ready for analysis. For another viewpoint on how PowerPivot played out at the conference, see Rob Collie's observations.Collaborative BII have been intrigued by the notion of collaborative BI for a very long time. Before I discovered BI, I was a Lotus Notes developer and later a manager of developers, working in a software company that enabled collaboration in the legal industry. Not only did I help create collaborative systems for our clients, I created a complete project management from the ground up to collaboratively manage our custom development work. In that case, collaboration involved my team, my client contacts, and me. I was also able to produce my own BI from that system as well, but didn't know that's what I was doing at the time. Only in recent years has SharePoint begun to catch up with the capabilities that I had with Lotus Notes more than a decade ago. Eventually, I had the opportunity at that job to formally investigate BI as another product offering for our software, and the rest - as they say - is history. I built my first data warehouse with Scott Cameron (who has also ventured into the authoring world by writing Analysis Services 2008 Step by Step and was at the BI Conference last week where I got to reminisce with him for a bit) and that began a career that I never imagined at the time.Fast forward to 2010, and I'm still lauding the virtues of collaborative BI, if only the tools will catch up to my vision! Thus, I was anxious to see what Donald Farmer (blog | twitter) and Rita Sallam of Gartner had to say on the subject in their session "Collaborative Decision Making." As I suspected, the tools aren't quite there yet, but the vendors are moving in the right direction. One thing I liked about this session was a non-Microsoft perspective of the state of the industry with regard to collaborative BI. In addition, this session included a better demonstration of SharePoint collaborative BI capabilities than appeared in the BI keynote. Check out the video in the link to the session to see the demonstration. One of the use cases that was demonstrated was linking from information to a person, because, as Donald put it, "People don't trust data, they trust people."The Microsoft BI Stack in GeneralA question I hear all the time from students when I'm teaching is how to know what tools to use when there is overlap between products in the BI stack. I've never taken the time to codify my thoughts on the subject, but saw that my friend Dan Bulos provided good insight on this topic from a variety of perspectives in his session, "So Many BI Tools, So Little Time." I thought one of his best points was that ideally you should be able to design in your tool of choice, and then deploy to your tool of choice. Unfortunately, the ideal is yet to become real across the platform. The closest we come is with the RDL in Reporting Services which can be produced from two different tools (Report Builder or Business Intelligence Development Studio's Report Designer), manually, or by a third-party or custom application. I have touted the idea for years (and publicly said so about 5 years ago) that eventually more products would be RDL producers or consumers, but we aren't there yet. Maybe in another 5 years.Another interesting session that covered the BI stack against a backdrop of competitive products was delivered by Andrew Brust. Andrew did a marvelous job of consolidating a lot of information in a way that clearly communicated how various vendors' offerings compared to the Microsoft BI stack. He also made a particularly compelling argument about how the existence of an ecosystem around the Microsoft BI stack provided innovation and opportunities lacking for other vendors. Check out his presentation, "How Does the Microsoft BI Stack...Stack Up?"Expo HallI had planned to spend more time in the Expo Hall to see who was doing new things with the BI stack, but didn't manage to get very far. Each time I set out on an exploratory mission, I got caught up in some fascinating conversations with one or more of my peers. I find interacting with people that I meet at conferences just as important as attending sessions to learn something new. There were a couple of items that really caught me eye, however, that I'll share here.Pragmatic Works. Whether you develop SSIS packages, build SSAS cubes, or author SSRS reports (or all of the above), you really must take a look at BI Documenter. Brian Knight (twitter) walked me through the key features, and I must say I was impressed. Once you've seen what this product can do, you won't want to document your BI projects any other way. You can download a free single-user database edition, or choose from more feature-rich standard or professional editions.Microsoft Press ebooks. I also stopped by the O'Reilly Media booth to meet some folks that one of my acquisitions editors at Microsoft Press recommended. In case you haven't heard, Microsoft Press has partnered with O'Reilly Media for distribution and publishing. Apart from my interest in learning more about O'Reilly Media as an author, an advertisement in their booth caught me eye which I think is a really great move. When you buy Microsoft Press ebooks through the O'Reilly web site, you can receive it in any (or all) of the following formats where possible: PDF, epub, .mobi for Kindle and .apk for Android. You also have lifetime DRM-free access to the ebooks. As someone who is an avid collector of books, I fnd myself running out of room for storage. In addition, I travel a lot, and it's hard to lug my reference library with me. Today's e-reader options make the move to digital books a more viable way to grow my library. Having a variety of formats means I am not limited to a single device, and lifetime access means I don't have to worry about keeping track of where I've stored my files. Because the e-books are DRM-free, I can copy and paste when I'm compiling notes, and I can print pages when necessary. That's a winning combination in my mind!Overall, I was pleased with the BI conference. There were many more sessions that I couldn't attend, either because the room was full when I got there or there were multiple sessions running concurrently that I wanted to see. Fortunately, many of the sessions are accessible for viewing online at http://www.msteched.com/2010/NorthAmerica along with the TechEd sessions. You can spot the BI sessions by the yellow skyline on the title slide of the presentation as shown below. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
CodePlex Daily Summary for Friday, June 01, 2012

CodePlex Daily Summary for Friday, June 01, 2012Popular ReleasesASP.Net Client Dependency Framework: v1.5: This release brings you many bug fixes and some new features Install via Nuget:Install-Package ClientDependency Install-Package ClientDependency-Mvc New featuresNew PlaceHolderProvider for webforms which will now let you specify exactly where the CSS and JS is rendered, so you can now separate them Better API support for runtime changes & registration Allows for custom formatting of composite file URLs new config option: pathUrlFormat="{dependencyId}/{version}/{type}" to have full contr...Silverlight 5 Multi-Window Controls: May 2012: This release introduces a new context menu type for desktop apps that can overflow the parent window: http://trelford.com/ContextMenu_SL5_Native.png Code snippet: <TextBlock Text="Right click on me to show the context menu"> <multiwindow:ContextMenuService.ContextMenu> <multiwindow:ContextMenuWindow> <multiwindow:MenuItem Header="Menu Item"/> </multiwindow:ContextMenuWindow> </multiwindow:ContextMenuService.Co...Better Explorer: Better Explorer Beta 1: Finally, the first Beta is here! There were a lot of changes, including: Translations into 10 different languages (the translations are not complete and will be updated soon) Conditional Select new tools for managing archives Folder Tools tab new search bar and Search Tab new image editing tools update function many bug fixes, stability fixes, and memory leak fixes other new features as well! Please check it out and if there are any problems, let us know. :) Also, do not forge...myManga: myManga v1.0.0.3: Will include MangaPanda as a default option. ChangeLog Updating from Previous Version: Extract contents of Release - myManga v1.0.0.3.zip to previous version's folder. Replaces: myManga.exe BakaBox.dll CoreMangaClasses.dll Manga.dll Plugins/MangaReader.manga.dll Plugins/MangaFox.manga.dll Plugins/MangaHere.manga.dll Plugins/MangaPanda.manga.dllPlayer Framework by Microsoft: Player Framework for Windows 8 Metro (Preview 3): Player Framework for HTML/JavaScript and XAML/C# Metro Style Applications. Additional DownloadsIIS Smooth Streaming Client SDK for Windows 8 Microsoft PlayReady Client SDK for Metro Style Apps Release notes:Support for Windows 8 Release Preview (released 5/31/12) Advertising support (VAST, MAST, VPAID, & clips) Miscellaneous improvements and bug fixesConfuser: Confuser 1.8: Changelog: +New UI...again. +New project system, replacing the previous declarative obfuscation and XML configuration. *Improve the protection strength... *Improve the compatibility. Now Confuser can obfuscate itself and even some real-life application like Paint.NET and ILSpy! (of course with some small adjustment)Naked Objects: Naked Objects Release 4.1.0: Corresponds to the packaged version 4.1.0 available via NuGet. Note that the versioning has moved to SemVer (http://semver.org/) This is a bug fix release with no new functionality. Please note that the easiest way to install and run the Naked Objects Framework is via the NuGet package manager: just search the Official NuGet Package Source for 'nakedobjects'. It is only necessary to download the source code (from here) if you wish to modify or re-build the framework yourself. If you do wi...Microsoft Ajax Minifier: Microsoft Ajax Minifier 4.54: Fix for issue #18161: pretty-printing CSS @media rule throws an exception due to mismatched Indent/Unindent pair.Silverlight Toolkit: Silverlight 5 Toolkit Source - May 2012: Source code for December 2011 Silverlight 5 Toolkit release.Json.NET: Json.NET 4.5 Release 6: New feature - Added IgnoreDataMemberAttribute support New feature - Added GetResolvedPropertyName to DefaultContractResolver New feature - Added CheckAdditionalContent to JsonSerializer Change - Metro build now always uses late bound reflection Change - JsonTextReader no longer returns no content after consecutive underlying content read failures Fix - Fixed bad JSON in an array with error handling creating an infinite loop Fix - Fixed deserializing objects with a non-default cons...DotNetNuke® Community Edition CMS: 06.02.00: Major Highlights Fixed issue in the Site Settings when single quotes were being treated as escape characters Fixed issue loading the Mobile Premium Data after upgrading from CE to PE Fixed errors logged when updating folder provider settings Fixed the order of the mobile device capabilities in the Site Redirection Management UI The User Profile page was completely rebuilt. We needed User Profiles to have multiple child pages. This would allow for the most flexibility by still f...Thales Simulator Library: Version 0.9.6: The Thales Simulator Library is an implementation of a software emulation of the Thales (formerly Zaxus & Racal) Hardware Security Module cryptographic device. This release fixes a problem with the FK command and a bug in the implementation of PIN block 05 format deconstruction. A new 0.9.6.Binaries file has been posted. This includes executable programs without an installer, including the GUI and console simulators, the key manager and the PVV clashing demo. Please note that you will need ...????: ????2.0.1: 1、?????。WiX Toolset: WiX v3.6 RC: WiX v3.6 RC (3.6.2928.0) provides feature complete Burn with VS11 support. For more information see Rob's blog post about the release: http://robmensching.com/blog/posts/2012/5/28/WiX-v3.6-Release-Candidate-availableJavascript .NET: Javascript .NET v0.7: SetParameter() reverts to its old behaviour of allowing JavaScript code to add new properties to wrapped C# objects. The behavior added briefly in 0.6 (throws an exception) can be had via the new SetParameterOptions.RejectUnknownProperties. TerminateExecution now uses its isolate to terminate the correct context automatically. Added support for converting all C# integral types, decimal and enums to JavaScript numbers. (Previously only the common types were handled properly.) Bug fixe...Phalanger - The PHP Language Compiler for the .NET Framework: 3.0 (May 2012): Fixes: unserialize() of negative float numbers fix pcre possesive quantifiers and character class containing ()[] array deserilization when the array contains a reference to ISerializable parsing lambda function fix round() reimplemented as it is in PHP to avoid .NET rounding errors filesize bypass for FileInfo.Length bug in Mono New features: Time zones reimplemented, uses Windows/Linux databaseSharePoint Euro 2012 - UEFA European Football Predictor: havivi.euro2012.wsp (1.1): New fetures:Admin enable / disable match Hide/Show Euro 2012 SharePoint lists (3 lists) Installing SharePoint Euro 2012 PredictorSharePoint Euro 2012 Predictor has been developed as a SharePoint Sandbox solution to support SharePoint Online (Office 365) Download the solution havivi.euro2012.wsp from the download page: Downloads Upload this solution to your Site Collection via the solutions area. Click on Activate to make the web parts in the solution available for use in the Site C...????SDK for .Net 4.0+(OAuth2.0+??V2?API): ??V2?SDK???: ?????????API?? ???????OAuth2.0?? ????：????????????，??????????“SOURCE CODE”?????????Changeset，http://weibosdk.codeplex.com/SourceControl/list/changesets ???：????????，DEMO??AppKey????????????????，?????AppKey，????AppKey???????????，?????“????>????>????>??????”.Net Code Samples: Code Samples: Code samples (SLNs).LINQ_Koans: LinqKoans v.02: Cleaned up a bitNew ProjectsAntiXSS Experimental: Welcome to AntiXSS Experimental. AntiXSS Experimental contains code for common encoders auto-generated using Microsoft Research's BEK project.atfone: atfoneBango Adobe Air Application Analytics SDK: Bango application analytics is an analytics solution for mobile applications. This SDK provides a framework you can use in your application to add analytics capabilities to your mobile applications. It's developed in Actionscript with Native Extensions to target iOS,Android and the Blackberry Playbook.bbinjest: bbinjestdevMobile.NET Library for Windows Phone 7.1: devMobile.NET Library intends to offer a set of commons and not so commons controls for developing Windows Phone 7.5 applications. It offers classic controls like both pie and column charts for simple scenarios as well as other not so classic controls like SignalAccuracy control for displaying, for instance, the GPS accuracy like WP7 built-in GPRS/3G signal coverage indicator does. It also provide a tag cloud control for displaying item in the way common web based tag clouds usually offer. ...dnnFiddle: dnnFiddle is a DotNetNuke module that aims to make it easier to add rich content to your DotNetNuke website.DotNetNuke Task Manager: Test Project to learn DotNetNuke and CodePlex IntegrationDSIB - TireService: Project for handling Tiresets/Wheels - looking up dimensions, loadindex, speedindex, etc. for wheelsflyabroad: flyabroadhphai: My ProjectIIS File Manager - Editor: IIS File Manager provides ability to upload files faster through HTTP and it requires no extra installation, just one website with windows authentication lets users upload files easily.KeypItSafe Password Vault: KeypItSafe Password Vault Easily and safely store your website passwords on your computer - or go mobile in just a few clicks! What is KeypItSafe? KeypItSafe is a free open source password manager that helps you store and manage all of your passwords securely on your computer or a USB/Removable Media drive. With only a few clicks you can transfer all of your saved passwords to a USB drive and immediately have access to them on virtually any computer. You only have to remember one mas...litwaredk: Sourcecode for the projects on www.litware.dkMVC Pattern Toolkit (Sample): This is a sample MVC pattern toolkit that helps web developers create ASP.NET MVC 2 web applications using advanced tooling and automation, with integrated guidance. This toolkit is provided as a *sample* soley to demonstrate how easy pattern toolkits can be created that provide custom automation and tooling in Visual Studio to speed development. *NOTE*: This pattern toolkit is not intended to demonstrate THE official way to build ASP.NET MVC applications. It is intended to demonstrate ...MyDeveloperCareer: willwymydevelopercarMyPomodoroWatch: My personal Pomodoro watch.NRails: NRailsPayment Gateways: Open source project for integrating payment gateways all over the world into .NET websites and desktop applications. Developers are requested to submit their code and check out the project to start contributing.Preactor Object Model: pom is a Preactor library which provides easy access and manipulation of Preactor data.RgC: RuCSecondPong: blablablaSharp Home: Sharphome is designed to run on Windows and Linux (via the Mono Project) and is designed to be useful in home automation and home security.SharpPTC: SharpPTC is a framebuffer library designed for creating retro games and applications, built on DirectDraw targeting the .NET platform. It provides a simple pixel buffer and methods to ease drawing (line, rectangle, clear etc). SharpPTC also comes with limited keyboard support.simplesocxs: simplesocxsSQL Process Viewer: View all of the processes (that you have security to see) currently running on a SQL database.testtom05312012git02: fdsfdText-To-Speech with Microsoft Translator Service: With this library, you can easily add Text-To-Speech capabilities to your .NET applications. It uses the Microsoft Translator Service to obtain streams of file speaking text in the desired language. At the moment of writing, there are 44 supported languages, including English, Italian, German, French, Spanish, Japanese and Chinese.ultsvn: The description.umbracoCssZenGarden: This is a learning package, It's not supposed to be a best practice for content management, just a fun test to see what a content managed cssZenGarden might be like.Visual Studio Solution Code Format AddIn: 0 people following this project (follow) VS???????? ????????：http://www.cnblogs.com/viter/addin ??Visual Studio 2008?2010，???????????，???????????"version“????。 ????namespace , class , struct , enum , property ，?????????(??????Function)??????Function，????????。 ???????、????。 ????Function????????????xml???，???????BUG，????Function???。 ??? class , struct , enum , property , Function??#region #endregion?????。 ????Property ? Function ???????，?Property????“?????? ”??。 ?????...Viz, Simple 3D Control inspired by Processing.orgVolunteerManager: Its a small app that manages volunteers in a volunteer organization.WebLearningFS: Dokan Based Web Learning File SystemWindowsPhonePusherSLService: Silverlight Pusher for WPhoneWindowsPhonePusherWcfService: for pushing same to wphone tooWPF Encryption: This develop a application for encryption/encode a string or checksum a fileXNAGameFeatures: XNAGameFeatures Project XNAGameFeatures is a XNA 4.0 library which permit to create and manage a XNA game easily. The project is separate in five part : BasicGame library Widget library Shapes library Input library Features library Each part is thinking to offer a easy way to the creation of a game in XNA 4.0

Read the article
How to Load Oracle Tables From Hadoop Tutorial (Part 5 - Leveraging Parallelism in OSCH)

- by Bob Hanckel

Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 Using OSCH: Beyond Hello World In the previous post we discussed a “Hello World” example for OSCH focusing on the mechanics of getting a toy end-to-end example working. In this post we are going to talk about how to make it work for big data loads. We will explain how to optimize an OSCH external table for load, paying particular attention to Oracle’s DOP (degree of parallelism), the number of external table location files we use, and the number of HDFS files that make up the payload. We will provide some rules that serve as best practices when using OSCH. The assumption is that you have read the previous post and have some end to end OSCH external tables working and now you want to ramp up the size of the loads. Using OSCH External Tables for Access and Loading OSCH external tables are no different from any other Oracle external tables. They can be used to access HDFS content using Oracle SQL: SELECT * FROM my_hdfs_external_table; or use the same SQL access to load a table in Oracle. INSERT INTO my_oracle_table SELECT * FROM my_hdfs_external_table; To speed up the load time, you will want to control the degree of parallelism (i.e. DOP) and add two SQL hints. ALTER SESSION FORCE PARALLEL DML PARALLEL 8; ALTER SESSION FORCE PARALLEL QUERY PARALLEL 8; INSERT /*+ append pq_distribute(my_oracle_table, none) */ INTO my_oracle_table SELECT * FROM my_hdfs_external_table; There are various ways of either hinting at what level of DOP you want to use. The ALTER SESSION statements above force the issue assuming you (the user of the session) are allowed to assert the DOP (more on that in the next section). Alternatively you could embed additional parallel hints directly into the INSERT and SELECT clause respectively. /*+ parallel(my_oracle_table,8) *//*+ parallel(my_hdfs_external_table,8) */ Note that the "append" hint lets you load a target table by reserving space above a given "high watermark" in storage and uses Direct Path load. In other doesn't try to fill blocks that are already allocated and partially filled. It uses unallocated blocks. It is an optimized way of loading a table without incurring the typical resource overhead associated with run-of-the-mill inserts. The "pq_distribute" hint in this context unifies the INSERT and SELECT operators to make data flow during a load more efficient. Finally your target Oracle table should be defined with "NOLOGGING" and "PARALLEL" attributes. The combination of the "NOLOGGING" and use of the "append" hint disables REDO logging, and its overhead. The "PARALLEL" clause tells Oracle to try to use parallel execution when operating on the target table. Determine Your DOP It might feel natural to build your datasets in Hadoop, then afterwards figure out how to tune the OSCH external table definition, but you should start backwards. You should focus on Oracle database, specifically the DOP you want to use when loading (or accessing) HDFS content using external tables. The DOP in Oracle controls how many PQ slaves are launched in parallel when executing an external table. Typically the DOP is something you want to Oracle to control transparently, but for loading content from Hadoop with OSCH, it's something that you will want to control. Oracle computes the maximum DOP that can be used by an Oracle user. The maximum value that can be assigned is an integer value typically equal to the number of CPUs on your Oracle instances, times the number of cores per CPU, times the number of Oracle instances. For example, suppose you have a RAC environment with 2 Oracle instances. And suppose that each system has 2 CPUs with 32 cores. The maximum DOP would be 128 (i.e. 2*2*32). In point of fact if you are running on a production system, the maximum DOP you are allowed to use will be restricted by the Oracle DBA. This is because using a system maximum DOP can subsume all system resources on Oracle and starve anything else that is executing. Obviously on a production system where resources need to be shared 24x7, this can’t be allowed to happen. The use cases for being able to run OSCH with a maximum DOP are when you have exclusive access to all the resources on an Oracle system. This can be in situations when your are first seeding tables in a new Oracle database, or there is a time where normal activity in the production database can be safely taken off-line for a few hours to free up resources for a big incremental load. Using OSCH on high end machines (specifically Oracle Exadata and Oracle BDA cabled with Infiniband), this mode of operation can load up to 15TB per hour. The bottom line is that you should first figure out what DOP you will be allowed to run with by talking to the DBAs who manage the production system. You then use that number to derive the number of location files, and (optionally) the number of HDFS data files that you want to generate, assuming that is flexible. Rule 1: Find out the maximum DOP you will be allowed to use with OSCH on the target Oracle system Determining the Number of Location Files Let’s assume that the DBA told you that your maximum DOP was 8. You want the number of location files in your external table to be big enough to utilize all 8 PQ slaves, and you want them to represent equally balanced workloads. Remember location files in OSCH are metadata lists of HDFS files and are created using OSCH’s External Table tool. They also represent the workload size given to an individual Oracle PQ slave (i.e. a PQ slave is given one location file to process at a time, and only it will process the contents of the location file.) Rule 2: The size of the workload of a single location file (and the PQ slave that processes it) is the sum of the content size of the HDFS files it lists For example, if a location file lists 5 HDFS files which are each 100GB in size, the workload size for that location file is 500GB. The number of location files that you generate is something you control by providing a number as input to OSCH’s External Table tool. Rule 3: The number of location files chosen should be a small multiple of the DOP Each location file represents one workload for one PQ slave. So the goal is to keep all slaves busy and try to give them equivalent workloads. Obviously if you run with a DOP of 8 but have 5 location files, only five PQ slaves will have something to do and the other three will have nothing to do and will quietly exit. If you run with 9 location files, then the PQ slaves will pick up the first 8 location files, and assuming they have equal work loads, will finish up about the same time. But the first PQ slave to finish its job will then be rescheduled to process the ninth location file, potentially doubling the end to end processing time. So for this DOP using 8, 16, or 32 location files would be a good idea. Determining the Number of HDFS Files Let’s start with the next rule and then explain it: Rule 4: The number of HDFS files should try to be a multiple of the number of location files and try to be relatively the same size In our running example, the DOP is 8. This means that the number of location files should be a small multiple of 8. Remember that each location file represents a list of unique HDFS files to load, and that the sum of the files listed in each location file is a workload for one Oracle PQ slave. The OSCH External Table tool will look in an HDFS directory for a set of HDFS files to load. It will generate N number of location files (where N is the value you gave to the tool). It will then try to divvy up the HDFS files and do its best to make sure the workload across location files is as balanced as possible. (The tool uses a greedy algorithm that grabs the biggest HDFS file and delegates it to a particular location file. It then looks for the next biggest file and puts in some other location file, and so on). The tools ability to balance is reduced if HDFS file sizes are grossly out of balance or are too few. For example suppose my DOP is 8 and the number of location files is 8. Suppose I have only 8 HDFS files, where one file is 900GB and the others are 100GB. When the tool tries to balance the load it will be forced to put the singleton 900GB into one location file, and put each of the 100GB files in the 7 remaining location files. The load balance skew is 9 to 1. One PQ slave will be working overtime, while the slacker PQ slaves are off enjoying happy hour. If however the total payload (1600 GB) were broken up into smaller HDFS files, the OSCH External Table tool would have an easier time generating a list where each workload for each location file is relatively the same. Applying Rule 4 above to our DOP of 8, we could divide the workload into160 files that were approximately 10 GB in size. For this scenario the OSCH External Table tool would populate each location file with 20 HDFS file references, and all location files would have similar workloads (approximately 200GB per location file.) As a rule, when the OSCH External Table tool has to deal with more and smaller files it will be able to create more balanced loads. How small should HDFS files get? Not so small that the HDFS open and close file overhead starts having a substantial impact. For our performance test system (Exadata/BDA with Infiniband), I compared three OSCH loads of 1 TiB. One load had 128 HDFS files living in 64 location files where each HDFS file was about 8GB. I then did the same load with 12800 files where each HDFS file was about 80MB size. The end to end load time was virtually the same. However when I got ridiculously small (i.e. 128000 files at about 8MB per file), it started to make an impact and slow down the load time. What happens if you break rules 3 or 4 above? Nothing draconian, everything will still function. You just won’t be taking full advantage of the generous DOP that was allocated to you by your friendly DBA. The key point of the rules articulated above is this: if you know that HDFS content is ultimately going to be loaded into Oracle using OSCH, it makes sense to chop them up into the right number of files roughly the same size, derived from the DOP that you expect to use for loading. Next Steps So far we have talked about OLH and OSCH as alternative models for loading. That’s not quite the whole story. They can be used together in a way that provides for more efficient OSCH loads and allows one to be more flexible about scheduling on a Hadoop cluster and an Oracle Database to perform load operations. The next lesson will talk about Oracle Data Pump files generated by OLH, and loaded using OSCH. It will also outline the pros and cons of using various load methods. This will be followed up with a final tutorial lesson focusing on how to optimize OLH and OSCH for use on Oracle's engineered systems: specifically Exadata and the BDA. /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}

Read the article
Multiple routers, subnets, gateways etc

- by allentown

My current setup is: Cable modem dishes out 13 static IP's (/28), a GB switch is plugged into the cable modem, and has access to those 13 static IP's, I have about 6 "servers" in use right now. The cable modem is also a firewall, DHCP server, and 3 port 10/100 switch. I am using it as a firewall, but not currently as a DHCP server. I have plugged into the cable modem, two network cables, one which goes to the WAN port of a Linksys Dual Band Wireless 10/100/1000 router/switch. Into the linksys are a few workstations, a few printers, and some laptops connecting to wifi. I set the Linksys to use take static IP, and enabled DHCP for the workstations, printers, etc in 192.168.1.1/24. The network for the Linksys is mostly self contained, backups go to a SAN, on that network, it all happens through that switch, over GB. But I also get internet access from it as well via the cable modem using one static IP. This all works, however, I can not "see" the static IP machines when I am on the Linksys. I can get to them via ssh and other protocols, and if I want to from "outside", I open holes, like 80, 25, 587, 143, 22, etc. The second wire, from the cable modem/fireall/switch just uplinks to the managed GB switch. What are the pros and cons of this? I do not like giving up the static IP to the Linksys. I basically have a mixed network of public servers, and internal workstations. I want the public servers on public IP's because I do not want to mess with port forwarding and mappings. Is it correct also, that if someone breaches the Linksys wifi, they still would have a hard time getting to the static IP range, just by nature of the network topology? Today, just for a test, I toggled on the DHCP in the firewall/cable modem at 10.1.10.1/24 range, the Linksys is n the 192.168.1.100/24 range. At that point, all the static IP machines still had in and out access, but Linksys was unreachable. The cable modem only has 10/100 ports, so I will not plug anything but the network drop into it, which is 50Mb/10Mb. Which makes me think this could be less than ideal, as transfers from the workstation network to the server network will be bottlenecked at 100Mb when I have 1000Mb available. I may not need to solve that, if isolation is better though. I do not move a lot of data, if any, from Linsys network to server network, so for it to pretend to be remote is ok. Should I approach this any different? I could enable DHCP on the cable modem/firewall, it should still send out the statics to the GB switch, but will also be a DHCP in 10.1.10.1/24 range? I can then plug the Linksys into the GB switch, which is now picking up statics and the 10.1.10.1/24 ranges, tell the Linksys to use 10.1.10.5 or so. Now, do I disable DHCP on the Linksys, and the cable modem/firewall will pass through the statics and 10.0.10.1/24 ranges as well? Or, could I open a second DHCP pool on the Linksys? I guess doing so gives me network isolation again, but it is just the reverse of what I have now. But I get out of the bottleneck, not that the Linksys could ever really touch real GB speeds anyway, but the managed switch certainly can. This is all because 13 statics are not that many. Right now, 6 "servers", the Linksys, a managed switch, a few SSL certs, and I am running out. I do not want to waste a static IP on the managed GB switch, or the Linksys, unless it provides me some type of benefit. Final question, under my current setup, if I am on a workstation, sitting at 192.168.1.109, the Linksys, with GB, and I send a file over ssh to the static IP machine, is that literally leaving the internet, and coming back in, or does it stay local? To me it seems like: Workstation (192.168.1.109) -> Linksys DHCP -> Linksys Static IP -> Cable Modem -> Server ( and it hits the 10/100 ports on the cable modem, slowing me down. But does it round trip the network, leave and come back in, limiting me to the 50/10 internet speeds? *These are all made up numbers, I do not use default router IP's as I will one day add a VPN, and do not want collisions. I need some recommendations, do I want one big network, or two isolated ones. Printers these days need an IP, everything does, I can not get autoconf/bonjour to be reliable on most printers. but I am also not sure I want the "server" side of my operation to be polluted by the workstation side of my operation. Unless there is some magic subetting I have not learned yet, here is what I am thinking: Cable modem 10/100, has 13 static IP, publicly accessible -> Enable DHCP on the cable modem -> Cable modem plugs into managed switch -> Managed switch gets 10.1.10.1 ssh, telnet, https admin management address -> Managed switch sends static IP's to to servers -> Plug Linksys into managed switch, giving it 10.1.10.2 static internally in Linksys admin -> Linksys gets assigned 10.1.10.x as its DHCP sending range -> Local printers, workstations, iPhones etc, connect to this -> ( Do I enable DHCP or disable it on the Linksys, just define a non over lapping range, or create an entirely new DHCP at 10.1.50.0/24, I think I am back isolated again with that method too? ) Thank you for any suggestions. This is the first time I have had to deal with less than a /24, and most are larger than that, but it is just a drop to a cabinet. Otherwise, it's a router, a few repeaters, and soho stuff that is simple, with one IP. I know a few may suggest going all DHCP on the servers, and I may one day, just not now, there has been too much moving of gear for me to be interested in that, and I would want something in the Catalyst series to deal with that.

Read the article
Scrum Master Stephen Forte Teaches Agile Development, Silverlight and BI at GIDS 2010

- by rajesh ahuja

Great Indian Developer Summit 2010 – Gold Standard for India's Software Developer Ecosystem Bangalore, March 25, 2010: The author of several books on application and database development including Programming SQL Server 2008 and certified Scrum Master Stephen Forte is coming this summer to India's biggest summit for the developer ecosystem - Great Indian Developer Summit. At the summit, Stephen will conduct a workshop guaranteed to give attendees a jump start in taking a certified scrum master exam. Scrum, one of the most popular Agile project management and development methods, which is starting to be adopted at major corporations and on very large projects. After an introduction to the basics of Scrum like project planning and estimation, the Scrum Master, team, product owner and burn down, and of course the daily Scrum, Stephen will show many real world applications of the methodology drawn from his own experience as a Scrum Master. Negotiating with the business, estimation and team dynamics are all discussed as well as how to use Scrum in small organizations, large enterprise environments and consulting environments. Stephen will also discuss using Scrum with virtual teams and an off-shoring environment. He will then take a look at the tools we will use for Agile development, including planning poker, unit testing, and much more. On 20th April at the GIDS.NET Conference, Stephen will also conduct a series of sessions on Microsoft computing technologies. He will teach how to build data driven, n-tier Rich Internet Applications (RIA) with Silverlight 4.0. Line of business applications (LOB) in Silverlight 4.0 are easy by tapping the power of WCF RIA Services, the Silverlight Toolkit, and elevated out of browser support. Stephen's demo centric session will walk you through an example of building a LOB application with Silverlight 4.0. See how Silverlight and WCF RIA Services support domain logic, services, data binding, validation, server based paging, authentication, authorization and much more. Silverlight 4.0 means business. Silverlight runs C# and Visual Basic code, and so it seems natural that a business application might share some code between the Silverlight client and its ASP.NET Web server. You may want to run some code client-side for interactivity, but re-run that code on the server for security or reliability. This is possible, and there are several techniques you can use to accomplish this goal. In Stephen's second talk learn about the various techniques and their pros and cons. Some techniques work better in C#, others in VB. Still others are simpler with a little extra tooling or code-generation. Any serious Silverlight business application will almost certainly face this issue, and this session gets you going fast. In the third talk, Stephen will explain how to properly architect and deploy a BI application using a mix of some exciting new tools and some old familiar ones. He will start with a traditional relational transaction centric database (OLTP) and explore ways to build a data warehouse (OLAP), looking at the star and snowflake schemas. Next he will look at the process of extraction, transformation, and loading (ETL) your OLTP data into your data warehouse. Different techniques for ETL will be described and the various tradeoffs will be discussed. Then he will look at using the warehouse for reporting, drill down, and data analysis in Microsoft Excel's PowerPivot 2010. The session will round off by showing how to properly build a cube and build a data analysis application on top of that cube, and conclude by looking at some tools to help with the data visualization process. Every year, GIDS is a game changer for several thousands of IT professionals, providing them with a competitive edge over their peers, enlightening them with bleeding-edge information most useful in their daily jobs, helping them network with world-class experts and visionaries, and providing them with a much needed thrust in their careers. Attend Great Indian Developer Summit to gain the information, education and solutions you seek. From post-conference workshops, breakout sessions by expert instructors, keynotes by industry heavyweights, enhanced networking opportunities, and more. About Great Indian Developer Summit Great Indian Developer Summit is the gold standard for India's software developer ecosystem for gaining exposure to and evaluating new projects, tools, services, platforms, languages, software and standards. Packed with premium knowledge, action plans and advise from been-there-done-it veterans, creators, and visionaries, the 2010 edition of Great Indian Developer Summit features focused sessions, case studies, workshops and power panels that will transform you into a force to reckon with. Featuring 3 co-located conferences: GIDS.NET, GIDS.Web, GIDS.Java and an exclusive day of in-depth tutorials - GIDS.Workshops, from 20 April to 24 April at the IISc campus in Bangalore. At GIDS you'll participate in hundreds of sessions encompassing the full range of Microsoft computing, Java, Agile, RIA, Rich Web, open source/standards, languages, frameworks and platforms, practical tutorials that deep dive into technical skill and best practices, inspirational keynote presentations, an Expo Hall featuring dozens of the latest projects and products activities, engaging networking events, and the interact with the best and brightest of speakers from around the world. For further information on GIDS 2010, please visit the summit on the web http://www.developersummit.com/ A Saltmarch Media Press Release E: [email protected] Ph: +91 80 4005 1000

Read the article
What's the best name for a non-mutating "add" method on an immutable collection?

- by Jon Skeet

Sorry for the waffly title - if I could come up with a concise title, I wouldn't have to ask the question. Suppose I have an immutable list type. It has an operation Foo(x) which returns a new immutable list with the specified argument as an extra element at the end. So to build up a list of strings with values "Hello", "immutable", "world" you could write: var empty = new ImmutableList<string>(); var list1 = empty.Foo("Hello"); var list2 = list1.Foo("immutable"); var list3 = list2.Foo("word"); (This is C# code, and I'm most interested in a C# suggestion if you feel the language is important. It's not fundamentally a language question, but the idioms of the language may be important.) The important thing is that the existing lists are not altered by Foo - so empty.Count would still return 0. Another (more idiomatic) way of getting to the end result would be: var list = new ImmutableList<string>().Foo("Hello"); .Foo("immutable"); .Foo("word"); My question is: what's the best name for Foo? EDIT 3: As I reveal later on, the name of the type might not actually be ImmutableList<T>, which makes the position clear. Imagine instead that it's TestSuite and that it's immutable because the whole of the framework it's a part of is immutable... (End of edit 3) Options I've come up with so far: Add: common in .NET, but implies mutation of the original list Cons: I believe this is the normal name in functional languages, but meaningless to those without experience in such languages Plus: my favourite so far, it doesn't imply mutation to me. Apparently this is also used in Haskell but with slightly different expectations (a Haskell programmer might expect it to add two lists together rather than adding a single value to the other list). With: consistent with some other immutable conventions, but doesn't have quite the same "additionness" to it IMO. And: not very descriptive. Operator overload for + : I really don't like this much; I generally think operators should only be applied to lower level types. I'm willing to be persuaded though! The criteria I'm using for choosing are: Gives the correct impression of the result of the method call (i.e. that it's the original list with an extra element) Makes it as clear as possible that it doesn't mutate the existing list Sounds reasonable when chained together as in the second example above Please ask for more details if I'm not making myself clear enough... EDIT 1: Here's my reasoning for preferring Plus to Add. Consider these two lines of code: list.Add(foo); list.Plus(foo); In my view (and this is a personal thing) the latter is clearly buggy - it's like writing "x + 5;" as a statement on its own. The first line looks like it's okay, until you remember that it's immutable. In fact, the way that the plus operator on its own doesn't mutate its operands is another reason why Plus is my favourite. Without the slight ickiness of operator overloading, it still gives the same connotations, which include (for me) not mutating the operands (or method target in this case). EDIT 2: Reasons for not liking Add. Various answers are effectively: "Go with Add. That's what DateTime does, and String has Replace methods etc which don't make the immutability obvious." I agree - there's precedence here. However, I've seen plenty of people call DateTime.Add or String.Replace and expect mutation. There are loads of newsgroup questions (and probably SO ones if I dig around) which are answered by "You're ignoring the return value of String.Replace; strings are immutable, a new string gets returned." Now, I should reveal a subtlety to the question - the type might not actually be an immutable list, but a different immutable type. In particular, I'm working on a benchmarking framework where you add tests to a suite, and that creates a new suite. It might be obvious that: var list = new ImmutableList<string>(); list.Add("foo"); isn't going to accomplish anything, but it becomes a lot murkier when you change it to: var suite = new TestSuite<string, int>(); suite.Add(x => x.Length); That looks like it should be okay. Whereas this, to me, makes the mistake clearer: var suite = new TestSuite<string, int>(); suite.Plus(x => x.Length); That's just begging to be: var suite = new TestSuite<string, int>().Plus(x => x.Length); Ideally, I would like my users not to have to be told that the test suite is immutable. I want them to fall into the pit of success. This may not be possible, but I'd like to try. I apologise for over-simplifying the original question by talking only about an immutable list type. Not all collections are quite as self-descriptive as ImmutableList<T> :)

Read the article
Java style FOR loop in a clojure interpeter ?

- by Kevin

I have a basic interpreter in clojure. Now i need to implement for (initialisation; finish-test; loop-update) { statements } inside my interpreter. I will attach my interpreter code I got so far. Any help is appreciated. Interpreter (declare interpret make-env) ;; (def do-trace false) ;; ;; simple utilities (def third ; return third item in a list (fn [a-list] (second (rest a-list)))) (def fourth ; return fourth item in a list (fn [a-list] (third (rest a-list)))) (def run ; make it easy to test the interpreter (fn [e] (println "Processing: " e) (println "=> " (interpret e (make-env))))) ;; for the environment (def make-env (fn [] '())) (def add-var (fn [env var val] (cons (list var val) env))) (def lookup-var (fn [env var] (cond (empty? env) 'error (= (first (first env)) var) (second (first env)) :else (lookup-var (rest env) var)))) ;; -- define numbers (def is-number? (fn [expn] (number? expn))) (def interpret-number (fn [expn env] expn)) ;; -- define symbols (def is-symbol? (fn [expn] (symbol? expn))) (def interpret-symbol (fn [expn env] (lookup-var env expn))) ;; -- define boolean (def is-boolean? (fn [expn] (or (= expn 'true) (= expn 'false)))) (def interpret-boolean (fn [expn env] expn)) ;; -- define functions (def is-function? (fn [expn] (and (list? expn) (= 3 (count expn)) (= 'lambda (first expn))))) (def interpret-function (fn [expn env] expn)) ;; -- define addition (def is-plus? (fn [expn] (and (list? expn) (= 3 (count expn)) (= '+ (first expn))))) (def interpret-plus (fn [expn env] (+ (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define subtraction (def is-minus? (fn [expn] (and (list? expn) (= 3 (count expn)) (= '- (first expn))))) (def interpret-minus (fn [expn env] (- (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define multiplication (def is-times? (fn [expn] (and (list? expn) (= 3 (count expn)) (= '* (first expn))))) (def interpret-times (fn [expn env] (* (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define division (def is-divides? (fn [expn] (and (list? expn) (= 3 (count expn)) (= '/ (first expn))))) (def interpret-divides (fn [expn env] (/ (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define equals test (def is-equals? (fn [expn] (and (list? expn) (= 3 (count expn)) (= '= (first expn))))) (def interpret-equals (fn [expn env] (= (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define greater-than test (def is-greater-than? (fn [expn] (and (list? expn) (= 3 (count expn)) (= '> (first expn))))) (def interpret-greater-than (fn [expn env] (> (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define not (def is-not? (fn [expn] (and (list? expn) (= 2 (count expn)) (= 'not (first expn))))) (def interpret-not (fn [expn env] (not (interpret (second expn) env)))) ;; -- define or (def is-or? (fn [expn] (and (list? expn) (= 3 (count expn)) (= 'or (first expn))))) (def interpret-or (fn [expn env] (or (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define and (def is-and? (fn [expn] (and (list? expn) (= 3 (count expn)) (= 'and (first expn))))) (def interpret-and (fn [expn env] (and (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define with (def is-with? (fn [expn] (and (list? expn) (= 3 (count expn)) (= 'with (first expn))))) (def interpret-with (fn [expn env] (interpret (third expn) (add-var env (first (second expn)) (interpret (second (second expn)) env))))) ;; -- define if (def is-if? (fn [expn] (and (list? expn) (= 4 (count expn)) (= 'if (first expn))))) (def interpret-if (fn [expn env] (cond (interpret (second expn) env) (interpret (third expn) env) :else (interpret (fourth expn) env)))) ;; -- define function-application (def is-function-application? (fn [expn env] (and (list? expn) (= 2 (count expn)) (is-function? (interpret (first expn) env))))) (def interpret-function-application (fn [expn env] (let [function (interpret (first expn) env)] (interpret (third function) (add-var env (first (second function)) (interpret (second expn) env)))))) ;; the interpreter itself (def interpret (fn [expn env] (cond do-trace (println "Interpret is processing: " expn)) (cond ; basic values (is-number? expn) (interpret-number expn env) (is-symbol? expn) (interpret-symbol expn env) (is-boolean? expn) (interpret-boolean expn env) (is-function? expn) (interpret-function expn env) ; built-in functions (is-plus? expn) (interpret-plus expn env) (is-minus? expn) (interpret-minus expn env) (is-times? expn) (interpret-times expn env) (is-divides? expn) (interpret-divides expn env) (is-equals? expn) (interpret-equals expn env) (is-greater-than? expn) (interpret-greater-than expn env) (is-not? expn) (interpret-not expn env) (is-or? expn) (interpret-or expn env) (is-and? expn) (interpret-and expn env) ; special syntax (is-with? expn) (interpret-with expn env) (is-if? expn) (interpret-if expn env) ; functions (is-function-application? expn env) (interpret-function-application expn env) :else 'error)))

Read the article
another question about OpenGL ES rendering to texture

- by ensoreus

Hello, pros and gurus! Here is another question about rendering to texture. The whole stuff is all about saving texture between passing image into different filters. Maybe all iPhone developers knows about Apple's sample code with OpenGL processing where they used GL filters(functions), but pass into them the same source image. I need to edit an image by passing it sequentelly with saving the state of the image to edit. I am very noob in OpenGL, so I spent increadibly a lot of to solve the issue. So, I desided to create 2 FBO's and attach source image and temporary image as a textures to render in. Here is my init routine: glEnableClientState(GL_VERTEX_ARRAY); glEnableClientState(GL_TEXTURE_COORD_ARRAY); glEnable(GL_TEXTURE_2D); glPixelStorei(GL_UNPACK_ALIGNMENT, 1); glGetIntegerv(GL_FRAMEBUFFER_BINDING_OES, (GLint *)&SystemFBO); glImage = [self loadTexture:preparedImage]; //source image for (int i = 0; i < 4; i++) { fullquad[i].s *= glImage->s; fullquad[i].t *= glImage->t; flipquad[i].s *= glImage->s; flipquad[i].t *= glImage->t; } tmpImage = [self loadEmptyTexture]; //editing image glGenFramebuffersOES(1, &tmpImageFBO); glBindFramebufferOES(GL_FRAMEBUFFER_OES, tmpImageFBO); glFramebufferTexture2DOES(GL_FRAMEBUFFER_OES, GL_COLOR_ATTACHMENT0_OES, GL_TEXTURE_2D, tmpImage->texID, 0); GLenum status = glCheckFramebufferStatusOES(GL_FRAMEBUFFER_OES); if(status != GL_FRAMEBUFFER_COMPLETE_OES) { NSLog(@"failed to make complete tmp framebuffer object %x", status); } glBindTexture(GL_TEXTURE_2D, 0); glBindFramebufferOES(GL_FRAMEBUFFER_OES, 0); glGenRenderbuffersOES(1, &glImageFBO); glBindFramebufferOES(GL_FRAMEBUFFER_OES, glImageFBO); glFramebufferTexture2DOES(GL_FRAMEBUFFER_OES, GL_COLOR_ATTACHMENT0_OES, GL_TEXTURE_2D, glImage->texID, 0); status = glCheckFramebufferStatusOES(GL_FRAMEBUFFER_OES) ; if(status != GL_FRAMEBUFFER_COMPLETE_OES) { NSLog(@"failed to make complete cur framebuffer object %x", status); } glBindTexture(GL_TEXTURE_2D, 0); glBindFramebufferOES(GL_FRAMEBUFFER_OES, 0); When user drag the slider, this routine invokes to apply changes -(void)setContrast:(CGFloat)value{ contrast = value; if(flag!=mfContrast){ NSLog(@"contrast: dumped"); flag = mfContrast; glBindFramebufferOES(GL_FRAMEBUFFER_OES, glImageFBO); glClearColor(1,1,1,1); glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT); glMatrixMode(GL_PROJECTION); glLoadIdentity(); glOrthof(0, 512, 0, 512, -1, 1); glMatrixMode(GL_MODELVIEW); glLoadIdentity(); glScalef(512, 512, 1); glBindTexture(GL_TEXTURE_2D, tmpImage->texID); glViewport(0, 0, 512, 512); glVertexPointer(2, GL_FLOAT, sizeof(V2fT2f), &fullquad[0].x); glTexCoordPointer(2, GL_FLOAT, sizeof(V2fT2f), &fullquad[0].s); glDrawArrays(GL_TRIANGLE_STRIP, 0, 4); glBindFramebufferOES(GL_FRAMEBUFFER_OES, 0); } glBindFramebufferOES(GL_FRAMEBUFFER_OES,tmpImageFBO); glClearColor(0,0,0,1); glClear(GL_COLOR_BUFFER_BIT); glEnable(GL_TEXTURE_2D); glActiveTexture(GL_TEXTURE0); glMatrixMode(GL_PROJECTION); glLoadIdentity(); glOrthof(0, 512, 0, 512, -1, 1); glMatrixMode(GL_MODELVIEW); glLoadIdentity(); glScalef(512, 512, 1); glBindTexture(GL_TEXTURE_2D, glImage->texID); glViewport(0, 0, 512, 512); [self contrastProc:fullquad value:contrast]; glBindFramebufferOES(GL_FRAMEBUFFER_OES, 0); [self redraw]; } Here are two cases: if it is the same filter(edit mode) to use, I bind tmpFBO to draw into tmpImage texture and edit glImage texture. contrastProc is a pure routine from Apples's sample. If it is another mode, than I save edited image by drawing tmpImage texture in source texture glImage, binded with glImageFBO. After that I call redraw: glBindFramebufferOES(GL_FRAMEBUFFER_OES, SystemFBO); glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); glMatrixMode(GL_PROJECTION); glLoadIdentity(); glOrthof(0, kTexWidth, 0, kTexHeight, -1, 1); glMatrixMode(GL_MODELVIEW); glLoadIdentity(); glScalef(kTexWidth, kTexHeight, 1); glBindTexture(GL_TEXTURE_2D, glImage->texID); glViewport(0, 0, kTexWidth, kTexHeight); glVertexPointer(2, GL_FLOAT, sizeof(V2fT2f), &flipquad[0].x); glTexCoordPointer(2, GL_FLOAT, sizeof(V2fT2f), &flipquad[0].s); glDrawArrays(GL_TRIANGLE_STRIP, 0, 4); glBindFramebufferOES(GL_FRAMEBUFFER_OES, 0); And here it binds visual framebuffer and dispose glImage texture. So, the result is VERY aggresive filtering. Increasing contrast volume by just 0.2 brings image to state that comparable with 0.9 contrast volume in Apple's sample code project. I miss something obvious, I guess. Interesting, if I disabple line glBindTexture(GL_TEXTURE_2D, glImage->texID); in setContrast routine it brings no effect. At all. If I replace tmpImageFBO with SystemFBO to draw glImage directly on display(and disabling redraw invoking line), all works fine. Please, HELP ME!!! :(

Read the article
ANTS CLR and Memory Profiler In Depth Review (Part 1 of 2 – CLR Profiler)

- by ToStringTheory

One of the things that people might not know about me, is my obsession to make my code as efficient as possible. Many people might not realize how much of a task or undertaking that this might be, but it is surely a task as monumental as climbing Mount Everest, except this time it is a challenge for the mind… In trying to make code efficient, there are many different factors that play a part – size of project or solution, tiers, language used, experience and training of the programmer, technologies used, maintainability of the code – the list can go on for quite some time. I spend quite a bit of time when developing trying to determine what is the best way to implement a feature to accomplish the efficiency that I look to achieve. One program that I have recently come to learn about – Red Gate ANTS Performance (CLR) and Memory profiler gives me tools to accomplish that job more efficiently as well. In this review, I am going to cover some of the features of the ANTS profiler set by compiling some hideous example code to test against. Notice As a member of the Geeks With Blogs Influencers program, one of the perks is the ability to review products, in exchange for a free license to the program. I have not let this affect my opinions of the product in any way, and Red Gate nor Geeks With Blogs has tried to influence my opinion regarding this product in any way. Introduction The ANTS Profiler pack provided by Red Gate was something that I had not heard of before receiving an email regarding an offer to review it for a license. Since I look to make my code efficient, it was a no brainer for me to try it out! One thing that I have to say took me by surprise is that upon downloading the program and installing it you fill out a form for your usual contact information. Sure enough within 2 hours, I received an email from a sales representative at Red Gate asking if she could help me to achieve the most out of my trial time so it wouldn’t go to waste. After replying to her and explaining that I was looking to review its feature set, she put me in contact with someone that setup a demo session to give me a quick rundown of its features via an online meeting. After having dealt with a massive ordeal with one of my utility companies and their complete lack of customer service, Red Gates friendly and helpful representatives were a breath of fresh air, and something I was thankful for. ANTS CLR Profiler The ANTS CLR profiler is the thing I want to focus on the most in this post, so I am going to dive right in now. Install was simple and took no time at all. It installed both the profiler for the CLR and Memory, but also visual studio extensions to facilitate the usage of the profilers (click any images for full size images): The Visual Studio menu options (under ANTS menu) Starting the CLR Performance Profiler from the start menu yields this window If you follow the instructions after launching the program from the start menu (Click File > New Profiling Session to start a new project), you are given a dialog with plenty of options for profiling: The New Session dialog. Lots of options. One thing I noticed is that the buttons in the lower right were half-covered by the panel of the application. If I had to guess, I would imagine that this is caused by my DPI settings being set to 125%. This is a problem I have seen in other applications as well that don’t scale well to different dpi scales. The profiler options give you the ability to profile: .NET Executable ASP.NET web application (hosted in IIS) ASP.NET web application (hosted in IIS express) ASP.NET web application (hosted in Cassini Web Development Server) SharePoint web application (hosted in IIS) Silverlight 4+ application Windows Service COM+ server XBAP (local XAML browser application) Attach to an already running .NET 4 process Choosing each option provides a varying set of other variables/options that one can set including options such as application arguments, operating path, record I/O performance performance counters to record (43 counters in all!), etc… All in all, they give you the ability to profile many different .Net project types, and make it simple to do so. In most cases of my using this application, I would be using the built in Visual Studio extensions, as they automatically start a new profiling project in ANTS with the options setup, and start your program, however RedGate has made it easy enough to profile outside of Visual Studio as well. On the flip side of this, as someone who lives most of their work life in Visual Studio, one thing I do wish is that instead of opening an entirely separate application/gui to perform profiling after launching, that instead they would provide a Visual Studio panel with the information, and integrate more of the profiling project information into Visual Studio. So, now that we have an idea of what options that the profiler gives us, its time to test its abilities and features. Horrendous Example Code – Prime Number Generator One of my interests besides development, is Physics and Math – what I went to college for. I have especially always been interested in prime numbers, as they are something of a mystery… So, I decided that I would go ahead and to test the abilities of the profiler, I would write a small program, website, and library to generate prime numbers in the quantity that you ask for. I am going to start off with some terrible code, and show how I would see the profiler being used as a development tool. First off, the IPrimes interface (all code is downloadable at the end of the post): interface IPrimes { IEnumerable<int> GetPrimes(int retrieve); } Simple enough, right? Anything that implements the interface will (hopefully) provide an IEnumerable of int, with the quantity specified in the parameter argument. Next, I am going to implement this interface in the most basic way: public class DumbPrimes : IPrimes { public IEnumerable<int> GetPrimes(int retrieve) { //store a list of primes already found var _foundPrimes = new List<int>() { 2, 3 }; //if i ask for 1 or two primes, return what asked for if (retrieve <= _foundPrimes.Count()) return _foundPrimes.Take(retrieve); //the next number to look at int _analyzing = 4; //since I already determined I don't have enough //execute at least once, and until quantity is sufficed do { //assume prime until otherwise determined bool isPrime = true; //start dividing at 2 //divide until number is reached, or determined not prime for (int i = 2; i < _analyzing && isPrime; i++) { //if (i) goes into _analyzing without a remainder, //_analyzing is NOT prime if (_analyzing % i == 0) isPrime = false; } //if it is prime, add to found list if (isPrime) _foundPrimes.Add(_analyzing); //increment number to analyze next _analyzing++; } while (_foundPrimes.Count() < retrieve); return _foundPrimes; } } This is the simplest way to get primes in my opinion. Checking each number by the straight definition of a prime – is it divisible by anything besides 1 and itself. I have included this code in a base class library for my solution, as I am going to use it to demonstrate a couple of features of ANTS. This class library is consumed by a simple non-MVVM WPF application, and a simple MVC4 website. I will not post the WPF code here inline, as it is simply an ObservableCollection<int>, a label, two textbox’s, and a button. Starting a new Profiling Session So, in Visual Studio, I have just completed my first stint developing the GUI and DumbPrimes IPrimes class, so now I want to check my codes efficiency by profiling it. All I have to do is build the solution (surprised initiating a profiling session doesn’t do this, but I suppose I can understand it), and then click the ANTS menu, followed by Profile Performance. I am then greeted by the profiler starting up and already monitoring my program live: You are provided with a realtime graph at the top, and a pane at the bottom giving you information on how to proceed. I am going to start by asking my program to show me the first 15000 primes: After the program finally began responding again (I did all the work on the main UI thread – how bad!), I stopped the profiler, which did kill the process of my program too. One important thing to note, is that the profiler by default wants to give you a lot of detail about the operation – line hit counts, time per line, percent time per line, etc… The important thing to remember is that this itself takes a lot of time. When running my program without the profiler attached, it can generate the 15000 primes in 5.18 seconds, compared to 74.5 seconds – almost a 1500 percent increase. While this may seem like a lot, remember that there is a trade off. It may be WAY more inefficient, however, I am able to drill down and make improvements to specific problem areas, and then decrease execution time all around. Analyzing the Profiling Session After clicking ‘Stop Profiling’, the process running my application stopped, and the entire execution time was automatically selected by ANTS, and the results shown below: Now there are a number of interesting things going on here, I am going to cover each in a section of its own: Real Time Performance Counter Bar (top of screen) At the top of the screen, is the real time performance bar. As your application is running, this will constantly update with the currently selected performance counters status. A couple of cool things to note are the fact that you can drag a selection around specific time periods to drill down the detail views in the lower 2 panels to information pertaining to only that period. After selecting a time period, you can bookmark a section and name it, so that it is easy to find later, or after reloaded at a later time. You can also zoom in, out, or fit the graph to the space provided – useful for drilling down. It may be hard to see, but at the top of the processor time graph below the time ticks, but above the red usage graph, there is a green bar. This bar shows at what times a method that is selected in the ‘Call tree’ panel is called. Very cool to be able to click on a method and see at what times it made an impact. As I said before, ANTS provides 43 different performance counters you can hook into. Click the arrow next to the Performance tab at the top will allow you to change between different counters if you have them selected: Method Call Tree, ADO.Net Database Calls, File IO – Detail Panel Red Gate really hit the mark here I think. When you select a section of the run with the graph, the call tree populates to fill a hierarchical tree of method calls, with information regarding each of the methods. By default, methods are hidden where the source is not provided (framework type code), however, Red Gate has integrated Reflector into ANTS, so even if you don’t have source for something, you can select a method and get the source if you want. Methods are also hidden where the impact is seen as insignificant – methods that are only executed for 1% of the time of the overall calling methods time; in other words, working on making them better is not where your efforts should be focused. – Smart! Source Panel – Detail Panel The source panel is where you can see line level information on your code, showing the code for the currently selected method from the Method Call Tree. If the code is not available, Reflector takes care of it and shows the code anyways! As you can notice, there does seem to be a problem with how ANTS determines what line is the actual line that a call is completed on. I have suspicions that this may be due to some of the inline code optimizations that the CLR applies upon compilation of the assembly. In a method with comments, the problem is much more severe: As you can see here, apparently the most offending code in my base library was a comment – *gasp*! Removing the comments does help quite a bit, however I hope that Red Gate works on their counter algorithm soon to improve the logic on positioning for statistics: I did a small test just to demonstrate the lines are correct without comments. For me, it isn’t a deal breaker, as I can usually determine the correct placements by looking at the application code in the region and determining what makes sense, but it is something that would probably build up some irritation with time. Feature – Suggest Method for Optimization A neat feature to really help those in need of a pointer, is the menu option under tools to automatically suggest methods to optimize/improve: Nice feature – clicking it filters the call tree and stars methods that it thinks are good candidates for optimization. I do wish that they would have made it more visible for those of use who aren’t great on sight: Process Integration I do think that this could have a place in my process. After experimenting with the profiler, I do think it would be a great benefit to do some development, testing, and then after all the bugs are worked out, use the profiler to check on things to make sure nothing seems like it is hogging more than its fair share. For example, with this program, I would have developed it, ran it, tested it – it works, but slowly. After looking at the profiler, and seeing the massive amount of time spent in 1 method, I might go ahead and try to re-implement IPrimes (I actually would probably rewrite the offending code, but so that I can distribute both sets of code easily, I’m just going to make another implementation of IPrimes). Using two pieces of knowledge about prime numbers can make this method MUCH more efficient – prime numbers fall into two buckets 6k+/-1 , and a number is prime if it is not divisible by any other primes before it: public class SmartPrimes : IPrimes { public IEnumerable<int> GetPrimes(int retrieve) { //store a list of primes already found var _foundPrimes = new List<int>() { 2, 3 }; //if i ask for 1 or two primes, return what asked for if (retrieve <= _foundPrimes.Count()) return _foundPrimes.Take(retrieve); //the next number to look at int _k = 1; //since I already determined I don't have enough //execute at least once, and until quantity is sufficed do { //assume prime until otherwise determined bool isPrime = true; int potentialPrime; //analyze 6k-1 //assign the value to potential potentialPrime = 6 * _k - 1; //if there are any primes that divise this, it is NOT a prime number //using PLINQ for quick boost isPrime = !_foundPrimes.AsParallel() .Any(prime => potentialPrime % prime == 0); //if it is prime, add to found list if (isPrime) _foundPrimes.Add(potentialPrime); if (_foundPrimes.Count() == retrieve) break; //analyze 6k+1 //assign the value to potential potentialPrime = 6 * _k + 1; //if there are any primes that divise this, it is NOT a prime number //using PLINQ for quick boost isPrime = !_foundPrimes.AsParallel() .Any(prime => potentialPrime % prime == 0); //if it is prime, add to found list if (isPrime) _foundPrimes.Add(potentialPrime); //increment k to analyze next _k++; } while (_foundPrimes.Count() < retrieve); return _foundPrimes; } } Now there are definitely more things I can do to help make this more efficient, but for the scope of this example, I think this is fine (but still hideous)! Profiling this now yields a happy surprise 27 seconds to generate the 15000 primes with the profiler attached, and only 1.43 seconds without. One important thing I wanted to call out though was the performance graph now: Notice anything odd? The %Processor time is above 100%. This is because there is now more than 1 core in the operation. A better label for the chart in my mind would have been %Core time, but to each their own. Another odd thing I noticed was that the profiler seemed to be spot on this time in my DumbPrimes class with line details in source, even with comments.. Odd. Profiling Web Applications The last thing that I wanted to cover, that means a lot to me as a web developer, is the great amount of work that Red Gate put into the profiler when profiling web applications. In my solution, I have a simple MVC4 application setup with 1 page, a single input form, that will output prime values as my WPF app did. Launching the profiler from Visual Studio as before, nothing is really different in the profiler window, however I did receive a UAC prompt for a Red Gate helper app to integrate with the web server without notification. After requesting 500, 1000, 2000, and 5000 primes, and looking at the profiler session, things are slightly different from before: As you can see, there are 4 spikes of activity in the processor time graph, but there is also something new in the call tree: That’s right – ANTS will actually group method calls by get/post operations, so it is easier to find out what action/page is giving the largest problems… Pretty cool in my mind! Overview Overall, I think that Red Gate ANTS CLR Profiler has a lot to offer, however I think it also has a long ways to go. 3 Biggest Pros: Ability to easily drill down from time graph, to method calls, to source code Wide variety of counters to choose from when profiling your application Excellent integration/grouping of methods being called from web applications by request – BRILLIANT! 3 Biggest Cons: Issue regarding line details in source view Nit pick – Processor time vs. Core time Nit pick – Lack of full integration with Visual Studio Ratings Ease of Use (7/10) – I marked down here because of the problems with the line level details and the extra work that that entails, and the lack of better integration with Visual Studio. Effectiveness (10/10) – I believe that the profiler does EXACTLY what it purports to do. Especially with its large variety of performance counters, a definite plus! Features (9/10) – Besides the real time performance monitoring, and the drill downs that I’ve shown here, ANTS also has great integration with ADO.Net, with the ability to show database queries run by your application in the profiler. This, with the line level details, the web request grouping, reflector integration, and various options to customize your profiling session I think create a great set of features! Customer Service (10/10) – My entire experience with Red Gate personnel has been nothing but good. their people are friendly, helpful, and happy! UI / UX (8/10) – The interface is very easy to get around, and all of the options are easy to find. With a little bit of poking around, you’ll be optimizing Hello World in no time flat! Overall (8/10) – Overall, I am happy with the Performance Profiler and its features, as well as with the service I received when working with the Red Gate personnel. I WOULD recommend you trying the application and seeing if it would fit into your process, BUT, remember there are still some kinks in it to hopefully be worked out. My next post will definitely be shorter (hopefully), but thank you for reading up to here, or skipping ahead! Please, if you do try the product, drop me a message and let me know what you think! I would love to hear any opinions you may have on the product. Code Feel free to download the code I used above – download via DropBox

Read the article
Towards Database Continuous Delivery – What Next after Continuous Integration? A Checklist

- by Ben Rees

.dbd-banner p{ font-size:0.75em; padding:0 0 10px; margin:0 } .dbd-banner p span{ color:#675C6D; } .dbd-banner p:last-child{ padding:0; } @media ALL and (max-width:640px){ .dbd-banner{ background:#f0f0f0; padding:5px; color:#333; margin-top: 5px; } } -- Database delivery patterns & practices STAGE 4 AUTOMATED DEPLOYMENT If you’ve been fortunate enough to get to the stage where you’ve implemented some sort of continuous integration process for your database updates, then hopefully you’re seeing the benefits of that investment – constant feedback on changes your devs are making, advanced warning of data loss (prior to the production release on Saturday night!), a nice suite of automated tests to check business logic, so you know it’s going to work when it goes live, and so on. But what next? What can you do to improve your delivery process further, moving towards a full continuous delivery process for your database? In this article I describe some of the issues you might need to tackle on the next stage of this journey, and how to plan to overcome those obstacles before they appear. Our Database Delivery Learning Program consists of four stages, really three – source controlling a database, running continuous integration processes, then how to set up automated deployment (the middle stage is split in two – basic and advanced continuous integration, making four stages in total). If you’ve managed to work through the first three of these stages – source control, basic, then advanced CI, then you should have a solid change management process set up where, every time one of your team checks in a change to your database (whether schema or static reference data), this change gets fully tested automatically by your CI server. But this is only part of the story. Great, we know that our updates work, that the upgrade process works, that the upgrade isn’t going to wipe our 4Tb of production data with a single DROP TABLE. But – how do you get this (fully tested) release live? Continuous delivery means being always ready to release your software at any point in time. There’s a significant gap between your latest version being tested, and it being easily releasable. Just a quick note on terminology – there’s a nice piece here from Atlassian on the difference between continuous integration, continuous delivery and continuous deployment. This piece also gives a nice description of the benefits of continuous delivery. These benefits have been summed up by Jez Humble at Thoughtworks as: “Continuous delivery is a set of principles and practices to reduce the cost, time, and risk of delivering incremental changes to users” There’s another really useful piece here on Simple-Talk about the need for continuous delivery and how it applies to the database written by Phil Factor – specifically the extra needs and complexities of implementing a full CD solution for the database (compared to just implementing CD for, say, a web app). So, hopefully you’re convinced of moving on the the next stage! The next step after CI is to get some sort of automated deployment (or “release management”) process set up. But what should I do next? What do I need to plan and think about for getting my automated database deployment process set up? Can’t I just install one of the many release management tools available and hey presto, I’m ready! If only it were that simple. Below I list some of the areas that it’s worth spending a little time on, where a little planning and prep could go a long way. It’s also worth pointing out, that this should really be an evolving process. Depending on your starting point of course, it can be a long journey from your current setup to a full continuous delivery pipeline. If you’ve got a CI mechanism in place, you’re certainly a long way down that path. Nevertheless, we’d recommend evolving your process incrementally. Pages 157 and 129-141 of the book on Continuous Delivery (by Jez Humble and Dave Farley) have some great guidance on building up a pipeline incrementally: http://www.amazon.com/Continuous-Delivery-Deployment-Automation-Addison-Wesley/dp/0321601912 For now, in this post, we’ll look at the following areas for your checklist: You and Your Team Environments The Deployment Process Rollback and Recovery Development Practices You and Your Team It’s a cliché in the DevOps community that “It’s not all about processes and tools, really it’s all about a culture”. As stated in this DevOps report from Puppet Labs: “DevOps processes and tooling contribute to high performance, but these practices alone aren’t enough to achieve organizational success. The most common barriers to DevOps adoption are cultural: lack of manager or team buy-in, or the value of DevOps isn’t understood outside of a specific group”. Like most clichés, there’s truth in there – if you want to set up a database continuous delivery process, you need to get your boss, your department, your company (if relevant) onside. Why? Because it’s an investment with the benefits coming way down the line. But the benefits are huge – for HP, in the book A Practical Approach to Large-Scale Agile Development: How HP Transformed LaserJet FutureSmart Firmware, these are summarized as: -2008 to present: overall development costs reduced by 40% -Number of programs under development increased by 140% -Development costs per program down 78% -Firmware resources now driving innovation increased by a factor of 8 (from 5% working on new features to 40% But what does this mean? It means that, when moving to the next stage, to make that extra investment in automating your deployment process, it helps a lot if everyone is convinced that this is a good thing. That they understand the benefits of automated deployment and are willing to make the effort to transform to a new way of working. Incidentally, if you’re ever struggling to convince someone of the value I’d strongly recommend just buying them a copy of this book – a great read, and a very practical guide to how it can really work at a large org. I’ve spoken to many customers who have implemented database CI who describe their deployment process as “The point where automation breaks down. Up to that point, the CI process runs, untouched by human hand, but as soon as that’s finished we revert to manual.” This deployment process can involve, for example, a DBA manually comparing an environment (say, QA) to production, creating the upgrade scripts, reading through them, checking them against an Excel document emailed to him/her the night before, turning to page 29 in his/her notebook to double-check how replication is switched off and on for deployments, and so on and so on. Painful, error-prone and lengthy. But the point is, if this is something like your deployment process, telling your DBA “We’re changing everything you do and your toolset next week, to automate most of your role – that’s okay isn’t it?” isn’t likely to go down well. There’s some work here to bring him/her onside – to explain what you’re doing, why there will still be control of the deployment process and so on. Or of course, if you’re the DBA looking after this process, you have to do a similar job in reverse. You may have researched and worked out how you’d like to change your methodology to start automating your painful release process, but do the dev team know this? What if they have to start producing different artifacts for you? Will they be happy with this? Worth talking to them, to find out. As well as talking to your DBA/dev team, the other group to get involved before implementation is your manager. And possibly your manager’s manager too. As mentioned, unless there’s buy-in “from the top”, you’re going to hit problems when the implementation starts to get rocky (and what tool/process implementations don’t get rocky?!). You need to have support from someone senior in your organisation – someone you can turn to when you need help with a delayed implementation, lack of resources or lack of progress. Actions: Get your DBA involved (or whoever looks after live deployments) and discuss what you’re planning to do or, if you’re the DBA yourself, get the dev team up-to-speed with your plans, Get your boss involved too and make sure he/she is bought in to the investment. Environments Where are you going to deploy to? And really this question is – what environments do you want set up for your deployment pipeline? Assume everyone has “Production”, but do you have a QA environment? Dedicated development environments for each dev? Proper pre-production? I’ve seen every setup under the sun, and there is often a big difference between “What we want, to do continuous delivery properly” and “What we’re currently stuck with”. Some of these differences are: What we want What we’ve got Each developer with their own dedicated database environment A single shared “development” environment, used by everyone at once An Integration box used to test the integration of all check-ins via the CI process, along with a full suite of unit-tests running on that machine In fact if you have a CI process running, you’re likely to have some sort of integration server running (even if you don’t call it that!). Whether you have a full suite of unit tests running is a different question… Separate QA environment used explicitly for manual testing prior to release “We just test on the dev environments, or maybe pre-production” A proper pre-production (or “staging”) box that matches production as closely as possible Hopefully a pre-production box of some sort. But does it match production closely!? A production environment reproducible from source control A production box which has drifted significantly from anything in source control The big question is – how much time and effort are you going to invest in fixing these issues? In reality this just involves figuring out which new databases you’re going to create and where they’ll be hosted – VMs? Cloud-based? What about size/data issues – what data are you going to include on dev environments? Does it need to be masked to protect access to production data? And often the amount of work here really depends on whether you’re working on a new, greenfield project, or trying to update an existing, brownfield application. There’s a world if difference between starting from scratch with 4 or 5 clean environments (reproducible from source control of course!), and trying to re-purpose and tweak a set of existing databases, with all of their surrounding processes and quirks. But for a proper release management process, ideally you have: Dedicated development databases, An Integration server used for testing continuous integration and running unit tests. [NB: This is the point at which deployments are automatic, without human intervention. Each deployment after this point is a one-click (but human) action], QA – QA engineers use a one-click deployment process to automatically* deploy chosen releases to QA for testing, Pre-production. The environment you use to test the production release process, Production. * A note on the use of the word “automatic” – when carrying out automated deployments this does not mean that the deployment is happening without human intervention (i.e. that something is just deploying over and over again). It means that the process of carrying out the deployment is automatic in that it’s not a person manually running through a checklist or set of actions. The deployment still requires a single-click from a user. Actions: Get your environments set up and ready, Set access permissions appropriately, Make sure everyone understands what the environments will be used for (it’s not a “free-for-all” with all environments to be accessed, played with and changed by development). The Deployment Process As described earlier, most existing database deployment processes are pretty manual. The following is a description of a process we hear very often when we ask customers “How do your database changes get live? How does your manual process work?” Check pre-production matches production (use a schema compare tool, like SQL Compare). Sometimes done by taking a backup from production and restoring in to pre-prod, Again, use a schema compare tool to find the differences between the latest version of the database ready to go live (i.e. what the team have been developing). This generates a script, User (generally, the DBA), reviews the script. This often involves manually checking updates against a spreadsheet or similar, Run the script on pre-production, and check there are no errors (i.e. it upgrades pre-production to what you hoped), If all working, run the script on production.* * this assumes there’s no problem with production drifting away from pre-production in the interim time period (i.e. someone has hacked something in to the production box without going through the proper change management process). This difference could undermine the validity of your pre-production deployment test. Red Gate is currently working on a free tool to detect this problem – sign up here at www.sqllighthouse.com, if you’re interested in testing early versions. There are several variations on this process – some better, some much worse! How do you automate this? In particular, step 3 – surely you can’t automate a DBA checking through a script, that everything is in order!? The key point here is to plan what you want in your new deployment process. There are so many options. At one extreme, pure continuous deployment – whenever a dev checks something in to source control, the CI process runs (including extensive and thorough testing!), before the deployment process keys in and automatically deploys that change to the live box. Not for the faint hearted – and really not something we recommend. At the other extreme, you might be more comfortable with a semi-automated process – the pre-production/production matching process is automated (with an error thrown if these environments don’t match), followed by a manual intervention, allowing for script approval by the DBA. One he/she clicks “Okay, I’m happy for that to go live”, the latter stages automatically take the script through to live. And anything in between of course – and other variations. But we’d strongly recommended sitting down with a whiteboard and your team, and spending a couple of hours mapping out “What do we do now?”, “What do we actually want?”, “What will satisfy our needs for continuous delivery, but still maintaining some sort of continuous control over the process?” NB: Most of what we’re discussing here is about production deployments. It’s important to note that you will also need to map out a deployment process for earlier environments (for example QA). However, these are likely to be less onerous, and many customers opt for a much more automated process for these boxes. Actions: Sit down with your team and a whiteboard, and draw out the answers to the questions above for your production deployments – “What do we do now?”, “What do we actually want?”, “What will satisfy our needs for continuous delivery, but still maintaining some sort of continuous control over the process?” Repeat for earlier environments (QA and so on). Rollback and Recovery If only every deployment went according to plan! Unfortunately they don’t – and when things go wrong, you need a rollback or recovery plan for what you’re going to do in that situation. Once you move in to a more automated database deployment process, you’re far more likely to be deploying more frequently than before. No longer once every 6 months, maybe now once per week, or even daily. Hence the need for a quick rollback or recovery process becomes paramount, and should be planned for. NB: These are mainly scenarios for handling rollbacks after the transaction has been committed. If a failure is detected during the transaction, the whole transaction can just be rolled back, no problem. There are various options, which we’ll explore in subsequent articles, things like: Immediately restore from backup, Have a pre-tested rollback script (remembering that really this is a “roll-forward” script – there’s not really such a thing as a rollback script for a database!) Have fallback environments – for example, using a blue-green deployment pattern. Different options have pros and cons – some are easier to set up, some require more investment in infrastructure; and of course some work better than others (the key issue with using backups, is loss of the interim transaction data that has been added between the failed deployment and the restore). The best mechanism will be primarily dependent on how your application works and how much you need a cast-iron failsafe mechanism. Actions: Work out an appropriate rollback strategy based on how your application and business works, your appetite for investment and requirements for a completely failsafe process. Development Practices This is perhaps the more difficult area for people to tackle. The process by which you can deploy database updates is actually intrinsically linked with the patterns and practices used to develop that database and linked application. So you need to decide whether you want to implement some changes to the way your developers actually develop the database (particularly schema changes) to make the deployment process easier. A good example is the pattern “Branch by abstraction”. Explained nicely here, by Martin Fowler, this is a process that can be used to make significant database changes (e.g. splitting a table) in a step-wise manner so that you can always roll back, without data loss – by making incremental updates to the database backward compatible. Slides 103-108 of the following slidedeck, from Niek Bartholomeus explain the process: https://speakerdeck.com/niekbartho/orchestration-in-meatspace As these slides show, by making a significant schema change in multiple steps – where each step can be rolled back without any loss of new data – this affords the release team the opportunity to have zero-downtime deployments with considerably less stress (because if an increment goes wrong, they can roll back easily). There are plenty more great patterns that can be implemented – the book Refactoring Databases, by Scott Ambler and Pramod Sadalage is a great read, if this is a direction you want to go in: http://www.amazon.com/Refactoring-Databases-Evolutionary-paperback-Addison-Wesley/dp/0321774515 But the question is – how much of this investment are you willing to make? How often are you making significant schema changes that would require these best practices? Again, there’s a difference here between migrating old projects and starting afresh – with the latter it’s much easier to instigate best practice from the start. Actions: For your business, work out how far down the path you want to go, amending your database development patterns to “best practice”. It’s a trade-off between implementing quality processes, and the necessity to do so (depending on how often you make complex changes). Socialise these changes with your development group. No-one likes having “best practice” changes imposed on them, so good to introduce these ideas and the rationale behind them early. Summary The next stages of implementing a continuous delivery pipeline for your database changes (once you have CI up and running) require a little pre-planning, if you want to get the most out of the work, and for the implementation to go smoothly. We’ve covered some of the checklist of areas to consider – mainly in the areas of “Getting the team ready for the changes that are coming” and “Planning our your pipeline, environments, patterns and practices for development”, though there will be more detail, depending on where you’re coming from – and where you want to get to. This article is part of our database delivery patterns & practices series on Simple Talk. Find more articles for version control, automated testing, continuous integration & deployment.

Read the article
Custom language - FOR loop in a clojure interpeter?

- by Mark

I have a basic interpreter in clojure. Now i need to implement for (initialisation; finish-test; loop-update) { statements } Implement a similar for-loop for the interpreted language. The pattern will be: (for variable-declarations end-test loop-update do statement) The variable-declarations will set up initial values for variables.The end-test returns a boolean, and the loop will end if end-test returns false. The statement is interpreted followed by the loop-update for each pass of the loop. Examples of use are: (run ’(for ((i 0)) (< i 10) (set i (+ 1 i)) do (println i))) (run ’(for ((i 0) (j 0)) (< i 10) (seq (set i (+ 1 i)) (set j (+ j (* 2 i)))) do (println j))) inside my interpreter. I will attach my interpreter code I got so far. Any help is appreciated. Interpreter (declare interpret make-env) ;; needed as language terms call out to 'interpret' (def do-trace false) ;; change to 'true' to show calls to 'interpret' ;; simple utilities (def third ; return third item in a list (fn [a-list] (second (rest a-list)))) (def fourth ; return fourth item in a list (fn [a-list] (third (rest a-list)))) (def run ; make it easy to test the interpreter (fn [e] (println "Processing: " e) (println "=> " (interpret e (make-env))))) ;; for the environment (def make-env (fn [] '())) (def add-var (fn [env var val] (cons (list var val) env))) (def lookup-var (fn [env var] (cond (empty? env) 'error (= (first (first env)) var) (second (first env)) :else (lookup-var (rest env) var)))) ;; for terms in language ;; -- define numbers (def is-number? (fn [expn] (number? expn))) (def interpret-number (fn [expn env] expn)) ;; -- define symbols (def is-symbol? (fn [expn] (symbol? expn))) (def interpret-symbol (fn [expn env] (lookup-var env expn))) ;; -- define boolean (def is-boolean? (fn [expn] (or (= expn 'true) (= expn 'false)))) (def interpret-boolean (fn [expn env] expn)) ;; -- define functions (def is-function? (fn [expn] (and (list? expn) (= 3 (count expn)) (= 'lambda (first expn))))) (def interpret-function ; keep function definitions as they are written (fn [expn env] expn)) ;; -- define addition (def is-plus? (fn [expn] (and (list? expn) (= 3 (count expn)) (= '+ (first expn))))) (def interpret-plus (fn [expn env] (+ (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define subtraction (def is-minus? (fn [expn] (and (list? expn) (= 3 (count expn)) (= '- (first expn))))) (def interpret-minus (fn [expn env] (- (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define multiplication (def is-times? (fn [expn] (and (list? expn) (= 3 (count expn)) (= '* (first expn))))) (def interpret-times (fn [expn env] (* (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define division (def is-divides? (fn [expn] (and (list? expn) (= 3 (count expn)) (= '/ (first expn))))) (def interpret-divides (fn [expn env] (/ (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define equals test (def is-equals? (fn [expn] (and (list? expn) (= 3 (count expn)) (= '= (first expn))))) (def interpret-equals (fn [expn env] (= (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define greater-than test (def is-greater-than? (fn [expn] (and (list? expn) (= 3 (count expn)) (= '> (first expn))))) (def interpret-greater-than (fn [expn env] (> (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define not (def is-not? (fn [expn] (and (list? expn) (= 2 (count expn)) (= 'not (first expn))))) (def interpret-not (fn [expn env] (not (interpret (second expn) env)))) ;; -- define or (def is-or? (fn [expn] (and (list? expn) (= 3 (count expn)) (= 'or (first expn))))) (def interpret-or (fn [expn env] (or (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define and (def is-and? (fn [expn] (and (list? expn) (= 3 (count expn)) (= 'and (first expn))))) (def interpret-and (fn [expn env] (and (interpret (second expn) env) (interpret (third expn) env)))) ;; -- define print (def is-print? (fn [expn] (and (list? expn) (= 2 (count expn)) (= 'println (first expn))))) (def interpret-print (fn [expn env] (println (interpret (second expn) env)))) ;; -- define with (def is-with? (fn [expn] (and (list? expn) (= 3 (count expn)) (= 'with (first expn))))) (def interpret-with (fn [expn env] (interpret (third expn) (add-var env (first (second expn)) (interpret (second (second expn)) env))))) ;; -- define if (def is-if? (fn [expn] (and (list? expn) (= 4 (count expn)) (= 'if (first expn))))) (def interpret-if (fn [expn env] (cond (interpret (second expn) env) (interpret (third expn) env) :else (interpret (fourth expn) env)))) ;; -- define function-application (def is-function-application? (fn [expn env] (and (list? expn) (= 2 (count expn)) (is-function? (interpret (first expn) env))))) (def interpret-function-application (fn [expn env] (let [function (interpret (first expn) env)] (interpret (third function) (add-var env (first (second function)) (interpret (second expn) env)))))) ;; the interpreter itself (def interpret (fn [expn env] (cond do-trace (println "Interpret is processing: " expn)) (cond ; basic values (is-number? expn) (interpret-number expn env) (is-symbol? expn) (interpret-symbol expn env) (is-boolean? expn) (interpret-boolean expn env) (is-function? expn) (interpret-function expn env) ; built-in functions (is-plus? expn) (interpret-plus expn env) (is-minus? expn) (interpret-minus expn env) (is-times? expn) (interpret-times expn env) (is-divides? expn) (interpret-divides expn env) (is-equals? expn) (interpret-equals expn env) (is-greater-than? expn) (interpret-greater-than expn env) (is-not? expn) (interpret-not expn env) (is-or? expn) (interpret-or expn env) (is-and? expn) (interpret-and expn env) (is-print? expn) (interpret-print expn env) ; special syntax (is-with? expn) (interpret-with expn env) (is-if? expn) (interpret-if expn env) ; functions (is-function-application? expn env) (interpret-function-application expn env) :else 'error))) ;; tests of using environment (println "Environment tests:") (println (add-var (make-env) 'x 1)) (println (add-var (add-var (add-var (make-env) 'x 1) 'y 2) 'x 3)) (println (lookup-var '() 'x)) (println (lookup-var '((x 1)) 'x)) (println (lookup-var '((x 1) (y 2)) 'x)) (println (lookup-var '((x 1) (y 2)) 'y)) (println (lookup-var '((x 3) (y 2) (x 1)) 'x)) ;; examples of using interpreter (println "Interpreter examples:") (run '1) (run '2) (run '(+ 1 2)) (run '(/ (* (+ 4 5) (- 2 4)) 2)) (run '(with (x 1) x)) (run '(with (x 1) (with (y 2) (+ x y)))) (run '(with (x (+ 2 4)) x)) (run 'false) (run '(not false)) (run '(with (x true) (with (y false) (or x y)))) (run '(or (= 3 4) (> 4 3))) (run '(with (x 1) (if (= x 1) 2 3))) (run '(with (x 2) (if (= x 1) 2 3))) (run '((lambda (n) (* 2 n)) 4)) (run '(with (double (lambda (n) (* 2 n))) (double 4))) (run '(with (sum-to (lambda (n) (if (= n 0) 0 (+ n (sum-to (- n 1)))))) (sum-to 100))) (run '(with (x 1) (with (f (lambda (n) (+ n x))) (with (x 2) (println (f 3))))))

Read the article
Odd behavior when recursively building a return type for variadic functions

- by Dennis Zickefoose

This is probably going to be a really simple explanation, but I'm going to give as much backstory as possible in case I'm wrong. Advanced apologies for being so verbose. I'm using gcc4.5, and I realize the c++0x support is still somewhat experimental, but I'm going to act on the assumption that there's a non-bug related reason for the behavior I'm seeing. I'm experimenting with variadic function templates. The end goal was to build a cons-list out of std::pair. It wasn't meant to be a custom type, just a string of pair objects. The function that constructs the list would have to be in some way recursive, with the ultimate return value being dependent on the result of the recursive calls. As an added twist, successive parameters are added together before being inserted into the list. So if I pass [1, 2, 3, 4, 5, 6] the end result should be {1+2, {3+4, 5+6}}. My initial attempt was fairly naive. A function, Build, with two overloads. One took two identical parameters and simply returned their sum. The other took two parameters and a parameter pack. The return value was a pair consisting of the sum of the two set parameters, and the recursive call. In retrospect, this was obviously a flawed strategy, because the function isn't declared when I try to figure out its return type, so it has no choice but to resolve to the non-recursive version. That I understand. Where I got confused was the second iteration. I decided to make those functions static members of a template class. The function calls themselves are not parameterized, but instead the entire class is. My assumption was that when the recursive function attempts to generate its return type, it would instantiate a whole new version of the structure with its own static function, and everything would work itself out. The result was: "error: no matching function for call to BuildStruct<double, double, char, char>::Go(const char&, const char&)" The offending code: static auto Go(const Type& t0, const Type& t1, const Types&... rest) -> std::pair<Type, decltype(BuildStruct<Types...>::Go(rest...))> My confusion comes from the fact that the parameters to BuildStruct should always be the same types as the arguments sent to BuildStruct::Go, but in the error code Go is missing the initial two double parameters. What am I missing here? If my initial assumption about how the static functions would be chosen was incorrect, why is it trying to call the wrong function rather than just not finding a function at all? It seems to just be mixing types willy-nilly, and I just can't come up with an explanation as to why. If I add additional parameters to the initial call, it always burrows down to that last step before failing, so presumably the recursion itself is at least partially working. This is in direct contrast to the initial attempt, which always failed to find a function call right away. Ultimately, I've gotten past the problem, with a fairly elegant solution that hardly resembles either of the first two attempts. So I know how to do what I want to do. I'm looking for an explanation for the failure I saw. Full code to follow since I'm sure my verbal description was insufficient. First some boilerplate, if you feel compelled to execute the code and see it for yourself. Then the initial attempt, which failed reasonably, then the second attempt, which did not. #include <iostream> using std::cout; using std::endl; #include <utility> template<typename T1, typename T2> std::ostream& operator <<(std::ostream& str, const std::pair<T1, T2>& p) { return str << "[" << p.first << ", " << p.second << "]"; } //Insert code here int main() { Execute(5, 6, 4.3, 2.2, 'c', 'd'); Execute(5, 6, 4.3, 2.2); Execute(5, 6); return 0; } Non-struct solution: template<typename Type> Type BuildFunction(const Type& t0, const Type& t1) { return t0 + t1; } template<typename Type, typename... Rest> auto BuildFunction(const Type& t0, const Type& t1, const Rest&... rest) -> std::pair<Type, decltype(BuildFunction(rest...))> { return std::pair<Type, decltype(BuildFunction(rest...))> (t0 + t1, BuildFunction(rest...)); } template<typename... Types> void Execute(const Types&... t) { cout << BuildFunction(t...) << endl; } Resulting errors: test.cpp: In function 'void Execute(const Types& ...) [with Types = {int, int, double, double, char, char}]': test.cpp:33:35: instantiated from here test.cpp:28:3: error: no matching function for call to 'BuildFunction(const int&, const int&, const double&, const double&, const char&, const char&)' Struct solution: template<typename... Types> struct BuildStruct; template<typename Type> struct BuildStruct<Type, Type> { static Type Go(const Type& t0, const Type& t1) { return t0 + t1; } }; template<typename Type, typename... Types> struct BuildStruct<Type, Type, Types...> { static auto Go(const Type& t0, const Type& t1, const Types&... rest) -> std::pair<Type, decltype(BuildStruct<Types...>::Go(rest...))> { return std::pair<Type, decltype(BuildStruct<Types...>::Go(rest...))> (t0 + t1, BuildStruct<Types...>::Go(rest...)); } }; template<typename... Types> void Execute(const Types&... t) { cout << BuildStruct<Types...>::Go(t...) << endl; } Resulting errors: test.cpp: In instantiation of 'BuildStruct<int, int, double, double, char, char>': test.cpp:33:3: instantiated from 'void Execute(const Types& ...) [with Types = {int, int, double, double, char, char}]' test.cpp:38:41: instantiated from here test.cpp:24:15: error: no matching function for call to 'BuildStruct<double, double, char, char>::Go(const char&, const char&)' test.cpp:24:15: note: candidate is: static std::pair<Type, decltype (BuildStruct<Types ...>::Go(BuildStruct<Type, Type, Types ...>::Go::rest ...))> BuildStruct<Type, Type, Types ...>::Go(const Type&, const Type&, const Types& ...) [with Type = double, Types = {char, char}, decltype (BuildStruct<Types ...>::Go(BuildStruct<Type, Type, Types ...>::Go::rest ...)) = char] test.cpp: In function 'void Execute(const Types& ...) [with Types = {int, int, double, double, char, char}]': test.cpp:38:41: instantiated from here test.cpp:33:3: error: 'Go' is not a member of 'BuildStruct<int, int, double, double, char, char>'

Read the article

Search Results

Search found 1504 results on 61 pages for 'pros cons'.

Page 60/61 | < Previous Page | 56 57 58 59 60 61 | Next Page >

- by James Michael Hare

- by davidholmes

- by Sterling

- by FunctorSalad

- by fayer

- by sprugman

- by jamesmorris

- by smisner

- by James Michael Hare

- by Piotr Rodak

- by smisner

- by Bob Hanckel

- by allentown

- by rajesh ahuja

- by Jon Skeet

- by Kevin

- by ensoreus

- by ToStringTheory

- by Ben Rees

- by Mark

- by Dennis Zickefoose

< Previous Page | 56 57 58 59 60 61 | Next Page >