Search Results

Search found 235 results on 10 pages for 'reduction'.

Page 10/10 | < Previous Page | 6 7 8 9 10

CodePlex Daily Summary for Thursday, November 22, 2012

CodePlex Daily Summary for Thursday, November 22, 2012Popular ReleasesStackBuilder: StackBuilder 1.0.11.0: +Added Box/Case analysis...VidCoder: 1.4.7 Beta: Added view modes to the Preview window. Now you can see the image in 1:1 or in "Corners" mode to show a close-up of cropping results. Added the ability to set a custom completion sound. Gave the encoding settings command bar a more distinctive background color and extended it to the whole width of the window. Added the preview button to the command bar. Rearranged UI in Video tab and added back the section headers. Added the "Most" choice for the advanced x264 analysis option. Updat...ServiceMon - Extensible Real-time, Service Monitoring Utility for Windows: ServiceMon Release 1.2.0.58: Auto-uploaded from build serverImapX 2: ImapX 2.0.0.6: An updated release of the ImapX 2 library, containing many bugfixes for both, the library and the sample application.WiX Toolset: WiX v3.7 RC: WiX v3.7 RC (3.7.1119.0) provides feature complete Bundle update and reference tracking plus several bug fixes. For more information see Rob's blog post about the release: http://robmensching.com/blog/posts/2012/11/20/WiX-v3.7-Release-Candidate-availablePicturethrill: Version 2.11.20.0: Fixed up Bing image provider on Windows 8Excel AddIn to reset the last worksheet cell: XSFormatCleaner.xla: Modified the commandbar code to use CommandBar IDs instead of English names.Json.NET: Json.NET 4.5 Release 11: New feature - Added ITraceWriter, MemoryTraceWriter, DiagnosticsTraceWriter New feature - Added StringEscapeHandling with options to escape HTML and non-ASCII characters New feature - Added non-generic JToken.ToObject methods New feature - Deserialize ISet<T> properties as HashSet<T> New feature - Added implicit conversions for Uri, TimeSpan, Guid New feature - Missing byte, char, Guid, TimeSpan and Uri explicit conversion operators added to JToken New feature - Special case...EntitiesToDTOs - Entity Framework DTO Generator: EntitiesToDTOs.v3.0: DTOs and Assemblers can be generated inside project folders! Choose the types you want to generate! Support for Visual Studio 2012 !!! Support for new Entity Framework EDMX (format used by VS2012) ! Support for Enum Types! Optional automatic check for updates! Added the following methods to Assemblers! IEnumerable<DTO>.ToEntities() : ICollection<Entity> IEnumerable<Entity>.ToDTOs() : ICollection<DTO> Indicate class identifier for DTOs and Assemblers! Cleaner Assemblers code....mojoPortal: 2.3.9.4: see release notes on mojoportal.com http://www.mojoportal.com/mojoportal-2394-released Note that we have separate deployment packages for .NET 3.5 and .NET 4.0, but we recommend you to use .NET 4, we will probably drop support for .NET 3.5 once .NET 4.5 is available The deployment package downloads on this page are pre-compiled and ready for production deployment, they contain no C# source code and are not intended for use in Visual Studio. To download the source code see getting the lates...DotNetNuke® Store: 03.01.07: What's New in this release? IMPORTANT: this version requires DotNetNuke 04.06.02 or higher! DO NOT REPORT BUGS HERE IN THE ISSUE TRACKER, INSTEAD USE THE DotNetNuke Store Forum! Bugs corrected: - Replaced some hard coded references to the default address provider classes by the corresponding interfaces to allow the creation of another address provider with a different name. New Features: - Added the 'pickup' delivery option at checkout. - Added the 'no delivery' option in the Store Admin ...Bundle Transformer - a modular extension for ASP.NET Web Optimization Framework: Bundle Transformer 1.6.10: Version: 1.6.10 Published: 11/18/2012 Now almost all of the Bundle Transformer's assemblies is signed (except BundleTransformer.Yui.dll); In BundleTransformer.SassAndScss the SassAndCoffee.Ruby library was replaced by my own implementation of the Sass- and SCSS-compiler (based on code of the SassAndCoffee.Ruby library version 2.0.2.0); In BundleTransformer.CoffeeScript added support of CoffeeScript version 1.4.0-3; In BundleTransformer.TypeScript added support of TypeScript version 0....ExtJS based ASP.NET 2.0 Controls: FineUI v3.2.0: +2012-11-18 v3.2.0 -?????????????????SelectedValueArray????????（◇?◆:）。 -???????????????????RecoverPropertiesFromJObject????（〓?〓、????、??、Vian_Pan）。 -????????????，?????????????，???SelectedValueArray???????（sam.chang）。 -??Alert.Show???????????（swtseaman）。 -???????????????，??Icon??IconUrl????（swtseaman）。 -?????????TimePicker（??）。 -?????????，??/res.axd?css=blue.css&v=1。 -????????，?????????????，???????。 -????MenuCheckBox（???????）。 -?RadioButton??AutoPostBack??。 -???????FCKEditor?????????...BugNET Issue Tracker: BugNET 1.2: Please read our release notes for BugNET 1.2: http://blog.bugnetproject.com/bugnet-1-2-has-been-released Please do not post questions as reviews. Questions should be posted in the Discussions tab, where they will usually get promptly responded to. If you post a question as a review, you will pollute the rating, and you won't get an answer.Paint.NET PSD Plugin: 2.2.0: Changes: Layer group visibility is now applied to all layers within the group. This greatly improves the visual fidelity of complex PSD files that have hidden layer groups. Layer group names are prefixed so that users can get an indication of the layer group hierarchy. (Paint.NET has a flat list of layers, so the hierarchy is flattened out on load.) The progress bar now reports status when saving PSD files, instead of showing an indeterminate rolling bar. Performance improvement of 1...CRM 2011 Visual Ribbon Editor: Visual Ribbon Editor (1.3.1116.7): [IMPROVED] Detailed error message descriptions for FaultException [FIX] Fixed bug in rule CrmOfflineAccessStateRule which had incorrect State attribute name [FIX] Fixed bug in rule EntityPropertyRule which was missing PropertyValue attribute [FIX] Current connection information was not displayed in status bar while refreshing list of entitiesSuper Metroid Randomizer: Super Metroid Randomizer v5: v5 -Added command line functionality for automation purposes. -Implented Krankdud's change to randomize the Etecoon's item. NOTE: this version will not accept seeds from a previous version. The seed format has changed by necessity. v4 -Started putting version numbers at the top of the form. -Added a warning when suitless Maridia is required in a parsed seed. v3 -Changed seed to only generate filename-legal characters. Using old seeds will still work exactly the same. -Files can now be saved...Caliburn Micro: WPF, Silverlight, WP7 and WinRT/Metro made easy.: Caliburn.Micro v1.4: Changes This version includes many bug fixes across all platforms, improvements to nuget support and...the biggest news of all...full support for both WinRT and WP8. Download Contents Debug and Release Assemblies Samples Readme.txt License.txt Packages Available on Nuget Caliburn.Micro – The full framework compiled into an assembly. Caliburn.Micro.Start - Includes Caliburn.Micro plus a starting bootstrapper, view model and view. Caliburn.Micro.Container – The Caliburn.Micro invers...DirectX Tool Kit: November 15, 2012: November 15, 2012 Added support for WIC2 when available on Windows 8 and Windows 7 with KB 2670838 Cleaned up warning level 4 warningsDotNetNuke® Community Edition CMS: 06.02.05: Major Highlights Updated the system so that it supports nested folders in the App_Code folder Updated the Global Error Handling so that when errors within the global.asax handler happen, they are caught and shown in a page displaying the original HTTP error code Fixed issue that stopped users from specifying Link URLs that open on a new window Security FixesFixed issue in the Member Directory module that could show members to non authenticated users Fixed issue in the Lists modul...New Projects1122case1325: It is a codeplex project1122case1327: Never be so greedy Accommodation Portal: This is a skeleton web site for holiday home owners, who wishes to rent their holiday accommodations to visitors from around the world. Analog Clock: This is project contains analog clock made in win forms.Android Socket Plus: ????Android??????。???????Android???????Socket???，?????????（?PC?Windows????????）??????Socket???，??，????????????；??????????????Socket???，??????????????Socket???。Bancosol: PFCBig Data Twitter Demo: This demo analyzes tweets in real-time, even including a dashboard. The tweets are also archived in Azure DB/Blob and Hadoop where Excel can be used for BI!Cloud For Science: This project serves as issue tracker for C4S components, and is used by participants of the C4S project. cwt: cwtDiary Application for Elementary Student: A Simple Diary Application Made for Elementary Student. Dice Dreams: Dice Dreams Dice Dreams is a dice game , the player must get 1.000.000 point to win, you start with 100.000 point. Dynamic Entity Framework Filtering: Generates linq to Entity Framework Queries by examining the EF data model, thus allowing for code reduction for queries to commonly requested entities.EazyErp: Enterprise Resource Planning System of Vf AsiaGoPlay: Tooke - add a project description here...HMyBlog: myblogHumanitarian Toolbox: This project is the publication site for bits built by the Humanitarian Toolbox ( http://humanitariantoolbox.net/).Impulse Media Player: Impulse Media Player is the ultimate media player for WindowsInspiration.Web: Description: A simple (but entertaining) ASP.NET MVC (C#) project to suggest random code names for projects. Intended audience: People who need a name (any name) to get started with their projects. Application written during Webcamp Singapore (4th Jun 2010 - 5th Jun 2010).iPictureUploader: Simple tool to upload images onto WEB and share via blogs/forums/etc lugionline: test projectMosaic Snake 3D: A clone of the popular Snake game for Windows 8. It contains a simple 3D engine based on SharpDX and is completely written in C# and Xaml.Outage Display: A simple web-based outage displayPrestaShop free Electronic Brown Shop Template - ModuleBazaar: Check out Latest and Best Featured PrestaShop Templates, Modules Magento Extensions, Opencart Extensions, Clone scripts from ModuleBazaarRackspace Cloud Files Manager: A simple Rackspace Cloud Files manager for static websitesRoll The Dice: Roll The Dice is a simple game developed by Erika Enggar Savitri and Queen Anugerah Aguslia who are currently studying Information System at Ma Chung UniversityRussian IDM: Russian IDM is free IdM solutionshootout: Comparing the speed of different languages and the constructs available within those languagessmart messaging connector: Sending fax from a PrintDocument or a file. Sending broadcast fax. Managing (Pause/Resume/Restart) current fax job. Managing configuration of the fax serverTeen Diary: Teen Diary Software - FREEWARE SOFTWARE. - SIMPLE, LIGHTWEIGHT, PORTABLE, COMPATIBLE. - made from .NET FRAMEWORK languange with XML data. Tekapo: Tekapo is a wizard style application that will help you manage your digital photos. Most digital cameras will store images as JPEG files. Information about the camera and how the camera has taken the photo is placed into the JPEG file along with the image data. One of the pieces of information stored is the date and time that the photo was taken. Tekapo uses the picture taken date to organise the photos.The Byte Kitchen's Open Sources: This project is related to The Byte Kitchen Blog (at thebytekitchen.com). It typically deals with Windows 8 apps, DirectX, and the Kinect for Windows.TheDiary: daily-self journalWCFsample: WCFsampleXNA Game Editor: This project will has familiar features like Unity Engine Editor.

Read the article
????ASMM

- by Liu Maclean(???)

???Oracle??????????????SGA/PGA???,????10g????????????ASMM????,????????ASMM?????????Oracle??????????,?ASMM??????DBA????????????;????????ASMM???????????????DBA???:????????????DB,?????????????DBA?????????????????????????????????,ASMM??????????,???????????,??????????,??????????????????;?10g release 1?10.2??????ASMM?????????????,???????ASMM????????ASMM?????startup???????????ASMM??AMM??,????????DBA????SGA/PGA?????????”??”??”???”???,???????????DBA????chemist(???????1??2??????????????)? ?????????????????ASMM?????,?????????????…… Oracle?SGA???????9i???????????,????: Buffer Cache ????????????,??????????????? Default Pool ??????,???DB_CACHE_SIZE?? Keep Pool ??????,???DB_KEEP_CACHE_SIZE?? Non standard pool ???????,???DB_nK_cache_size?? Recycle pool ???,???db_recycle_cache_size?? Shared Pool ???,???shared_pool_size?? Library cache ?????? Row cache ???,?????? Java Pool java?,???Java_pool_size?? Large Pool ??,???Large_pool_size?? Fixed SGA ???SGA??,???Oracle???????,?????????granule? ?9i?????ASMM,???????????SGA,??????MSMM??9i???buffer cache??????????,?????????????????????????,???9i?????????????,?????????????????????????? ????SGA?????: ?????shared pool?default buffer pool????????,??????????? ?9i???????????(advisor),?????????? ??????????????? ?????????,?????? ?????,?????ORA-04031?????????? ASMM?????: ?????????? ???????????????? ???????sga_target?? ???????????,??????????? ??MSMM???????: ???? ???? ?????? ???? ??????????,??????????? ??????????????????,??????????ORA-04031??? ASMM???????????:1.??????sga_target???????2.???????,???:????(memory component),????(memory broker)???????(memory mechanism)3.????(memory advisor) ASMM????????????(Automatically set),??????:shared_pool_size?db_cache_size?java_pool_size?large_pool _size?streams_pool_size;?????????????????,???:db_keep_cache_size?db_recycle_cache_size?db_nk_cache_size?log_buffer????SGA?????,????????????????,??log_buffer?fixed sga??????????????? ??ASMM?????????sga_target??,???????ASMM??????????????????db_cache_size?java_pool_size???,?????????????????????,????????????????????(???)????????,Oracle?????????(granule,?SGA<1GB?granule???4M,?SGA>1GB?granule???16M)???????,??????????????buffer cache,??????????????????(granule)??????????????????????sga_target??,???????????????????(dism,???????)???ASMM?????????????statistics_level?????typical?ALL,?????BASIC??MMON????(Memory Monitor is a background process that gathers memory statistics (snapshots) stores this information in the AWR (automatic workload repository). MMON is also responsible for issuing alerts for metrics that exceed their thresholds)?????????????????????ASMM?????,???????????sga_target?????statistics_level?BASIC: SQL> show parameter sga NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ lock_sga boolean FALSE pre_page_sga boolean FALSE sga_max_size big integer 2000M sga_target big integer 2000M SQL> show parameter sga_target NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ sga_target big integer 2000M SQL> alter system set statistics_level=BASIC; alter system set statistics_level=BASIC * ERROR at line 1: ORA-02097: parameter cannot be modified because specified value is invalid ORA-00830: cannot set statistics_level to BASIC with auto-tune SGA enabled ?????server parameter file?spfile??,ASMM????shutdown??????????????(Oracle???????,????????)???spfile?,?????strings?????spfile????????????????????,?: G10R2.__db_cache_size=973078528 G10R2.__java_pool_size=16777216 G10R2.__large_pool_size=16777216 G10R2.__shared_pool_size=1006632960 G10R2.__streams_pool_size=67108864 ???spfile?????????????????,???????????”???”?????,??????????”??”?? ?ASMM?????????????? ?????(tunable):????????????????????????????buffer cache?????????,cache????????????????,?????????? IO????????????????????????????Library cache????? subheap????,?????????????????????????????????(open cursors)?????????client??????????????buffer cache???????,???????????pin??buffer???(???????) ?????(Un-tunable):???????????????????,?????????????????,?????????????????????????large pool?????? ??????(Fixed Size):???????????,??????????????????????????????????????? ????????????????(memory resize request)?????????,?????: ??????(Immediate Request):???????????ASMM????????????????????????(chunk)?,??????OUT-OF-MEMORY(ORA-04031)???,????????????????????(granule)????????????????????granule,????????????,?????????????????????????????,????granule??????????????? ??????(Deferred Request):???????????????????????????,??????????????granule???????????????MMON??????????delta. ??????(Manual Request):????????????alter system?????????????????????????????????????????????????granule,??????grow?????ORA-4033??,?????shrink?????ORA-4034??? ?ASMM????,????(Memory Broker)????????????????????????????(Deferred)??????????????????????(auto-tunable component)???????????????,???????????????MMON??????????????????????????????????,????????????????;MMON????Memory Broker?????????????????????????MMON????????????????????????????????????????(resize request system queue)?MMAN????(Memory Manager is a background process that manages the dynamic resizing of SGA memory areas as the workload increases or decreases)??????????????????? ?10gR1?Shared Pool?shrink??????????,?????????????Buffer Cache???????????granule,????Buffer Cache?granule????granule header?Metadata(???buffer header??RAC??Lock Elements)????,?????????????????????shared pool????????duration(?????)?chunk??????granule?,????????????granule??10gR2????Buffer Cache Granule????????granule header?buffer?Metadata(buffer header?LE)????,??shared pool???duration?chunk????????granule,??????buffer cache?shared pool??????????????10gr2?streams pool?????????(???????streams pool duration????) ??????????(Donor,???trace????)???,?????????granule???buffer cache,????granule????????????: ????granule???????granule header ?????chunk????granule?????????buffer header ???,???chunk??????????????????????metadata? ???2-4??,???granule???? ??????????????????,??buffer cache??granule???shared pool?,???????: MMAN??????????buffer cache???granule MMAN????granule??quiesce???(Moving 1 granule from inuse to quiesce list of DEFAULT buffer cache for an immediate req) DBWR???????quiesced???granule????buffer(dirty buffer) MMAN??shared pool????????(consume callback),granule?free?chunk???shared pool??(consume)?,????????????????????granule????shared granule??????,???????????granule???????????,??????pin??buffer??Metadata(???buffer header?LE)?????buffer cache??? ???granule???????shared pool,???granule?????shared??? ?????ASMM???????????,??????????: _enabled_shared_pool_duration:?????????10g????shared pool duration??,?????sga_target?0?????false;???10.2.0.5??cursor_space_for_time???true??????false,???10.2.0.5??cursor_space_for_time????? _memory_broker_shrink_heaps:???????0??Oracle?????shared pool?java pool,??????0,??shrink request??????????????????? _memory_management_tracing: ???????MMON?MMAN??????????(advisor)?????(Memory Broker)?????trace???;??ORA-04031????????36,???8?????????????trace,???23????Memory Broker decision???,???32???cache resize???;??????????: Level Contents 0×01 Enables statistics tracing 0×02 Enables policy tracing 0×04 Enables transfer of granules tracing 0×08 Enables startup tracing 0×10 Enables tuning tracing 0×20 Enables cache tracing ?????????_memory_management_tracing?????DUMP_TRANSFER_OPS????????????????,?????????????????trace?????????mman_trace?transfer_ops_dump? SQL> alter system set "_memory_management_tracing"=63; System altered Operation make shared pool grow and buffer cache shrink!!!.............. ???????granule?????,????default buffer pool?resize??: AUTO SGA: Request 0xdc9c2628 after pre-processing, ret=0 /* ???0xdc9c2628??????addr */ AUTO SGA: IMMEDIATE, FG request 0xdc9c2628 /* ???????????Immediate???? */ AUTO SGA: Receiver of memory is shared pool, size=16, state=3, flg=0 /* ?????????shared pool,???,????16?granule,??grow?? */ AUTO SGA: Donor of memory is DEFAULT buffer cache, size=106, state=4, flg=0 /* ???????Default buffer cache,????,????106?granule,??shrink?? */ AUTO SGA: Memory requested=3896, remaining=3896 /* ??immeidate request???????3896 bytes */ AUTO SGA: Memory received=0, minreq=3896, gransz=16777216 /* ????free?granule,??received?0,gransz?granule??? */ AUTO SGA: Request 0xdc9c2628 status is INACTIVE /* ??????????,??????inactive?? */ AUTO SGA: Init bef rsz for request 0xdc9c2628 /* ????????before-process???? */ AUTO SGA: Set rq dc9c2628 status to PENDING /* ?request??pending?? */ AUTO SGA: 0xca000000 rem=3896, rcvd=16777216, 105, 16777216, 17 /* ???????0xca000000?16M??granule */ AUTO SGA: Returning 4 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 4, 1, a AUTO SGA: Resize done for pool DEFAULT, 8192 /* ???default pool?resize */ AUTO SGA: Init aft rsz for request 0xdc9c2628 AUTO SGA: Request 0xdc9c2628 after processing AUTO SGA: IMMEDIATE, FG request 0x7fff917964a0 AUTO SGA: Receiver of memory is shared pool, size=17, state=0, flg=0 AUTO SGA: Donor of memory is DEFAULT buffer cache, size=105, state=0, flg=0 AUTO SGA: Memory requested=3896, remaining=0 AUTO SGA: Memory received=16777216, minreq=3896, gransz=16777216 AUTO SGA: Request 0x7fff917964a0 status is COMPLETE /* shared pool????16M?granule */ AUTO SGA: activated granule 0xca000000 of shared pool ?????partial granule????????????trace: AUTO SGA: Request 0xdc9c2628 after pre-processing, ret=0 AUTO SGA: IMMEDIATE, FG request 0xdc9c2628 AUTO SGA: Receiver of memory is shared pool, size=82, state=3, flg=1 AUTO SGA: Donor of memory is DEFAULT buffer cache, size=36, state=4, flg=1 /* ????????shared pool,?????default buffer cache */ AUTO SGA: Memory requested=4120, remaining=4120 AUTO SGA: Memory received=0, minreq=4120, gransz=16777216 AUTO SGA: Request 0xdc9c2628 status is INACTIVE AUTO SGA: Init bef rsz for request 0xdc9c2628 AUTO SGA: Set rq dc9c2628 status to PENDING AUTO SGA: Moving granule 0x93000000 of DEFAULT buffer cache to activate list AUTO SGA: Moving 1 granule 0x8c000000 from inuse to quiesce list of DEFAULT buffer cache for an immediate req /* ???buffer cache??????0x8c000000?granule??????inuse list, ???????quiesce list? */ AUTO SGA: Returning 0 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 0, 1, 20a AUTO SGA: activated granule 0x93000000 of DEFAULT buffer cache AUTO SGA: NOT_FREE for imm req for gran 0x8c000000 / * ??dbwr??0x8c000000 granule????dirty buffer */ AUTO SGA: Returning 0 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 0, 1, 20a AUTO SGA: NOT_FREE for imm req for gran 0x8c000000 AUTO SGA: Returning 0 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 0, 1, 20a AUTO SGA: NOT_FREE for imm req for gran 0x8c000000 AUTO SGA: Returning 0 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 0, 1, 20a AUTO SGA: NOT_FREE for imm req for gran 0x8c000000 AUTO SGA: Returning 0 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 0, 1, 20a AUTO SGA: NOT_FREE for imm req for gran 0x8c000000 AUTO SGA: Returning 0 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 0, 1, 20a AUTO SGA: NOT_FREE for imm req for gran 0x8c000000 ......................................... AUTO SGA: Rcv shared pool consuming 8192 from 0x8c000000 in granule 0x8c000000; owner is DEFAULT buffer cache AUTO SGA: Rcv shared pool consuming 90112 from 0x8c002000 in granule 0x8c000000; owner is DEFAULT buffer cache AUTO SGA: Rcv shared pool consuming 24576 from 0x8c01a000 in granule 0x8c000000; owner is DEFAULT buffer cache AUTO SGA: Rcv shared pool consuming 65536 from 0x8c022000 in granule 0x8c000000; owner is DEFAULT buffer cache AUTO SGA: Rcv shared pool consuming 131072 from 0x8c034000 in granule 0x8c000000; owner is DEFAULT buffer cache AUTO SGA: Rcv shared pool consuming 286720 from 0x8c056000 in granule 0x8c000000; owner is DEFAULT buffer cache AUTO SGA: Rcv shared pool consuming 98304 from 0x8c09e000 in granule 0x8c000000; owner is DEFAULT buffer cache AUTO SGA: Rcv shared pool consuming 106496 from 0x8c0b8000 in granule 0x8c000000; owner is DEFAULT buffer cache ..................... /* ??shared pool????0x8c000000 granule??chunk, ??granule?owner????default buffer cache */ AUTO SGA: Imm xfer 0x8c000000 from quiesce list of DEFAULT buffer cache to partial inuse list of shared pool /* ???0x8c000000 granule?default buffer cache????????shared pool????inuse list */ AUTO SGA: Returning 4 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 4, 1, 20a AUTO SGA: Init aft rsz for request 0xdc9c2628 AUTO SGA: Request 0xdc9c2628 after processing AUTO SGA: IMMEDIATE, FG request 0x7fffe9bcd0e0 AUTO SGA: Receiver of memory is shared pool, size=83, state=0, flg=1 AUTO SGA: Donor of memory is DEFAULT buffer cache, size=35, state=0, flg=1 AUTO SGA: Memory requested=4120, remaining=0 AUTO SGA: Memory received=14934016, minreq=4120, gransz=16777216 AUTO SGA: Request 0x7fffe9bcd0e0 status is COMPLETE /* ????partial transfer?? */ ?????partial transfer??????DUMP_TRANSFER_OPS????0x8c000000 partial granule???????,?: SQL> oradebug setmypid; Statement processed. SQL> oradebug dump DUMP_TRANSFER_OPS 1; Statement processed. SQL> oradebug tracefile_name; /s01/admin/G10R2/udump/g10r2_ora_21482.trc =======================trace content============================== GRANULE SIZE is 16777216 COMPONENT NAME : shared pool Number of granules in partially inuse list (listid 4) is 23 Granule addr is 0x8c000000 Granule owner is DEFAULT buffer cache /* ?0x8c000000 granule?shared pool?partially inuse list, ?????owner??default buffer cache */ Granule 0x8c000000 dump from owner perspective gptr = 0x8c000000, num buf hdrs = 1989, num buffers = 156, ghdr = 0x8cffe000 / * ?????granule?granule header????0x8cffe000, ????156?buffer block,1989?buffer header */ /* ??granule??????,??????buffer cache??shared pool chunk */ BH:0x8cf76018 BA:(nil) st:11 flg:20000 BH:0x8cf76128 BA:(nil) st:11 flg:20000 BH:0x8cf76238 BA:(nil) st:11 flg:20000 BH:0x8cf76348 BA:(nil) st:11 flg:20000 BH:0x8cf76458 BA:(nil) st:11 flg:20000 BH:0x8cf76568 BA:(nil) st:11 flg:20000 BH:0x8cf76678 BA:(nil) st:11 flg:20000 BH:0x8cf76788 BA:(nil) st:11 flg:20000 BH:0x8cf76898 BA:(nil) st:11 flg:20000 BH:0x8cf769a8 BA:(nil) st:11 flg:20000 BH:0x8cf76ab8 BA:(nil) st:11 flg:20000 BH:0x8cf76bc8 BA:(nil) st:11 flg:20000 BH:0x8cf76cd8 BA:0x8c018000 st:1 flg:622202 ............... Address 0x8cf30000 to 0x8cf74000 not in cache Address 0x8cf74000 to 0x8d000000 in cache Granule 0x8c000000 dump from receivers perspective Dumping layout Address 0x8c000000 to 0x8c018000 in sga heap(1,3) (idx=1, dur=4) Address 0x8c018000 to 0x8c01a000 not in this pool Address 0x8c01a000 to 0x8c020000 in sga heap(1,3) (idx=1, dur=4) Address 0x8c020000 to 0x8c022000 not in this pool Address 0x8c022000 to 0x8c032000 in sga heap(1,3) (idx=1, dur=4) Address 0x8c032000 to 0x8c034000 not in this pool Address 0x8c034000 to 0x8c054000 in sga heap(1,3) (idx=1, dur=4) Address 0x8c054000 to 0x8c056000 not in this pool Address 0x8c056000 to 0x8c09c000 in sga heap(1,3) (idx=1, dur=4) Address 0x8c09c000 to 0x8c09e000 not in this pool Address 0x8c09e000 to 0x8c0b6000 in sga heap(1,3) (idx=1, dur=4) Address 0x8c0b6000 to 0x8c0b8000 not in this pool Address 0x8c0b8000 to 0x8c0d2000 in sga heap(1,3) (idx=1, dur=4) ???????granule?????shared granule??????,?????????buffer block,????1?shared subpool??????durtaion?4?chunk,duration=4?execution duration;??duration?chunk???????????,??extent???quiesce list??????????????free?execution duration?????????????,??????duration???extent(??????extent????granule)??????? ?????????????ASMM?????????,????: V$SGAINFODisplays summary information about the system global area (SGA). V$SGADisplays size information about the SGA, including the sizes of different SGA components, the granule size, and free memory. V$SGASTATDisplays detailed information about the SGA. V$SGA_DYNAMIC_COMPONENTSDisplays information about the dynamic SGA components. This view summarizes information based on all completed SGA resize operations since instance startup. V$SGA_DYNAMIC_FREE_MEMORYDisplays information about the amount of SGA memory available for future dynamic SGA resize operations. V$SGA_RESIZE_OPSDisplays information about the last 400 completed SGA resize operations. V$SGA_CURRENT_RESIZE_OPSDisplays information about SGA resize operations that are currently in progress. A resize operation is an enlargement or reduction of a dynamic SGA component. V$SGA_TARGET_ADVICEDisplays information that helps you tune SGA_TARGET. ?????????shared pool duration???,?????????

Read the article
Jquery Serialize data

- by Richard

So I have several text boxes, drop down menus, and radio options on this form. When the user clicks submit i want to save ALL that information so I can put it into a database. So this is all the form's inputs <div id="reg2a"> First Name: <br/><input type="text" name="fname" /> <br/> Last Name: <br/><input type="text" name="lname" /> <br/> Address: <br/><input type="text" name="address" /> <br/> City: <br/><input type="text" name="city" /> <br/> State: <br/><input type="text" name="state" /> <br/> Zip Code: <br/><input type="text" name="zip" /> <br/> Phone Number: <br/><input type="text" name="phone" /> <br/> Fax: <br/><input type="text" name="fax" /> <br/> Email: <br/><input type="text" name="email" /> <br/> Ethnicity: <i>Used only for grant reporting purposes</i> <br/><input type="text" name="ethnicity" /> <br/><br/> Instutional Information Type (select the best option) <br/> <select name="iitype"> <option value="none">None</option> <option value="uni">University</option> <option value="commorg">Community Organization</option> </select> <br/><br/> Number of sessions willing to present: <select id="vennum_select" name="vnum"> <?php for($i=0;$i<=3;$i++) { ?> <option value="<?php echo $i ?>"><?php echo $i ?></option> <?php } ?><br/> </select><br/> Number of tables requested: <select id="tabnum_select" name="tnum"> <?php for($i=1;$i<=3;$i++) { ?> <option value="<?php echo $i ?>"><?php echo $i ?></option> <?php } ?> </select><br/><br/> Awarding of a door prize during the conference will result in a reduction in the cost of your first table. <br/><br/> I am providing a door prize for delivery during the conference of $75 or more <select id="prize_select" name="pnum"> <option value="0">No</option> <option value="1">Yes</option> </select><br/> Prize name: <input type="text" name="prize_name" /><br/> Description: <input type="text" name="descr" /><br/> Value: <input type="text" name="retail" /><br/><br/> Name of Institution: <br/><input type="text" name="institution" /> <br/><br/> Type (select the best option) <br/> <select name="type"> <option value="none">None</option> <option value="k5">K-5</option> <option value="k8">K-8</option> <option value="68">6-8</option> <option value="912">9-12</option> </select> <br/><br/> Address: <br/><input type="text" name="address_sch" /> <br/> City: <br/><input type="text" name="city_sch" /> <br/> State: <br/><input type="text" name="state_sch" /> <br/> Zip Code: <br/><input type="text" name="zip_sch" /> <br/> <form name="frm2sub" id="frm2sub" action="page3.php" method="post"> <input type="submit" name="submit" value="Submit" id="submit" /> </form> </div> This is my jquery function: $("#frm2sub").submit( function() { var values = {}; values["fname"] = $("#fname").val(); }); I can do this for each one of the input boxes but I want to give all this data to the next page. So how do I put this array into $_POST? Btw, I've tried doing var data = $("#reg2a").serialize(); and var data = $(document).serialize(); Data ended up being empty. Any ideas? Thanks

Read the article
Improving Partitioned Table Join Performance

- by Paul White

The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing. Test Data The two tables in this example use a common partitioning partition scheme. The partition function uses 41 equal-size partitions: CREATE PARTITION FUNCTION PFT (integer) AS RANGE RIGHT FOR VALUES ( 125000, 250000, 375000, 500000, 625000, 750000, 875000, 1000000, 1125000, 1250000, 1375000, 1500000, 1625000, 1750000, 1875000, 2000000, 2125000, 2250000, 2375000, 2500000, 2625000, 2750000, 2875000, 3000000, 3125000, 3250000, 3375000, 3500000, 3625000, 3750000, 3875000, 4000000, 4125000, 4250000, 4375000, 4500000, 4625000, 4750000, 4875000, 5000000 ); GO CREATE PARTITION SCHEME PST AS PARTITION PFT ALL TO ([PRIMARY]); There two tables are: CREATE TABLE dbo.T1 ( TID integer NOT NULL IDENTITY(0,1), Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T1 PRIMARY KEY CLUSTERED (TID) ON PST (TID) ); CREATE TABLE dbo.T2 ( TID integer NOT NULL, Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T2 PRIMARY KEY CLUSTERED (TID, Column1) ON PST (TID) ); The next script loads 5 million rows into T1 with a pseudo-random value between 1 and 5 for Column1. The table is partitioned on the IDENTITY column TID: INSERT dbo.T1 WITH (TABLOCKX) (Column1) SELECT (ABS(CHECKSUM(NEWID())) % 5) + 1 FROM dbo.Numbers AS N WHERE n BETWEEN 1 AND 5000000; In case you don’t already have an auxiliary table of numbers lying around, here’s a script to create one with 10 million rows: CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY); WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5) INSERT dbo.Numbers WITH (TABLOCKX) SELECT TOP (10000000) n FROM Nums ORDER BY n OPTION (MAXDOP 1); Table T1 contains data like this: Next we load data into table T2. The relationship between the two tables is that table 2 contains ‘n’ rows for each row in table 1, where ‘n’ is determined by the value in Column1 of table T1. There is nothing particularly special about the data or distribution, by the way. INSERT dbo.T2 WITH (TABLOCKX) (TID, Column1) SELECT T.TID, N.n FROM dbo.T1 AS T JOIN dbo.Numbers AS N ON N.n >= 1 AND N.n <= T.Column1; Table T2 ends up containing about 15 million rows: The primary key for table T2 is a combination of TID and Column1. The data is partitioned according to the value in column TID alone. Partition Distribution The following query shows the number of rows in each partition of table T1: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T1 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are 40 partitions containing 125,000 rows (40 * 125k = 5m rows). The rightmost partition remains empty. The next query shows the distribution for table 2: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T2 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are roughly 375,000 rows in each partition (the rightmost partition is also empty): Ok, that’s the test data done. Test Query and Execution Plan The task is to count the rows resulting from joining tables 1 and 2 on the TID column: SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; The optimizer chooses a plan using parallel hash join, and partial aggregation: The Plan Explorer plan tree view shows accurate cardinality estimates and an even distribution of rows across threads (click to enlarge the image): With a warm data cache, the STATISTICS IO output shows that no physical I/O was needed, and all 41 partitions were touched: Running the query without actual execution plan or STATISTICS IO information for maximum performance, the query returns in around 2600ms. Execution Plan Analysis The first step toward improving on the execution plan produced by the query optimizer is to understand how it works, at least in outline. The two parallel Clustered Index Scans use multiple threads to read rows from tables T1 and T2. Parallel scan uses a demand-based scheme where threads are given page(s) to scan from the table as needed. This arrangement has certain important advantages, but does result in an unpredictable distribution of rows amongst threads. The point is that multiple threads cooperate to scan the whole table, but it is impossible to predict which rows end up on which threads. For correct results from the parallel hash join, the execution plan has to ensure that rows from T1 and T2 that might join are processed on the same thread. For example, if a row from T1 with join key value ‘1234’ is placed in thread 5’s hash table, the execution plan must guarantee that any rows from T2 that also have join key value ‘1234’ probe thread 5’s hash table for matches. The way this guarantee is enforced in this parallel hash join plan is by repartitioning rows to threads after each parallel scan. The two repartitioning exchanges route rows to threads using a hash function over the hash join keys. The two repartitioning exchanges use the same hash function so rows from T1 and T2 with the same join key must end up on the same hash join thread. Expensive Exchanges This business of repartitioning rows between threads can be very expensive, especially if a large number of rows is involved. The execution plan selected by the optimizer moves 5 million rows through one repartitioning exchange and around 15 million across the other. As a first step toward removing these exchanges, consider the execution plan selected by the optimizer if we join just one partition from each table, disallowing parallelism: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = 1 AND $PARTITION.PFT(T2.TID) = 1 OPTION (MAXDOP 1); The optimizer has chosen a (one-to-many) merge join instead of a hash join. The single-partition query completes in around 100ms. If everything scaled linearly, we would expect that extending this strategy to all 40 populated partitions would result in an execution time around 4000ms. Using parallelism could reduce that further, perhaps to be competitive with the parallel hash join chosen by the optimizer. This raises a question. If the most efficient way to join one partition from each of the tables is to use a merge join, why does the optimizer not choose a merge join for the full query? Forcing a Merge Join Let’s force the optimizer to use a merge join on the test query using a hint: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN); This is the execution plan selected by the optimizer: This plan results in the same number of logical reads reported previously, but instead of 2600ms the query takes 5000ms. The natural explanation for this drop in performance is that the merge join plan is only using a single thread, whereas the parallel hash join plan could use multiple threads. Parallel Merge Join We can get a parallel merge join plan using the same query hint as before, and adding trace flag 8649: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN, QUERYTRACEON 8649); The execution plan is: This looks promising. It uses a similar strategy to distribute work across threads as seen for the parallel hash join. In practice though, performance is disappointing. On a typical run, the parallel merge plan runs for around 8400ms; slower than the single-threaded merge join plan (5000ms) and much worse than the 2600ms for the parallel hash join. We seem to be going backwards! The logical reads for the parallel merge are still exactly the same as before, with no physical IOs. The cardinality estimates and thread distribution are also still very good (click to enlarge): A big clue to the reason for the poor performance is shown in the wait statistics (captured by Plan Explorer Pro): CXPACKET waits require careful interpretation, and are most often benign, but in this case excessive waiting occurs at the repartitioning exchanges. Unlike the parallel hash join, the repartitioning exchanges in this plan are order-preserving ‘merging’ exchanges (because merge join requires ordered inputs): Parallelism works best when threads can just grab any available unit of work and get on with processing it. Preserving order introduces inter-thread dependencies that can easily lead to significant waits occurring. In extreme cases, these dependencies can result in an intra-query deadlock, though the details of that will have to wait for another time to explore in detail. The potential for waits and deadlocks leads the query optimizer to cost parallel merge join relatively highly, especially as the degree of parallelism (DOP) increases. This high costing resulted in the optimizer choosing a serial merge join rather than parallel in this case. The test results certainly confirm its reasoning. Collocated Joins In SQL Server 2008 and later, the optimizer has another available strategy when joining tables that share a common partition scheme. This strategy is a collocated join, also known as as a per-partition join. It can be applied in both serial and parallel execution plans, though it is limited to 2-way joins in the current optimizer. Whether the optimizer chooses a collocated join or not depends on cost estimation. The primary benefits of a collocated join are that it eliminates an exchange and requires less memory, as we will see next. Costing and Plan Selection The query optimizer did consider a collocated join for our original query, but it was rejected on cost grounds. The parallel hash join with repartitioning exchanges appeared to be a cheaper option. There is no query hint to force a collocated join, so we have to mess with the costing framework to produce one for our test query. Pretending that IOs cost 50 times more than usual is enough to convince the optimizer to use collocated join with our test query: -- Pretend IOs are 50x cost temporarily DBCC SETIOWEIGHT(50); -- Co-located hash join SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (RECOMPILE); -- Reset IO costing DBCC SETIOWEIGHT(1); Collocated Join Plan The estimated execution plan for the collocated join is: The Constant Scan contains one row for each partition of the shared partitioning scheme, from 1 to 41. The hash repartitioning exchanges seen previously are replaced by a single Distribute Streams exchange using Demand partitioning. Demand partitioning means that the next partition id is given to the next parallel thread that asks for one. My test machine has eight logical processors, and all are available for SQL Server to use. As a result, there are eight threads in the single parallel branch in this plan, each processing one partition from each table at a time. Once a thread finishes processing a partition, it grabs a new partition number from the Distribute Streams exchange…and so on until all partitions have been processed. It is important to understand that the parallel scans in this plan are different from the parallel hash join plan. Although the scans have the same parallelism icon, tables T1 and T2 are not being co-operatively scanned by multiple threads in the same way. Each thread reads a single partition of T1 and performs a hash match join with the same partition from table T2. The properties of the two Clustered Index Scans show a Seek Predicate (unusual for a scan!) limiting the rows to a single partition: The crucial point is that the join between T1 and T2 is on TID, and TID is the partitioning column for both tables. A thread that processes partition ‘n’ is guaranteed to see all rows that can possibly join on TID for that partition. In addition, no other thread will see rows from that partition, so this removes the need for repartitioning exchanges. CPU and Memory Efficiency Improvements The collocated join has removed two expensive repartitioning exchanges and added a single exchange processing 41 rows (one for each partition id). Remember, the parallel hash join plan exchanges had to process 5 million and 15 million rows. The amount of processor time spent on exchanges will be much lower in the collocated join plan. In addition, the collocated join plan has a maximum of 8 threads processing single partitions at any one time. The 41 partitions will all be processed eventually, but a new partition is not started until a thread asks for it. Threads can reuse hash table memory for the new partition. The parallel hash join plan also had 8 hash tables, but with all 5,000,000 build rows loaded at the same time. The collocated plan needs memory for only 8 * 125,000 = 1,000,000 rows at any one time. Collocated Hash Join Performance The collated join plan has disappointing performance in this case. The query runs for around 25,300ms despite the same IO statistics as usual. This is much the worst result so far, so what went wrong? It turns out that cardinality estimation for the single partition scans of table T1 is slightly low. The properties of the Clustered Index Scan of T1 (graphic immediately above) show the estimation was for 121,951 rows. This is a small shortfall compared with the 125,000 rows actually encountered, but it was enough to cause the hash join to spill to physical tempdb: A level 1 spill doesn’t sound too bad, until you realize that the spill to tempdb probably occurs for each of the 41 partitions. As a side note, the cardinality estimation error is a little surprising because the system tables accurately show there are 125,000 rows in every partition of T1. Unfortunately, the optimizer uses regular column and index statistics to derive cardinality estimates here rather than system table information (e.g. sys.partitions). Collocated Merge Join We will never know how well the collocated parallel hash join plan might have worked without the cardinality estimation error (and the resulting 41 spills to tempdb) but we do know: Merge join does not require a memory grant; and Merge join was the optimizer’s preferred join option for a single partition join Putting this all together, what we would really like to see is the same collocated join strategy, but using merge join instead of hash join. Unfortunately, the current query optimizer cannot produce a collocated merge join; it only knows how to do collocated hash join. So where does this leave us? CROSS APPLY sys.partitions We can try to write our own collocated join query. We can use sys.partitions to find the partition numbers, and CROSS APPLY to get a count per partition, with a final step to sum the partial counts. The following query implements this idea: SELECT row_count = SUM(Subtotals.cnt) FROM ( -- Partition numbers SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1 ) AS P CROSS APPLY ( -- Count per collocated join SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; The estimated plan is: The cardinality estimates aren’t all that good here, especially the estimate for the scan of the system table underlying the sys.partitions view. Nevertheless, the plan shape is heading toward where we would like to be. Each partition number from the system table results in a per-partition scan of T1 and T2, a one-to-many Merge Join, and a Stream Aggregate to compute the partial counts. The final Stream Aggregate just sums the partial counts. Execution time for this query is around 3,500ms, with the same IO statistics as always. This compares favourably with 5,000ms for the serial plan produced by the optimizer with the OPTION (MERGE JOIN) hint. This is another case of the sum of the parts being less than the whole – summing 41 partial counts from 41 single-partition merge joins is faster than a single merge join and count over all partitions. Even so, this single-threaded collocated merge join is not as quick as the original parallel hash join plan, which executed in 2,600ms. On the positive side, our collocated merge join uses only one logical processor and requires no memory grant. The parallel hash join plan used 16 threads and reserved 569 MB of memory: Using a Temporary Table Our collocated merge join plan should benefit from parallelism. The reason parallelism is not being used is that the query references a system table. We can work around that by writing the partition numbers to a temporary table (or table variable): SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); CREATE TABLE #P ( partition_number integer PRIMARY KEY); INSERT #P (partition_number) SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1; SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; DROP TABLE #P; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; Using the temporary table adds a few logical reads, but the overall execution time is still around 3500ms, indistinguishable from the same query without the temporary table. The problem is that the query optimizer still doesn’t choose a parallel plan for this query, though the removal of the system table reference means that it could if it chose to: In fact the optimizer did enter the parallel plan phase of query optimization (running search 1 for a second time): Unfortunately, the parallel plan found seemed to be more expensive than the serial plan. This is a crazy result, caused by the optimizer’s cost model not reducing operator CPU costs on the inner side of a nested loops join. Don’t get me started on that, we’ll be here all night. In this plan, everything expensive happens on the inner side of a nested loops join. Without a CPU cost reduction to compensate for the added cost of exchange operators, candidate parallel plans always look more expensive to the optimizer than the equivalent serial plan. Parallel Collocated Merge Join We can produce the desired parallel plan using trace flag 8649 again: SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: One difference between this plan and the collocated hash join plan is that a Repartition Streams exchange operator is used instead of Distribute Streams. The effect is similar, though not quite identical. The Repartition uses round-robin partitioning, meaning the next partition id is pushed to the next thread in sequence. The Distribute Streams exchange seen earlier used Demand partitioning, meaning the next partition id is pulled across the exchange by the next thread that is ready for more work. There are subtle performance implications for each partitioning option, but going into that would again take us too far off the main point of this post. Performance The important thing is the performance of this parallel collocated merge join – just 1350ms on a typical run. The list below shows all the alternatives from this post (all timings include creation, population, and deletion of the temporary table where appropriate) from quickest to slowest: Collocated parallel merge join: 1350ms Parallel hash join: 2600ms Collocated serial merge join: 3500ms Serial merge join: 5000ms Parallel merge join: 8400ms Collated parallel hash join: 25,300ms (hash spill per partition) The parallel collocated merge join requires no memory grant (aside from a paltry 1.2MB used for exchange buffers). This plan uses 16 threads at DOP 8; but 8 of those are (rather pointlessly) allocated to the parallel scan of the temporary table. These are minor concerns, but it turns out there is a way to address them if it bothers you. Parallel Collocated Merge Join with Demand Partitioning This final tweak replaces the temporary table with a hard-coded list of partition ids (dynamic SQL could be used to generate this query from sys.partitions): SELECT row_count = SUM(Subtotals.cnt) FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12),(13),(14),(15),(16),(17),(18),(19),(20), (21),(22),(23),(24),(25),(26),(27),(28),(29),(30), (31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41) ) AS P (partition_number) CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: The parallel collocated hash join plan is reproduced below for comparison: The manual rewrite has another advantage that has not been mentioned so far: the partial counts (per partition) can be computed earlier than the partial counts (per thread) in the optimizer’s collocated join plan. The earlier aggregation is performed by the extra Stream Aggregate under the nested loops join. The performance of the parallel collocated merge join is unchanged at around 1350ms. Final Words It is a shame that the current query optimizer does not consider a collocated merge join (Connect item closed as Won’t Fix). The example used in this post showed an improvement in execution time from 2600ms to 1350ms using a modestly-sized data set and limited parallelism. In addition, the memory requirement for the query was almost completely eliminated – down from 569MB to 1.2MB. The problem with the parallel hash join selected by the optimizer is that it attempts to process the full data set all at once (albeit using eight threads). It requires a large memory grant to hold all 5 million rows from table T1 across the eight hash tables, and does not take advantage of the divide-and-conquer opportunity offered by the common partitioning. The great thing about the collocated join strategies is that each parallel thread works on a single partition from both tables, reading rows, performing the join, and computing a per-partition subtotal, before moving on to a new partition. From a thread’s point of view… If you have trouble visualizing what is happening from just looking at the parallel collocated merge join execution plan, let’s look at it again, but from the point of view of just one thread operating between the two Parallelism (exchange) operators. Our thread picks up a single partition id from the Distribute Streams exchange, and starts a merge join using ordered rows from partition 1 of table T1 and partition 1 of table T2. By definition, this is all happening on a single thread. As rows join, they are added to a (per-partition) count in the Stream Aggregate immediately above the Merge Join. Eventually, either T1 (partition 1) or T2 (partition 1) runs out of rows and the merge join stops. The per-partition count from the aggregate passes on through the Nested Loops join to another Stream Aggregate, which is maintaining a per-thread subtotal. Our same thread now picks up a new partition id from the exchange (say it gets id 9 this time). The count in the per-partition aggregate is reset to zero, and the processing of partition 9 of both tables proceeds just as it did for partition 1, and on the same thread. Each thread picks up a single partition id and processes all the data for that partition, completely independently from other threads working on other partitions. One thread might eventually process partitions (1, 9, 17, 25, 33, 41) while another is concurrently processing partitions (2, 10, 18, 26, 34) and so on for the other six threads at DOP 8. The point is that all 8 threads can execute independently and concurrently, continuing to process new partitions until the wider job (of which the thread has no knowledge!) is done. This divide-and-conquer technique can be much more efficient than simply splitting the entire workload across eight threads all at once. Related Reading Understanding and Using Parallelism in SQL Server Parallel Execution Plans Suck © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

Read the article
A Taxonomy of Numerical Methods v1

- by JoshReuben

Numerical Analysis – When, What, (but not how) Once you understand the Math & know C++, Numerical Methods are basically blocks of iterative & conditional math code. I found the real trick was seeing the forest for the trees – knowing which method to use for which situation. Its pretty easy to get lost in the details – so I’ve tried to organize these methods in a way that I can quickly look this up. I’ve included links to detailed explanations and to C++ code examples. I’ve tried to classify Numerical methods in the following broad categories: Solving Systems of Linear Equations Solving Non-Linear Equations Iteratively Interpolation Curve Fitting Optimization Numerical Differentiation & Integration Solving ODEs Boundary Problems Solving EigenValue problems Enjoy – I did ! Solving Systems of Linear Equations Overview Solve sets of algebraic equations with x unknowns The set is commonly in matrix form Gauss-Jordan Elimination http://en.wikipedia.org/wiki/Gauss%E2%80%93Jordan_elimination C++: http://www.codekeep.net/snippets/623f1923-e03c-4636-8c92-c9dc7aa0d3c0.aspx Produces solution of the equations & the coefficient matrix Efficient, stable 2 steps: · Forward Elimination – matrix decomposition: reduce set to triangular form (0s below the diagonal) or row echelon form. If degenerate, then there is no solution · Backward Elimination –write the original matrix as the product of ints inverse matrix & its reduced row-echelon matrix à reduce set to row canonical form & use back-substitution to find the solution to the set Elementary ops for matrix decomposition: · Row multiplication · Row switching · Add multiples of rows to other rows Use pivoting to ensure rows are ordered for achieving triangular form LU Decomposition http://en.wikipedia.org/wiki/LU_decomposition C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-lu-decomposition-for-solving.html Represent the matrix as a product of lower & upper triangular matrices A modified version of GJ Elimination Advantage – can easily apply forward & backward elimination to solve triangular matrices Techniques: · Doolittle Method – sets the L matrix diagonal to unity · Crout Method - sets the U matrix diagonal to unity Note: both the L & U matrices share the same unity diagonal & can be stored compactly in the same matrix Gauss-Seidel Iteration http://en.wikipedia.org/wiki/Gauss%E2%80%93Seidel_method C++: http://www.nr.com/forum/showthread.php?t=722 Transform the linear set of equations into a single equation & then use numerical integration (as integration formulas have Sums, it is implemented iteratively). an optimization of Gauss-Jacobi: 1.5 times faster, requires 0.25 iterations to achieve the same tolerance Solving Non-Linear Equations Iteratively find roots of polynomials – there may be 0, 1 or n solutions for an n order polynomial use iterative techniques Iterative methods · used when there are no known analytical techniques · Requires set functions to be continuous & differentiable · Requires an initial seed value – choice is critical to convergence à conduct multiple runs with different starting points & then select best result · Systematic - iterate until diminishing returns, tolerance or max iteration conditions are met · bracketing techniques will always yield convergent solutions, non-bracketing methods may fail to converge Incremental method if a nonlinear function has opposite signs at 2 ends of a small interval x1 & x2, then there is likely to be a solution in their interval – solutions are detected by evaluating a function over interval steps, for a change in sign, adjusting the step size dynamically. Limitations – can miss closely spaced solutions in large intervals, cannot detect degenerate (coinciding) solutions, limited to functions that cross the x-axis, gives false positives for singularities Fixed point method http://en.wikipedia.org/wiki/Fixed-point_iteration C++: http://books.google.co.il/books?id=weYj75E_t6MC&pg=PA79&lpg=PA79&dq=fixed+point+method++c%2B%2B&source=bl&ots=LQ-5P_taoC&sig=lENUUIYBK53tZtTwNfHLy5PEWDk&hl=en&sa=X&ei=wezDUPW1J5DptQaMsIHQCw&redir_esc=y#v=onepage&q=fixed%20point%20method%20%20c%2B%2B&f=false Algebraically rearrange a solution to isolate a variable then apply incremental method Bisection method http://en.wikipedia.org/wiki/Bisection_method C++: http://numericalcomputing.wordpress.com/category/algorithms/ Bracketed - Select an initial interval, keep bisecting it ad midpoint into sub-intervals and then apply incremental method on smaller & smaller intervals – zoom in Adv: unaffected by function gradient à reliable Disadv: slow convergence False Position Method http://en.wikipedia.org/wiki/False_position_method C++: http://www.dreamincode.net/forums/topic/126100-bisection-and-false-position-methods/ Bracketed - Select an initial interval , & use the relative value of function at interval end points to select next sub-intervals (estimate how far between the end points the solution might be & subdivide based on this) Newton-Raphson method http://en.wikipedia.org/wiki/Newton's_method C++: http://www-users.cselabs.umn.edu/classes/Summer-2012/csci1113/index.php?page=./newt3 Also known as Newton's method Convenient, efficient Not bracketed – only a single initial guess is required to start iteration – requires an analytical expression for the first derivative of the function as input. Evaluates the function & its derivative at each step. Can be extended to the Newton MutiRoot method for solving multiple roots Can be easily applied to an of n-coupled set of non-linear equations – conduct a Taylor Series expansion of a function, dropping terms of order n, rewrite as a Jacobian matrix of PDs & convert to simultaneous linear equations !!! Secant Method http://en.wikipedia.org/wiki/Secant_method C++: http://forum.vcoderz.com/showthread.php?p=205230 Unlike N-R, can estimate first derivative from an initial interval (does not require root to be bracketed) instead of inputting it Since derivative is approximated, may converge slower. Is fast in practice as it does not have to evaluate the derivative at each step. Similar implementation to False Positive method Birge-Vieta Method http://mat.iitm.ac.in/home/sryedida/public_html/caimna/transcendental/polynomial%20methods/bv%20method.html C++: http://books.google.co.il/books?id=cL1boM2uyQwC&pg=SA3-PA51&lpg=SA3-PA51&dq=Birge-Vieta+Method+c%2B%2B&source=bl&ots=QZmnDTK3rC&sig=BPNcHHbpR_DKVoZXrLi4nVXD-gg&hl=en&sa=X&ei=R-_DUK2iNIjzsgbE5ID4Dg&redir_esc=y#v=onepage&q=Birge-Vieta%20Method%20c%2B%2B&f=false combines Horner's method of polynomial evaluation (transforming into lesser degree polynomials that are more computationally efficient to process) with Newton-Raphson to provide a computational speed-up Interpolation Overview Construct new data points for as close as possible fit within range of a discrete set of known points (that were obtained via sampling, experimentation) Use Taylor Series Expansion of a function f(x) around a specific value for x Linear Interpolation http://en.wikipedia.org/wiki/Linear_interpolation C++: http://www.hamaluik.com/?p=289 Straight line between 2 points à concatenate interpolants between each pair of data points Bilinear Interpolation http://en.wikipedia.org/wiki/Bilinear_interpolation C++: http://supercomputingblog.com/graphics/coding-bilinear-interpolation/2/ Extension of the linear function for interpolating functions of 2 variables – perform linear interpolation first in 1 direction, then in another. Used in image processing – e.g. texture mapping filter. Uses 4 vertices to interpolate a value within a unit cell. Lagrange Interpolation http://en.wikipedia.org/wiki/Lagrange_polynomial C++: http://www.codecogs.com/code/maths/approximation/interpolation/lagrange.php For polynomials Requires recomputation for all terms for each distinct x value – can only be applied for small number of nodes Numerically unstable Barycentric Interpolation http://epubs.siam.org/doi/pdf/10.1137/S0036144502417715 C++: http://www.gamedev.net/topic/621445-barycentric-coordinates-c-code-check/ Rearrange the terms in the equation of the Legrange interpolation by defining weight functions that are independent of the interpolated value of x Newton Divided Difference Interpolation http://en.wikipedia.org/wiki/Newton_polynomial C++: http://jee-appy.blogspot.co.il/2011/12/newton-divided-difference-interpolation.html Hermite Divided Differences: Interpolation polynomial approximation for a given set of data points in the NR form - divided differences are used to approximately calculate the various differences. For a given set of 3 data points , fit a quadratic interpolant through the data Bracketed functions allow Newton divided differences to be calculated recursively Difference table Cubic Spline Interpolation http://en.wikipedia.org/wiki/Spline_interpolation C++: https://www.marcusbannerman.co.uk/index.php/home/latestarticles/42-articles/96-cubic-spline-class.html Spline is a piecewise polynomial Provides smoothness – for interpolations with significantly varying data Use weighted coefficients to bend the function to be smooth & its 1st & 2nd derivatives are continuous through the edge points in the interval Curve Fitting A generalization of interpolating whereby given data points may contain noise à the curve does not necessarily pass through all the points Least Squares Fit http://en.wikipedia.org/wiki/Least_squares C++: http://www.ccas.ru/mmes/educat/lab04k/02/least-squares.c Residual – difference between observed value & expected value Model function is often chosen as a linear combination of the specified functions Determines: A) The model instance in which the sum of squared residuals has the least value B) param values for which model best fits data Straight Line Fit Linear correlation between independent variable and dependent variable Linear Regression http://en.wikipedia.org/wiki/Linear_regression C++: http://www.oocities.org/david_swaim/cpp/linregc.htm Special case of statistically exact extrapolation Leverage least squares Given a basis function, the sum of the residuals is determined and the corresponding gradient equation is expressed as a set of normal linear equations in matrix form that can be solved (e.g. using LU Decomposition) Can be weighted - Drop the assumption that all errors have the same significance –-> confidence of accuracy is different for each data point. Fit the function closer to points with higher weights Polynomial Fit - use a polynomial basis function Moving Average http://en.wikipedia.org/wiki/Moving_average C++: http://www.codeproject.com/Articles/17860/A-Simple-Moving-Average-Algorithm Used for smoothing (cancel fluctuations to highlight longer-term trends & cycles), time series data analysis, signal processing filters Replace each data point with average of neighbors. Can be simple (SMA), weighted (WMA), exponential (EMA). Lags behind latest data points – extra weight can be given to more recent data points. Weights can decrease arithmetically or exponentially according to distance from point. Parameters: smoothing factor, period, weight basis Optimization Overview Given function with multiple variables, find Min (or max by minimizing –f(x)) Iterative approach Efficient, but not necessarily reliable Conditions: noisy data, constraints, non-linear models Detection via sign of first derivative - Derivative of saddle points will be 0 Local minima Bisection method Similar method for finding a root for a non-linear equation Start with an interval that contains a minimum Golden Search method http://en.wikipedia.org/wiki/Golden_section_search C++: http://www.codecogs.com/code/maths/optimization/golden.php Bisect intervals according to golden ratio 0.618.. Achieves reduction by evaluating a single function instead of 2 Newton-Raphson Method Brent method http://en.wikipedia.org/wiki/Brent's_method C++: http://people.sc.fsu.edu/~jburkardt/cpp_src/brent/brent.cpp Based on quadratic or parabolic interpolation – if the function is smooth & parabolic near to the minimum, then a parabola fitted through any 3 points should approximate the minima – fails when the 3 points are collinear , in which case the denominator is 0 Simplex Method http://en.wikipedia.org/wiki/Simplex_algorithm C++: http://www.codeguru.com/cpp/article.php/c17505/Simplex-Optimization-Algorithm-and-Implemetation-in-C-Programming.htm Find the global minima of any multi-variable function Direct search – no derivatives required At each step it maintains a non-degenerative simplex – a convex hull of n+1 vertices. Obtains the minimum for a function with n variables by evaluating the function at n-1 points, iteratively replacing the point of worst result with the point of best result, shrinking the multidimensional simplex around the best point. Point replacement involves expanding & contracting the simplex near the worst value point to determine a better replacement point Oscillation can be avoided by choosing the 2nd worst result Restart if it gets stuck Parameters: contraction & expansion factors Simulated Annealing http://en.wikipedia.org/wiki/Simulated_annealing C++: http://code.google.com/p/cppsimulatedannealing/ Analogy to heating & cooling metal to strengthen its structure Stochastic method – apply random permutation search for global minima - Avoid entrapment in local minima via hill climbing Heating schedule - Annealing schedule params: temperature, iterations at each temp, temperature delta Cooling schedule – can be linear, step-wise or exponential Differential Evolution http://en.wikipedia.org/wiki/Differential_evolution C++: http://www.amichel.com/de/doc/html/ More advanced stochastic methods analogous to biological processes: Genetic algorithms, evolution strategies Parallel direct search method against multiple discrete or continuous variables Initial population of variable vectors chosen randomly – if weighted difference vector of 2 vectors yields a lower objective function value then it replaces the comparison vector Many params: #parents, #variables, step size, crossover constant etc Convergence is slow – many more function evaluations than simulated annealing Numerical Differentiation Overview 2 approaches to finite difference methods: · A) approximate function via polynomial interpolation then differentiate · B) Taylor series approximation – additionally provides error estimate Finite Difference methods http://en.wikipedia.org/wiki/Finite_difference_method C++: http://www.wpi.edu/Pubs/ETD/Available/etd-051807-164436/unrestricted/EAMPADU.pdf Find differences between high order derivative values - Approximate differential equations by finite differences at evenly spaced data points Based on forward & backward Taylor series expansion of f(x) about x plus or minus multiples of delta h. Forward / backward difference - the sums of the series contains even derivatives and the difference of the series contains odd derivatives – coupled equations that can be solved. Provide an approximation of the derivative within a O(h^2) accuracy There is also central difference & extended central difference which has a O(h^4) accuracy Richardson Extrapolation http://en.wikipedia.org/wiki/Richardson_extrapolation C++: http://mathscoding.blogspot.co.il/2012/02/introduction-richardson-extrapolation.html A sequence acceleration method applied to finite differences Fast convergence, high accuracy O(h^4) Derivatives via Interpolation Cannot apply Finite Difference method to discrete data points at uneven intervals – so need to approximate the derivative of f(x) using the derivative of the interpolant via 3 point Lagrange Interpolation Note: the higher the order of the derivative, the lower the approximation precision Numerical Integration Estimate finite & infinite integrals of functions More accurate procedure than numerical differentiation Use when it is not possible to obtain an integral of a function analytically or when the function is not given, only the data points are Newton Cotes Methods http://en.wikipedia.org/wiki/Newton%E2%80%93Cotes_formulas C++: http://www.siafoo.net/snippet/324 For equally spaced data points Computationally easy – based on local interpolation of n rectangular strip areas that is piecewise fitted to a polynomial to get the sum total area Evaluate the integrand at n+1 evenly spaced points – approximate definite integral by Sum Weights are derived from Lagrange Basis polynomials Leverage Trapezoidal Rule for default 2nd formulas, Simpson 1/3 Rule for substituting 3 point formulas, Simpson 3/8 Rule for 4 point formulas. For 4 point formulas use Bodes Rule. Higher orders obtain more accurate results Trapezoidal Rule uses simple area, Simpsons Rule replaces the integrand f(x) with a quadratic polynomial p(x) that uses the same values as f(x) for its end points, but adds a midpoint Romberg Integration http://en.wikipedia.org/wiki/Romberg's_method C++: http://code.google.com/p/romberg-integration/downloads/detail?name=romberg.cpp&can=2&q= Combines trapezoidal rule with Richardson Extrapolation Evaluates the integrand at equally spaced points The integrand must have continuous derivatives Each R(n,m) extrapolation uses a higher order integrand polynomial replacement rule (zeroth starts with trapezoidal) à a lower triangular matrix set of equation coefficients where the bottom right term has the most accurate approximation. The process continues until the difference between 2 successive diagonal terms becomes sufficiently small. Gaussian Quadrature http://en.wikipedia.org/wiki/Gaussian_quadrature C++: http://www.alglib.net/integration/gaussianquadratures.php Data points are chosen to yield best possible accuracy – requires fewer evaluations Ability to handle singularities, functions that are difficult to evaluate The integrand can include a weighting function determined by a set of orthogonal polynomials. Points & weights are selected so that the integrand yields the exact integral if f(x) is a polynomial of degree <= 2n+1 Techniques (basically different weighting functions): · Gauss-Legendre Integration w(x)=1 · Gauss-Laguerre Integration w(x)=e^-x · Gauss-Hermite Integration w(x)=e^-x^2 · Gauss-Chebyshev Integration w(x)= 1 / Sqrt(1-x^2) Solving ODEs Use when high order differential equations cannot be solved analytically Evaluated under boundary conditions RK for systems – a high order differential equation can always be transformed into a coupled first order system of equations Euler method http://en.wikipedia.org/wiki/Euler_method C++: http://rosettacode.org/wiki/Euler_method First order Runge–Kutta method. Simple recursive method – given an initial value, calculate derivative deltas. Unstable & not very accurate (O(h) error) – not used in practice A first-order method - the local error (truncation error per step) is proportional to the square of the step size, and the global error (error at a given time) is proportional to the step size In evolving solution between data points xn & xn+1, only evaluates derivatives at beginning of interval xn à asymmetric at boundaries Higher order Runge Kutta http://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods C++: http://www.dreamincode.net/code/snippet1441.htm 2nd & 4th order RK - Introduces parameterized midpoints for more symmetric solutions à accuracy at higher computational cost Adaptive RK – RK-Fehlberg – estimate the truncation at each integration step & automatically adjust the step size to keep error within prescribed limits. At each step 2 approximations are compared – if in disagreement to a specific accuracy, the step size is reduced Boundary Value Problems Where solution of differential equations are located at 2 different values of the independent variable x à more difficult, because cannot just start at point of initial value – there may not be enough starting conditions available at the end points to produce a unique solution An n-order equation will require n boundary conditions – need to determine the missing n-1 conditions which cause the given conditions at the other boundary to be satisfied Shooting Method http://en.wikipedia.org/wiki/Shooting_method C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-shooting-method-for-solving.html Iteratively guess the missing values for one end & integrate, then inspect the discrepancy with the boundary values of the other end to adjust the estimate Given the starting boundary values u1 & u2 which contain the root u, solve u given the false position method (solving the differential equation as an initial value problem via 4th order RK), then use u to solve the differential equations. Finite Difference Method For linear & non-linear systems Higher order derivatives require more computational steps – some combinations for boundary conditions may not work though Improve the accuracy by increasing the number of mesh points Solving EigenValue Problems An eigenvalue can substitute a matrix when doing matrix multiplication à convert matrix multiplication into a polynomial EigenValue For a given set of equations in matrix form, determine what are the solution eigenvalue & eigenvectors Similar Matrices - have same eigenvalues. Use orthogonal similarity transforms to reduce a matrix to diagonal form from which eigenvalue(s) & eigenvectors can be computed iteratively Jacobi method http://en.wikipedia.org/wiki/Jacobi_method C++: http://people.sc.fsu.edu/~jburkardt/classes/acs2_2008/openmp/jacobi/jacobi.html Robust but Computationally intense – use for small matrices < 10x10 Power Iteration http://en.wikipedia.org/wiki/Power_iteration For any given real symmetric matrix, generate the largest single eigenvalue & its eigenvectors Simplest method – does not compute matrix decomposition à suitable for large, sparse matrices Inverse Iteration Variation of power iteration method – generates the smallest eigenvalue from the inverse matrix Rayleigh Method http://en.wikipedia.org/wiki/Rayleigh's_method_of_dimensional_analysis Variation of power iteration method Rayleigh Quotient Method Variation of inverse iteration method Matrix Tri-diagonalization Method Use householder algorithm to reduce an NxN symmetric matrix to a tridiagonal real symmetric matrix vua N-2 orthogonal transforms Whats Next Outside of Numerical Methods there are lots of different types of algorithms that I’ve learned over the decades: Data Mining – (I covered this briefly in a previous post: http://geekswithblogs.net/JoshReuben/archive/2007/12/31/ssas-dm-algorithms.aspx ) Search & Sort Routing Problem Solving Logical Theorem Proving Planning Probabilistic Reasoning Machine Learning Solvers (eg MIP) Bioinformatics (Sequence Alignment, Protein Folding) Quant Finance (I read Wilmott’s books – interesting) Sooner or later, I’ll cover the above topics as well.

Read the article
help with fixing fwts errors log

- by jasmines

Here is an extract of results.log: MTRR validation. Test 1 of 3: Validate the kernel MTRR IOMEM setup. FAILED [MEDIUM] MTRRIncorrectAttr: Test 1, Memory range 0xc0000000 to 0xdfffffff (PCI Bus 0000:00) has incorrect attribute Write-Combining. FAILED [MEDIUM] MTRRIncorrectAttr: Test 1, Memory range 0xfee01000 to 0xffffffff (PCI Bus 0000:00) has incorrect attribute Write-Protect. ==================================================================================================== Test 1 of 1: Kernel log error check. Kernel message: [ 0.208079] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored ADVICE: This is not exactly a failure mode but a warning from the kernel. The _OSI() method has implemented a match to the 'Linux' query in the DSDT and this is redundant because the ACPI driver matches onto the Windows _OSI strings by default. FAILED [HIGH] KlogACPIErrorMethodExecutionParse: Test 1, HIGH Kernel message: [ 3.512783] ACPI Error : Method parse/execution failed [\_SB_.PCI0.GFX0._DOD] (Node f7425858), AE_AML_PACKAGE_LIMIT (20110623/psparse-536) ADVICE: This is a bug picked up by the kernel, but as yet, the firmware test suite has no diagnostic advice for this particular problem. Found 1 unique errors in kernel log. ==================================================================================================== Check if system is using latest microcode. ---------------------------------------------------------------------------------------------------- Cannot read microcode file /usr/share/misc/intel-microcode.dat. Aborted test, initialisation failed. ==================================================================================================== MSR register tests. FAILED [MEDIUM] MSRCPUsInconsistent: Test 1, MSR SYSENTER_ESP (0x175) has 1 inconsistent values across 2 CPUs for (shift: 0 mask: 0xffffffffffffffff). MSR CPU 0 -> 0xf7bb9c40 vs CPU 1 -> 0xf7bc7c40 FAILED [MEDIUM] MSRCPUsInconsistent: Test 1, MSR MISC_ENABLE (0x1a0) has 1 inconsistent values across 2 CPUs for (shift: 0 mask: 0x400c51889). MSR CPU 0 -> 0x850088 vs CPU 1 -> 0x850089 ==================================================================================================== Checks firmware has set PCI Express MaxReadReq to a higher value on non-motherboard devices. ---------------------------------------------------------------------------------------------------- Test 1 of 1: Check firmware settings MaxReadReq for PCI Express devices. MaxReadReq for pci://00:00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio Controller (rev 03) is low (128) [Audio device]. MaxReadReq for pci://00:02:00.0 Network controller: Intel Corporation PRO/Wireless 5100 AGN [Shiloh] Network Connection is low (128) [Network controller]. FAILED [LOW] LowMaxReadReq: Test 1, 2 devices have low MaxReadReq settings. Firmware may have configured these too low. ADVICE: The MaxReadRequest size is set too low and will affect performance. It will provide excellent bus sharing at the cost of bus data transfer rates. Although not a critical issue, it may be worth considering setting the MaxReadRequest size to 256 or 512 to increase throughput on the PCI Express bus. Some drivers (for example the Brocade Fibre Channel driver) allow one to override the firmware settings. Where possible, this BIOS configuration setting is worth increasing it a little more for better performance at a small reduction of bus sharing. ==================================================================================================== PCIe ASPM check. ---------------------------------------------------------------------------------------------------- Test 1 of 2: PCIe ASPM ACPI test. PCIE ASPM is not controlled by Linux kernel. ADVICE: BIOS reports that Linux kernel should not modify ASPM settings that BIOS configured. It can be intentional because hardware vendors identified some capability bugs between the motherboard and the add-on cards. Test 2 of 2: PCIe ASPM registers test. WARNING: Test 2, RP 00h:1Ch.01h L0s not enabled. WARNING: Test 2, RP 00h:1Ch.01h L1 not enabled. WARNING: Test 2, Device 02h:00h.00h L0s not enabled. WARNING: Test 2, Device 02h:00h.00h L1 not enabled. PASSED: Test 2, PCIE aspm setting matched was matched. WARNING: Test 2, RP 00h:1Ch.05h L0s not enabled. WARNING: Test 2, RP 00h:1Ch.05h L1 not enabled. WARNING: Test 2, Device 85h:00h.00h L0s not enabled. WARNING: Test 2, Device 85h:00h.00h L1 not enabled. PASSED: Test 2, PCIE aspm setting matched was matched. ==================================================================================================== Extract and analyse Windows Management Instrumentation (WMI). Test 1 of 2: Check Windows Management Instrumentation in DSDT Found WMI Method WMAA with GUID: 5FB7F034-2C63-45E9-BE91-3D44E2C707E4, Instance 0x01 Found WMI Event, Notifier ID: 0x80, GUID: 95F24279-4D7B-4334-9387-ACCDC67EF61C, Instance 0x01 PASSED: Test 1, GUID 95F24279-4D7B-4334-9387-ACCDC67EF61C is handled by driver hp-wmi (Vendor: HP). Found WMI Event, Notifier ID: 0xa0, GUID: 2B814318-4BE8-4707-9D84-A190A859B5D0, Instance 0x01 FAILED [MEDIUM] WMIUnknownGUID: Test 1, GUID 2B814318-4BE8-4707-9D84-A190A859B5D0 is unknown to the kernel, a driver may need to be implemented for this GUID. ADVICE: A WMI driver probably needs to be written for this event. It can checked for using: wmi_has_guid("2B814318-4BE8-4707-9D84-A190A859B5D0"). One can install a notify handler using wmi_install_notify_handler("2B814318-4BE8-4707-9D84-A190A859B5D0", handler, NULL). http://lwn.net/Articles/391230 describes how to write an appropriate driver. Found WMI Object, Object ID AB, GUID: 05901221-D566-11D1-B2F0-00A0C9062910, Instance 0x01, Flags: 00 Found WMI Method WMBA with GUID: 1F4C91EB-DC5C-460B-951D-C7CB9B4B8D5E, Instance 0x01 Found WMI Object, Object ID BC, GUID: 2D114B49-2DFB-4130-B8FE-4A3C09E75133, Instance 0x7f, Flags: 00 Found WMI Object, Object ID BD, GUID: 988D08E3-68F4-4C35-AF3E-6A1B8106F83C, Instance 0x19, Flags: 00 Found WMI Object, Object ID BE, GUID: 14EA9746-CE1F-4098-A0E0-7045CB4DA745, Instance 0x01, Flags: 00 Found WMI Object, Object ID BF, GUID: 322F2028-0F84-4901-988E-015176049E2D, Instance 0x01, Flags: 00 Found WMI Object, Object ID BG, GUID: 8232DE3D-663D-4327-A8F4-E293ADB9BF05, Instance 0x01, Flags: 00 Found WMI Object, Object ID BH, GUID: 8F1F6436-9F42-42C8-BADC-0E9424F20C9A, Instance 0x00, Flags: 00 Found WMI Object, Object ID BI, GUID: 8F1F6435-9F42-42C8-BADC-0E9424F20C9A, Instance 0x00, Flags: 00 Found WMI Method WMAC with GUID: 7391A661-223A-47DB-A77A-7BE84C60822D, Instance 0x01 Found WMI Object, Object ID BJ, GUID: DF4E63B6-3BBC-4858-9737-C74F82F821F3, Instance 0x05, Flags: 00 ==================================================================================================== Disassemble DSDT to check for _OSI("Linux"). ---------------------------------------------------------------------------------------------------- Test 1 of 1: Disassemble DSDT to check for _OSI("Linux"). This is not strictly a failure mode, it just alerts one that this has been defined in the DSDT and probably should be avoided since the Linux ACPI driver matches onto the Windows _OSI strings { If (_OSI ("Linux")) { Store (0x03E8, OSYS) } If (_OSI ("Windows 2001")) { Store (0x07D1, OSYS) } If (_OSI ("Windows 2001 SP1")) { Store (0x07D1, OSYS) } If (_OSI ("Windows 2001 SP2")) { Store (0x07D2, OSYS) } If (_OSI ("Windows 2006")) { Store (0x07D6, OSYS) } If (LAnd (MPEN, LEqual (OSYS, 0x07D1))) { TRAP (0x01, 0x48) } TRAP (0x03, 0x35) } WARNING: Test 1, DSDT implements a deprecated _OSI("Linux") test. ==================================================================================================== 0 passed, 0 failed, 1 warnings, 0 aborted, 0 skipped, 0 info only. ==================================================================================================== ACPI DSDT Method Semantic Tests. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP Failed to install global event handler. Test 22 of 93: Check _PSR (Power Source). ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 22, Detected an infinite loop when evaluating method '\_SB_.AC__._PSR'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. PASSED: Test 22, \_SB_.AC__._PSR correctly acquired and released locks 16 times. Test 35 of 93: Check _TMP (Thermal Zone Current Temp). ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 35, Detected an infinite loop when evaluating method '\_TZ_.DTSZ._TMP'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. PASSED: Test 35, \_TZ_.DTSZ._TMP correctly acquired and released locks 14 times. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 35, Detected an infinite loop when evaluating method '\_TZ_.CPUZ._TMP'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. PASSED: Test 35, \_TZ_.CPUZ._TMP correctly acquired and released locks 10 times. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 35, Detected an infinite loop when evaluating method '\_TZ_.SKNZ._TMP'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. PASSED: Test 35, \_TZ_.SKNZ._TMP correctly acquired and released locks 10 times. PASSED: Test 35, _TMP correctly returned sane looking value 0x00000b4c (289.2 degrees K) PASSED: Test 35, \_TZ_.BATZ._TMP correctly acquired and released locks 9 times. PASSED: Test 35, _TMP correctly returned sane looking value 0x00000aac (273.2 degrees K) PASSED: Test 35, \_TZ_.FDTZ._TMP correctly acquired and released locks 7 times. Test 46 of 93: Check _DIS (Disable). FAILED [MEDIUM] MethodShouldReturnNothing: Test 46, \_SB_.PCI0.LPCB.SIO_.COM1._DIS returned values, but was expected to return nothing. Object returned: INTEGER: 0x00000000 ADVICE: This probably won't cause any errors, but it should be fixed as the AML code is not conforming to the expected behaviour as described in the ACPI specification. FAILED [MEDIUM] MethodShouldReturnNothing: Test 46, \_SB_.PCI0.LPCB.SIO_.LPT0._DIS returned values, but was expected to return nothing. Object returned: INTEGER: 0x00000000 ADVICE: This probably won't cause any errors, but it should be fixed as the AML code is not conforming to the expected behaviour as described in the ACPI specification. Test 61 of 93: Check _WAK (System Wake). Test _WAK(1) System Wake, State S1. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 61, Detected an infinite loop when evaluating method '\_WAK'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. Test _WAK(2) System Wake, State S2. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 61, Detected an infinite loop when evaluating method '\_WAK'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. Test _WAK(3) System Wake, State S3. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 61, Detected an infinite loop when evaluating method '\_WAK'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. Test _WAK(4) System Wake, State S4. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 61, Detected an infinite loop when evaluating method '\_WAK'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. Test _WAK(5) System Wake, State S5. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 61, Detected an infinite loop when evaluating method '\_WAK'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. Test 87 of 93: Check _BCL (Query List of Brightness Control Levels Supported). Package has 2 elements: 00: INTEGER: 0x00000000 01: INTEGER: 0x00000000 FAILED [MEDIUM] Method_BCLElementCount: Test 87, Method _BCL should return a package of more than 2 integers, got just 2. Test 88 of 93: Check _BCM (Set Brightness Level). ACPICA Exception AE_AML_PACKAGE_LIMIT during execution of method _BCM FAILED [CRITICAL] AEAMLPackgeLimit: Test 88, Detected error 'Package limit' when evaluating '\_SB_.PCI0.GFX0.DD02._BCM'. ==================================================================================================== ACPI table settings sanity checks. ---------------------------------------------------------------------------------------------------- Test 1 of 1: Check ACPI tables. PASSED: Test 1, Table APIC passed. Table ECDT not present to check. FAILED [MEDIUM] FADT32And64BothDefined: Test 1, FADT 32 bit FIRMWARE_CONTROL is non-zero, and X_FIRMWARE_CONTROL is also non-zero. Section 5.2.9 of the ACPI specification states that if the FIRMWARE_CONTROL is non-zero then X_FIRMWARE_CONTROL must be set to zero. ADVICE: The FADT FIRMWARE_CTRL is a 32 bit pointer that points to the physical memory address of the Firmware ACPI Control Structure (FACS). There is also an extended 64 bit version of this, the X_FIRMWARE_CTRL pointer that also can point to the FACS. Section 5.2.9 of the ACPI specification states that if the X_FIRMWARE_CTRL field contains a non zero value then the FIRMWARE_CTRL field *must* be zero. This error is also detected by the Linux kernel. If FIRMWARE_CTRL and X_FIRMWARE_CTRL are defined, then the kernel just uses the 64 bit version of the pointer. PASSED: Test 1, Table HPET passed. PASSED: Test 1, Table MCFG passed. PASSED: Test 1, Table RSDT passed. PASSED: Test 1, Table RSDP passed. Table SBST not present to check. PASSED: Test 1, Table XSDT passed. ==================================================================================================== Re-assemble DSDT and find syntax errors and warnings. ---------------------------------------------------------------------------------------------------- Test 1 of 2: Disassemble and reassemble DSDT FAILED [HIGH] AMLAssemblerError4043: Test 1, Assembler error in line 2261 Line | AML source ---------------------------------------------------------------------------------------------------- 02258| 0x00000000, // Range Minimum 02259| 0xFEDFFFFF, // Range Maximum 02260| 0x00000000, // Translation Offset 02261| 0x00000000, // Length | ^ | error 4043: Invalid combination of Length and Min/Max fixed flags 02262| ,, _Y0E, AddressRangeMemory, TypeStatic) 02263| DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, 02264| 0x00000000, // Granularity ==================================================================================================== ADVICE: (for error #4043): This occurs if the length is zero and just one of the resource MIF/MAF flags are set, or the length is non-zero and resource MIF/MAF flags are both set. These are illegal combinations and need to be fixed. See section 6.4.3.5 Address Space Resource Descriptors of version 4.0a of the ACPI specification for more details. FAILED [HIGH] AMLAssemblerError4050: Test 1, Assembler error in line 2268 Line | AML source ---------------------------------------------------------------------------------------------------- 02265| 0xFEE01000, // Range Minimum 02266| 0xFFFFFFFF, // Range Maximum 02267| 0x00000000, // Translation Offset 02268| 0x011FEFFF, // Length | ^ | error 4050: Length is not equal to fixed Min/Max window 02269| ,, , AddressRangeMemory, TypeStatic) 02270| }) 02271| Method (_CRS, 0, Serialized) ==================================================================================================== ADVICE: (for error #4050): The minimum address is greater than the maximum address. This is illegal. FAILED [HIGH] AMLAssemblerError1104: Test 1, Assembler error in line 8885 Line | AML source ---------------------------------------------------------------------------------------------------- 08882| Method (_DIS, 0, NotSerialized) 08883| { 08884| DSOD (0x02) 08885| Return (0x00) | ^ | warning level 0 1104: Reserved method should not return a value (_DIS) 08886| } 08887| 08888| Method (_SRS, 1, NotSerialized) ==================================================================================================== FAILED [HIGH] AMLAssemblerError1104: Test 1, Assembler error in line 9195 Line | AML source ---------------------------------------------------------------------------------------------------- 09192| Method (_DIS, 0, NotSerialized) 09193| { 09194| DSOD (0x01) 09195| Return (0x00) | ^ | warning level 0 1104: Reserved method should not return a value (_DIS) 09196| } 09197| 09198| Method (_SRS, 1, NotSerialized) ==================================================================================================== FAILED [HIGH] AMLAssemblerError1127: Test 1, Assembler error in line 9242 Line | AML source ---------------------------------------------------------------------------------------------------- 09239| CreateWordField (CRES, \_SB.PCI0.LPCB.SIO.LPT0._CRS._Y21._MAX, MAX2) 09240| CreateByteField (CRES, \_SB.PCI0.LPCB.SIO.LPT0._CRS._Y21._LEN, LEN2) 09241| CreateWordField (CRES, \_SB.PCI0.LPCB.SIO.LPT0._CRS._Y22._INT, IRQ0) 09242| CreateWordField (CRES, \_SB.PCI0.LPCB.SIO.LPT0._CRS._Y23._DMA, DMA0) | ^ | warning level 0 1127: ResourceTag smaller than Field (Tag: 8 bits, Field: 16 bits) 09243| If (RLPD) 09244| { 09245| Store (0x00, Local0) ==================================================================================================== FAILED [HIGH] AMLAssemblerError1128: Test 1, Assembler error in line 18682 Line | AML source ---------------------------------------------------------------------------------------------------- 18679| Store (0x01, Index (DerefOf (Index (Local0, 0x02)), 0x01)) 18680| If (And (WDPE, 0x40)) 18681| { 18682| Wait (\_SB.BEVT, 0x10) | ^ | warning level 0 1128: Result is not used, possible operator timeout will be missed 18683| } 18684| 18685| Store (BRID, Index (DerefOf (Index (Local0, 0x02)), 0x02)) ==================================================================================================== ADVICE: (for warning level 0 #1128): The operation can possibly timeout, and hence the return value indicates an timeout error. However, because the return value is not checked this very probably indicates that the code is buggy. A possible scenario is that a mutex times out and the code attempts to access data in a critical region when it should not. This will lead to undefined behaviour. This should be fixed. Table DSDT (0) reassembly: Found 2 errors, 4 warnings. Test 2 of 2: Disassemble and reassemble SSDT PASSED: Test 2, SSDT (0) reassembly, Found 0 errors, 0 warnings. FAILED [HIGH] AMLAssemblerError1104: Test 2, Assembler error in line 60 Line | AML source ---------------------------------------------------------------------------------------------------- 00057| { 00058| Store (CPDC (Arg0), Local0) 00059| GCAP (Local0) 00060| Return (Local0) | ^ | warning level 0 1104: Reserved method should not return a value (_PDC) 00061| } 00062| 00063| Method (_OSC, 4, NotSerialized) ==================================================================================================== FAILED [HIGH] AMLAssemblerError1104: Test 2, Assembler error in line 174 Line | AML source ---------------------------------------------------------------------------------------------------- 00171| { 00172| Store (\_PR.CPU0.CPDC (Arg0), Local0) 00173| GCAP (Local0) 00174| Return (Local0) | ^ | warning level 0 1104: Reserved method should not return a value (_PDC) 00175| } 00176| 00177| Method (_OSC, 4, NotSerialized) ==================================================================================================== FAILED [HIGH] AMLAssemblerError1104: Test 2, Assembler error in line 244 Line | AML source ---------------------------------------------------------------------------------------------------- 00241| { 00242| Store (\_PR.CPU0.CPDC (Arg0), Local0) 00243| GCAP (Local0) 00244| Return (Local0) | ^ | warning level 0 1104: Reserved method should not return a value (_PDC) 00245| } 00246| 00247| Method (_OSC, 4, NotSerialized) ==================================================================================================== FAILED [HIGH] AMLAssemblerError1104: Test 2, Assembler error in line 290 Line | AML source ---------------------------------------------------------------------------------------------------- 00287| { 00288| Store (\_PR.CPU0.CPDC (Arg0), Local0) 00289| GCAP (Local0) 00290| Return (Local0) | ^ | warning level 0 1104: Reserved method should not return a value (_PDC) 00291| } 00292| 00293| Method (_OSC, 4, NotSerialized) ==================================================================================================== Table SSDT (1) reassembly: Found 0 errors, 4 warnings. PASSED: Test 2, SSDT (2) reassembly, Found 0 errors, 0 warnings. PASSED: Test 2, SSDT (3) reassembly, Found 0 errors, 0 warnings. ==================================================================================================== 3 passed, 10 failed, 0 warnings, 0 aborted, 0 skipped, 0 info only. ==================================================================================================== Critical failures: 1 method test, at 1 log line: 1449: Detected error 'Package limit' when evaluating '\_SB_.PCI0.GFX0.DD02._BCM'. High failures: 11 klog test, at 1 log line: 121: HIGH Kernel message: [ 3.512783] ACPI Error: Method parse/execution failed [\_SB_.PCI0.GFX0._DOD] (Node f7425858), AE_AML_PACKAGE_LIMIT (20110623/psparse-536) syntaxcheck test, at 1 log line: 1668: Assembler error in line 2261 syntaxcheck test, at 1 log line: 1687: Assembler error in line 2268 syntaxcheck test, at 1 log line: 1703: Assembler error in line 8885 syntaxcheck test, at 1 log line: 1716: Assembler error in line 9195 syntaxcheck test, at 1 log line: 1729: Assembler error in line 9242 syntaxcheck test, at 1 log line: 1742: Assembler error in line 18682 syntaxcheck test, at 1 log line: 1766: Assembler error in line 60 syntaxcheck test, at 1 log line: 1779: Assembler error in line 174 syntaxcheck test, at 1 log line: 1792: Assembler error in line 244 syntaxcheck test, at 1 log line: 1805: Assembler error in line 290 Medium failures: 9 mtrr test, at 1 log line: 76: Memory range 0xc0000000 to 0xdfffffff (PCI Bus 0000:00) has incorrect attribute Write-Combining. mtrr test, at 1 log line: 78: Memory range 0xfee01000 to 0xffffffff (PCI Bus 0000:00) has incorrect attribute Write-Protect. msr test, at 1 log line: 165: MSR SYSENTER_ESP (0x175) has 1 inconsistent values across 2 CPUs for (shift: 0 mask: 0xffffffffffffffff). msr test, at 1 log line: 173: MSR MISC_ENABLE (0x1a0) has 1 inconsistent values across 2 CPUs for (shift: 0 mask: 0x400c51889). wmi test, at 1 log line: 528: GUID 2B814318-4BE8-4707-9D84-A190A859B5D0 is unknown to the kernel, a driver may need to be implemented for this GUID. method test, at 1 log line: 1002: \_SB_.PCI0.LPCB.SIO_.COM1._DIS returned values, but was expected to return nothing. method test, at 1 log line: 1011: \_SB_.PCI0.LPCB.SIO_.LPT0._DIS returned values, but was expected to return nothing. method test, at 1 log line: 1443: Method _BCL should return a package of more than 2 integers, got just 2. acpitables test, at 1 log line: 1643: FADT 32 bit FIRMWARE_CONTROL is non-zero, and X_FIRMWARE_CONTROL is also non-zero. Se

Read the article
The Incremental Architect´s Napkin – #3 – Make Evolvability inevitable

- by Ralf Westphal

Originally posted on: http://geekswithblogs.net/theArchitectsNapkin/archive/2014/06/04/the-incremental-architectacutes-napkin-ndash-3-ndash-make-evolvability-inevitable.aspxThe easier something to measure the more likely it will be produced. Deviations between what is and what should be can be readily detected. That´s what automated acceptance tests are for. That´s what sprint reviews in Scrum are for. It´s no small wonder our software looks like it looks. It has all the traits whose conformance with requirements can easily be measured. And it´s lacking traits which cannot easily be measured. Evolvability (or Changeability) is such a trait. If an operation is correct, if an operation if fast enough, that can be checked very easily. But whether Evolvability is high or low, that cannot be checked by taking a measure or two. Evolvability might correlate with certain traits, e.g. number of lines of code (LOC) per function or Cyclomatic Complexity or test coverage. But there is no threshold value signalling “evolvability too low”; also Evolvability is hardly tangible for the customer. Nevertheless Evolvability is of great importance - at least in the long run. You can get away without much of it for a short time. Eventually, though, it´s needed like any other requirement. Or even more. Because without Evolvability no other requirement can be implemented. Evolvability is the foundation on which all else is build. Such fundamental importance is in stark contrast with its immeasurability. To compensate this, Evolvability must be put at the very center of software development. It must become the hub around everything else revolves. Since we cannot measure Evolvability, though, we cannot start watching it more. Instead we need to establish practices to keep it high (enough) at all times. Chefs have known that for long. That´s why everybody in a restaurant kitchen is constantly seeing after cleanliness. Hygiene is important as is to have clean tools at standardized locations. Only then the health of the patrons can be guaranteed and production efficiency is constantly high. Still a kitchen´s level of cleanliness is easier to measure than software Evolvability. That´s why important practices like reviews, pair programming, or TDD are not enough, I guess. What we need to keep Evolvability in focus and high is… to continually evolve. Change must not be something to avoid but too embrace. To me that means the whole change cycle from requirement analysis to delivery needs to be gone through more often. Scrum´s sprints of 4, 2 even 1 week are too long. Kanban´s flow of user stories across is too unreliable; it takes as long as it takes. Instead we should fix the cycle time at 2 days max. I call that Spinning. No increment must take longer than from this morning until tomorrow evening to finish. Then it should be acceptance checked by the customer (or his/her representative, e.g. a Product Owner). For me there are several resasons for such a fixed and short cycle time for each increment: Clear expectations Absolute estimates (“This will take X days to complete.”) are near impossible in software development as explained previously. Too much unplanned research and engineering work lurk in every feature. And then pervasive interruptions of work by peers and management. However, the smaller the scope the better our absolute estimates become. That´s because we understand better what really are the requirements and what the solution should look like. But maybe more importantly the shorter the timespan the more we can control how we use our time. So much can happen over the course of a week and longer timespans. But if push comes to shove I can block out all distractions and interruptions for a day or possibly two. That´s why I believe we can give rough absolute estimates on 3 levels: Noon Tonight Tomorrow Think of a meeting with a Product Owner at 8:30 in the morning. If she asks you, how long it will take you to implement a user story or bug fix, you can say, “It´ll be fixed by noon.”, or you can say, “I can manage to implement it until tonight before I leave.”, or you can say, “You´ll get it by tomorrow night at latest.” Yes, I believe all else would be naive. If you´re not confident to get something done by tomorrow night (some 34h from now) you just cannot reliably commit to any timeframe. That means you should not promise anything, you should not even start working on the issue. So when estimating use these four categories: Noon, Tonight, Tomorrow, NoClue - with NoClue meaning the requirement needs to be broken down further so each aspect can be assigned to one of the first three categories. If you like absolute estimates, here you go. But don´t do deep estimates. Don´t estimate dozens of issues; don´t think ahead (“Issue A is a Tonight, then B will be a Tomorrow, after that it´s C as a Noon, finally D is a Tonight - that´s what I´ll do this week.”). Just estimate so Work-in-Progress (WIP) is 1 for everybody - plus a small number of buffer issues. To be blunt: Yes, this makes promises impossible as to what a team will deliver in terms of scope at a certain date in the future. But it will give a Product Owner a clear picture of what to pull for acceptance feedback tonight and tomorrow. Trust through reliability Our trade is lacking trust. Customers don´t trust software companies/departments much. Managers don´t trust developers much. I find that perfectly understandable in the light of what we´re trying to accomplish: delivering software in the face of uncertainty by means of material good production. Customers as well as managers still expect software development to be close to production of houses or cars. But that´s a fundamental misunderstanding. Software development ist development. It´s basically research. As software developers we´re constantly executing experiments to find out what really provides value to users. We don´t know what they need, we just have mediated hypothesises. That´s why we cannot reliably deliver on preposterous demands. So trust is out of the window in no time. If we switch to delivering in short cycles, though, we can regain trust. Because estimates - explicit or implicit - up to 32 hours at most can be satisfied. I´d say: reliability over scope. It´s more important to reliably deliver what was promised then to cover a lot of requirement area. So when in doubt promise less - but deliver without delay. Deliver on scope (Functionality and Quality); but also deliver on Evolvability, i.e. on inner quality according to accepted principles. Always. Trust will be the reward. Less complexity of communication will follow. More goodwill buffer will follow. So don´t wait for some Kanban board to show you, that flow can be improved by scheduling smaller stories. You don´t need to learn that the hard way. Just start with small batch sizes of three different sizes. Fast feedback What has been finished can be checked for acceptance. Why wait for a sprint of several weeks to end? Why let the mental model of the issue and its solution dissipate? If you get final feedback after one or two weeks, you hardly remember what you did and why you did it. Resoning becomes hard. But more importantly youo probably are not in the mood anymore to go back to something you deemed done a long time ago. It´s boring, it´s frustrating to open up that mental box again. Learning is harder the longer it takes from event to feedback. Effort can be wasted between event (finishing an issue) and feedback, because other work might go in the wrong direction based on false premises. Checking finished issues for acceptance is the most important task of a Product Owner. It´s even more important than planning new issues. Because as long as work started is not released (accepted) it´s potential waste. So before starting new work better make sure work already done has value. By putting the emphasis on acceptance rather than planning true pull is established. As long as planning and starting work is more important, it´s a push process. Accept a Noon issue on the same day before leaving. Accept a Tonight issue before leaving today or first thing tomorrow morning. Accept a Tomorrow issue tomorrow night before leaving or early the day after tomorrow. After acceptance the developer(s) can start working on the next issue. Flexibility As if reliability/trust and fast feedback for less waste weren´t enough economic incentive, there is flexibility. After each issue the Product Owner can change course. If on Monday morning feature slices A, B, C, D, E were important and A, B, C were scheduled for acceptance by Monday evening and Tuesday evening, the Product Owner can change her mind at any time. Maybe after A got accepted she asks for continuation with D. But maybe, just maybe, she has gotten a completely different idea by then. Maybe she wants work to continue on F. And after B it´s neither D nor E, but G. And after G it´s D. With Spinning every 32 hours at latest priorities can be changed. And nothing is lost. Because what got accepted is of value. It provides an incremental value to the customer/user. Or it provides internal value to the Product Owner as increased knowledge/decreased uncertainty. I find such reactivity over commitment economically very benefical. Why commit a team to some workload for several weeks? It´s unnecessary at beast, and inflexible and wasteful at worst. If we cannot promise delivery of a certain scope on a certain date - which is what customers/management usually want -, we can at least provide them with unpredecented flexibility in the face of high uncertainty. Where the path is not clear, cannot be clear, make small steps so you´re able to change your course at any time. Premature completion Customers/management are used to premeditating budgets. They want to know exactly how much to pay for a certain amount of requirements. That´s understandable. But it does not match with the nature of software development. We should know that by now. Maybe there´s somewhere in the world some team who can consistently deliver on scope, quality, and time, and budget. Great! Congratulations! I, however, haven´t seen such a team yet. Which does not mean it´s impossible, but I think it´s nothing I can recommend to strive for. Rather I´d say: Don´t try this at home. It might hurt you one way or the other. However, what we can do, is allow customers/management stop work on features at any moment. With spinning every 32 hours a feature can be declared as finished - even though it might not be completed according to initial definition. I think, progress over completion is an important offer software development can make. Why think in terms of completion beyond a promise for the next 32 hours? Isn´t it more important to constantly move forward? Step by step. We´re not running sprints, we´re not running marathons, not even ultra-marathons. We´re in the sport of running forever. That makes it futile to stare at the finishing line. The very concept of a burn-down chart is misleading (in most cases). Whoever can only think in terms of completed requirements shuts out the chance for saving money. The requirements for a features mostly are uncertain. So how does a Product Owner know in the first place, how much is needed. Maybe more than specified is needed - which gets uncovered step by step with each finished increment. Maybe less than specified is needed. After each 4–32 hour increment the Product Owner can do an experient (or invite users to an experiment) if a particular trait of the software system is already good enough. And if so, she can switch the attention to a different aspect. In the end, requirements A, B, C then could be finished just 70%, 80%, and 50%. What the heck? It´s good enough - for now. 33% money saved. Wouldn´t that be splendid? Isn´t that a stunning argument for any budget-sensitive customer? You can save money and still get what you need? Pull on practices So far, in addition to more trust, more flexibility, less money spent, Spinning led to “doing less” which also means less code which of course means higher Evolvability per se. Last but not least, though, I think Spinning´s short acceptance cycles have one more effect. They excert pull-power on all sorts of practices known for increasing Evolvability. If, for example, you believe high automated test coverage helps Evolvability by lowering the fear of inadverted damage to a code base, why isn´t 90% of the developer community practicing automated tests consistently? I think, the answer is simple: Because they can do without. Somehow they manage to do enough manual checks before their rare releases/acceptance checks to ensure good enough correctness - at least in the short term. The same goes for other practices like component orientation, continuous build/integration, code reviews etc. None of that is compelling, urgent, imperative. Something else always seems more important. So Evolvability principles and practices fall through the cracks most of the time - until a project hits a wall. Then everybody becomes desperate; but by then (re)gaining Evolvability has become as very, very difficult and tedious undertaking. Sometimes up to the point where the existence of a project/company is in danger. With Spinning that´s different. If you´re practicing Spinning you cannot avoid all those practices. With Spinning you very quickly realize you cannot deliver reliably even on your 32 hour promises. Spinning thus is pulling on developers to adopt principles and practices for Evolvability. They will start actively looking for ways to keep their delivery rate high. And if not, management will soon tell them to do that. Because first the Product Owner then management will notice an increasing difficulty to deliver value within 32 hours. There, finally there emerges a way to measure Evolvability: The more frequent developers tell the Product Owner there is no way to deliver anything worth of feedback until tomorrow night, the poorer Evolvability is. Don´t count the “WTF!”, count the “No way!” utterances. In closing For sustainable software development we need to put Evolvability first. Functionality and Quality must not rule software development but be implemented within a framework ensuring (enough) Evolvability. Since Evolvability cannot be measured easily, I think we need to put software development “under pressure”. Software needs to be changed more often, in smaller increments. Each increment being relevant to the customer/user in some way. That does not mean each increment is worthy of shipment. It´s sufficient to gain further insight from it. Increments primarily serve the reduction of uncertainty, not sales. Sales even needs to be decoupled from this incremental progress. No more promises to sales. No more delivery au point. Rather sales should look at a stream of accepted increments (or incremental releases) and scoup from that whatever they find valuable. Sales and marketing need to realize they should work on what´s there, not what might be possible in the future. But I digress… In my view a Spinning cycle - which is not easy to reach, which requires practice - is the core practice to compensate the immeasurability of Evolvability. From start to finish of each issue in 32 hours max - that´s the challenge we need to accept if we´re serious increasing Evolvability. Fortunately higher Evolvability is not the only outcome of Spinning. Customer/management will like the increased flexibility and “getting more bang for the buck”.

Read the article
PTLQueue : a scalable bounded-capacity MPMC queue

- by Dave

Title: Fast concurrent MPMC queue -- I've used the following concurrent queue algorithm enough that it warrants a blog entry. I'll sketch out the design of a fast and scalable multiple-producer multiple-consumer (MPSC) concurrent queue called PTLQueue. The queue has bounded capacity and is implemented via a circular array. Bounded capacity can be a useful property if there's a mismatch between producer rates and consumer rates where an unbounded queue might otherwise result in excessive memory consumption by virtue of the container nodes that -- in some queue implementations -- are used to hold values. A bounded-capacity queue can provide flow control between components. Beware, however, that bounded collections can also result in resource deadlock if abused. The put() and take() operators are partial and wait for the collection to become non-full or non-empty, respectively. Put() and take() do not allocate memory, and are not vulnerable to the ABA pathologies. The PTLQueue algorithm can be implemented equally well in C/C++ and Java. Partial operators are often more convenient than total methods. In many use cases if the preconditions aren't met, there's nothing else useful the thread can do, so it may as well wait via a partial method. An exception is in the case of work-stealing queues where a thief might scan a set of queues from which it could potentially steal. Total methods return ASAP with a success-failure indication. (It's tempting to describe a queue or API as blocking or non-blocking instead of partial or total, but non-blocking is already an overloaded concurrency term. Perhaps waiting/non-waiting or patient/impatient might be better terms). It's also trivial to construct partial operators by busy-waiting via total operators, but such constructs may be less efficient than an operator explicitly and intentionally designed to wait. A PTLQueue instance contains an array of slots, where each slot has volatile Turn and MailBox fields. The array has power-of-two length allowing mod/div operations to be replaced by masking. We assume sensible padding and alignment to reduce the impact of false sharing. (On x86 I recommend 128-byte alignment and padding because of the adjacent-sector prefetch facility). Each queue also has PutCursor and TakeCursor cursor variables, each of which should be sequestered as the sole occupant of a cache line or sector. You can opt to use 64-bit integers if concerned about wrap-around aliasing in the cursor variables. Put(null) is considered illegal, but the caller or implementation can easily check for and convert null to a distinguished non-null proxy value if null happens to be a value you'd like to pass. Take() will accordingly convert the proxy value back to null. An advantage of PTLQueue is that you can use atomic fetch-and-increment for the partial methods. We initialize each slot at index I with (Turn=I, MailBox=null). Both cursors are initially 0. All shared variables are considered "volatile" and atomics such as CAS and AtomicFetchAndIncrement are presumed to have bidirectional fence semantics. Finally T is the templated type. I've sketched out a total tryTake() method below that allows the caller to poll the queue. tryPut() has an analogous construction. Zebra stripping : alternating row colors for nice-looking code listings. See also google code "prettify" : https://code.google.com/p/google-code-prettify/ Prettify is a javascript module that yields the HTML/CSS/JS equivalent of pretty-print. -- pre:nth-child(odd) { background-color:#ff0000; } pre:nth-child(even) { background-color:#0000ff; } border-left: 11px solid #ccc; margin: 1.7em 0 1.7em 0.3em; background-color:#BFB; font-size:12px; line-height:65%; " // PTLQueue : Put(v) : // producer : partial method - waits as necessary assert v != null assert Mask = 1 && (Mask & (Mask+1)) == 0 // Document invariants // doorway step // Obtain a sequence number -- ticket // As a practical concern the ticket value is temporally unique // The ticket also identifies and selects a slot auto tkt = AtomicFetchIncrement (&PutCursor, 1) slot * s = &Slots[tkt & Mask] // waiting phase : // wait for slot's generation to match the tkt value assigned to this put() invocation. // The "generation" is implicitly encoded as the upper bits in the cursor // above those used to specify the index : tkt div (Mask+1) // The generation serves as an epoch number to identify a cohort of threads // accessing disjoint slots while s-Turn != tkt : Pause assert s-MailBox == null s-MailBox = v // deposit and pass message Take() : // consumer : partial method - waits as necessary auto tkt = AtomicFetchIncrement (&TakeCursor,1) slot * s = &Slots[tkt & Mask] // 2-stage waiting : // First wait for turn for our generation // Acquire exclusive "take" access to slot's MailBox field // Then wait for the slot to become occupied while s-Turn != tkt : Pause // Concurrency in this section of code is now reduced to just 1 producer thread // vs 1 consumer thread. // For a given queue and slot, there will be most one Take() operation running // in this section. // Consumer waits for producer to arrive and make slot non-empty // Extract message; clear mailbox; advance Turn indicator // We have an obvious happens-before relation : // Put(m) happens-before corresponding Take() that returns that same "m" for T v = s-MailBox if v != null : s-MailBox = null ST-ST barrier s-Turn = tkt + Mask + 1 // unlock slot to admit next producer and consumer return v Pause tryTake() : // total method - returns ASAP with failure indication for auto tkt = TakeCursor slot * s = &Slots[tkt & Mask] if s-Turn != tkt : return null T v = s-MailBox // presumptive return value if v == null : return null // ratify tkt and v values and commit by advancing cursor if CAS (&TakeCursor, tkt, tkt+1) != tkt : continue s-MailBox = null ST-ST barrier s-Turn = tkt + Mask + 1 return v The basic idea derives from the Partitioned Ticket Lock "PTL" (US20120240126-A1) and the MultiLane Concurrent Bag (US8689237). The latter is essentially a circular ring-buffer where the elements themselves are queues or concurrent collections. You can think of the PTLQueue as a partitioned ticket lock "PTL" augmented to pass values from lock to unlock via the slots. Alternatively, you could conceptualize of PTLQueue as a degenerate MultiLane bag where each slot or "lane" consists of a simple single-word MailBox instead of a general queue. Each lane in PTLQueue also has a private Turn field which acts like the Turn (Grant) variables found in PTL. Turn enforces strict FIFO ordering and restricts concurrency on the slot mailbox field to at most one simultaneous put() and take() operation. PTL uses a single "ticket" variable and per-slot Turn (grant) fields while MultiLane has distinct PutCursor and TakeCursor cursors and abstract per-slot sub-queues. Both PTL and MultiLane advance their cursor and ticket variables with atomic fetch-and-increment. PTLQueue borrows from both PTL and MultiLane and has distinct put and take cursors and per-slot Turn fields. Instead of a per-slot queues, PTLQueue uses a simple single-word MailBox field. PutCursor and TakeCursor act like a pair of ticket locks, conferring "put" and "take" access to a given slot. PutCursor, for instance, assigns an incoming put() request to a slot and serves as a PTL "Ticket" to acquire "put" permission to that slot's MailBox field. To better explain the operation of PTLQueue we deconstruct the operation of put() and take() as follows. Put() first increments PutCursor obtaining a new unique ticket. That ticket value also identifies a slot. Put() next waits for that slot's Turn field to match that ticket value. This is tantamount to using a PTL to acquire "put" permission on the slot's MailBox field. Finally, having obtained exclusive "put" permission on the slot, put() stores the message value into the slot's MailBox. Take() similarly advances TakeCursor, identifying a slot, and then acquires and secures "take" permission on a slot by waiting for Turn. Take() then waits for the slot's MailBox to become non-empty, extracts the message, and clears MailBox. Finally, take() advances the slot's Turn field, which releases both "put" and "take" access to the slot's MailBox. Note the asymmetry : put() acquires "put" access to the slot, but take() releases that lock. At any given time, for a given slot in a PTLQueue, at most one thread has "put" access and at most one thread has "take" access. This restricts concurrency from general MPMC to 1-vs-1. We have 2 ticket locks -- one for put() and one for take() -- each with its own "ticket" variable in the form of the corresponding cursor, but they share a single "Grant" egress variable in the form of the slot's Turn variable. Advancing the PutCursor, for instance, serves two purposes. First, we obtain a unique ticket which identifies a slot. Second, incrementing the cursor is the doorway protocol step to acquire the per-slot mutual exclusion "put" lock. The cursors and operations to increment those cursors serve double-duty : slot-selection and ticket assignment for locking the slot's MailBox field. At any given time a slot MailBox field can be in one of the following states: empty with no pending operations -- neutral state; empty with one or more waiting take() operations pending -- deficit; occupied with no pending operations; occupied with one or more waiting put() operations -- surplus; empty with a pending put() or pending put() and take() operations -- transitional; or occupied with a pending take() or pending put() and take() operations -- transitional. The partial put() and take() operators can be implemented with an atomic fetch-and-increment operation, which may confer a performance advantage over a CAS-based loop. In addition we have independent PutCursor and TakeCursor cursors. Critically, a put() operation modifies PutCursor but does not access the TakeCursor and a take() operation modifies the TakeCursor cursor but does not access the PutCursor. This acts to reduce coherence traffic relative to some other queue designs. It's worth noting that slow threads or obstruction in one slot (or "lane") does not impede or obstruct operations in other slots -- this gives us some degree of obstruction isolation. PTLQueue is not lock-free, however. The implementation above is expressed with polite busy-waiting (Pause) but it's trivial to implement per-slot parking and unparking to deschedule waiting threads. It's also easy to convert the queue to a more general deque by replacing the PutCursor and TakeCursor cursors with Left/Front and Right/Back cursors that can move either direction. Specifically, to push and pop from the "left" side of the deque we would decrement and increment the Left cursor, respectively, and to push and pop from the "right" side of the deque we would increment and decrement the Right cursor, respectively. We used a variation of PTLQueue for message passing in our recent OPODIS 2013 paper. ul { list-style:none; padding-left:0; padding:0; margin:0; margin-left:0; } ul#myTagID { padding: 0px; margin: 0px; list-style:none; margin-left:0;} -- -- There's quite a bit of related literature in this area. I'll call out a few relevant references: Wilson's NYU Courant Institute UltraComputer dissertation from 1988 is classic and the canonical starting point : Operating System Data Structures for Shared-Memory MIMD Machines with Fetch-and-Add. Regarding provenance and priority, I think PTLQueue or queues effectively equivalent to PTLQueue have been independently rediscovered a number of times. See CB-Queue and BNPBV, below, for instance. But Wilson's dissertation anticipates the basic idea and seems to predate all the others. Gottlieb et al : Basic Techniques for the Efficient Coordination of Very Large Numbers of Cooperating Sequential Processors Orozco et al : CB-Queue in Toward high-throughput algorithms on many-core architectures which appeared in TACO 2012. Meneghin et al : BNPVB family in Performance evaluation of inter-thread communication mechanisms on multicore/multithreaded architecture Dmitry Vyukov : bounded MPMC queue (highly recommended) Alex Otenko : US8607249 (highly related). John Mellor-Crummey : Concurrent queues: Practical fetch-and-phi algorithms. Technical Report 229, Department of Computer Science, University of Rochester Thomasson : FIFO Distributed Bakery Algorithm (very similar to PTLQueue). Scott and Scherer : Dual Data Structures I'll propose an optimization left as an exercise for the reader. Say we wanted to reduce memory usage by eliminating inter-slot padding. Such padding is usually "dark" memory and otherwise unused and wasted. But eliminating the padding leaves us at risk of increased false sharing. Furthermore lets say it was usually the case that the PutCursor and TakeCursor were numerically close to each other. (That's true in some use cases). We might still reduce false sharing by incrementing the cursors by some value other than 1 that is not trivially small and is coprime with the number of slots. Alternatively, we might increment the cursor by one and mask as usual, resulting in a logical index. We then use that logical index value to index into a permutation table, yielding an effective index for use in the slot array. The permutation table would be constructed so that nearby logical indices would map to more distant effective indices. (Open question: what should that permutation look like? Possibly some perversion of a Gray code or De Bruijn sequence might be suitable). As an aside, say we need to busy-wait for some condition as follows : "while C == 0 : Pause". Lets say that C is usually non-zero, so we typically don't wait. But when C happens to be 0 we'll have to spin for some period, possibly brief. We can arrange for the code to be more machine-friendly with respect to the branch predictors by transforming the loop into : "if C == 0 : for { Pause; if C != 0 : break; }". Critically, we want to restructure the loop so there's one branch that controls entry and another that controls loop exit. A concern is that your compiler or JIT might be clever enough to transform this back to "while C == 0 : Pause". You can sometimes avoid this by inserting a call to a some type of very cheap "opaque" method that the compiler can't elide or reorder. On Solaris, for instance, you could use :"if C == 0 : { gethrtime(); for { Pause; if C != 0 : break; }}". It's worth noting the obvious duality between locks and queues. If you have strict FIFO lock implementation with local spinning and succession by direct handoff such as MCS or CLH,then you can usually transform that lock into a queue. Hidden commentary and annotations - invisible : * And of course there's a well-known duality between queues and locks, but I'll leave that topic for another blog post. * Compare and contrast : PTLQ vs PTL and MultiLane * Equivalent : Turn; seq; sequence; pos; position; ticket * Put = Lock; Deposit Take = identify and reserve slot; wait; extract & clear; unlock * conceptualize : Distinct PutLock and TakeLock implemented as ticket lock or PTL Distinct arrival cursors but share per-slot "Turn" variable provides exclusive role-based access to slot's mailbox field put() acquires exclusive access to a slot for purposes of "deposit" assigns slot round-robin and then acquires deposit access rights/perms to that slot take() acquires exclusive access to slot for purposes of "withdrawal" assigns slot round-robin and then acquires withdrawal access rights/perms to that slot At any given time, only one thread can have withdrawal access to a slot at any given time, only one thread can have deposit access to a slot Permissible for T1 to have deposit access and T2 to simultaneously have withdrawal access * round-robin for the purposes of; role-based; access mode; access role mailslot; mailbox; allocate/assign/identify slot rights; permission; license; access permission; * PTL/Ticket hybrid Asymmetric usage ; owner oblivious lock-unlock pairing K-exclusion add Grant cursor pass message m from lock to unlock via Slots[] array Cursor performs 2 functions : + PTL ticket + Assigns request to slot in round-robin fashion Deconstruct protocol : explication put() : allocate slot in round-robin fashion acquire PTL for "put" access store message into slot associated with PTL index take() : Acquire PTL for "take" access // doorway step seq = fetchAdd (&Grant, 1) s = &Slots[seq & Mask] // waiting phase while s-Turn != seq : pause Extract : wait for s-mailbox to be full v = s-mailbox s-mailbox = null Release PTL for both "put" and "take" access s-Turn = seq + Mask + 1 * Slot round-robin assignment and lock "doorway" protocol leverage the same cursor and FetchAdd operation on that cursor FetchAdd (&Cursor,1) + round-robin slot assignment and dispersal + PTL/ticket lock "doorway" step waiting phase is via "Turn" field in slot * PTLQueue uses 2 cursors -- put and take. Acquire "put" access to slot via PTL-like lock Acquire "take" access to slot via PTL-like lock 2 locks : put and take -- at most one thread can access slot's mailbox Both locks use same "turn" field Like multilane : 2 cursors : put and take slot is simple 1-capacity mailbox instead of queue Borrow per-slot turn/grant from PTL Provides strict FIFO Lock slot : put-vs-put take-vs-take at most one put accesses slot at any one time at most one put accesses take at any one time reduction to 1-vs-1 instead of N-vs-M concurrency Per slot locks for put/take Release put/take by advancing turn * is instrumental in ... * P-V Semaphore vs lock vs K-exclusion * See also : FastQueues-excerpt.java dice-etc/queue-mpmc-bounded-blocking-circular-xadd/ * PTLQueue is the same as PTLQB - identical * Expedient return; ASAP; prompt; immediately * Lamport's Bakery algorithm : doorway step then waiting phase Threads arriving at doorway obtain a unique ticket number Threads enter in ticket order * In the terminology of Reed and Kanodia a ticket lock corresponds to the busy-wait implementation of a semaphore using an eventcount and a sequencer It can also be thought of as an optimization of Lamport's bakery lock was designed for fault-tolerance rather than performance Instead of spinning on the release counter, processors using a bakery lock repeatedly examine the tickets of their peers --

Read the article
Issue 15: Oracle Exadata Marketing Campaigns

- by rituchhibber

PARTNER FOCUS Oracle ExadataMarketing Campaign Steve McNickleVP Europe, cVidya Steve McNickle is VP Europe for cVidya, an innovative provider of revenue intelligence solutions for telecom, media and entertainment service providers including AT&T, BT, Deutsche Telecom and Vodafone. The company's product portfolio helps operators and service providers maximise margins, improve customer experience and optimise ecosystem relationships through revenue assurance, fraud and security management, sales performance management, pricing analytics, and inter-carrier services. cVidya has partnered with Oracle for more than a decade. RESOURCES -- Oracle PartnerNetwork (OPN) Oracle Exastack Program Oracle Exastack Optimized Oracle Exastack Labs and Enablement Resources Oracle Engineered Systems Oracle Communications cVidya SUBSCRIBE FEEDBACK PREVIOUS ISSUES Are you ready for Oracle OpenWorld this October? -- -- Please could you tell us a little about cVidya's partnering history with Oracle, and expand on your Oracle Exastack accreditations? "cVidya was established just over ten years ago and we've had a strong relationship with Oracle almost since the very beginning. Through our Revenue Intelligence work with some of the world's largest service providers we collect tremendous amounts of information, amounting to billions of records per day. We help our clients to collect, store and analyse that data to ensure that their end customers are getting the best levels of service, are billed correctly, and are happy that they are on the correct price plan. We have been an Oracle Gold level partner for seven years, and crucially just two months ago we were also accredited as Oracle Exastack Optimized for MoneyMap, our core Revenue Assurance solution. Very soon we also expect to be Oracle Exastack Optimized DRMap, our Data Retention solution." What unique capabilities and customer benefits does Oracle Exastack add to your applications? "Oracle Exastack enables us to deliver radical benefits to our customers. A typical mobile operator in the UK might handle between 500 million and two billion call data record details daily. Each transaction needs to be validated, billed correctly and fraud checked. Because of the enormous volumes involved, our clients demand scalable infrastructure that allows them to efficiently acquire, store and process all that data within controlled cost, space and environmental constraints. We have proved that the Oracle Exadata system can process data up to seven times faster and load it as much as 20 times faster than other standard best-of-breed server approaches. With the Oracle Exadata Database Machine they can reduce their datacentre equipment from say, the six or seven cabinets that they needed in the past, down to just one. This dramatic simplification delivers incredible value to the customer by cutting down enormously on all of their significant cost, space, energy, cooling and maintenance overheads." "The Oracle Exastack Program has given our clients the ability to switch their focus from reactive to proactive. Traditionally they may have spent 80 percent of their day processing, and just 20 percent enabling end customers to see advanced analytics, and avoiding issues before they occur. With our solutions and Oracle Exadata they can now switch that balance around entirely, resulting not only in reduced revenue leakage, but a far higher focus on proactive leakage prevention. How has the Oracle Exastack Program transformed your customer business? "We can already see the impact. Oracle solutions allow our delivery teams to achieve successful deployments, happy customers and self-satisfaction, and the power of Oracle's Exa solutions is easy to measure in terms of their transformational ability. We gained our first sale into a major European telco by demonstrating the major performance gains that would transform their business. Clients can measure the ease of organisational change, the early prevention of business issues, the reduction in manpower required to provide protection and coverage across all their products and services, plus of course end customer satisfaction. If customers know that that service is provided accurately and that their bills are calculated correctly, then over time this satisfaction can be attributed to revenue intelligence and the underlying systems which provide it. Combine this with the further integration we have with the other layers of the Oracle stack, including the telecommunications offerings such as NCC, OCDM and BRM, and the result is even greater customer value—not to mention the increased speed to market and the reduced project risk." What does the Oracle Exastack community bring to cVidya, both in terms of general benefits, and also tangible new opportunities and partnerships? "A great deal. We have participated in the Oracle Exastack community heavily over the past year, and have had lots of meetings with Oracle and our peers around the globe. It brings us into contact with like-minded, innovative partners, who like us are not happy to just stand still and want to take fresh technology to their customer base in order to gain enhanced value. We identified three new partnerships in each of two recent meetings, and hope these will open up new opportunities, not only in areas that exactly match where we operate today, but also in some new associative areas that will expand our reach into new business sectors. Notably, thanks to the Exastack community we were invited on stage at last year's Oracle OpenWorld conference. Appearing so publically with Oracle senior VP Judson Althoff elevated awareness and visibility of cVidya and has enabled us to participate in a number of other events with Oracle over the past eight months. We've been involved in speaking opportunities, forums and exhibitions, providing us with invaluable opportunities that we wouldn't otherwise have got close to." How has Exastack differentiated cVidya as an ISV, and helped you to evolve your business to the next level? "When we are selling to our core customer base of Tier 1 telecommunications providers, we know that they want more than just software. They want an enduring partnership that will last many years, they want innovation, and a forward thinking partner who knows how to guide them on where they need to be to meet market demand three, five or seven years down the line. Membership of respected global bodies, such as the Telemanagement Forum enables us to lead standard adherence in our area of business, giving us a lot of credibility, but Oracle is also involved in this forum with its own telecommunications portfolio, strengthening our position still further. When we approach CEOs, CTOs and CIOs at the very largest Tier 1 operators, not only can we easily show them that our technology is fantastic, we can also talk about our strong partnership with Oracle, and our joint embracing of today's standards and tomorrow's innovation." Where would you like cVidya to be in one year's time? "We want to get all of our relevant products Oracle Exastack Optimized. Our MoneyMap Revenue Assurance solution is already Exastack Optimised, our DRMAP Data Retention Solution should be Exastack Optimised within the next month, and our FraudView Fraud Management solution within the next two to three months. We'd then like to extend our Oracle accreditation out to include other members of the Oracle Engineered Systems family. We are moving into the 'Big Data' space, and so we're obviously very keen to work closely with Oracle to conduct pilots, map new technologies onto Oracle Big Data platforms, and embrace and measure the benefits of other Oracle systems, namely Oracle Exalogic Elastic Cloud, the Oracle Exalytics In-Memory Machine and the Oracle SPARC SuperCluster. We would also like to examine how the Oracle Database Appliance might benefit our Tier 2 service provider customers. Finally, we'd also like to continue working with the Oracle Communications Global Business Unit (CGBU), furthering our integration with Oracle billing products so that we are able to quickly deploy fraud solutions into Oracle's Engineered System stack, give operational benefits to our clients that are pre-integrated, more cost-effective, and can be rapidly deployed rapidly and producing benefits in three months, not nine months." Chris Baker ,Senior Vice President, Oracle Worldwide ISV-OEM-Java Sales Chris Baker is the Global Head of ISV/OEM Sales responsible for working with ISV/OEM partners to maximise Oracle's business through those partners, whilst maximising those partners' business to their end users. Chris works with partners, customers, innovators, investors and employees to develop innovative business solutions using Oracle products, services and skills. Firstly, could you please explain Oracle's current strategy for ISV partners, globally and in EMEA? "Oracle customers use independent software vendor (ISV) applications to run their businesses. They use them to generate revenue and to fulfil obligations to their own customers. Our strategy is very straight-forward. We want all of our ISV partners and OEMs to concentrate on the things that they do the best – building applications to meet the unique industry and functional requirements of their customer. We want to ensure that we deliver a best in class application platform so the ISV is free to concentrate their effort on their application functionality and user experience We invest over four billion dollars in research and development every year, and we want our ISVs to benefit from all of that investment in operating systems, virtualisation, databases, middleware, engineered systems, and other hardware. By doing this, we help them to reduce their costs, gain more consistency and agility for quicker implementations, and also rapidly differentiate themselves from other application vendors. It's all about simplification because we believe that around 25 to 30 percent of the development costs incurred by many ISVs are caused by customising infrastructure and have nothing to do with their applications. Our strategy is to enable our ISV partners to standardise their application platform using engineered architecture, so they can write once to the Oracle stack and deploy seamlessly in the cloud, on-premise, or in hybrid deployments. It's really important that architecture is the same in order to keep cost and time overheads at a minimum, so we provide standardisation and an environment that enables our ISVs to concentrate on the core business that makes them the most money and brings them success." How do you believe this strategy is helping the ISVs to work hand-in-hand with Oracle to ensure that end customers get the industry-leading solutions that they need? "We work with our ISVs not just to help them be successful, but also to help them market themselves. We have something called the 'Oracle Exastack Ready Program', which enables ISVs to publicise themselves as 'Ready' to run the core software platforms that run on Oracle's engineered systems including Exadata and Exalogic. So, for example, they can become 'Database Ready' which means that they use the latest version of Oracle Database and therefore can run their application without modification on Exadata or the Oracle Database Appliance. Alternatively, they can become WebLogic Ready, Oracle Linux Ready and Oracle Solaris Ready which means they run on the latest release and therefore can run their application, with no new porting work, on Oracle Exalogic. Those 'Ready' logos are important in helping ISVs advertise to their customers that they are using the latest technologies which have been fully tested. We now also have Exadata Ready and Exalogic Ready programmes which allow ISVs to promote the certification of their applications on these platforms. This highlights these partners to Oracle customers as having solutions that run fluently on the Oracle Exadata Database Machine, the Oracle Exalogic Elastic Cloud or one of our other engineered systems. This makes it easy for customers to identify solutions and provides ISVs with an avenue to connect with Oracle customers who are rapidly adopting engineered systems. We have also taken this programme to the next level in the shape of 'Oracle Exastack Optimized' for partners whose applications run best on the Oracle stack and have invested the time to fully optimise application performance. We ensure that Exastack Optimized partner status is promoted and supported by press releases, and we help our ISVs go to market and differentiate themselves through the use our technology and the standardisation it delivers. To date we have had several hundred organisations successfully work through our Exastack Optimized programme." How does Oracle's strategy of offering pre-integrated open platform software and hardware allow ISVs to bring their products to market more quickly? "One of the problems for many ISVs is that they have to think very carefully about the technology on which their solutions will be deployed, particularly in the cloud or hosted environments. They have to think hard about how they secure these environments, whether the concern is, for example, middleware, identity management, or securing personal data. If they don't use the technology that we build-in to our products to help them to fulfil these roles, they then have to build it themselves. This takes time, requires testing, and must be maintained. By taking advantage of our technology, partners will now know that they have a standard platform. They will know that they can confidently talk about implementation being the same every time they do it. Very large ISV applications could once take a year or two to be implemented at an on-premise environment. But it wasn't just the configuration of the application that took the time, it was actually the infrastructure - the different hardware configurations, operating systems and configurations of databases and middleware. Now we strongly believe that it's all about standardisation and repeatability. It's about making sure that our partners can do it once and are then able to roll it out many different times using standard componentry." What actions would you recommend for existing ISV partners that are looking to do more business with Oracle and its customer base, not only to maximise benefits, but also to maximise partner relationships? "My team, around the world and in the EMEA region, is available and ready to talk to any of our ISVs and to explore the possibilities together. We run programmes like 'Excite' and 'Insight' to help us to understand how we can help ISVs with architecture and widen their environments. But we also want to work with, and look at, new opportunities - for example, the Machine-to-Machine (M2M) market or 'The Internet of Things'. Over the next few years, many millions, indeed billions of devices will be collecting massive amounts of data and communicating it back to the central systems where ISVs will be running their applications. The only way that our partners will be able to provide a single vendor 'end-to-end' solution is to use Oracle integrated systems at the back end and Java on the 'smart' devices collecting the data – a complete solution from device to data centre. So there are huge opportunities to work closely with our ISVs, using Oracle's complete M2M platform, to provide the infrastructure that enables them to extract maximum value from the data collected. If any partners don't know where to start or who to contact, then they can contact me directly at [email protected] or indeed any of our teams across the EMEA region. We want to work with ISVs to help them to be as successful as they possibly can through simplification and speed to market, and we also want all of the top ISVs in the world based on Oracle." What opportunities are immediately opened to new ISV partners joining the OPN? "As you know OPN is very, very important. New members will discover a huge amount of content that instantly becomes accessible to them. They can access a wealth of no-cost training and enablement materials to build their expertise in Oracle technology. They can download Oracle software and use it for development projects. They can help themselves become more competent by becoming part of a true community and uncovering new opportunities by working with Oracle and their peers in the Oracle Partner Network. As well as publishing massive amounts of information on OPN, we also hold our global Oracle OpenWorld event, at which partners play a huge role. This takes place at the end of September and the beginning of October in San Francisco. Attending ISV partners have an unrivalled opportunity to contribute to elements such as the OpenWorld / OPN Exchange, at which they can talk to other partners and really begin thinking about how they can move their businesses on and play key roles in a very large ecosystem which revolves around technology and standardisation." Finally, are there any other messages that you would like to share with the Oracle ISV community? "The crucial message that I always like to reinforce is architecture, architecture and architecture! The key opportunities that ISVs have today revolve around standardising their architectures so that they can confidently think: “I will I be able to do exactly the same thing whenever a customer is looking to deploy on-premise, hosted or in the cloud”. The right architecture is critical to being competitive and to really start changing the game. We want to help our ISV partners to do just that; to establish standard architecture and to seize the opportunities it opens up for them. New market opportunities like M2M are enormous - just look at how many devices are all around you right now. We can help our partners to interface with these devices more effectively while thinking about their entire ecosystem, rather than just the piece that they have traditionally focused upon. With standardised architecture, we can help people dramatically improve their speed, reach, agility and delivery of enhanced customer satisfaction and value all the way from the Java side to their centralised systems. All Oracle ISV partners must take advantage of these opportunities, which is why Oracle will continue to invest in and support them." -- Gergely Strbik is Oracle Hardware and Software Product Manager for Avnet in Hungary. Avnet Technology Solutions is an OracleValue Added Distributor focused on the development of the existing Oracle channel. This includes the recruitment and enablement of Oracle partners as well as driving deeper adoption of Oracle's technology and application products within the IT channel. "The main business benefits of ODA for our customers and partners are scalability, flexibility, a great price point for the high performance delivered, and the easily configurable embedded Linux operating system. People welcome a lower point of entry and the ability to grow capacity on demand as their business expands." "Marketing and selling the ODA requires another way of thinking because it is an appliance. We have to transform the ways in which our partners and customers think from buying hardware and software independently to buying complete solutions. Successful early adopters and satisfied customer reactions will certainly help us to sell the ODA. We will have more experience with the product after the first deliveries and installations—end users need to see the power and benefits for themselves." "Our typical ODA customers will be those looking for complete solutions from a single reseller partner who is also able to manage the appliance. They will have enjoyed using Oracle Database but now want a new product that is able to unlock new levels of performance. A higher proportion of potential customers will come from our existing Oracle base, with around 30% from new business, but we intend to evangelise the ODA on the market to see how we can change this balance as all our customers adjust to the concept of 'Hardware and Software, Engineered to Work Together'. -- Back to the welcome page

Read the article
Optimized OCR black/white pixel algorithm

- by eagle

I am writing a simple OCR solution for a finite set of characters. That is, I know the exact way all 26 letters in the alphabet will look like. I am using C# and am able to easily determine if a given pixel should be treated as black or white. I am generating a matrix of black/white pixels for every single character. So for example, the letter I (capital i), might look like the following: 01110 00100 00100 00100 01110 Note: all points, which I use later in this post, assume that the top left pixel is (0, 0), bottom right pixel is (4, 4). 1's represent black pixels, and 0's represent white pixels. I would create a corresponding matrix in C# like this: CreateLetter("I", new List<List<bool>>() { new List<bool>() { false, true, true, true, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, true, true, true, false } }); I know I could probably optimize this part by using a multi-dimensional array instead, but let's ignore that for now, this is for illustrative purposes. Every letter is exactly the same dimensions, 10px by 11px (10px by 11px is the actual dimensions of a character in my real program. I simplified this to 5px by 5px in this posting since it is much easier to "draw" the letters using 0's and 1's on a smaller image). Now when I give it a 10px by 11px part of an image to analyze with OCR, it would need to run on every single letter (26) on every single pixel (10 * 11 = 110) which would mean 2,860 (26 * 110) iterations (in the worst case) for every single character. I was thinking this could be optimized by defining the unique characteristics of every character. So, for example, let's assume that the set of characters only consists of 5 distinct letters: I, A, O, B, and L. These might look like the following: 01110 00100 00100 01100 01000 00100 01010 01010 01010 01000 00100 01110 01010 01100 01000 00100 01010 01010 01010 01000 01110 01010 00100 01100 01110 After analyzing the unique characteristics of every character, I can significantly reduce the number of tests that need to be performed to test for a character. For example, for the "I" character, I could define it's unique characteristics as having a black pixel in the coordinate (3, 0) since no other characters have that pixel as black. So instead of testing 110 pixels for a match on the "I" character, I reduced it to a 1 pixel test. This is what it might look like for all these characters: var LetterI = new OcrLetter() { Name = "I", BlackPixels = new List<Point>() { new Point (3, 0) } } var LetterA = new OcrLetter() { Name = "A", WhitePixels = new List<Point>() { new Point(2, 4) } } var LetterO = new OcrLetter() { Name = "O", BlackPixels = new List<Point>() { new Point(3, 2) }, WhitePixels = new List<Point>() { new Point(2, 2) } } var LetterB = new OcrLetter() { Name = "B", BlackPixels = new List<Point>() { new Point(3, 1) }, WhitePixels = new List<Point>() { new Point(3, 2) } } var LetterL = new OcrLetter() { Name = "L", BlackPixels = new List<Point>() { new Point(1, 1), new Point(3, 4) }, WhitePixels = new List<Point>() { new Point(2, 2) } } This is challenging to do manually for 5 characters and gets much harder the greater the amount of letters that are added. You also want to guarantee that you have the minimum set of unique characteristics of a letter since you want it to be optimized as much as possible. I want to create an algorithm that will identify the unique characteristics of all the letters and would generate similar code to that above. I would then use this optimized black/white matrix to identify characters. How do I take the 26 letters that have all their black/white pixels filled in (e.g. the CreateLetter code block) and convert them to an optimized set of unique characteristics that define a letter (e.g. the new OcrLetter() code block)? And how would I guarantee that it is the most efficient definition set of unique characteristics (e.g. instead of defining 6 points as the unique characteristics, there might be a way to do it with 1 or 2 points, as the letter "I" in my example was able to). An alternative solution I've come up with is using a hash table, which will reduce it from 2,860 iterations to 110 iterations, a 26 time reduction. This is how it might work: I would populate it with data similar to the following: Letters["01110 00100 00100 00100 01110"] = "I"; Letters["00100 01010 01110 01010 01010"] = "A"; Letters["00100 01010 01010 01010 00100"] = "O"; Letters["01100 01010 01100 01010 01100"] = "B"; Now when I reach a location in the image to process, I convert it to a string such as: "01110 00100 00100 00100 01110" and simply find it in the hash table. This solution seems very simple, however, this still requires 110 iterations to generate this string for each letter. In big O notation, the algorithm is the same since O(110N) = O(2860N) = O(N) for N letters to process on the page. However, it is still improved by a constant factor of 26, a significant improvement (e.g. instead of it taking 26 minutes, it would take 1 minute). Update: Most of the solutions provided so far have not addressed the issue of identifying the unique characteristics of a character and rather provide alternative solutions. I am still looking for this solution which, as far as I can tell, is the only way to achieve the fastest OCR processing. I just came up with a partial solution: For each pixel, in the grid, store the letters that have it as a black pixel. Using these letters: I A O B L 01110 00100 00100 01100 01000 00100 01010 01010 01010 01000 00100 01110 01010 01100 01000 00100 01010 01010 01010 01000 01110 01010 00100 01100 01110 You would have something like this: CreatePixel(new Point(0, 0), new List<Char>() { }); CreatePixel(new Point(1, 0), new List<Char>() { 'I', 'B', 'L' }); CreatePixel(new Point(2, 0), new List<Char>() { 'I', 'A', 'O', 'B' }); CreatePixel(new Point(3, 0), new List<Char>() { 'I' }); CreatePixel(new Point(4, 0), new List<Char>() { }); CreatePixel(new Point(0, 1), new List<Char>() { }); CreatePixel(new Point(1, 1), new List<Char>() { 'A', 'B', 'L' }); CreatePixel(new Point(2, 1), new List<Char>() { 'I' }); CreatePixel(new Point(3, 1), new List<Char>() { 'A', 'O', 'B' }); // ... CreatePixel(new Point(2, 2), new List<Char>() { 'I', 'A', 'B' }); CreatePixel(new Point(3, 2), new List<Char>() { 'A', 'O' }); // ... CreatePixel(new Point(2, 4), new List<Char>() { 'I', 'O', 'B', 'L' }); CreatePixel(new Point(3, 4), new List<Char>() { 'I', 'A', 'L' }); CreatePixel(new Point(4, 4), new List<Char>() { }); Now for every letter, in order to find the unique characteristics, you need to look at which buckets it belongs to, as well as the amount of other characters in the bucket. So let's take the example of "I". We go to all the buckets it belongs to (1,0; 2,0; 3,0; ...; 3,4) and see that the one with the least amount of other characters is (3,0). In fact, it only has 1 character, meaning it must be an "I" in this case, and we found our unique characteristic. You can also do the same for pixels that would be white. Notice that bucket (2,0) contains all the letters except for "L", this means that it could be used as a white pixel test. Similarly, (2,4) doesn't contain an 'A'. Buckets that either contain all the letters or none of the letters can be discarded immediately, since these pixels can't help define a unique characteristic (e.g. 1,1; 4,0; 0,1; 4,4). It gets trickier when you don't have a 1 pixel test for a letter, for example in the case of 'O' and 'B'. Let's walk through the test for 'O'... It's contained in the following buckets: // Bucket Count Letters // 2,0 4 I, A, O, B // 3,1 3 A, O, B // 3,2 2 A, O // 2,4 4 I, O, B, L Additionally, we also have a few white pixel tests that can help: (I only listed those that are missing at most 2). The Missing Count was calculated as (5 - Bucket.Count). // Bucket Missing Count Missing Letters // 1,0 2 A, O // 1,1 2 I, O // 2,2 2 O, L // 3,4 2 O, B So now we can take the shortest black pixel bucket (3,2) and see that when we test for (3,2) we know it is either an 'A' or an 'O'. So we need an easy way to tell the difference between an 'A' and an 'O'. We could either look for a black pixel bucket that contains 'O' but not 'A' (e.g. 2,4) or a white pixel bucket that contains an 'O' but not an 'A' (e.g. 1,1). Either of these could be used in combination with the (3,2) pixel to uniquely identify the letter 'O' with only 2 tests. This seems like a simple algorithm when there are 5 characters, but how would I do this when there are 26 letters and a lot more pixels overlapping? For example, let's say that after the (3,2) pixel test, it found 10 different characters that contain the pixel (and this was the least from all the buckets). Now I need to find differences from 9 other characters instead of only 1 other character. How would I achieve my goal of getting the least amount of checks as possible, and ensure that I am not running extraneous tests?

Read the article

< Previous Page | 6 7 8 9 10