schema compare - Page 137

IMAPSync Migration to Exchange 2010 SP1: Exchange drops connections while checking for existence of folders

- by Benjamin Priestman

I'm migrating from ZImbra Collaboration Suite to Exchange 2010 SP1. I'm testing IMAPSync as a possible migration tool and have hit a problem with the IMAP server in Exchange 2010. For each account it migrates, IMAPSync loops through the list of folders in the source mailbox and tests for the existence of each one in the destination mailbox. It then goes on to create those folders that do not exist and copy over the messages. It's the intial testing for the existence of the folders that is giving me a problem. The response given by the Exchange server when the folder does not yet exist is given as an error: "R=""16 NO IMAPSyncTest/8 doesn't exist."" After ten of these errors have been issued in succession, the Exchange server appears to stop responding to the IMAP session. Enabling protocol logging for IMAP confirms that the 10th request for a non-existant folder is the last request to be logged on the server. IMAPSync carries on merrily without seeming to realise its connection has gone and thus fails to create any folders. I've logged this with the tool's creator. Does anyone have any idea why Exchange is stopping responding to the connections though? The behaviour looks rather like throttling, although the 'ten strikes and you're out' trigger does not seem to correspond to any of the triggers on the ThrottlingPolicies. Just to check, I've tried creating a new ThrottlingPolicy, turned everything that I think might be relevant up to 11 and applied it to the my test mailbox. Policy settings are listed below, along with IMAP settings. Everything else should be pretty much as default. Throttling Policy RunspaceId : afa3159c-32a6-4906-986f-8adfbe50868b IsDefault : False AnonymousMaxConcurrency : 1 AnonymousPercentTimeInAD : AnonymousPercentTimeInCAS : AnonymousPercentTimeInMailboxRPC : EASMaxConcurrency : 10 EASPercentTimeInAD : EASPercentTimeInCAS : EASPercentTimeInMailboxRPC : EASMaxDevices : 10 EASMaxDeviceDeletesPerMonth : EWSMaxConcurrency : 10 EWSPercentTimeInAD : 50 EWSPercentTimeInCAS : 90 EWSPercentTimeInMailboxRPC : 60 EWSMaxSubscriptions : 5000 EWSFastSearchTimeoutInSeconds : 60 EWSFindCountLimit : 1000 IMAPMaxConcurrency : 1000 IMAPPercentTimeInAD : 400 IMAPPercentTimeInCAS : 400 IMAPPercentTimeInMailboxRPC : 400 OWAMaxConcurrency : 5 OWAPercentTimeInAD : 30 OWAPercentTimeInCAS : 150 OWAPercentTimeInMailboxRPC : 150 POPMaxConcurrency : 20 POPPercentTimeInAD : POPPercentTimeInCAS : POPPercentTimeInMailboxRPC : PowerShellMaxConcurrency : 18 PowerShellMaxTenantConcurrency : PowerShellMaxCmdlets : PowerShellMaxCmdletsTimePeriod : ExchangeMaxCmdlets : PowerShellMaxCmdletQueueDepth : PowerShellMaxDestructiveCmdlets : PowerShellMaxDestructiveCmdletsTimePeriod : RCAMaxConcurrency : 1000 RCAPercentTimeInAD : 400 RCAPercentTimeInCAS : 400 RCAPercentTimeInMailboxRPC : 400 CPAMaxConcurrency : 20 CPAPercentTimeInCAS : 205 CPAPercentTimeInMailboxRPC : 200 MessageRateLimit : RecipientRateLimit : ForwardeeLimit : CPUStartPercent : 75 AdminDisplayName : ExchangeVersion : 0.10 (14.0.100.0) Name : TestMigrationThrottling DistinguishedName : CN=TestMigrationThrottling,CN=Global Settings,CN=Our Company,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=cimex,DC=com Identity : TestMigrationThrottling Guid : 240049b3-2023-4df1-8edc-fbfc1fc80b87 ObjectCategory : domain.com/Configuration/Schema/ms-Exch-Throttling-Policy ObjectClass : {top, msExchGenericPolicy, msExchThrottlingPolicy} WhenChanged : 21/04/2011 18:48:19 WhenCreated : 21/04/2011 18:07:20 WhenChangedUTC : 21/04/2011 17:48:19 WhenCreatedUTC : 21/04/2011 17:07:20 OrganizationId : OriginatingServer : a-domain-controller IsValid : True IMAPSettings RunspaceId : afa3159c-32a6-4906-986f-8adfbe50868b ProtocolName : IMAP4 Name : 1 MaxCommandSize : 10240 ShowHiddenFoldersEnabled : False UnencryptedOrTLSBindings : {192.168.x.x:143} SSLBindings : {192.168.x.x:993} InternalConnectionSettings : {mail.office.domain.com:143:TLS, mail.office.domain.com:993:SSL} ExternalConnectionSettings : {mail.office.domain.com:143:TLS, mail.office.domain.com:993:SSL} X509CertificateName : mail.domain.com Banner : The Microsoft Exchange IMAP4 service is ready. LoginType : SecureLogin AuthenticatedConnectionTimeout : 00:30:00 PreAuthenticatedConnectionTimeout : 00:01:00 MaxConnections : 2147483647 MaxConnectionFromSingleIP : 2147483647 MaxConnectionsPerUser : 16 MessageRetrievalMimeFormat : BestBodyFormat ProxyTargetPort : 143 CalendarItemRetrievalOption : iCalendar OwaServerUrl : EnableExactRFC822Size : False LiveIdBasicAuthReplacement : False SuppressReadReceipt : False ProtocolLogEnabled : True EnforceCertificateErrors : False LogFileLocation : C:\Program Files\Microsoft\Exchange Server\V14\Logging\Imap4 LogFileRollOverSettings : Daily LogPerFileSizeQuota : 0 B (0 bytes) ExtendedProtectionPolicy : None EnableGSSAPIAndNTLMAuth : True Server : CMX-OFFICE-EX01 AdminDisplayName : ExchangeVersion : 0.10 (14.0.100.0) DistinguishedName : CN=1,CN=IMAP4,CN=Protocols,CN=EXCHANGE01,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=Our COmpany,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=domain,DC=com Identity : EXCHANGE01\1 Guid : 48f9dc37-74c2-4fb0-a042-641f863f45f2 ObjectCategory : domain.com/Configuration/Schema/ms-Exch-Protocol-Cfg-IMAP-Server ObjectClass : {top, protocolCfg, protocolCfgIMAP, protocolCfgIMAPServer} WhenChanged : 21/04/2011 17:03:39 WhenCreated : 15/04/2011 13:51:58 WhenChangedUTC : 21/04/2011 16:03:39 WhenCreatedUTC : 15/04/2011 12:51:58 OrganizationId : OriginatingServer : a-domain-server IsValid : True

Read the article

Trying to connect phpMyAdmin to remote mySQL server ( 2002: can't connect )

- by Malcolm Jones

Trying to get phpMyAdmin to talk to a remote mySQL server. The config is below and there is already a user set up in mySQL DB to be able to log in from the specified host that PMA sits on. Hosting is provided by Rackspace (Rightscale) and both cloud servers behind the same firewall. [config.inc.php] <?php $cfg['blowfish_secret'] = ''; $i = 0; $i++; $cfg['Servers'][$i]['host'] = 'XX.XX.XX.XX'; // MySQL hostname or IP address $cfg['Servers'][$i]['port'] = ''; // MySQL port - leave blank for default port $cfg['Servers'][$i]['socket'] = ''; // Path to the socket - leave blank for default socket $cfg['Servers'][$i]['connect_type'] = 'tcp'; // How to connect to MySQL server ('tcp' or 'socket') $cfg['Servers'][$i]['extension'] = 'mysql'; // The php MySQL extension to use ('mysql' or 'mysqli') $cfg['Servers'][$i]['compress'] = FALSE; // Use compressed protocol for the MySQL connection // (requires PHP >= 4.3.0) $cfg['Servers'][$i]['controluser'] = ''; // MySQL control user settings // (this user must have read-only $cfg['Servers'][$i]['controlpass'] = ''; // access to the "mysql/user" // and "mysql/db" tables). // The controluser is also // used for all relational // features (pmadb) $cfg['Servers'][$i]['auth_type'] = 'config'; // Authentication method (config, http or cookie based)? $cfg['Servers'][$i]['user'] = 'USERNAME'; // MySQL user $cfg['Servers'][$i]['password'] = 'PASSWORD'; // MySQL password (only needed // with 'config' auth_type) $cfg['Servers'][$i]['only_db'] = ''; // If set to a db-name, only // this db is displayed in left frame // It may also be an array of db-names, where sorting order is relevant. $cfg['Servers'][$i]['hide_db'] = ''; // Database name to be hidden from listings $cfg['Servers'][$i]['verbose'] = ''; // Verbose name for this host - leave blank to show the hostname $cfg['Servers'][$i]['pmadb'] = ''; // Database used for Relation, Bookmark and PDF Features // (see scripts/create_tables.sql) // - leave blank for no support // DEFAULT: 'phpmyadmin' $cfg['Servers'][$i]['bookmarktable'] = ''; // Bookmark table // - leave blank for no bookmark support // DEFAULT: 'pma_bookmark' $cfg['Servers'][$i]['relation'] = ''; // table to describe the relation between links (see doc) // - leave blank for no relation-links support // DEFAULT: 'pma_relation' $cfg['Servers'][$i]['table_info'] = ''; // table to describe the display fields // - leave blank for no display fields support // DEFAULT: 'pma_table_info' $cfg['Servers'][$i]['table_coords'] = ''; // table to describe the tables position for the PDF schema // - leave blank for no PDF schema support // DEFAULT: 'pma_table_coords' $cfg['Servers'][$i]['pdf_pages'] = ''; // table to describe pages of relationpdf // - leave blank if you don't want to use this // DEFAULT: 'pma_pdf_pages' $cfg['Servers'][$i]['column_info'] = ''; // table to store column information // - leave blank for no column comments/mime types // DEFAULT: 'pma_column_info' $cfg['Servers'][$i]['history'] = ''; // table to store SQL history // - leave blank for no SQL query history // DEFAULT: 'pma_history' $cfg['Servers'][$i]['verbose_check'] = TRUE; // set to FALSE if you know that your pma_* tables // are up to date. This prevents compatibility // checks and thereby increases performance. $cfg['Servers'][$i]['AllowRoot'] = TRUE; // whether to allow root login $cfg['Servers'][$i]['AllowDeny']['order'] // Host authentication order, leave blank to not use = ''; $cfg['Servers'][$i]['AllowDeny']['rules'] // Host authentication rules, leave blank for defaults = array(); Please let me know if you need anymore info. -- Malcolm

Read the article

Installing Lubuntu 14.04.1 forcepae fails

- by Rantanplan

I tried to install Lubuntu 14.04.1 from a CD. First, I chose Try Lubuntu without installing which gave: ERROR: PAE is disabled on this Pentium M (PAE can potentially be enabled with kernel parameter "forcepae" ... Following the description on https://help.ubuntu.com/community/PAE, I used forcepae and tried Try Lubuntu without installing again. That worked fine. dmesg | grep -i pae showed: [ 0.000000] Kernel command line: file=/cdrom/preseed/lubuntu.seed boot=casper initrd=/casper/initrd.lz quiet splash -- forcepae [ 0.008118] PAE forced! On the live-CD session, I tried installing Lubuntu double clicking on the install button on the desktop. Here, the CD starts running but then stops running and nothing happens. Next, I rebooted and tried installing Lubuntu directly from the boot menu screen using forcepae again. After a while, I receive the following error message: The installer encountered an unrecoverable error. A desktop session will now be run so that you may investigate the problem or try installing again. Hitting Enter brings me to the desktop. For what errors should I search? And how? Finally, I rebooted once more and tried Check disc for defects with forcepae option; no errors have been found. Now, I am wondering how to find the error or whether it would be better to follow advice c in https://help.ubuntu.com/community/PAE: "Move the hard disk to a computer on which the processor has PAE capability and PAE flag (that is, almost everything else than a Banias). Install the system as usual but don't add restricted drivers. After the install move the disk back." Thanks for some hints! Perhaps some of the following can help: On Lubuntu 12.04: cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 13 model name : Intel(R) Pentium(R) M processor 1.50GHz stepping : 6 microcode : 0x17 cpu MHz : 600.000 cache size : 2048 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr mce cx8 mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss tm pbe up bts est tm2 bogomips : 1284.76 clflush size : 64 cache_alignment : 64 address sizes : 32 bits physical, 32 bits virtual power management: uname -a Linux humboldt 3.2.0-67-generic #101-Ubuntu SMP Tue Jul 15 17:45:51 UTC 2014 i686 i686 i386 GNU/Linux lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 12.04.5 LTS Release: 12.04 Codename: precise cpuid eax in eax ebx ecx edx 00000000 00000002 756e6547 6c65746e 49656e69 00000001 000006d6 00000816 00000180 afe9f9bf 00000002 02b3b001 000000f0 00000000 2c04307d 80000000 80000004 00000000 00000000 00000000 80000001 00000000 00000000 00000000 00000000 80000002 20202020 20202020 65746e49 2952286c 80000003 6e655020 6d756974 20295228 7270204d 80000004 7365636f 20726f73 30352e31 007a4847 Vendor ID: "GenuineIntel"; CPUID level 2 Intel-specific functions: Version 000006d6: Type 0 - Original OEM Family 6 - Pentium Pro Model 13 - Stepping 6 Reserved 0 Brand index: 22 [not in table] Extended brand string: " Intel(R) Pentium(R) M processor 1.50GHz" CLFLUSH instruction cache line size: 8 Feature flags afe9f9bf: FPU Floating Point Unit VME Virtual 8086 Mode Enhancements DE Debugging Extensions PSE Page Size Extensions TSC Time Stamp Counter MSR Model Specific Registers MCE Machine Check Exception CX8 COMPXCHG8B Instruction SEP Fast System Call MTRR Memory Type Range Registers PGE PTE Global Flag MCA Machine Check Architecture CMOV Conditional Move and Compare Instructions FGPAT Page Attribute Table CLFSH CFLUSH instruction DS Debug store ACPI Thermal Monitor and Clock Ctrl MMX MMX instruction set FXSR Fast FP/MMX Streaming SIMD Extensions save/restore SSE Streaming SIMD Extensions instruction set SSE2 SSE2 extensions SS Self Snoop TM Thermal monitor 31 reserved TLB and cache info: b0: unknown TLB/cache descriptor b3: unknown TLB/cache descriptor 02: Instruction TLB: 4MB pages, 4-way set assoc, 2 entries f0: unknown TLB/cache descriptor 7d: unknown TLB/cache descriptor 30: unknown TLB/cache descriptor 04: Data TLB: 4MB pages, 4-way set assoc, 8 entries 2c: unknown TLB/cache descriptor On Lubuntu 14.04.1 live-CD with forcepae: cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 13 model name : Intel(R) Pentium(R) M processor 1.50GHz stepping : 6 microcode : 0x17 cpu MHz : 600.000 cache size : 2048 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fdiv_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss tm pbe bts est tm2 bogomips : 1284.68 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 32 bits virtual power management: uname -a Linux lubuntu 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:12 UTC 2014 i686 i686 i686 GNU/Linux lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 14.04.1 LTS Release: 14.04 Codename: trusty cpuid CPU 0: vendor_id = "GenuineIntel" version information (1/eax): processor type = primary processor (0) family = Intel Pentium Pro/II/III/Celeron/Core/Core 2/Atom, AMD Athlon/Duron, Cyrix M2, VIA C3 (6) model = 0xd (13) stepping id = 0x6 (6) extended family = 0x0 (0) extended model = 0x0 (0) (simple synth) = Intel Pentium M (Dothan B1) / Celeron M (Dothan B1), 90nm miscellaneous (1/ebx): process local APIC physical ID = 0x0 (0) cpu count = 0x0 (0) CLFLUSH line size = 0x8 (8) brand index = 0x16 (22) brand id = 0x16 (22): Intel Pentium M, .13um feature information (1/edx): x87 FPU on chip = true virtual-8086 mode enhancement = true debugging extensions = true page size extensions = true time stamp counter = true RDMSR and WRMSR support = true physical address extensions = false machine check exception = true CMPXCHG8B inst. = true APIC on chip = false SYSENTER and SYSEXIT = true memory type range registers = true PTE global bit = true machine check architecture = true conditional move/compare instruction = true page attribute table = true page size extension = false processor serial number = false CLFLUSH instruction = true debug store = true thermal monitor and clock ctrl = true MMX Technology = true FXSAVE/FXRSTOR = true SSE extensions = true SSE2 extensions = true self snoop = true hyper-threading / multi-core supported = false therm. monitor = true IA64 = false pending break event = true feature information (1/ecx): PNI/SSE3: Prescott New Instructions = false PCLMULDQ instruction = false 64-bit debug store = false MONITOR/MWAIT = false CPL-qualified debug store = false VMX: virtual machine extensions = false SMX: safer mode extensions = false Enhanced Intel SpeedStep Technology = true thermal monitor 2 = true SSSE3 extensions = false context ID: adaptive or shared L1 data = false FMA instruction = false CMPXCHG16B instruction = false xTPR disable = false perfmon and debug = false process context identifiers = false direct cache access = false SSE4.1 extensions = false SSE4.2 extensions = false extended xAPIC support = false MOVBE instruction = false POPCNT instruction = false time stamp counter deadline = false AES instruction = false XSAVE/XSTOR states = false OS-enabled XSAVE/XSTOR = false AVX: advanced vector extensions = false F16C half-precision convert instruction = false RDRAND instruction = false hypervisor guest status = false cache and TLB information (2): 0xb0: instruction TLB: 4K, 4-way, 128 entries 0xb3: data TLB: 4K, 4-way, 128 entries 0x02: instruction TLB: 4M pages, 4-way, 2 entries 0xf0: 64 byte prefetching 0x7d: L2 cache: 2M, 8-way, sectored, 64 byte lines 0x30: L1 cache: 32K, 8-way, 64 byte lines 0x04: data TLB: 4M pages, 4-way, 8 entries 0x2c: L1 data cache: 32K, 8-way, 64 byte lines extended feature flags (0x80000001/edx): SYSCALL and SYSRET instructions = false execution disable = false 1-GB large page support = false RDTSCP = false 64-bit extensions technology available = false Intel feature flags (0x80000001/ecx): LAHF/SAHF supported in 64-bit mode = false LZCNT advanced bit manipulation = false 3DNow! PREFETCH/PREFETCHW instructions = false brand = " Intel(R) Pentium(R) M processor 1.50GHz" (multi-processing synth): none (multi-processing method): Intel leaf 1 (synth) = Intel Pentium M (Dothan B1), 90nm

Read the article

Michael Crump’s notes for 70-563 PRO – Designing and Developing Windows Applications usi

- by mbcrump

TIME TO GO PRO! This is my notes for 70-563 PRO – Designing and Developing Windows Applications using .NET Framework 3.5 I created it using several resources (various certification web sites, msdn, official ms 70-548 book). The reason that I created this review is because a) I am taking the exam. b) MS did not create a book for this exam. Use the(MS 70-548)book. c) To make sure I am familiar with each before the exam. I hope that it provides a good start for your own notes. I hope that someone finds this useful. At least, it will give you a starting point of what to expect to know on the PRO exam. Also, for those wondering, the PRO exam does contains very little code. It is basically all theory. 1. Validation Controls – How to prevent users from entering invalid data on forms. (MaskedTextBox control and RegEx) 2. ServiceController – used to start and control the behavior of existing services. 3. User Feedback (know winforms Status Bar, Tool Tips, Color, Error Provider, Context-Sensitive and Accessibility) 4. Specific (derived) exceptions must be handled before general (base class) exceptions. By moving the exception handling for the base type Exception to after exception handling of ArgumentNullException, all ArgumentNullException thrown by the Helper method will be caught and logged correctly. 5. A heartbeat method is a method exposed by a Web service that allows external applications to check on the status of the service. 6. New users must master key tasks quickly. Giving these tasks context and appropriate detail will help. However, advanced users will demand quicker paths. Shortcuts, accelerators, or toolbar buttons will speed things along for the advanced user. 7. MSBuild uses project files to instruct the build engine what to build and how to build it. MSBuild project files are XML files that adhere to the MSBuild XML schema. The MSBuild project files contain complete file, build action, and dependency information for each individual projects. 8. Evaluating whether or not to fix a bug involves a triage process. You must identify the bug's impact, set the priority, categorize it, and assign a developer. Many times the person doing the triage work will assign the bug to a developer for further investigation. In fact, the workflow for the bug work item inside of Team System supports this step. Developers are often asked to assess the impact of a given bug. This assessment helps the person doing the triage make a decision on how to proceed. When assessing the impact of a bug, you should consider time and resources to fix it, bug risk, and impacts of the bug. 9. In large projects it is generally impossible and unfeasible to fix all bugs because of the impact on schedule and budget. 10. Code reviews should be conducted by a technical lead or a technical peer. 11. Testing Applications 12. WCF Services – application state 13. SQL Server 2005 / 2008 Express Edition – reliable storage of data / Microsoft SQL Server 3.5 Compact Database– used for client computers to retrieve and save data from a shared location. 14. SQL Server 2008 Compact Edition – used for minimum possible memory and can synchronize data with a corporate SQL Server 2008 Database. Supports offline user and minimum dependency on external components. 15. MDI and SDI Forms (specifically IsMDIContainer) 16. GUID – in the case of data warehousing, it is important to define unique keys. 17. Encrypting / Security Data 18. Understanding of Isolated Storage/Proper location to store items 19. LINQ to SQL 20. Multithreaded access 21. ADO.NET Entity Framework model 22. Marshal.ReleaseComObject 23. Common User Interface Layout (ComboBox, ListBox, Listview, MaskedTextBox, TextBox, RichTextBox, SplitContainer, TableLayoutPanel, TabControl) 24. DataSets Class - http://msdn.microsoft.com/en-us/library/system.data.dataset%28VS.71%29.aspx 25. SQL Server 2008 Reporting Services (SSRS) 26. SystemIcons.Shield (Vista UAC) 27. Leverging stored procedures to perform data manipulation for a database schema that can change. 28. DataContext 29. Microsoft Windows Installer Packages, ClickOnce(bootstrapping features), XCopy. 30. Client Application Services – will authenticate users by using the same data source as a ASP.NET web application. 31. SQL Server 2008 Caching 32. StringBuilder 33. Accessibility Guidelines for Windows Applications http://msdn.microsoft.com/en-us/library/ms228004.aspx 34. Logging erros 35. Testing performance related issues. 36. Role Based Security, GenericIdentity and GenericPrincipal 37. System.Net.CookieContainer will store session data for webapps (see isolated storage for winforms) 38. .NET CLR Profiler tool will identify objects that cause performance issues. 39. ADO.NET Synchronization (SyncGroup) 40. Globalization - CultureInfo 41. IDisposable Interface- reports on several questions relating to this. 42. Adding timestamps to determine whether data has changed or not. 43. Converting applications to .NET Framework 3.5 44. MicrosoftReportViewer 45. Composite Controls 46. Windows Vista KNOWN folders. 47. Microsoft Sync Framework 48. TypeConverter -Provides a unified way of converting types of values to other types, as well as for accessing standard values and sub properties. http://msdn.microsoft.com/en-us/library/system.componentmodel.typeconverter.aspx 49. Concurrency control mechanisms The main categories of concurrency control mechanisms are: Optimistic - Delay the checking of whether a transaction meets the isolation rules (e.g., serializability and recoverability) until its end, without blocking any of its (read, write) operations, and then abort a transaction, if the desired rules are violated. Pessimistic - Block operations of a transaction, if they may cause violation of the rules. Semi-optimistic - Block operations in some situations, and do not block in other situations, while delaying rules checking to transaction's end, as done with optimistic. 50. AutoResetEvent 51. Microsoft Messaging Queue (MSMQ) 4.0 52. Bulk imports 53. KeyDown event of controls 54. WPF UI components 55. UI process layer 56. GAC (installing, removing and queuing) 57. Use a local database cache to reduce the network bandwidth used by applications. 58. Sound can easily be annoying and distracting to users, so use it judiciously. Always give users the option to turn sound off. Because a user might have sound off, never convey important information through sound alone.

Read the article

LLBLGen Pro v3.1 released!

- by FransBouma

Yesterday we released LLBLGen Pro v3.1! Version 3.1 comes with new features and enhancements, which I'll describe briefly below. v3.1 is a free upgrade for v3.x licensees. What's new / changed? Designer Extensible Import system. An extensible import system has been added to the designer to import project data from external sources. Importers are plug-ins which import project meta-data (like entity definitions, mappings and relational model data) from an external source into the loaded project. In v3.1, an importer plug-in for importing project elements from existing LLBLGen Pro v3.x project files has been included. You can use this importer to create source projects from which you import parts of models to build your actual project with. Model-only relationships. In v3.1, relationships of the type 1:1, m:1 and 1:n can be marked as model-only. A model-only relationship isn't required to have a backing foreign key constraint in the relational model data. They're ideal for projects which have to work with relational databases where changes can't always be made or some relationships can't be added to (e.g. the ones which are important for the entity model, but are not allowed to be added to the relational model for some reason). Custom field ordering. Although fields in an entity definition don't really have an ordering, it can be important for some situations to have the entity fields in a given order, e.g. when you use compound primary keys. Field ordering can be defined using a pop-up dialog which can be opened through various ways, e.g. inside the project explorer, model view and entity editor. It can also be set automatically during refreshes based on new settings. Command line relational model data refresher tool, CliRefresher.exe. The command line refresh tool shipped with v2.6 is now available for v3.1 as well Navigation enhancements in various designer elements. It's now easier to find elements like entities, typed views etc. in the project explorer from editors, to navigate to related entities in the project explorer by right clicking a relationship, navigate to the super-type in the project explorer when right-clicking an entity and navigate to the sub-type in the project explorer when right-clicking a sub-type node in the project explorer. Minor visual enhancements / tweaks LLBLGen Pro Runtime Framework Entity creation is now up to 30% faster and takes 5% less memory. Creating an entity object has been optimized further by tweaks inside the framework to make instantiating an entity object up to 30% faster. It now also takes up to 5% less memory than in v3.0 Prefetch Path node merging is now up to 20-25% faster. Setting entity references required the creation of a new relationship object. As this relationship object is always used internally it could be cached (as it's used for syncing only). This increases performance by 20-25% in the merging functionality. Entity fetches are now up to 20% faster. A large number of tweaks have been applied to make entity fetches up to 20% faster than in v3.0. Full WCF RIA support. It's now possible to use your LLBLGen Pro runtime framework powered domain layer in a WCF RIA application using the VS.NET tools for WCF RIA services. WCF RIA services is a Microsoft technology for .NET 4 and typically used within silverlight applications. SQL Server DQE compatibility level is now per instance. (Usable in Adapter). It's now possible to set the compatibility level of the SQL Server Dynamic Query Engine (DQE) per instance of the DQE instead of the global setting it was before. The global setting is still available and is used as the default value for the compatibility level per-instance. You can use this to switch between CE Desktop and normal SQL Server compatibility per DataAccessAdapter instance. Support for COUNT_BIG aggregate function (SQL Server specific). The aggregate function COUNT_BIG has been added to the list of available aggregate functions to be used in the framework. Minor changes / tweaks I'm especially pleased with the import system, as that makes working with entity models a lot easier. The import system lets you import from another LLBLGen Pro v3 project any entity definition, mapping and / or meta-data like table definitions. This way you can build repository projects where you store model fragments, e.g. the building blocks for a customer-order system, a user credential model etc., any model you can think of. In most projects, you'll recognize that some parts of your new model look familiar. In these cases it would have been easier if you would have been able to import these parts from projects you had pre-created. With LLBLGen Pro v3.1 you can. For example, say you have an Oracle schema called CRM which contains the bread 'n' butter customer-order-product kind of model. You create an entity model from that schema and save it in a project file. Now you start working on another project for another customer and you have to use SQL Server. You also start using model-first development, so develop the entity model from scratch as there's no existing database. As this customer also requires some CRM like entity model, you import the entities from your saved Oracle project into this new SQL Server targeting project. Because you don't work with Oracle this time, you don't import the relational meta-data, just the entities, their relationships and possibly their inheritance hierarchies, if any. As they're now entities in your project you can change them a bit to match the new customer's requirements. This can save you a lot of time, because you can re-use pre-fab model fragments for new projects. In the example above there are no tables yet (as you work model first) so using the forward mapping capabilities of LLBLGen Pro v3 creates the tables, PK constraints, Unique Constraints and FK constraints for you. This way you can build a nice repository of model fragments which you can re-use in new projects.

Read the article

PeopleSoft Upgrades, Fusion, & BI for Leading European PeopleSoft Applications Customers

- by Mark Rosenberg

With so many industry-leading services firms around the globe managing their businesses with PeopleSoft, it’s always an adventure setting up times and meetings for us to keep in touch with them, especially those outside of North America who often do not get to join us at Oracle OpenWorld. Fortunately, during the first two weeks of May, Nigel Woodland (Oracle’s Service Industries Director for the EMEA region) and I successfully blocked off our calendars to visit seven different customers spanning four countries in Western Europe. We met executives and leaders at four Staffing industry firms, two Professional Services firms that engage in consulting and auditing, and a Financial Services firm. As we shared the latest information regarding product capabilities and plans, we also gained valuable insight into the hot technology topics facing these businesses. What we heard was both informative and inspiring, and I suspect other Oracle PeopleSoft applications customers can benefit from one or more of the following observations from our trip. Great IT Plans Get Executed When You Respect the Users Each of our visits followed roughly the same pattern. After introductions, Nigel outlined Oracle’s product and technology strategy, including a discussion of how we at Oracle invest in each layer of the “technology stack” to provide customers with unprecedented business management capabilities and choice. Then, I provided the specifics of the PeopleSoft product line’s investment strategy, detailing the dramatic number of rich usability and functionality enhancements added to release 9.1 since its general availability in 2009 and the game-changing capabilities slated for 9.2. What was most exciting about each of these discussions was that shortly after my talking about what customers can do with release 9.1 right now to drive up user productivity and satisfaction, I saw the wheels turning in the minds of our audiences. Business analyst and end user-configurable tools and technologies, such as WorkCenters and the Related Action Framework, that provide the ability to tailor a “central command center” to the exact needs of each recruiter, biller, and every other role in the organization were exactly what each of our customers had been looking for. Every one of our audiences agreed that these tools which demonstrate a respect for the user would finally help IT pole vault over the wall of resistance that users had often raised in the past. With these new user-focused capabilities, IT is positioned to definitively partner with the business, instead of drag the business along, to unlock the value of their investment in PeopleSoft. This topic of respecting the user emerged during our very first visit, which was at Vital Services Group at their Head Office “The Mill” in Manchester, England. (If you are a student of architecture and are ever in Manchester, you should stop in to see this amazingly renovated old mill building.) I had just finished explaining our PeopleSoft 9.2 roadmap, and Mike Code, PeopleSoft Systems Manager for this innovative staffing company, said, “Mark, the new features you’ve shown us in 9.1/9.2 are very relevant to our business. As we forge ahead with the 9.1 upgrade, the ability to configure a targeted user interface with WorkCenters, Related Actions, Pivot Grids, and Alerts will enable us to satisfy the business that this upgrade is for them and will deliver tangible benefits. In fact, you’ve highlighted that we need to start talking to the business to keep up the momentum to start reviewing the 9.2 upgrade after we get to 9.1, because as much as 9.1 and PeopleTools 8.52 offers, what you’ve shown us for 9.2 is what we’ve envisioned was ultimately possible with our investment in PeopleSoft applications.” We also received valuable feedback about our investment for the Staffing industry when we visited with Hans Wanders, CIO of Randstad (the second largest Staffing company in the world) in the Netherlands. After our visit, Hans noted, “It was very interesting to see how the PeopleSoft applications have developed. I was truly impressed by many of the new developments.” Hans and Mike, sincere thanks for the validation that our team’s hard work and dedication to “respecting the users” is worth the effort! Co-existence of PeopleSoft and Fusion Applications Just Makes Sense As a “product person,” one of the most rewarding things about visiting customers is that they actually want to talk to me. Sometimes, they want to discuss a product area that we need to enhance; other times, they are interested in learning how to extract more value from their applications; and still others, they want to tell me how they are using the applications to drive real value for the business. During this trip, I was very pleased to hear that several of our customers not only thought the co-existence of Fusion applications alongside PeopleSoft applications made sense in theory, but also that they were aggressively looking at how to deploy one or more Fusion applications alongside their PeopleSoft HCM and FSCM applications. The most common deployment plan in the works by three of the organizations is to upgrade to PeopleSoft 9.1 or 9.2, and then adopt one of the new Fusion HCM applications, such as Fusion Performance Management or the full suite of Fusion Talent Management. For example, during an applications upgrade planning discussion with the staffing company Hays plc., Mark Thomas, who is Hays’ UK IT Director, commented, “We are very excited about where we can go with the latest versions of the PeopleSoft applications in conjunction with Fusion Talent Management.” Needless to say, this news was very encouraging, because it reiterated that our applications investment strategy makes good business sense for our customers. Next Generation Business Intelligence Is the Key to the Future The third, and perhaps most exciting, lesson I learned during this journey is that our audiences already know that the latest generation of Business Intelligence technologies will be the “secret sauce” for organizations to transform business in radical ways. While a number of the organizations we visited on the trip have deployed or are deploying Oracle Business Intelligence Enterprise Edition and the associated analytics applications to provide dashboards of easy-to-understand, user-configurable metrics that help optimize business performance according to current operating procedures, what’s most exciting to them is being able to use Business Intelligence to change the way an organization does business, grows revenue, and makes a profit. In particular, several executives we met asked whether we can help them minimize the need to have perfectly structured data and at the same time generate analytics that improve order fulfillment decision-making. To them, the path to future growth lies in having the ability to analyze unstructured data rapidly and intuitively and leveraging technology’s ability to detect patterns that a human cannot reasonably be expected to see. For illustrative purposes, here is a good example of a business problem where analyzing a combination of structured and unstructured data can produce better results. If you have a resource manager trying to decide which person would be the best fit for an assignment in terms of ensuring (a) client satisfaction, (b) the individual’s satisfaction with the work, (c) least travel distance, and (d) highest margin, you traditionally compare resource qualifications to assignment needs, calculate margins on past work with the client, and measure distances. To perform these comparisons, you are likely to need the organization to have profiles setup, people ranked against profiles, margin targets setup, margins measured, distances setup, distances measured, and more. As you can imagine, this requires organizations to plan and implement data setup, capture, and quality management initiatives to ensure that dependable information is available to support resourcing analysis and decisions. In the fast-paced, tight-budget world in which most organizations operate today, the effort and discipline required to maintain high-quality, structured data like those described in the above example are certainly not desirable and in some cases are not feasible. You can imagine how intrigued our audiences were when I informed them that we are ready to help them analyze volumes of unstructured data, detect trends, and produce recommendations. Our discussions delved into examples of how the firms could leverage Oracle’s Secure Enterprise Search and Endeca technologies to keyword search against, compare, and learn from unstructured resource and assignment data. We also considered examples of how they could employ Oracle Real-Time Decisions to generate statistically significant recommendations based on similar resourcing scenarios that have produced the desired satisfaction and profit margin results. --- Although I had almost no time for sight-seeing during this trip to Europe, I have to say that it may have been one of the most energizing and engaging trips of my career. Showing these dedicated customers how they can give every user a uniquely tailored set of tools and address business problems in ways that have to date been impossible made the journey across the Atlantic more than worth it. If any of these three topics intrigue you, I’d recommend you contact your Oracle applications representative to arrange for more detailed discussions with the appropriate members of our organization.

Read the article

Using SQL Source Control with Fortress or Vault – Part 2

- by AjarnMark

In Part 1, I started talking about using Red-Gate’s newest version of SQL Source Control and how I really like it as a viable method to source control your database development. It looks like this is going to turn into a little series where I will explain how we have done things in the past, and how life is different with SQL Source Control. I will also explain some of my philosophy and methodology around deployment with these tools. But for now, let’s talk about some of the good and the bad of the tool itself. More Kudos and Features I mentioned previously how impressed I was with the responsiveness of Red-Gate’s team. I have been having an ongoing email conversation with Gyorgy Pocsi, and as I have run into problems or requested things behave a little differently, it has not been more than a day or two before a new Build is ready for me to download and test. Quite impressive! I’m sure much of the requests I put in were already in the plans, so I can’t really take credit for them, but throughout this conversation, Red-Gate has implemented several features that were not in the first Early Access version. Those include: Honoring the Fortress configuration option to require Work Item (Bug) IDs on check-ins. Adding the check-in comment text as a comment to the Work Item. Adding the list of checked-in files, along with the Fortress links for automatic History and DIFF view Updating the status of a Work Item on check-in (e.g. setting the item to Complete or, in our case “Dev-Complete”) Support for the Fortress 2.0 API, and not just the Vault Pro 5.1 API. (See later notes regarding support for Fortress 2.0). These were all features that I felt we really needed to have in-place before I could honestly consider converting my team to using SQL Source Control on a regular basis. Now that I have those, my only excuse is not wanting to switch boats on the team mid-stream. So when we wrap up our current release in a few weeks, we will make the jump. In the meantime, I will continue to bang on it to make sure it is stable. It passed one test for stability when I did a test load of one of our larger database schemas into Fortress with SQL Source Control. That database has about 150 tables, 200 User-Defined Functions and nearly 900 Stored Procedures. The initial load to source control went smoothly and took just a brief amount of time. Warnings Remember that this IS still in pre-release stage and while I have not had any problems after that first hiccup I wrote about last time, you still need to treat it with a healthy respect. As I understand it, the RTM is targeted for February. There are a couple more features that I hope make it into the final release version, but if not, they’ll probably be coming soon thereafter. Those are: A Browse feature to let me lookup the Work Item ID instead of having to remember it or look back in my Item details. This is just a matter of convenience. I normally have my Work Item list open anyway, so I can easily look it up, but hey, why not make it even easier. A multi-line comment area. The current space for writing check-in comments is a single-line text box. I would like to have a multi-line space as I sometimes write lengthy commentary. But I recognize that it is a struggle to get most developers to put in more than the word “fixed” as their comment, so this meets the need of the majority as-is, and it’s not a show-stopper for us. Merge. SQL Source Control currently does not have a Merge feature. If two or more people make changes to the same database object, you will get a warning of the conflict and have to choose which one wins (and then manually edit to include the others’ changes). I think it unlikely you will run into actual conflicts in Stored Procedures and Functions, but you might with Views or Tables. This will be nice to have, but I’m not losing any sleep over it. And I have multiple tools at my disposal to do merges manually, so really not a show-stopper for us. Automation has its limits. As cool as this automation is, it has its limits and there are some changes that you will be better off scripting yourself. For example, if you are refactoring table definitions, and want to change a column name, you can write that as a quick sp_rename command and preserve the data within that column. But because this tool is looking just at a before and after picture, it cannot tell that you just renamed a column. To the tool, it looks like you dropped one column and added another. This is not a knock against Red-Gate. All automated scripting tools have this issue, unless the are actively monitoring your every step to know exactly what you are doing. This means that when you go to Deploy your changes, SQL Compare will script the change as a column drop and add, or will attempt to rebuild the entire table. Unfortunately, neither of these approaches will preserve the existing data in that column the way an sp_rename will, and so you are better off scripting that change yourself. Thankfully, SQL Compare will produce warnings about the potential loss of data before it does the actual synchronization and give you a chance to intercept the script and do it yourself. Also, please note that the current official word is that SQL Source Control supports Vault Professional 5.1 and later. Vault Professional is the new name for what was previously known as Fortress. (You can read about the name change on SourceGear’s site.) The last version of Fortress was 2.x, and the API for Fortress 2.x is different from the API for Vault Pro. At my company, we are currently running Fortress 2.0, with plans to upgrade to Vault Pro early next year. Gyorgy was able to come up with a work-around for me to be able to use SQL Source Control with Fortress 2.0, even though it is not officially supported. If you are using Fortress 2.0 and want to use SQL Source Control, be aware that this is not officially supported, but it is working for us, and you can probably get the work-around instructions from Red-Gate if you’re really, really nice to them. Upcoming Topics Some of the other topics I will likely cover in this series over the next few weeks are: How we used to do source control back in the old days (a few weeks ago) before SQL Source Control was available to Vault users What happens when you restore a database that is linked to source control Handling multiple development branches of source code Concurrent Development practices and handling Conflicts Deployment Tips and Best Practices A recap after using the tool for a while

Read the article

Updating a SharePoint master page via a solution (WSP)

- by Kelly Jones

In my last blog post, I wrote how to deploy a SharePoint theme using Features and a solution package. As promised in that post, here is how to update an already deployed master page. There are several ways to update a master page in SharePoint. You could upload a new version to the master page gallery, or you could upload a new master page to the gallery, and then set the site to use this new page. Manually uploading your master page to the master page gallery might be the best option, depending on your environment. For my client, I did these steps in code, which is what they preferred. (Image courtesy of: http://www.joiningdots.net/blog/2007/08/sharepoint-and-quick-launch.html ) Before you decide which method you need to use, take a look at your existing pages. Are they using the SharePoint dynamic token or the static token for the master page reference? The wha, huh? SO, there are four ways to tell an .aspx page hosted in SharePoint which master page it should use: “~masterurl/default.master” – tells the page to use the default.master property of the site “~masterurl/custom.master” – tells the page to use the custom.master property of the site “~site/default.master” – tells the page to use the file named “default.master” in the site’s master page gallery “~sitecollection/default.master” – tells the page to use the file named “default.master” in the site collection’s master page gallery For more information about these tokens, take a look at this article on MSDN. Once you determine which token your existing pages are pointed to, then you know which file you need to update. So, if the ~masterurl tokens are used, then you upload a new master page, either replacing the existing one or adding another one to the gallery. If you’ve uploaded a new file with a new name, you’ll just need to set it as the master page either through the UI (MOSS only) or through code (MOSS or WSS Feature receiver code – or using SharePoint Designer). If the ~site or ~sitecollection tokens were used, then you’re limited to either replacing the existing master page, or editing all of your existing pages to point to another master page. In most cases, it probably makes sense to just replace the master page. For my project, I’m working with WSS and the existing pages are set to the ~sitecollection token. Based on this, I decided to just upload a new version of the existing master page (and not modify the dozens of existing pages). Also, since my client prefers Features and solutions, I created a master page Feature and a corresponding Feature Receiver. For information on creating the elements and feature files, check out this post: http://sharepointmagazine.net/technical/development/deploying-the-master-page . This works fine, unless you are overwriting an existing master page, which was my case. You’ll run into errors because the master page file needs to be checked out, replaced, and then checked in. I wrote code in my Feature Activated event handler to accomplish these steps. Here are the steps necessary in code: Get the file name from the elements file of the Feature Check out the file from the master page gallery Upload the file to the master page gallery Check in the file to the master page gallery Here’s the code in my Feature Receiver: 1: public override void FeatureActivated(SPFeatureReceiverProperties properties) 2: { 3: try 4: { 5: 6: SPElementDefinitionCollection col = properties.Definition.GetElementDefinitions(System.Globalization.CultureInfo.CurrentCulture); 7: 8: using (SPWeb curweb = GetCurWeb(properties)) 9: { 10: foreach (SPElementDefinition ele in col) 11: { 12: if (string.Compare(ele.ElementType, "Module", true) == 0) 13: { 14: // <Module Name="DefaultMasterPage" List="116" Url="_catalogs/masterpage" RootWebOnly="FALSE"> 15: // <File Url="myMaster.master" Type="GhostableInLibrary" IgnoreIfAlreadyExists="TRUE" 16: // Path="MasterPages/myMaster.master" /> 17: // </Module> 18: string Url = ele.XmlDefinition.Attributes["Url"].Value; 19: foreach (System.Xml.XmlNode file in ele.XmlDefinition.ChildNodes) 20: { 21: string Url2 = file.Attributes["Url"].Value; 22: string Path = file.Attributes["Path"].Value; 23: string fileType = file.Attributes["Type"].Value; 24: 25: if (string.Compare(fileType, "GhostableInLibrary", true) == 0) 26: { 27: //Check out file in document library 28: SPFile existingFile = curweb.GetFile(Url + "/" + Url2); 29: 30: if (existingFile != null) 31: { 32: if (existingFile.CheckOutStatus != SPFile.SPCheckOutStatus.None) 33: { 34: throw new Exception("The master page file is already checked out. Please make sure the master page file is checked in, before activating this feature."); 35: } 36: else 37: { 38: existingFile.CheckOut(); 39: existingFile.Update(); 40: } 41: } 42: 43: //Upload file to document library 44: string filePath = System.IO.Path.Combine(properties.Definition.RootDirectory, Path); 45: string fileName = System.IO.Path.GetFileName(filePath); 46: char slash = Convert.ToChar("/"); 47: string[] folders = existingFile.ParentFolder.Url.Split(slash); 48: 49: if (folders.Length > 2) 50: { 51: Logger.logMessage("More than two folders were detected in the library path for the master page. Only two are supported.", 52: Logger.LogEntryType.Information); //custom logging component 53: } 54: 55: SPFolder myLibrary = curweb.Folders[folders[0]].SubFolders[folders[1]]; 56: 57: FileStream fs = File.OpenRead(filePath); 58: 59: SPFile newFile = myLibrary.Files.Add(fileName, fs, true); 60: 61: myLibrary.Update(); 62: newFile.CheckIn("Updated by Feature", SPCheckinType.MajorCheckIn); 63: newFile.Update(); 64: } 65: } 66: } 67: } 68: } 69: } 70: catch (Exception ex) 71: { 72: string msg = "Error occurred during feature activation"; 73: Logger.logException(ex, msg, ""); 74: } 75: 76: } 77: 78: /// <summary> 79: /// Using a Feature's properties, get a reference to the Current Web 80: /// </summary> 81: /// <param name="properties"></param> 82: public SPWeb GetCurWeb(SPFeatureReceiverProperties properties) 83: { 84: SPWeb curweb; 85: 86: //Check if the parent of the web is a site or a web 87: if (properties != null && properties.Feature.Parent.GetType().ToString() == "Microsoft.SharePoint.SPWeb") 88: { 89: 90: //Get web from parent 91: curweb = (SPWeb)properties.Feature.Parent; 92: 93: } 94: else 95: { 96: //Get web from Site 97: using (SPSite cursite = (SPSite)properties.Feature.Parent) 98: { 99: curweb = (SPWeb)cursite.OpenWeb(); 100: } 101: } 102: 103: return curweb; 104: } This did the trick. It allowed me to update my existing master page, through an easily repeatable process (which is great when you are working with more than one environment and what to do things like TEST it!). I did run into what I would classify as a strange issue with one of my subsites, but that’s the topic for another blog post.

Read the article

TFS 2010 SDK: Smart Merge - Programmatically Create your own Merge Tool

- by Tarun Arora

Technorati Tags: Team Foundation Server 2010,TFS SDK,TFS API,TFS Merge Programmatically,TFS Work Items Programmatically,TFS Administration Console,ALM The information available in the Merge window in Team Foundation Server 2010 is very important in the decision making during the merging process. However, at present the merge window shows very limited information, more that often you are interested to know the work item, files modified, code reviewer notes, policies overridden, etc associated with the change set. Our friends at Microsoft are working hard to change the game again with vNext, but because at present the merge window is a model window you have to cancel the merge process and go back one after the other to check the additional information you need. If you can relate to what i am saying, you will enjoy this blog post! I will show you how to programmatically create your own merging window using the TFS 2010 API. A few screen shots of the WPF TFS 2010 API – Custom Merging Application that we will be creating programmatically, Excited??? Let’s start coding… 1. Get All Team Project Collections for the TFS Server You can read more on connecting to TFS programmatically on my blog post => How to connect to TFS Programmatically 1: public static ReadOnlyCollection<CatalogNode> GetAllTeamProjectCollections() 2: { 3: TfsConfigurationServer configurationServer = 4: TfsConfigurationServerFactory. 5: GetConfigurationServer(new Uri("http://xxx:8080/tfs/")); 6: 7: CatalogNode catalogNode = configurationServer.CatalogNode; 8: return catalogNode.QueryChildren(new Guid[] 9: { CatalogResourceTypes.ProjectCollection }, 10: false, CatalogQueryOptions.None); 11: } 2. Get All Team Projects for the selected Team Project Collection You can read more on connecting to TFS programmatically on my blog post => How to connect to TFS Programmatically 1: public static ReadOnlyCollection<CatalogNode> GetTeamProjects(string instanceId) 2: { 3: ReadOnlyCollection<CatalogNode> teamProjects = null; 4: 5: TfsConfigurationServer configurationServer = 6: TfsConfigurationServerFactory.GetConfigurationServer(new Uri("http://xxx:8080/tfs/")); 7: 8: CatalogNode catalogNode = configurationServer.CatalogNode; 9: var teamProjectCollections = catalogNode.QueryChildren(new Guid[] {CatalogResourceTypes.ProjectCollection }, 10: false, CatalogQueryOptions.None); 11: 12: foreach (var teamProjectCollection in teamProjectCollections) 13: { 14: if (string.Compare(teamProjectCollection.Resource.Properties["InstanceId"], instanceId, true) == 0) 15: { 16: teamProjects = teamProjectCollection.QueryChildren(new Guid[] { CatalogResourceTypes.TeamProject }, false, 17: CatalogQueryOptions.None); 18: } 19: } 20: 21: return teamProjects; 22: } 3. Get All Branches with in a Team Project programmatically I will be passing the name of the Team Project for which i want to retrieve all the branches. When consuming the ‘Version Control Service’ you have the method QueryRootBranchObjects, you need to pass the recursion type => none, one, full. Full implies you are interested in all branches under that root branch. 1: public static List<BranchObject> GetParentBranch(string projectName) 2: { 3: var branches = new List<BranchObject>(); 4: 5: var tfs = TfsTeamProjectCollectionFactory.GetTeamProjectCollection(new Uri("http://<ServerName>:8080/tfs/<teamProjectName>")); 6: var versionControl = tfs.GetService<VersionControlServer>(); 7: 8: var allBranches = versionControl.QueryRootBranchObjects(RecursionType.Full); 9: 10: foreach (var branchObject in allBranches) 11: { 12: if (branchObject.Properties.RootItem.Item.ToUpper().Contains(projectName.ToUpper())) 13: { 14: branches.Add(branchObject); 15: } 16: } 17: 18: return branches; 19: } 4. Get All Branches associated to the Parent Branch Programmatically Now that we have the parent branch, it is important to retrieve all child branches of that parent branch. Lets see how we can achieve this using the TFS API. 1: public static List<ItemIdentifier> GetChildBranch(string parentBranch) 2: { 3: var branches = new List<ItemIdentifier>(); 4: 5: var tfs = TfsTeamProjectCollectionFactory.GetTeamProjectCollection(new Uri("http://<ServerName>:8080/tfs/<CollectionName>")); 6: var versionControl = tfs.GetService<VersionControlServer>(); 7: 8: var i = new ItemIdentifier(parentBranch); 9: var allBranches = 10: versionControl.QueryBranchObjects(i, RecursionType.None); 11: 12: foreach (var branchObject in allBranches) 13: { 14: foreach (var childBranche in branchObject.ChildBranches) 15: { 16: branches.Add(childBranche); 17: } 18: } 19: 20: return branches; 21: } 5. Get Merge candidates between two branches Programmatically Now that we have the parent and the child branch that we are interested to perform a merge between we will use the method ‘GetMergeCandidates’ in the namespace ‘Microsoft.TeamFoundation.VersionControl.Client’ => http://msdn.microsoft.com/en-us/library/bb138934(v=VS.100).aspx 1: public static MergeCandidate[] GetMergeCandidates(string fromBranch, string toBranch) 2: { 3: var tfs = TfsTeamProjectCollectionFactory.GetTeamProjectCollection(new Uri("http://<ServerName>:8080/tfs/<CollectionName>")); 4: var versionControl = tfs.GetService<VersionControlServer>(); 5: 6: return versionControl.GetMergeCandidates(fromBranch, toBranch, RecursionType.Full); 7: } 6. Get changeset details Programatically Now that we have the changeset id that we are interested in, we can get details of the changeset. The Changeset object contains the properties => http://msdn.microsoft.com/en-us/library/microsoft.teamfoundation.versioncontrol.client.changeset.aspx - Changes: Gets or sets an array of Change objects that comprise this changeset. - CheckinNote: Gets or sets the check-in note of the changeset. - Comment: Gets or sets the comment of the changeset. - PolicyOverride: Gets or sets the policy override information of this changeset. - WorkItems: Gets an array of work items that are associated with this changeset. 1: public static Changeset GetChangeSetDetails(int changeSetId) 2: { 3: var tfs = TfsTeamProjectCollectionFactory.GetTeamProjectCollection(new Uri("http://<ServerName>:8080/tfs/<CollectionName>")); 4: var versionControl = tfs.GetService<VersionControlServer>(); 5: 6: return versionControl.GetChangeset(changeSetId); 7: } 7. Possibilities In future posts i will try and extend this idea to explore further possibilities, but few features that i am sure will further help during the merge decision making process would be, - View changed files - Compare modified file with current/previous version - Merge Preview - Last Merge date Any other features that you can think of?

Read the article

Complex type support in process flow – XMLTYPE

- by shawn

Before OWB 11.2 release, there are only 5 simple data types supported in process flow: DATE, BOOLEAN, INTEGER, FLOAT and STRING. A new complex data type – XMLTYPE is added in 11.2, in order to support complex data being passed between the process flow activities. In this article we will give a simple example to illustrate the usage of the new type and some related editors. Suppose there is a bookstore that uses XML format orders as shown below (we use the simplest form for the illustration purpose), then we can create a process flow to handle the order, take the order as the input, then extract necessary information, and generate a confirmation email to the customer automatically. <order id=’0001’> <customer> <name>Tom</name> <email>[email protected]</email> </customer> <book id=’Java_001’> <quantity>3</quantity> </book> </order> Considering a simple user case here: we use an input parameter/variable with XMLTYPE to hold the XML content of the order; then we can use an Assign activity to retrieve the email info from the order; after that, we can create an email activity to send the email (Other activities might be added in practical case, but will not be described here). 1) Set XML content value For testing purpose, we will create a variable to hold the sample order, and then this will be used among the process flow activities. When the variable is of XMLTYPE and the “Literal” value is set the true, the advance editor will be enabled. Click the “Advance Editor” shown as above, a simple xml editor will popup. The editor has basic features like syntax highlight and check as shown below: We can also do the basic validation or validation against schema with the editor by selecting the normalized schema. With this, it will be easier to provide the value for XMLTYPE variables. 2) Extract information from XML content After setting the value, we need to extract the email information with the Assign activity. In process flow, an enhanced expression builder is used to help users construct the XPath for extracting values from XML content. When the variable’s literal value is set the false, the advance editor is enabled. Click the button, the advance editor will popup, as shown below: The editor is based on the expression builder (which is often used in mapping etc), an XPath lib panel is appended which provides some help information on how to write the XPath. The expression used here is: “XMLTYPE.EXTRACT(XML_ORDER,'/order/customer/email/text()').getStringVal()”, which uses ‘/order/customer/email/text()’ as the XPath to extract the email info from the XML document. A variable called “EMAIL_ADDR” is created with String data type to hold the value extracted. Then we bind the “VARIABLE” parameter of Assign activity to “EMAIL_ADDR” variable, which means the value of the “EMAIL_ADDR” activity will be set to the result of the “VALUE” parameter of Assign activity. 3) Use the extracted information in Email activity We bind the “TO_ADDRESS” parameter of the email activity to the “EMAIL_ADDR” variable created in above step. We can also extract other information from the xml order directly through the expression, for example, we can set the “MESSAGE_BODY” with value “'Dear '||XMLTYPE.EXTRACT(XML_ORDER,'/order/customer/name/text()').getStringVal()||chr(13)||chr(10)||' You have ordered '||XMLTYPE.EXTRACT(XML_ORDER,'/order/book/quantity/text()').getStringVal()||' '||XMLTYPE.EXTRACT(XML_ORDER,'/order/book/@id').getStringVal()”. This expression will extract the customer name, the quantity and the book id from the order to compose the message body. To make the email activity work, we need provide some other necessary information, Such as “SMTP_SERVER” (which is the SMTP server used to send the emails, like “mail.bookstore.com”. The default PORT number is set to 25. You need to change the value accordingly), “FROM_ADDRESS” and “SUBJECT”. Then the process flow is ready to go. After deploying the process flow package, we can simply run the process flow to check if the result is as expected (An email will be sent to the specified email address with proper subject and message body). Note: In oracle 11g, there is an enhanced security feature - ACL (Access Control List), which restrict the network access within db, so we need to edit the list to allow UTL_SMTP work if you are using oracle 11g. Refer to chapter “Access Control Lists for UTL_TCP/HTTP/SMTP” and “Managing Fine-Grained Access to External Network Services” for more details. In previous releases, XMLTYPE already exists in other OWB objects, like mapping/transformation etc. When the mapping/transformation is dragged into a process flow, the parameters with XMLTYPE are mapped to STRING. Now with the XMLTYPE support in process flow, the XMLTYPE will map to XMLTYPE in a more natural way, and we can leverage the new data type for the design.

Read the article

Replication Services as ETL extraction tool

- by jorg

In my last blog post I explained the principles of Replication Services and the possibilities it offers in a BI environment. One of the possibilities I described was the use of snapshot replication as an ETL extraction tool: “Snapshot Replication can also be useful in BI environments, if you don’t need a near real-time copy of the database, you can choose to use this form of replication. Next to an alternative for Transactional Replication it can be used to stage data so it can be transformed and moved into the data warehousing environment afterwards. In many solutions I have seen developers create multiple SSIS packages that simply copies data from one or more source systems to a staging database that figures as source for the ETL process. The creation of these packages takes a lot of (boring) time, while Replication Services can do the same in minutes. It is possible to filter out columns and/or records and it can even apply schema changes automatically so I think it offers enough features here. I don’t know how the performance will be and if it really works as good for this purpose as I expect, but I want to try this out soon!” Well I have tried it out and I must say it worked well. I was able to let replication services do work in a fraction of the time it would cost me to do the same in SSIS. What I did was the following: Configure snapshot replication for some Adventure Works tables, this was quite simple and straightforward. Create an SSIS package that executes the snapshot replication on demand and waits for its completion. This is something that you can’t do with out of the box functionality. While configuring the snapshot replication two SQL Agent Jobs are created, one for the creation of the snapshot and one for the distribution of the snapshot. Unfortunately these jobs are asynchronous which means that if you execute them they immediately report back if the job started successfully or not, they do not wait for completion and report its result afterwards. So I had to create an SSIS package that executes the jobs and waits for their completion before the rest of the ETL process continues. Fortunately I was able to create the SSIS package with the desired functionality. I have made a step-by-step guide that will help you configure the snapshot replication and I have uploaded the SSIS package you need to execute it. Configure snapshot replication The first step is to create a publication on the database you want to replicate. Connect to SQL Server Management Studio and right-click Replication, choose for New.. Publication… The New Publication Wizard appears, click Next Choose your “source” database and click Next Choose Snapshot publication and click Next You can now select tables and other objects that you want to publish Expand Tables and select the tables that are needed in your ETL process In the next screen you can add filters on the selected tables which can be very useful. Think about selecting only the last x days of data for example. Its possible to filter out rows and/or columns. In this example I did not apply any filters. Schedule the Snapshot Agent to run at a desired time, by doing this a SQL Agent Job is created which we need to execute from a SSIS package later on. Next you need to set the Security Settings for the Snapshot Agent. Click on the Security Settings button. In this example I ran the Agent under the SQL Server Agent service account. This is not recommended as a security best practice. Fortunately there is an excellent article on TechNet which tells you exactly how to set up the security for replication services. Read it here and make sure you follow the guidelines! On the next screen choose to create the publication at the end of the wizard Give the publication a name (SnapshotTest) and complete the wizard The publication is created and the articles (tables in this case) are added Now the publication is created successfully its time to create a new subscription for this publication. Expand the Replication folder in SSMS and right click Local Subscriptions, choose New Subscriptions The New Subscription Wizard appears Select the publisher on which you just created your publication and select the database and publication (SnapshotTest) You can now choose where the Distribution Agent should run. If it runs at the distributor (push subscriptions) it causes extra processing overhead. If you use a separate server for your ETL process and databases choose to run each agent at its subscriber (pull subscriptions) to reduce the processing overhead at the distributor. Of course we need a database for the subscription and fortunately the Wizard can create it for you. Choose for New database Give the database the desired name, set the desired options and click OK You can now add multiple SQL Server Subscribers which is not necessary in this case but can be very useful. You now need to set the security settings for the Distribution Agent. Click on the …. button Again, in this example I ran the Agent under the SQL Server Agent service account. Read the security best practices here Click Next Make sure you create a synchronization job schedule again. This job is also necessary in the SSIS package later on. Initialize the subscription at first synchronization Select the first box to create the subscription when finishing this wizard Complete the wizard by clicking Finish The subscription will be created In SSMS you see a new database is created, the subscriber. There are no tables or other objects in the database available yet because the replication jobs did not ran yet. Now expand the SQL Server Agent, go to Jobs and search for the job that creates the snapshot: Rename this job to “CreateSnapshot” Now search for the job that distributes the snapshot: Rename this job to “DistributeSnapshot” Create an SSIS package that executes the snapshot replication We now need an SSIS package that will take care of the execution of both jobs. The CreateSnapshot job needs to execute and finish before the DistributeSnapshot job runs. After the DistributeSnapshot job has started the package needs to wait until its finished before the package execution finishes. The Execute SQL Server Agent Job Task is designed to execute SQL Agent Jobs from SSIS. Unfortunately this SSIS task only executes the job and reports back if the job started succesfully or not, it does not report if the job actually completed with success or failure. This is because these jobs are asynchronous. The SSIS package I’ve created does the following: It runs the CreateSnapshot job It checks every 5 seconds if the job is completed with a for loop When the CreateSnapshot job is completed it starts the DistributeSnapshot job And again it waits until the snapshot is delivered before the package will finish successfully Quite simple and the package is ready to use as standalone extract mechanism. After executing the package the replicated tables are added to the subscriber database and are filled with data: Download the SSIS package here (SSIS 2008) Conclusion In this example I only replicated 5 tables, I could create a SSIS package that does the same in approximately the same amount of time. But if I replicated all the 70+ AdventureWorks tables I would save a lot of time and boring work! With replication services you also benefit from the feature that schema changes are applied automatically which means your entire extract phase wont break. Because a snapshot is created using the bcp utility (bulk copy) it’s also quite fast, so the performance will be quite good. Disadvantages of using snapshot replication as extraction tool is the limitation on source systems. You can only choose SQL Server or Oracle databases to act as a publisher. So if you plan to build an extract phase for your ETL process that will invoke a lot of tables think about replication services, it would save you a lot of time and thanks to the Extract SSIS package I’ve created you can perfectly fit it in your usual SSIS ETL process.

Read the article

On StringComparison Values

- by Jesse

When you use the .NET Framework’s String.Equals and String.Compare methods do you use an overloStringComparison enumeration value? If not, you should be because the value provided for that StringComparison argument can have a big impact on the results of your string comparison. The StringComparison enumeration defines values that fall into three different major categories: Culture-sensitive comparison using a specific culture, defaulted to the Thread.CurrentThread.CurrentCulture value (StringComparison.CurrentCulture and StringComparison.CurrentCutlureIgnoreCase) Invariant culture comparison (StringComparison.InvariantCulture and StringComparison.InvariantCultureIgnoreCase) Ordinal (byte-by-byte) comparison of (StringComparison.Ordinal and StringComparison.OrdinalIgnoreCase) There is a lot of great material available that detail the technical ins and outs of these different string comparison approaches. If you’re at all interested in the topic these two MSDN articles are worth a read: Best Practices For Using Strings in the .NET Framework: http://msdn.microsoft.com/en-us/library/dd465121.aspx How To Compare Strings: http://msdn.microsoft.com/en-us/library/cc165449.aspx Those articles cover the technical details of string comparison well enough that I’m not going to reiterate them here other than to say that the upshot is that you typically want to use the culture-sensitive comparison whenever you’re comparing strings that were entered by or will be displayed to users and the ordinal comparison in nearly all other cases. So where does that leave the invariant culture comparisons? The “Best Practices For Using Strings in the .NET Framework” article has the following to say: “On balance, the invariant culture has very few properties that make it useful for comparison. It does comparison in a linguistically relevant manner, which prevents it from guaranteeing full symbolic equivalence, but it is not the choice for display in any culture. One of the few reasons to use StringComparison.InvariantCulture for comparison is to persist ordered data for a cross-culturally identical display. For example, if a large data file that contains a list of sorted identifiers for display accompanies an application, adding to this list would require an insertion with invariant-style sorting.” I don’t know about you, but I feel like that paragraph is a bit lacking. Are there really any “real world” reasons to use the invariant culture comparison? I think the answer to this question is, “yes”, but in order to understand why we should first think about what the invariant culture comparison really does. The invariant culture comparison is really just a culture-sensitive comparison using a special invariant culture (Michael Kaplan has a great post on the history of the invariant culture on his blog: http://blogs.msdn.com/b/michkap/archive/2004/12/29/344136.aspx). This means that the invariant culture comparison will apply the linguistic customs defined by the invariant culture which are guaranteed not to differ between different machines or execution contexts. This sort of consistently does prove useful if you needed to maintain a list of strings that are sorted in a meaningful and consistent way regardless of the user viewing them or the machine on which they are being viewed. Example: Prototype Names Let’s say that you work for a large multi-national toy company with branch offices in 10 different countries. Each year the company would work on 15-25 new toy prototypes each of which is assigned a “code name” while it is under development. Coming up with fun new code names is a big part of the company culture that everyone really enjoys, so to be fair the CEO of the company spent a lot of time coming up with a prototype naming scheme that would be fun for everyone to participate in, fair to all of the different branch locations, and accessible to all members of the organization regardless of the country they were from and the language that they spoke. Each new prototype will get a code name that begins with a letter following the previously created name using the alphabetical order of the Latin/Roman alphabet. Each new year prototype names would start back at “A”. The country that leads the prototype development effort gets to choose the name in their native language. (An appropriate Romanization system will be used for countries where the primary language is not written in the Latin/Roman alphabet. For example, the Pinyin system could be used for Chinese). To avoid repeating names, a list of all current and past prototype names will be maintained on each branch location’s company intranet site. Assuming that maintaining a single pre-sorted list is not feasible among all of the highly distributed intranet implementations, what string comparison method would you use to sort each year’s list of prototype names so that the list is both meaningful and consistent regardless of the country within which the list is being viewed? Sorting the list with a culture-sensitive comparison using the default configured culture on each country’s intranet server the list would probably work most of the time, but subtle differences between cultures could mean that two different people would see a list that was sorted slightly differently. The CEO wants the prototype names to be a unifying aspect of company culture and is adamant that everyone see the the same list sorted in the same order and there’s no way to guarantee a consistent sort across different cultures using the culture-sensitive string comparison rules. The culture-sensitive sort would produce a meaningful list for the specific user viewing it, but it wouldn’t always be consistent between different users. Sorting with the ordinal comparison would certainly be consistent regardless of the user viewing it, but would it be meaningful? Let’s say that the current year’s prototype name list looks like this: Antílope (Spanish) Babouin (French) Cahoun (Czech) Diamond (English) Flosse (German) If you were to sort this list using ordinal rules you’d end up with: Antílope Babouin Diamond Flosse Cahoun This sort is no good because the entry for “C” appears the bottom of the list after “F”. This is because the Czech entry for the letter “C” makes use of a diacritic (accent mark). The ordinal string comparison does a byte-by-byte comparison of the code points that make up each character in the string and the code point for the “C” with the diacritic mark is higher than any letter without a diacritic mark, which pushes that entry to the bottom of the sorted list. The CEO wants each country to be able to create prototype names in their native language, which means we need to allow for names that might begin with letters that have diacritics, so ordinal sorting kills the meaningfulness of the list. As it turns out, this situation is actually well-suited for the invariant culture comparison. The invariant culture accounts for linguistically relevant factors like the use of diacritics but will provide a consistent sort across all machines that perform the sort. Now that we’ve walked through this example, the following line from the “Best Practices For Using Strings in the .NET Framework” makes a lot more sense: One of the few reasons to use StringComparison.InvariantCulture for comparison is to persist ordered data for a cross-culturally identical display That line describes the prototype name example perfectly: we need a way to persist ordered data for a cross-culturally identical display. While this example is 100% made-up, I think it illustrates that there are indeed real-world situations where the invariant culture comparison is useful.

Read the article

Parallel LINQ - PLINQ

- by nmarun

Turns out now with .net 4.0 we can run a query like a multi-threaded application. Say you want to query a collection of objects and return only those that meet certain conditions. Until now, we basically had one ‘control’ that iterated over all the objects in the collection, checked the condition on each object and returned if it passed. We obviously agree that if we can ‘break’ this task into smaller ones, assign each task to a different ‘control’ and ask all the controls to do their job - in-parallel, the time taken the finish the entire task will be much lower. Welcome to PLINQ. Let’s take some examples. I have the following method that uses our good ol’ LINQ. 1: private static void Linq(int lowerLimit, int upperLimit) 2: { 3: // populate an array with int values from lowerLimit to the upperLimit 4: var source = Enumerable.Range(lowerLimit, upperLimit); 5: 6: // Start a timer 7: Stopwatch stopwatch = new Stopwatch(); 8: stopwatch.Start(); 9: 10: // set the expectation => build the expression tree 11: var evenNumbers = from num in source 12: where IsDivisibleBy(num, 2) 13: select num; 14: 15: // iterate over and print the returned items 16: foreach (var number in evenNumbers) 17: { 18: Console.WriteLine(string.Format("** {0}", number)); 19: } 20: 21: stopwatch.Stop(); 22: 23: // check the metrics 24: Console.WriteLine(String.Format("Elapsed {0}ms", stopwatch.ElapsedMilliseconds)); 25: } I’ve added comments for the major steps, but the only thing I want to talk about here is the IsDivisibleBy() method. I know I could have just included the logic directly in the where clause. I called a method to add ‘delay’ to the execution of the query - to simulate a loooooooooong operation (will be easier to compare the results). 1: private static bool IsDivisibleBy(int number, int divisor) 2: { 3: // iterate over some database query 4: // to add time to the execution of this method; 5: // the TableB has around 10 records 6: for (int i = 0; i < 10; i++) 7: { 8: DataClasses1DataContext dataContext = new DataClasses1DataContext(); 9: var query = from b in dataContext.TableBs select b; 10: 11: foreach (var row in query) 12: { 13: // Do NOTHING (wish my job was like this) 14: } 15: } 16: 17: return number % divisor == 0; 18: } Now, let’s look at how to modify this to PLINQ. 1: private static void Plinq(int lowerLimit, int upperLimit) 2: { 3: // populate an array with int values from lowerLimit to the upperLimit 4: var source = Enumerable.Range(lowerLimit, upperLimit); 5: 6: // Start a timer 7: Stopwatch stopwatch = new Stopwatch(); 8: stopwatch.Start(); 9: 10: // set the expectation => build the expression tree 11: var evenNumbers = from num in source.AsParallel() 12: where IsDivisibleBy(num, 2) 13: select num; 14: 15: // iterate over and print the returned items 16: foreach (var number in evenNumbers) 17: { 18: Console.WriteLine(string.Format("** {0}", number)); 19: } 20: 21: stopwatch.Stop(); 22: 23: // check the metrics 24: Console.WriteLine(String.Format("Elapsed {0}ms", stopwatch.ElapsedMilliseconds)); 25: } That’s it, this is now in PLINQ format. Oh and if you haven’t found the difference, look line 11 a little more closely. You’ll see an extension method ‘AsParallel()’ added to the ‘source’ variable. Couldn’t be more simpler right? So this is going to improve the performance for us. Let’s test it. So in my Main method of the Console application that I’m working on, I make a call to both. 1: static void Main(string[] args) 2: { 3: // set lower and upper limits 4: int lowerLimit = 1; 5: int upperLimit = 20; 6: // call the methods 7: Console.WriteLine("Calling Linq() method"); 8: Linq(lowerLimit, upperLimit); 9: 10: Console.WriteLine(); 11: Console.WriteLine("Calling Plinq() method"); 12: Plinq(lowerLimit, upperLimit); 13: 14: Console.ReadLine(); // just so I get enough time to read the output 15: } YMMV, but here are the results that I got: It’s quite obvious from the above results that the Plinq() method is taking considerably less time than the Linq() version. I’m sure you’ve already noticed that the output of the Plinq() method is not in order. That’s because, each of the ‘control’s we sent to fetch the results, reported with values as and when they obtained them. This is something about parallel LINQ that one needs to remember – the collection cannot be guaranteed to be undisturbed. This could be counted as a negative about PLINQ (emphasize ‘could’). Nevertheless, if we want the collection to be sorted, we can use a SortedSet (.net 4.0) or build our own custom ‘sorter’. Either way we go, there’s a good chance we’ll end up with a better performance using PLINQ. And there’s another negative of PLINQ (depending on how you see it). This is regarding the CPU cycles. See the usage for Linq() method (used ResourceMonitor): I have dual CPU’s and see the height of the peak in the bottom two blocks and now compare to what happens when I run the Plinq() method. The difference is obvious. Higher usage, but for a shorter duration (width of the peak). Both these points make sense in both cases. Linq() runs for a longer time, but uses less resources whereas Plinq() runs for a shorter time and consumes more resources. Even after knowing all these, I’m still inclined towards PLINQ. PLINQ rocks! (no hard feelings LINQ)

Read the article

SQL SERVER – Core Concepts – Elasticity, Scalability and ACID Properties – Exploring NuoDB an Elastically Scalable Database System

- by pinaldave

I have been recently exploring Elasticity and Scalability attributes of databases. You can see that in my earlier blog posts about NuoDB where I wanted to look at Elasticity and Scalability concepts. The concepts are very interesting, and intriguing as well. I have discussed these concepts with my friend Joyti M and together we have come up with this interesting read. The goal of this article is to answer following simple questions What is Elasticity? What is Scalability? How ACID properties vary from NOSQL Concepts? What are the prevailing problems in the current database system architectures? Why is NuoDB an innovative and welcome change in database paradigm? Elasticity This word’s original form is used in many different ways and honestly it does do a decent job in holding things together over the years as a person grows and contracts. Within the tech world, and specifically related to software systems (database, application servers), it has come to mean a few things - allow stretching of resources without reaching the breaking point (on demand). What are resources in this context? Resources are the usual suspects – RAM/CPU/IO/Bandwidth in the form of a container (a process or bunch of processes combined as modules). When it is about increasing resources the simplest idea which comes to mind is the addition of another container. Another container means adding a brand new physical node. When it is about adding a new node there are two questions which comes to mind. 1) Can we add another node to our software system? 2) If yes, does adding new node cause downtime for the system? Let us assume we have added new node, let us see what the new needs of the system are when a new node is added. Balancing incoming requests to multiple nodes Synchronization of a shared state across multiple nodes Identification of “downstate” and resolution action to bring it to “upstate” Well, adding a new node has its advantages as well. Here are few of the positive points Throughput can increase nearly horizontally across the node throughout the system Response times of application will increase as in-between layer interactions will be improved Now, Let us put the above concepts in the perspective of a Database. When we mention the term “running out of resources” or “application is bound to resources” the resources can be CPU, Memory or Bandwidth. The regular approach to “gain scalability” in the database is to look around for bottlenecks and increase the bottlenecked resource. When we have memory as a bottleneck we look at the data buffers, locks, query plans or indexes. After a point even this is not enough as there needs to be an efficient way of managing such large workload on a “single machine” across memory and CPU bound (right kind of scheduling) workload. We next move on to either read/write separation of the workload or functionality-based sharing so that we still have control of the individual. But this requires lots of planning and change in client systems in terms of knowing where to go/update/read and for reporting applications to “aggregate the data” in an intelligent way. What we ideally need is an intelligent layer which allows us to do these things without us getting into managing, monitoring and distributing the workload. Scalability In the context of database/applications, scalability means three main things Ability to handle normal loads without pressure E.g. X users at the Y utilization of resources (CPU, Memory, Bandwidth) on the Z kind of hardware (4 processor, 32 GB machine with 15000 RPM SATA drives and 1 GHz Network switch) with T throughput Ability to scale up to expected peak load which is greater than normal load with acceptable response times Ability to provide acceptable response times across the system E.g. Response time in S milliseconds (or agreed upon unit of measure) – 90% of the time The Issue – Need of Scale In normal cases one can plan for the load testing to test out normal, peak, and stress scenarios to ensure specific hardware meets the needs. With help from Hardware and Software partners and best practices, bottlenecks can be identified and requisite resources added to the system. Unfortunately this vertical scale is expensive and difficult to achieve and most of the operational people need the ability to scale horizontally. This helps in getting better throughput as there are physical limits in terms of adding resources (Memory, CPU, Bandwidth and Storage) indefinitely. Today we have different options to achieve scalability: Read & Write Separation The idea here is to do actual writes to one store and configure slaves receiving the latest data with acceptable delays. Slaves can be used for balancing out reads. We can also explore functional separation or sharing as well. We can separate data operations by a specific identifier (e.g. region, year, month) and consolidate it for reporting purposes. For functional separation the major disadvantage is when schema changes or workload pattern changes. As the requirement grows one still needs to deal with scale need in manual ways by providing an abstraction in the middle tier code. Using NOSQL solutions The idea is to flatten out the structures in general to keep all values which are retrieved together at the same store and provide flexible schema. The issue with the stores is that they are compromising on mostly consistency (no ACID guarantees) and one has to use NON-SQL dialect to work with the store. The other major issue is about education with NOSQL solutions. Would one really want to make these compromises on the ability to connect and retrieve in simple SQL manner and learn other skill sets? Or for that matter give up on ACID guarantee and start dealing with consistency issues? Hybrid Deployment – Mac, Linux, Cloud, and Windows One of the challenges today that we see across On-premise vs Cloud infrastructure is a difference in abilities. Take for example SQL Azure – it is wonderful in its concepts of throttling (as it is shared deployment) of resources and ability to scale using federation. However, the same abilities are not available on premise. This is not a mistake, mind you – but a compromise of the sweet spot of workloads, customer requirements and operational SLAs which can be supported by the team. In today’s world it is imperative that databases are available across operating systems – which are a commodity and used by developers of all hues. An Ideal Database Ability List A system which allows a linear scale of the system (increase in throughput with reasonable response time) with the addition of resources A system which does not compromise on the ACID guarantees and require developers to learn new paradigms A system which does not force fit a new way interacting with database by learning Non-SQL dialect A system which does not force fit its mechanisms for providing availability across its various modules. Well NuoDB is the first database which has all of the above abilities and much more. In future articles I will cover my hands-on experience with it. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

Read the article

ANTS Memory Profiler 7.0

- by James Michael Hare

I had always been a fan of ANTS products (Reflector is absolutely invaluable, and their performance profiler is great as well – very easy to use!), so I was curious to see what the ANTS Memory Profiler could show me. Background While a performance profiler will track how much time is typically spent in each unit of code, a memory profiler gives you much more detail on how and where your memory is being consumed and released in a program. As an example, I’d been working on a data access layer at work to call a market data web service. This web service would take a list of symbols to quote and would return back the quote data. To help consolidate the thousands of web requests per second we get and reduce load on the web services, we implemented a 5-second cache of quote data. Not quite long enough to where customers will typically notice a quote go “stale”, but just long enough to be able to collapse multiple quote requests for the same symbol in a short period of time. A 5-second cache may not sound like much, but it actually pays off by saving us roughly 42% of our web service calls, while still providing relatively up-to-date information. The question is whether or not the extra memory involved in maintaining the cache was worth it, so I decided to fire up the ANTS Memory Profiler and take a look at memory usage. First Impressions The main thing I’ve always loved about the ANTS tools is their ease of use. Pretty much everything is right there in front of you in a way that makes it easy for you to find what you need with little digging required. I’ve worked with other, older profilers before (that shall remain nameless other than to hint it was created by a very large chip maker) where it was a mind boggling experience to figure out how to do simple tasks. Not so with AMP. The opening dialog is very straightforward. You can choose from here whether to debug an executable, a web application (either in IIS or from VS’s web development server), windows services, etc. So I chose a .NET Executable and navigated to the build location of my test harness. Then began profiling. At this point while the application is running, you can see a chart of the memory as it ebbs and wanes with allocations and collections. At any given point in time, you can take snapshots (to compare states) zoom in, or choose to stop at any time. Snapshots Taking a snapshot also gives you a breakdown of the managed memory heaps for each generation so you get an idea how many objects are staying around for extended periods of time (as an object lives and survives collections, it gets promoted into higher generations where collection becomes less frequent). Generating a snapshot brings up an analysis view with very handy graphs that show your generation sizes. Almost all my memory is in Generation 1 in the managed memory component of the first graph, which is good news to me, because Gen 2 collections are much rarer. I once3 made the mistake once of caching data for 30 minutes and found it didn’t get collected very quick after I released my reference because it had been promoted to Gen 2 – doh! Analysis It looks like (from the second pie chart) that the majority of the allocations were in the string class. This also is expected for me because the majority of the memory allocated is in the web service responses, so it doesn’t seem the entities I’m adapting to (to prevent being too tightly coupled to the web service proxy classes, which can change easily out from under me) aren’t taking a significant portion of memory. I also appreciate that they have clear summary text in key places such as “No issues with large object heap fragmentation were detected”. For novice users, this type of summary information can be critical to getting them to use a tool and develop a good working knowledge of it. There is also a handy link at the bottom for “What to look for on the summary” which loads a web page of help on key points to look for. Clicking over to the session overview, it’s easy to compare the samples at each snapshot to see how your memory is growing, shrinking, or staying relatively the same. Looking at my snapshots, I’m pretty happy with the fact that memory allocation and heap size seems to be fairly stable and in control: Once again, you can check on the large object heap, generation one heap, and generation two heap across each snapshot to spot trends. Back on the analysis tab, we can go to the [Class List] button to get an idea what classes are making up the majority of our memory usage. As was little surprise to me, System.String was the clear majority of my allocations, though I found it surprising that the System.Reflection.RuntimeMehtodInfo came in second. I was curious about this, so I selected it and went into the [Instance Categorizer]. This view let me see where these instances to RuntimeMehtodInfo were coming from. So I scrolled back through the graph, and discovered that these were being held by the System.ServiceModel.ChannelFactoryRefCache and I was satisfied this was just an artifact of my WCF proxy. I also like that down at the bottom of the Instance Categorizer it gives you a series of filters and offers to guide you on which filter to use based on the problem you are trying to find. For example, if I suspected a memory leak, I might try to filter for survivors in growing classes. This means that for instances of a class that are growing in memory (more are being created than cleaned up), which ones are survivors (not collected) from garbage collection. This might allow me to drill down and find places where I’m holding onto references by mistake and not freeing them! Finally, if you want to really see all your instances and who is holding onto them (preventing collection), you can go to the “Instance Retention Graph” which creates a graph showing what references are being held in memory and who is holding onto them. Visual Studio Integration Of course, VS has its own profiler built in – and for a free bundled profiler it is quite capable – but AMP gives a much cleaner and easier-to-use experience, and when you install it you also get the option of letting it integrate directly into VS. So once you go back into VS after installation, you’ll notice an ANTS menu which lets you launch the ANTS profiler directly from Visual Studio. Clicking on one of these options fires up the project in the profiler immediately, allowing you to get right in. It doesn’t integrate with the Visual Studio windows themselves (like the VS profiler does), but still the plethora of information it provides and the clear and concise manner in which it presents it makes it well worth it. Summary If you like the ANTS series of tools, you shouldn’t be disappointed with the ANTS Memory Profiler. It was so easy to use that I was able to jump in with very little product knowledge and get the information I was looking it for. I’ve used other profilers before that came with 3-inch thick tomes that you had to read in order to get anywhere with the tool, and this one is not like that at all. It’s built for your everyday developer to get in and find their problems quickly, and I like that! Tweet Technorati Tags: Influencers,ANTS,Memory,Profiler

Read the article

MySQL for Excel 1.1.3 has been released

- by Javier Treviño

The MySQL Windows Experience Team is proud to announce the release of MySQL for Excel version 1.1.3, the latest addition to the MySQL Installer for Windows. MySQL for Excel is an application plug-in enabling data analysts to very easily access and manipulate MySQL data within Microsoft Excel. It enables you to directly work with a MySQL database from within Microsoft Excel so you can easily do tasks such as: Importing MySQL Data into Excel Exporting Excel data directly into MySQL to a new or existing table Editing MySQL data directly within Excel MySQL for Excel is installed using the MySQL Installer for Windows. The MySQL installer comes in 2 versions Full (150 MB) which includes a complete set of MySQL products with their binaries included in the download Web (1.5 MB - a network install) which will just pull MySQL for Excel over the web and install it when run. You can download MySQL Installer from our official Downloads page at http://dev.mysql.com/downloads/installer/. MySQL for Excel 1.1.3 introduces the following features: Upon saving a Workbook containing Worksheets in Edit Mode, the user is asked if he wants to exit the Edit Mode on all Worksheets before their parent Workbook is saved so the Worksheets are saved unprotected, otherwise the Worksheets will remain protected and the users will be able to unprotect them later retrieving the passkeys from the application log after closing MySQL for Excel. Added background coloring to the column names header row of an Import Data operation to have the same look as the one in an Edit Data operation (i.e. gray-ish background). Connection passwords can be stored securely just like MySQL Workbench does and these secured passwords are shared with Workbench in the same way connections are. Changed the way the MySQL for Excel ribbon toggle button works, instead of just showing or hiding the add-in it actually opens and closes it. Added a connection test before any operation against the database (schema creation, data import, append, export or edition) so the operation dialog is not shown and a friendlier error message is shown. Also this release contains the following bug fixes: Added a check on every connection test for an expired password, if the password has been expired a dialog is now shown to the user to reset the password. Bug #17354118 - DON'T HANDLE EXPIRED PASSWORDS Added code to escape text values to be imported to an Excel worksheet that start with an equals sign so Excel does not treat those values as formulas that will fail evaluation. This is an option turned on by default that can be turned off by users if they wish to import values to be treated as Excel formulas. Bug #17354102 - ERROR IMPORTING TEXT VALUES TO EXCEL STARTING WITH AN EQUALS SIGN Added code to properly check the reason for a failing connection, if it's a failing password the user gets a dialog to retry the connection with a different password until the connection succeeds, a connection error not related to the password is thrown or the user cancels. If the failing connection is not related to a bad password an error message is shown to the users indicating the reason of the failure. Bug #16239007 - CONNECTIONS TO MYSQL SERVICES NOT RUNNING DISPLAY A WRONG PASSWORD ERROR MESSAGE Added global options dialog that can be accessed from the Schema Selection and DB Object Selection panels where the timeouts for the connection to the DB Server and for the query commands can be changed from their default values (15 seconds for the connection timeout and 30 seconds for the query timeout). MySQL Bug #68732, Bug #17191646 - QUERY TIMEOUT CANNOT BE ADJUSTED IN MYSQL FOR EXCEL Changed the Varchar(65,535) data type shown in the Export Data data type combo box to Text since the maximum row size is 65,535 bytes and any autodetected column data type with a length greater than 4,000 should be set to Text actually for the table to be created successfully. MySQL Bug #69779, Bug #17191633 - EXPORT FAILS FOR EXCEL FILES CONTAINING > 4000 CHARACTERS OF TEXT PER CELL Removed code that was replacing all spaces typed by the user in an overriden data type for a new column in an Export Data operation, also improved the data type detection code to flag as invalid data types with parenthesis but without any text inside or where the contents inside the parenthesis are not valid for the specific data type. Bug #17260260 - EXPORT DATA SET TYPE NOT WORKING WITH MEMBER VALUES CONTAINING SPACES Added support for the year data type with a length of 2 or 4 and a validation that valid values are integers between 1901-2155 (for 4-digit years) or between 0-99 (for 2-digit years). Bug #17259915 - EXPORT DATA YEAR DATA TYPE NOT RECOGNIZED IF DECLARED WITH A DISPLAY WIDTH) Fixed code for Export Data operations where users overrode the data type for columns typing Text in the data type combobox, which is a valid data type but was not recognized as such. Bug #17259490 - EXPORT DATA TEXT DATA TYPE NOT RECOGNIZED AS A VALID DATA TYPE Changed the location of the registry where the MySQL for Excel add-in is installed to HKEY_LOCAL_MACHINE instead of HKEY_CURRENT_USER so the add-in is accessible by all users and not only to the user that installed it. For this to work with Excel 2007 a hotfix may be required (see http://support.microsoft.com/kb/976477). MySQL Bug #68746, Bug #16675992 - EXCEL-ADD-IN IS ONLY INSTALLED FOR USER ACCOUNT THAT THE INSTALLATION RUNS UNDER Added support for Excel 2013 Single Document Interface, now that Excel 2013 creates 1 window per workbook also the Excel Add-In maintains an independent custom task pane in each window. MySQL Bug #68792, Bug #17272087 - MYSQL FOR EXCEL SIDEBAR DOES NOT APPEAR IN EXCEL 2013 (WITH WORKAROUND) Included the latest MySQL Utility with a code fix for the COM exception thrown when attempting to open Workbench in the Manage Connections window. Bug #17258966 - MYSQL WORKBENCH NOT OPENED BY CLICKING MANAGE CONNECTIONS HOTLABEL Fixed code for Append Data operations that was not applying a calculated automatic mapping correctly when the source and target tables had different number of columns, some columns with the same name but some of those lying on column indexes beyond the limit of the other source/target table. MySQL Bug #69220, Bug #17278349 - APPEND DOESN'T AUTOMATICALLY DETECT EXCEL COL HEADER WITH SAME NAME AS SQL FIELD Fixed some code for Edit Data operations that was escaping special characters twice (during edition in Excel and then upon sending the query to the MySQL server). MySQL Bug #68669, Bug #17271693 - A BACKSLASH IS INSERTED BEFORE AN APOSTROPHE EDITING TABLE WITH MYSQL FOR EXCEL Upgraded MySQL Utility with latest version that encapsulates dialog base classes and introduces more classes to handle Workbench connections, and removed these from the Excel project. Bug #16500331 - CAN'T DELETE CONNECTIONS CREATED WITHIN ADDIN You can access the MySQL for Excel documentation at http://dev.mysql.com/doc/refman/5.6/en/mysql-for-excel.html You can find our team’s blog at http://blogs.oracle.com/MySQLOnWindows. You can also post questions on our MySQL for Excel forum found at http://forums.mysql.com/. Enjoy and thanks for the support!

Read the article

Securing an ADF Application using OES11g: Part 1

- by user12587121

Future releases of the Oracle stack should allow ADF applications to be secured natively with Oracle Entitlements Server (OES). In a sequence of postings here I explore one way to achive this with the current technology, namely OES 11.1.1.5 and ADF 11.1.1.6. ADF Security Basics ADF Bascis The Application Development Framework (ADF) is Oracle’s preferred technology for developing GUI based Java applications. It can be used to develop a UI for Swing applications or, more typically in the Oracle stack, for Web and J2EE applications. ADF is based on and extends the Java Server Faces (JSF) technology. To get an idea, Oracle provides an online demo to showcase ADF components. ADF can be used to develop just the UI part of an application, where, for example, the data access layer is implemented using some custom Java beans or EJBs. However ADF also has it’s own data access layer, ADF Business Components (ADF BC) that will allow rapid integration of data from data bases and Webservice interfaces to the ADF UI component. In this way ADF helps implement the MVC approach to building applications with UI and data components. The canonical tutorial for ADF is to open JDeveloper, define a connection to a database, drag and drop a table from the database view to a UI page, build and deploy. One has an application up and running very quickly with the ability to quickly integrate changes to, for example, the DB schema. ADF allows web pages to be created graphically and components like tables, forms, text fields, graphs and so on to be easily added to a page. On top of JSF Oracle have added drag and drop tooling with JDeveloper and declarative binding of the UI to the data layer, be it database, WebService or Java beans. An important addition is the bounded task flow which is a reusable set of pages and transitions. ADF adds some steps to the page lifecycle defined in JSF and adds extra widgets including powerful visualizations. It is worth pointing out that the Oracle Web Center product (portal, content management and so on) is based on and extends ADF. ADF Security ADF comes with it’s own security mechanism that is exposed by JDeveloper at development time and in the WLS Console and Enterprise Manager (EM) at run time. The security elements that need to be addressed in an ADF application are: authentication, authorization of access to web pages, task-flows, components within the pages and data being returned from the model layer. One typically relies on WLS to handle authentication and because of this users and groups will also be handled by WLS. Typically in a Dev environment, users and groups are stored in the WLS embedded LDAP server. One has a choice when enabling ADF security (Application->Secure->Configure ADF Security) about whether to turn on ADF authorization checking or not: In the case where authorization is enabled for ADF one defines a set of roles in which we place users and then we grant access to these roles to the different ADF elements (pages or task flows or elements in a page). An important notion here is the difference between Enterprise Roles and Application Roles. The idea behind an enterprise role is that is defined in terms of users and LDAP groups from the WLS identity store. “Enterprise” in the sense that these are things available for use to all applications that use that store. The other kind of role is an Application Role and the idea is that a given application will make use of Enterprise roles and users to build up a set of roles for it’s own use. These application roles will be available only to that application. The general idea here is that the enterprise roles are relatively static (for example an Employees group in the LDAP directory) while application roles are more dynamic, possibly depending on time, location, accessed resource and so on. One of the things that OES adds that is that we can define these dynamic membership conditions in Role Mapping Policies. To make this concrete, here is how, at design time in Jdeveloper, one assigns these rights in Jdeveloper, which puts them into a file called jazn-data.xml: When the ADF app is deployed to a WLS this JAZN security data is pushed to the system-jazn-data.xml file of the WLS deployment for the policies and application roles and to the WLS backing LDAP for the users and enterprise roles. Note the difference here: after deploying the application we will see the users and enterprise roles show up in the WLS LDAP server. But the policies and application roles are defined in the system-jazn-data.xml file. Consult the embedded WLS LDAP server to manage users and enterprise roles by going to the domain console and then Security Realms->myrealm->Users and Groups: For production environments (or in future to share this data with OES) one would then perform the operation of “reassociating” this security policy and application role data to a DB schema (or an LDAP). This is done in the EM console by reassociating the Security Provider. This blog posting has more explanations and references on this reassociation process. If ADF Authentication and Authorization are enabled then the Security Policies for a deployed application can be managed in EM. Our goal is to be able to manage security policies for the applicaiton rather via OES and it's console. Security Requirements for an ADF Application With this package tour of ADF security we can see that to secure an ADF application with we would expect to be able to take care of at least the following items: Authentication, including a user and user-group store Authorization for page access Authorization for bounded Task Flow access. A bounded task flow has only one point of entry and so if we protect that entry point by calling to OES then all the pages in the flow are protected. Authorization for viewing data coming from the data access layer In the next posting we will describe a sample ADF application and required security policies. References ADF Dev Guide: Fusion Middleware Fusion Developer's Guide for Oracle Application Development Framework: Enabling ADF Security in a Fusion Web Application Oracle tutorial on securing a sample ADF application, appears to require ADF 11.1.2 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;}

Read the article

The Sensemaking Spectrum for Business Analytics: Translating from Data to Business Through Analysis

- by Joe Lamantia

One of the most compelling outcomes of our strategic research efforts over the past several years is a growing vocabulary that articulates our cumulative understanding of the deep structure of the domains of discovery and business analytics. Modes are one example of the deep structure we’ve found. After looking at discovery activities across a very wide range of industries, question types, business needs, and problem solving approaches, we've identified distinct and recurring kinds of sensemaking activity, independent of context. We label these activities Modes: Explore, compare, and comprehend are three of the nine recognizable modes. Modes describe *how* people go about realizing insights. (Read more about the programmatic research and formal academic grounding and discussion of the modes here: https://www.researchgate.net/publication/235971352_A_Taxonomy_of_Enterprise_Search_and_Discovery) By analogy to languages, modes are the 'verbs' of discovery activity. When applied to the practical questions of product strategy and development, the modes of discovery allow one to identify what kinds of analytical activity a product, platform, or solution needs to support across a spread of usage scenarios, and then make concrete and well-informed decisions about every aspect of the solution, from high-level capabilities, to which specific types of information visualizations better enable these scenarios for the types of data users will analyze. The modes are a powerful generative tool for product making, but if you've spent time with young children, or had a really bad hangover (or both at the same time...), you understand the difficult of communicating using only verbs. So I'm happy to share that we've found traction on another facet of the deep structure of discovery and business analytics. Continuing the language analogy, we've identified some of the ‘nouns’ in the language of discovery: specifically, the consistently recurring aspects of a business that people are looking for insight into. We call these discovery Subjects, since they identify *what* people focus on during discovery efforts, rather than *how* they go about discovery as with the Modes. Defining the collection of Subjects people repeatedly focus on allows us to understand and articulate sense making needs and activity in more specific, consistent, and complete fashion. In combination with the Modes, we can use Subjects to concretely identify and define scenarios that describe people’s analytical needs and goals. For example, a scenario such as ‘Explore [a Mode] the attrition rates [a Measure, one type of Subject] of our largest customers [Entities, another type of Subject] clearly captures the nature of the activity — exploration of trends vs. deep analysis of underlying factors — and the central focus — attrition rates for customers above a certain set of size criteria — from which follow many of the specifics needed to address this scenario in terms of data, analytical tools, and methods. We can also use Subjects to translate effectively between the different perspectives that shape discovery efforts, reducing ambiguity and increasing impact on both sides the perspective divide. For example, from the language of business, which often motivates analytical work by asking questions in business terms, to the perspective of analysis. The question posed to a Data Scientist or analyst may be something like “Why are sales of our new kinds of potato chips to our largest customers fluctuating unexpectedly this year?” or “Where can innovate, by expanding our product portfolio to meet unmet needs?”. Analysts translate questions and beliefs like these into one or more empirical discovery efforts that more formally and granularly indicate the plan, methods, tools, and desired outcomes of analysis. From the perspective of analysis this second question might become, “Which customer needs of type ‘A', identified and measured in terms of ‘B’, that are not directly or indirectly addressed by any of our current products, offer 'X' potential for ‘Y' positive return on the investment ‘Z' required to launch a new offering, in time frame ‘W’? And how do these compare to each other?”. Translation also happens from the perspective of analysis to the perspective of data; in terms of availability, quality, completeness, format, volume, etc. By implication, we are proposing that most working organizations — small and large, for profit and non-profit, domestic and international, and in the majority of industries — can be described for analytical purposes using this collection of Subjects. This is a bold claim, but simplified articulation of complexity is one of the primary goals of sensemaking frameworks such as this one. (And, yes, this is in fact a framework for making sense of sensemaking as a category of activity - but we’re not considering the recursive aspects of this exercise at the moment.) Compellingly, we can place the collection of subjects on a single continuum — we call it the Sensemaking Spectrum — that simply and coherently illustrates some of the most important relationships between the different types of Subjects, and also illuminates several of the fundamental dynamics shaping business analytics as a domain. As a corollary, the Sensemaking Spectrum also suggests innovation opportunities for products and services related to business analytics. The first illustration below shows Subjects arrayed along the Sensemaking Spectrum; the second illustration presents examples of each kind of Subject. Subjects appear in colors ranging from blue to reddish-orange, reflecting their place along the Spectrum, which indicates whether a Subject addresses more the viewpoint of systems and data (Data centric and blue), or people (User centric and orange). This axis is shown explicitly above the Spectrum. Annotations suggest how Subjects align with the three significant perspectives of Data, Analysis, and Business that shape business analytics activity. This rendering makes explicit the translation and bridging function of Analysts as a role, and analysis as an activity. Subjects are best understood as fuzzy categories [http://georgelakoff.files.wordpress.com/2011/01/hedges-a-study-in-meaning-criteria-and-the-logic-of-fuzzy-concepts-journal-of-philosophical-logic-2-lakoff-19731.pdf], rather than tightly defined buckets. For each Subject, we suggest some of the most common examples: Entities may be physical things such as named products, or locations (a building, or a city); they could be Concepts, such as satisfaction; or they could be Relationships between entities, such as the variety of possible connections that define linkage in social networks. Likewise, Events may indicate a time and place in the dictionary sense; or they may be Transactions involving named entities; or take the form of Signals, such as ‘some Measure had some value at some time’ - what many enterprises understand as alerts. The central story of the Spectrum is that though consumers of analytical insights (represented here by the Business perspective) need to work in terms of Subjects that are directly meaningful to their perspective — such as Themes, Plans, and Goals — the working realities of data (condition, structure, availability, completeness, cost) and the changing nature of most discovery efforts make direct engagement with source data in this fashion impossible. Accordingly, business analytics as a domain is structured around the fundamental assumption that sense making depends on analytical transformation of data. Analytical activity incrementally synthesizes more complex and larger scope Subjects from data in its starting condition, accumulating insight (and value) by moving through a progression of stages in which increasingly meaningful Subjects are iteratively synthesized from the data, and recombined with other Subjects. The end goal of ‘laddering’ successive transformations is to enable sense making from the business perspective, rather than the analytical perspective.Synthesis through laddering is typically accomplished by specialized Analysts using dedicated tools and methods. Beginning with some motivating question such as seeking opportunities to increase the efficiency (a Theme) of fulfillment processes to reach some level of profitability by the end of the year (Plan), Analysts will iteratively wrangle and transform source data Records, Values and Attributes into recognizable Entities, such as Products, that can be combined with Measures or other data into the Events (shipment of orders) that indicate the workings of the business. More complex Subjects (to the right of the Spectrum) are composed of or make reference to less complex Subjects: a business Process such as Fulfillment will include Activities such as confirming, packing, and then shipping orders. These Activities occur within or are conducted by organizational units such as teams of staff or partner firms (Networks), composed of Entities which are structured via Relationships, such as supplier and buyer. The fulfillment process will involve other types of Entities, such as the products or services the business provides. The success of the fulfillment process overall may be judged according to a sophisticated operating efficiency Model, which includes tiered Measures of business activity and health for the transactions and activities included. All of this may be interpreted through an understanding of the operational domain of the businesses supply chain (a Domain). We'll discuss the Spectrum in more depth in succeeding posts.

Read the article

Benchmarking MySQL Replication with Multi-Threaded Slaves

- by Mat Keep

0 0 1 1145 6530 Homework 54 15 7660 14.0 Normal 0 false false false EN-US JA X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin; mso-ansi-language:EN-US;} The objective of this benchmark is to measure the performance improvement achieved when enabling the Multi-Threaded Slave enhancement delivered as a part MySQL 5.6. As the results demonstrate, Multi-Threaded Slaves delivers 5x higher replication performance based on a configuration with 10 databases/schemas. For real-world deployments, higher replication performance directly translates to: · Improved consistency of reads from slaves (i.e. reduced risk of reading "stale" data) · Reduced risk of data loss should the master fail before replicating all events in its binary log (binlog) The multi-threaded slave splits processing between worker threads based on schema, allowing updates to be applied in parallel, rather than sequentially. This delivers benefits to those workloads that isolate application data using databases - e.g. multi-tenant systems deployed in cloud environments. Multi-Threaded Slaves are just one of many enhancements to replication previewed as part of the MySQL 5.6 Development Release, which include: · Global Transaction Identifiers coupled with MySQL utilities for automatic failover / switchover and slave promotion · Crash Safe Slaves and Binlog · Optimized Row Based Replication · Replication Event Checksums · Time Delayed Replication These and many more are discussed in the “MySQL 5.6 Replication: Enabling the Next Generation of Web & Cloud Services” Developer Zone article Back to the benchmark - details are as follows. Environment The test environment consisted of two Linux servers: · one running the replication master · one running the replication slave. Only the slave was involved in the actual measurements, and was based on the following configuration: - Hardware: Oracle Sun Fire X4170 M2 Server - CPU: 2 sockets, 6 cores with hyper-threading, 2930 MHz. - OS: 64-bit Oracle Enterprise Linux 6.1 - Memory: 48 GB Test Procedure Initial Setup: Two MySQL servers were started on two different hosts, configured as replication master and slave. 10 sysbench schemas were created, each with a single table: CREATE TABLE `sbtest` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `k` int(10) unsigned NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', `pad` char(60) NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `k` (`k`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 10,000 rows were inserted in each of the 10 tables, for a total of 100,000 rows. When the inserts had replicated to the slave, the slave threads were stopped. The slave data directory was copied to a backup location and the slave threads position in the master binlog noted. 10 sysbench clients, each configured with 10 threads, were spawned at the same time to generate a random schema load against each of the 10 schemas on the master. Each sysbench client executed 10,000 "update key" statements: UPDATE sbtest set k=k+1 WHERE id = <random row> In total, this generated 100,000 update statements to later replicate during the test itself. Test Methodology: The number of slave workers to test with was configured using: SET GLOBAL slave_parallel_workers=<workers> Then the slave IO thread was started and the test waited for all the update queries to be copied over to the relay log on the slave. The benchmark clock was started and then the slave SQL thread was started. The test waited for the slave SQL thread to finish executing the 100k update queries, doing "select master_pos_wait()". When master_pos_wait() returned, the benchmark clock was stopped and the duration calculated. The calculated duration from the benchmark clock should be close to the time it took for the SQL thread to execute the 100,000 update queries. The 100k queries divided by this duration gave the benchmark metric, reported as Queries Per Second (QPS). Test Reset: The test-reset cycle was implemented as follows: · the slave was stopped · the slave data directory replaced with the previous backup · the slave restarted with the slave threads replication pointer repositioned to the point before the update queries in the binlog. The test could then be repeated with identical set of queries but a different number of slave worker threads, enabling a fair comparison. The Test-Reset cycle was repeated 3 times for 0-24 number of workers and the QPS metric calculated and averaged for each worker count. MySQL Configuration The relevant configuration settings used for MySQL are as follows: binlog-format=STATEMENT relay-log-info-repository=TABLE master-info-repository=TABLE As described in the test procedure, the slave_parallel_workers setting was modified as part of the test logic. The consequence of changing this setting is: 0 worker threads: - current (i.e. single threaded) sequential mode - 1 x IO thread and 1 x SQL thread - SQL thread both reads and executes the events 1 worker thread: - sequential mode - 1 x IO thread, 1 x Coordinator SQL thread and 1 x Worker thread - coordinator reads the event and hands it to the worker who executes 2+ worker threads: - parallel execution - 1 x IO thread, 1 x Coordinator SQL thread and 2+ Worker threads - coordinator reads events and hands them to the workers who execute them Results Figure 1 below shows that Multi-Threaded Slaves deliver ~5x higher replication performance when configured with 10 worker threads, with the load evenly distributed across our 10 x schemas. This result is compared to the current replication implementation which is based on a single SQL thread only (i.e. zero worker threads). Figure 1: 5x Higher Performance with Multi-Threaded Slaves The following figure shows more detailed results, with QPS sampled and reported as the worker threads are incremented. The raw numbers behind this graph are reported in the Appendix section of this post. Figure 2: Detailed Results As the results above show, the configuration does not scale noticably from 5 to 9 worker threads. When configured with 10 worker threads however, scalability increases significantly. The conclusion therefore is that it is desirable to configure the same number of worker threads as schemas. Other conclusions from the results: · Running with 1 worker compared to zero workers just introduces overhead without the benefit of parallel execution. · As expected, having more workers than schemas adds no visible benefit. Aside from what is shown in the results above, testing also demonstrated that the following settings had a very positive effect on slave performance: relay-log-info-repository=TABLE master-info-repository=TABLE For 5+ workers, it was up to 2.3 times as fast to run with TABLE compared to FILE. Conclusion As the results demonstrate, Multi-Threaded Slaves deliver significant performance increases to MySQL replication when handling multiple schemas. This, and the other replication enhancements introduced in MySQL 5.6 are fully available for you to download and evaluate now from the MySQL Developer site (select Development Release tab). You can learn more about MySQL 5.6 from the documentation Please don’t hesitate to comment on this or other replication blogs with feedback and questions. Appendix – Detailed Results

Read the article

When is a Seek not a Seek?

- by Paul White

The following script creates a single-column clustered table containing the integers from 1 to 1,000 inclusive. IF OBJECT_ID(N'tempdb..#Test', N'U') IS NOT NULL DROP TABLE #Test ; GO CREATE TABLE #Test ( id INTEGER PRIMARY KEY CLUSTERED ); ; INSERT #Test (id) SELECT V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 1000 ; Let’s say we need to find the rows with values from 100 to 170, excluding any values that divide exactly by 10. One way to write that query would be: SELECT T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; That query produces a pretty efficient-looking query plan: Knowing that the source column is defined as an INTEGER, we could also express the query this way: SELECT T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; We get a similar-looking plan: If you look closely, you might notice that the line connecting the two icons is a little thinner than before. The first query is estimated to produce 61.9167 rows – very close to the 63 rows we know the query will return. The second query presents a tougher challenge for SQL Server because it doesn’t know how to predict the selectivity of the modulo expression (T.id % 10 > 0). Without that last line, the second query is estimated to produce 68.1667 rows – a slight overestimate. Adding the opaque modulo expression results in SQL Server guessing at the selectivity. As you may know, the selectivity guess for a greater-than operation is 30%, so the final estimate is 30% of 68.1667, which comes to 20.45 rows. The second difference is that the Clustered Index Seek is costed at 99% of the estimated total for the statement. For some reason, the final SELECT operator is assigned a small cost of 0.0000484 units; I have absolutely no idea why this is so, or what it models. Nevertheless, we can compare the total cost for both queries: the first one comes in at 0.0033501 units, and the second at 0.0034054. The important point is that the second query is costed very slightly higher than the first, even though it is expected to produce many fewer rows (20.45 versus 61.9167). If you run the two queries, they produce exactly the same results, and both complete so quickly that it is impossible to measure CPU usage for a single execution. We can, however, compare the I/O statistics for a single run by running the queries with STATISTICS IO ON: Table '#Test'. Scan count 63, logical reads 126, physical reads 0. Table '#Test'. Scan count 01, logical reads 002, physical reads 0. The query with the IN list uses 126 logical reads (and has a ‘scan count’ of 63), while the second query form completes with just 2 logical reads (and a ‘scan count’ of 1). It is no coincidence that 126 = 63 * 2, by the way. It is almost as if the first query is doing 63 seeks, compared to one for the second query. In fact, that is exactly what it is doing. There is no indication of this in the graphical plan, or the tool-tip that appears when you hover your mouse over the Clustered Index Seek icon. To see the 63 seek operations, you have click on the Seek icon and look in the Properties window (press F4, or right-click and choose from the menu): The Seek Predicates list shows a total of 63 seek operations – one for each of the values from the IN list contained in the first query. I have expanded the first seek node to show the details; it is seeking down the clustered index to find the entry with the value 101. Each of the other 62 nodes expands similarly, and the same information is contained (even more verbosely) in the XML form of the plan. Each of the 63 seek operations starts at the root of the clustered index B-tree and navigates down to the leaf page that contains the sought key value. Our table is just large enough to need a separate root page, so each seek incurs 2 logical reads (one for the root, and one for the leaf). We can see the index depth using the INDEXPROPERTY function, or by using the a DMV: SELECT S.index_type_desc, S.index_depth FROM sys.dm_db_index_physical_stats ( DB_ID(N'tempdb'), OBJECT_ID(N'tempdb..#Test', N'U'), 1, 1, DEFAULT ) AS S ; Let’s look now at the Properties window when the Clustered Index Seek from the second query is selected: There is just one seek operation, which starts at the root of the index and navigates the B-tree looking for the first key that matches the Start range condition (id >= 101). It then continues to read records at the leaf level of the index (following links between leaf-level pages if necessary) until it finds a row that does not meet the End range condition (id <= 169). Every row that meets the seek range condition is also tested against the Residual Predicate highlighted above (id % 10 > 0), and is only returned if it matches that as well. You will not be surprised that the single seek (with a range scan and residual predicate) is much more efficient than 63 singleton seeks. It is not 63 times more efficient (as the logical reads comparison would suggest), but it is around three times faster. Let’s run both query forms 10,000 times and measure the elapsed time: DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON; SET STATISTICS XML OFF; ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; GO DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; On my laptop, running SQL Server 2008 build 4272 (SP2 CU2), the IN form of the query takes around 830ms and the range query about 300ms. The main point of this post is not performance, however – it is meant as an introduction to the next few parts in this mini-series that will continue to explore scans and seeks in detail. When is a seek not a seek? When it is 63 seeks © Paul White 2011 email: [email protected] twitter: @SQL_kiwi

Read the article

BI Applications overview

- by sv744

Welcome to Oracle BI applications blog! This blog will talk about various features, general roadmap, description of functionality and implementation steps related to Oracle BI applications. In the first post we start with an overview of the BI apps and will delve deeper into some of the topics below in the upcoming weeks and months. If there are other topics you would like us to talk about, pl feel free to provide feedback on that. The Oracle BI applications are a set of pre-built applications that enable pervasive BI by providing role-based insight for each functional area, including sales, service, marketing, contact center, finance, supplier/supply chain, HR/workforce, and executive management. For example, Sales Analytics includes role-based applications for sales executives, sales management, as well as front-line sales reps, each of whom have different needs. The applications integrate and transform data from a range of enterprise sources—including Siebel, Oracle, PeopleSoft, SAP, and others—into actionable intelligence for each business function and user role. This blog starts with the key benefits and characteristics of Oracle BI applications. In a series of subsequent blogs, each of these points will be explained in detail. Why BI apps? Demonstrate the value of BI to a business user, show reports / dashboards / model that can answer their business questions as part of the sales cycle. Demonstrate technical feasibility of BI project and significantly lower risk and improve success Build Vs Buy benefit Don’t have to start with a blank sheet of paper. Help consolidate disparate systems Data integration in M&A situations Insulate BI consumers from changes in the OLTP Present OLTP data and highlight issues of poor data / missing data – and improve data quality and accuracy Prebuilt Integrations BI apps support prebuilt integrations against leading ERP sources: Fusion Applications, E- Business Suite, Peoplesoft, JD Edwards, Siebel, SAP Co-developed with inputs from functional experts in BI and Applications teams. Out of the box dimensional model to source model mappings Multi source and Multi Instance support Rich Data Model BI apps have a very rich dimensionsal data model built over 10 years that incorporates best practises from BI modeling perspective as well as reflect the source system complexities Thanks for reading a long post, and be on the lookout for future posts. We will look forward to your valuable feedback on these topics as well as suggestions on what other topics would you like us to cover. I Conformed dimensional model across all business subject areas allows cross functional reporting, e.g. customer / supplier 360 Over 360 fact tables across 7 product areas CRM – 145, SCM – 47, Financials – 28, Procurement – 20, HCM – 27, Projects – 18, Campus Solutions – 21, PLM - 56 Supported by 300 physical dimensions Support for extensive calendars; Gregorian, enterprise and ledger based Conformed data model and metrics for real time vs warehouse based reporting Multi-tenant enabled Extensive BI related transformations BI apps ETL and data integration support various transformations required for dimensional models and reporting requirements. All these have been distilled into common patterns and abstracted logic which can be readily reused across different modules Slowly Changing Dimension support Hierarchy flattening support Row / Column Hybrid Hierarchy Flattening As Is vs. As Was hierarchy support Currency Conversion :- Support for 3 corporate, CRM, ledger and transaction currencies UOM conversion Internationalization / Localization Dynamic Data translations Code standardization (Domains) Historical Snapshots Cycle and process lifecycle computations Balance Facts Equalization of GL accounting chartfields/segments Standardized values for categorizing GL accounts Reconciliation between GL and subledgers to track accounted/transferred/posted transactions to GL Materialization of data only available through costly and complex APIs e.g. Fusion Payroll, EBS / Fusion Accruals Complex event Interpretation of source data – E.g. o What constitutes a transfer o Deriving supervisors via position hierarchy o Deriving primary assignment in PSFT o Categorizing and transposition to measures of Payroll Balances to specific metrics to support side by side comparison of measures of for example Fixed Salary, Variable Salary, Tax, Bonus, Overtime Payments. o Counting of Events – E.g. converting events to fact counters so that for example the number of hires can easily be added up and compared alongside the total transfers and terminations. Multi pass processing of multiple sources e.g. headcount, salary, promotion, performance to allow side to side comparison. Adding value to data to aid analysis through banding, additional domain classifications and groupings to allow higher level analytical reporting and data discovery Calculation of complex measures examples: o COGs, DSO, DPO, Inventory turns etc o Transfers within a Hierarchy or out of / into a hierarchy relative to view point in hierarchy. Configurability and Extensibility support BI apps offer support for extensibility for various entities as automated extensibility or part of extension methodology Key Flex fields and Descriptive Flex support Extensible attribute support (JDE) Conformed Domains ETL Architecture BI apps offer a modular adapter architecture which allows support of multiple product lines into a single conformed model Multi Source Multi Technology Orchestration – creates load plan taking into account task dependencies and customers deployment to generate a plan based on a customers of multiple complex etl tasks Plan optimization allowing parallel ETL tasks Oracle: Bit map indexes and partition management High availability support Follow the sun support. TCO BI apps support several utilities / capabilities that help with overall total cost of ownership and ensure a rapid implementation Improved cost of ownership – lower cost to deploy On-going support for new versions of the source application Task based setups flows Data Lineage Functional setup performed in Web UI by Functional person Configuration Test to Production support Security BI apps support both data and object security enabling implementations to quickly configure the application as per the reporting security needs Fine grain object security at report / dashboard and presentation catalog level Data Security integration with source systems Extensible to support external data security rules Extensive Set of KPIs Over 7000 base and derived metrics across all modules Time series calculations (YoY, % growth etc) Common Currency and UOM reporting Cross subject area KPIs (analyzing HR vs GL data, drill from GL to AP/AR, etc) Prebuilt reports and dashboards 3000+ prebuilt reports supporting a large number of industries Hundreds of role based dashboards Dynamic currency conversion at dashboard level Highly tuned Performance The BI apps have been tuned over the years for both a very performant ETL and dashboard performance. The applications use best practises and advanced database features to enable the best possible performance. Optimized data model for BI and analytic queries Prebuilt aggregates& the ability for customers to create their own aggregates easily on warehouse facts allows for scalable end user performance Incremental extracts and loads Incremental Aggregate build Automatic table index and statistics management Parallel ETL loads Source system deletes handling Low latency extract with Golden Gate Micro ETL support Bitmap Indexes Partitioning support Modularized deployment, start small and add other subject areas seamlessly Source Specfic Staging and Real Time Schema Support for source specific operational reporting schema for EBS, PSFT, Siebel and JDE Application Integrations The BI apps also allow for integration with source systems as well as other applications that provide value add through BI and enable BI consumption during operational decision making Embedded dashboards for Fusion, EBS and Siebel applications Action Link support Marketing Segmentation Sales Predictor Dashboard Territory Management External Integrations The BI apps data integration choices include support for loading extenral data External data enrichment choices : UNSPSC, Item class etc. Extensible Spend Classification Broad Deployment Choices Exalytics support Databases : Oracle, Exadata, Teradata, DB2, MSSQL ETL tool of choice : ODI (coming), Informatica Extensible and Customizable Extensible architecture and Methodology to add custom and external content Upgradable across releases

Read the article

How to begin? Windows 8 Development

- by Dennis Vroegop

Ok. I convinced you in my last post to do some Win8 development. You want a piece of that cake, or whatever your reasons may be. Good! Welcome to the club! Now let me ask you a question: what are you going to write? Ah. That’s the big one, isn’t it? What indeed? If you have been creating applications for computers before you’re in for quite a shock. The way people perceive apps on a tablet is quite different from what we know as applications. There’s a reason we call them apps instead of applications! Yes, technically they are applications but we don’t call them apps only because it sounds cool. The abbreviated form of the word applications itself is a pointer. Apps are small. Apps are focused. Apps are more lightweight. Apps do one thing but they do that one thing extremely good. In the ‘old’ days we wrote huge systems. We build ecosystems of services, screens, databases and more to create a system that provides value for the user. Think about it: what application do you use most at work? Can you in one sentence describe what it is, or what it does and yet still distinctively describe its purpose? I doubt you can. Let’s have a look at Outlouk. We all know it and we all love or hate it. But what is it? A mail program? No, there’s so much more there: calendar, contacts, RSS feeds and so on. Some call it a ‘collaboration’ application but that’s not really true as well. After all, why should a collaboration application give me my schedule for the day? I think the best way to describe Outlook is “client for Exchange” although that isn’t accurate either. Anyway: Outlook is a great application but it’s not an ‘app’ and therefor not very suitable for WinRT. Ok. Disclaimer here: yes, you can write big applications for WinRT. Some will. But that’s not what 99.9% of the developers will do. So I am stating here that big applications are not meant for WinRT. If 0.01% of the developers think that this is nonsense then they are welcome to go ahead but for the majority here this is not what we’re talking about. So: Apps are small, lightweight and good at what they do but only at that. If you’re a Phone developer you already know that: Phone apps on any platform fit the description I have above. If you’ve ever worked in a large cooperation before you might have seen one of these before: the Mission Statement. It’s supposed to be a oneliner that sums up what the company is supposed to do. Funny enough: although this doesn’t work for large companies it does work for defining your app. A mission statement for an app describes what it does. If it doesn’t fit in the mission statement then your app is going to get to big and will fail. A statement like this should be in the following style “<your app name> is the best app to <describe single task>” Fill in the blanks, write it and go! Mmm.. not really. There are some things there we need to think about. But the statement is a very, very important one. If you cannot fit your app in that line you’re preparing to fail. Your app will become to big, its purpose will be unclear and it will be hard to use. People won’t download it and those who do will give it a bad rating therefor preventing that huge success you’ve been dreaming about. Stick to the statement! Ok, let’s give it a try: “PlanesAreCool” is the best app to do planespotting in the field. You might have seen these people along runways of airports: taking photographs of airplanes and noting down their numbers and arrival- and departure times. We are going to help them out with our great app! If you look at the statement, can you guess what it does? I bet you can. If you find out it isn’t clear enough of if it’s too broad, refine it. This is probably the most important step in the development of your app so give it enough time! So. We’ve got the statement. Print it out, stick it to the wall and look at it. What does it tell you? If you see this, what do you think the app does? Write that down. Sit down with some friends and talk about it. What do they expect from an app like this? Write that down as well. Brainstorm. Make a list of features. This is mine: Note planes Look up aircraft carriers Add pictures of that plane Look up airfields Notify friends of new spots Look up details of a type of plane Plot a graph with arrival and departure times Share new spots on social media Look up history of a particular aircraft Compare your spots with friends Write down arrival times Write down departure times Write down wind conditions Write down the runway they take Look up weather conditions for next spotting day Invite friends to join you for a day of spotting. Now, I must make it clear that I am not a planespotter nor do I know what one does. So if the above list makes no sense, I apologize. There is a lesson: write apps for stuff you know about…. First of all, let’s look at our statement and then go through the list of features. Remove everything that has nothing to do with that statement! If you end up with an empty list, try again with both steps. Note planes Look up aircraft carriers Add pictures of that plane Look up airfields Notify friends of new spots Look up details of a type of plane Plot a graph with arrival and departure times Share new spots on social media Look up history of a particular aircraft Compare your spots with friends Write down arrival times Write down departure times Write down wind conditions Write down the runway they take Look up weather conditions for next spotting day Invite friends to join you for a day of spotting. That's better. The things I removed could be pretty useful to a plane spotter and could be fun to write. But do they match the statement? I said that the app is for spotting in the field, so “look up airfields” doesn’t belong there: I know where I am so why look it up? And the same goes for inviting friends or looking up the weather conditions for tomorrow. I am at the airfield right now, looking through my binoculars at the planes. I know the weather now and I don’t care about tomorrow. If you feel the items you’ve crossed out are valuable, then why not write another app? One that says “SpotNoter” is the best app for preparing a day of spotting with my friends. That’s a different app! Remember: Win8 apps are small and very good at doing ONE thing, and one thing only! If you have made that list, it’s time to prepare the navigation of your app. The navigation is how users see your app and how they use it. We’ll do that next time!

Read the article

RTS style fog of war woes

- by Fricken Hamster

So I'm trying to make a rts style line of sight fog of war style engine for my grid based game. Currently I am getting a set of vertices by raycasting in 360 degree. Then I use that list of vertices to do a graphics style polygon scanline fill to get a list of all points within the polygon. The I compare the new list of seen tiles and compare that with the old one and increment or decrement the world vision array as needed. The polygon scanline function is giving me trouble. I'm mostly following this http://www.cs.uic.edu/~jbell/CourseNotes/ComputerGraphics/PolygonFilling.html So far this is my code without cleaning anything up var edgeMinX:Vector.<int> = new Vector.<int>; var edgeMinY:Vector.<int> = new Vector.<int>; var edgeMaxY:Vector.<int> = new Vector.<int>; var edgeInvSlope:Vector.<Number> = new Vector.<Number>; var ilen:int = outvert.length; var miny:int = -1; var maxy:int = -1; for (i = 0; i < ilen; i++) { var curpoint:Point = outvert[i]; if (i == ilen -1) { var nextpoint:Point = outvert[0]; } else { nextpoint = outvert[i + 1]; } if (nextpoint.y == curpoint.y) { continue; } if (curpoint.y < nextpoint.y) { var curslope:Number = ((nextpoint.y - curpoint.y) / (nextpoint.x - curpoint.x)); edgeMinY.push(curpoint.y); edgeMinX.push(curpoint.x); edgeMaxY.push(nextpoint.y); edgeInvSlope.push(1 / curslope); if (curpoint.y < miny || miny == -1) { miny = curpoint.y; } if (nextpoint.y > maxy) { maxy = nextpoint.y; } } else { curslope = ((curpoint.y - nextpoint.y) / (curpoint.x - nextpoint.x)); edgeMinY.push(nextpoint.y); edgeMinX.push(nextpoint.x); edgeMaxY.push(curpoint.y); edgeInvSlope.push(1 / curslope); if (nextpoint.y < miny || miny == -1) { miny = curpoint.y; } if (curpoint.y > maxy) { maxy = nextpoint.y; } } } var activeMaxY:Vector.<int> = new Vector.<int>; var activeCurX:Vector.<Number> = new Vector.<Number>; var activeInvSlope:Vector.<Number> = new Vector.<Number>; for (var scanline:int = miny; scanline < maxy + 1; scanline++) { ilen = edgeMinY.length; for (i = 0; i < ilen; i++) { if (edgeMinY[i] == scanline) { activeMaxY.push(edgeMaxY[i]); activeCurX.push(edgeMinX[i]); activeInvSlope.push(edgeInvSlope[i]); //trace("added(" + edgeMinX[i]); edgeMaxY.splice(i, 1); edgeMinX.splice(i, 1); edgeMinY.splice(i, 1); edgeInvSlope.splice(i, 1); i--; ilen--; } } ilen = activeCurX.length; for (i = 0; i < ilen - 1; i++) { for (var j:int = i; j < ilen - 1; j++) { if (activeCurX[j] > activeCurX[j + 1]) { var tempint:int = activeMaxY[j]; activeMaxY[j] = activeMaxY[j + 1]; activeMaxY[j + 1] = tempint; var tempnum:Number = activeCurX[j]; activeCurX[j] = activeCurX[j + 1]; activeCurX[j + 1] = tempnum; tempnum = activeInvSlope[j]; activeInvSlope[j] = activeInvSlope[j + 1]; activeInvSlope[j + 1] = tempnum; } } } var prevx:int = -1; var jlen:int = activeCurX.length; for (j = 0; j < jlen; j++) { if (prevx == -1) { prevx = activeCurX[j]; } else { for (var k:int = prevx; k < activeCurX[j]; k++) { graphics.lineStyle(2, 0x124132); graphics.drawCircle(k * 20 + 10, scanline * 20 + 10, 5); if (k == prevx || k > activeCurX[j] - 1) { graphics.lineStyle(3, 0x004132); graphics.drawCircle(k * 20 + 10, scanline * 20 + 10, 2); } prevx = -1; //tileLightList.push(k, scanline); } } } ilen = activeCurX.length; for (i = 0; i < ilen; i++) { if (activeMaxY[i] == scanline + 1) { activeCurX.splice(i, 1); activeMaxY.splice(i, 1); activeInvSlope.splice(i, 1); i--; ilen--; } else { activeCurX[i] += activeInvSlope[i]; } } } It works in some cases but some of the x intersections are skipped, primarily when there are more than 2 x intersections in one scanline I think. Is there a way to fix this, or a better way to do what I described? Thanks

Read the article

The Incremental Architect’s Napkin - #5 - Design functions for extensibility and readability

- by Ralf Westphal

Originally posted on: http://geekswithblogs.net/theArchitectsNapkin/archive/2014/08/24/the-incremental-architectrsquos-napkin---5---design-functions-for.aspx The functionality of programs is entered via Entry Points. So what we´re talking about when designing software is a bunch of functions handling the requests represented by and flowing in through those Entry Points. Designing software thus consists of at least three phases: Analyzing the requirements to find the Entry Points and their signatures Designing the functionality to be executed when those Entry Points get triggered Implementing the functionality according to the design aka coding I presume, you´re familiar with phase 1 in some way. And I guess you´re proficient in implementing functionality in some programming language. But in my experience developers in general are not experienced in going through an explicit phase 2. “Designing functionality? What´s that supposed to mean?” you might already have thought. Here´s my definition: To design functionality (or functional design for short) means thinking about… well, functions. You find a solution for what´s supposed to happen when an Entry Point gets triggered in terms of functions. A conceptual solution that is, because those functions only exist in your head (or on paper) during this phase. But you may have guess that, because it´s “design” not “coding”. And here is, what functional design is not: It´s not about logic. Logic is expressions (e.g. +, -, && etc.) and control statements (e.g. if, switch, for, while etc.). Also I consider calling external APIs as logic. It´s equally basic. It´s what code needs to do in order to deliver some functionality or quality. Logic is what´s doing that needs to be done by software. Transformations are either done through expressions or API-calls. And then there is alternative control flow depending on the result of some expression. Basically it´s just jumps in Assembler, sometimes to go forward (if, switch), sometimes to go backward (for, while, do). But calling your own function is not logic. It´s not necessary to produce any outcome. Functionality is not enhanced by adding functions (subroutine calls) to your code. Nor is quality increased by adding functions. No performance gain, no higher scalability etc. through functions. Functions are not relevant to functionality. Strange, isn´t it. What they are important for is security of investment. By introducing functions into our code we can become more productive (re-use) and can increase evolvability (higher unterstandability, easier to keep code consistent). That´s no small feat, however. Evolvable code can hardly be overestimated. That´s why to me functional design is so important. It´s at the core of software development. To sum this up: Functional design is on a level of abstraction above (!) logical design or algorithmic design. Functional design is only done until you get to a point where each function is so simple you are very confident you can easily code it. Functional design an logical design (which mostly is coding, but can also be done using pseudo code or flow charts) are complementary. Software needs both. If you start coding right away you end up in a tangled mess very quickly. Then you need back out through refactoring. Functional design on the other hand is bloodless without actual code. It´s just a theory with no experiments to prove it. But how to do functional design? An example of functional design Let´s assume a program to de-duplicate strings. The user enters a number of strings separated by commas, e.g. a, b, a, c, d, b, e, c, a. And the program is supposed to clear this list of all doubles, e.g. a, b, c, d, e. There is only one Entry Point to this program: the user triggers the de-duplication by starting the program with the string list on the command line C:\>deduplicate "a, b, a, c, d, b, e, c, a" a, b, c, d, e …or by clicking on a GUI button. This leads to the Entry Point function to get called. It´s the program´s main function in case of the batch version or a button click event handler in the GUI version. That´s the physical Entry Point so to speak. It´s inevitable. What then happens is a three step process: Transform the input data from the user into a request. Call the request handler. Transform the output of the request handler into a tangible result for the user. Or to phrase it a bit more generally: Accept input. Transform input into output. Present output. This does not mean any of these steps requires a lot of effort. Maybe it´s just one line of code to accomplish it. Nevertheless it´s a distinct step in doing the processing behind an Entry Point. Call it an aspect or a responsibility - and you will realize it most likely deserves a function of its own to satisfy the Single Responsibility Principle (SRP). Interestingly the above list of steps is already functional design. There is no logic, but nevertheless the solution is described - albeit on a higher level of abstraction than you might have done yourself. But it´s still on a meta-level. The application to the domain at hand is easy, though: Accept string list from command line De-duplicate Present de-duplicated strings on standard output And this concrete list of processing steps can easily be transformed into code:static void Main(string[] args) { var input = Accept_string_list(args); var output = Deduplicate(input); Present_deduplicated_string_list(output); } Instead of a big problem there are three much smaller problems now. If you think each of those is trivial to implement, then go for it. You can stop the functional design at this point. But maybe, just maybe, you´re not so sure how to go about with the de-duplication for example. Then just implement what´s easy right now, e.g.private static string Accept_string_list(string[] args) { return args[0]; } private static void Present_deduplicated_string_list( string[] output) { var line = string.Join(", ", output); Console.WriteLine(line); } Accept_string_list() contains logic in the form of an API-call. Present_deduplicated_string_list() contains logic in the form of an expression and an API-call. And then repeat the functional design for the remaining processing step. What´s left is the domain logic: de-duplicating a list of strings. How should that be done? Without any logic at our disposal during functional design you´re left with just functions. So which functions could make up the de-duplication? Here´s a suggestion: De-duplicate Parse the input string into a true list of strings. Register each string in a dictionary/map/set. That way duplicates get cast away. Transform the data structure into a list of unique strings. Processing step 2 obviously was the core of the solution. That´s where real creativity was needed. That´s the core of the domain. But now after this refinement the implementation of each step is easy again:private static string[] Parse_string_list(string input) { return input.Split(',') .Select(s => s.Trim()) .ToArray(); } private static Dictionary<string,object> Compile_unique_strings(string[] strings) { return strings.Aggregate( new Dictionary<string, object>(), (agg, s) => { agg[s] = null; return agg; }); } private static string[] Serialize_unique_strings( Dictionary<string,object> dict) { return dict.Keys.ToArray(); } With these three additional functions Main() now looks like this:static void Main(string[] args) { var input = Accept_string_list(args); var strings = Parse_string_list(input); var dict = Compile_unique_strings(strings); var output = Serialize_unique_strings(dict); Present_deduplicated_string_list(output); } I think that´s very understandable code: just read it from top to bottom and you know how the solution to the problem works. It´s a mirror image of the initial design: Accept string list from command line Parse the input string into a true list of strings. Register each string in a dictionary/map/set. That way duplicates get cast away. Transform the data structure into a list of unique strings. Present de-duplicated strings on standard output You can even re-generate the design by just looking at the code. Code and functional design thus are always in sync - if you follow some simple rules. But about that later. And as a bonus: all the functions making up the process are small - which means easy to understand, too. So much for an initial concrete example. Now it´s time for some theory. Because there is method to this madness ;-) The above has only scratched the surface. Introducing Flow Design Functional design starts with a given function, the Entry Point. Its goal is to describe the behavior of the program when the Entry Point is triggered using a process, not an algorithm. An algorithm consists of logic, a process on the other hand consists just of steps or stages. Each processing step transforms input into output or a side effect. Also it might access resources, e.g. a printer, a database, or just memory. Processing steps thus can rely on state of some sort. This is different from Functional Programming, where functions are supposed to not be stateful and not cause side effects.[1] In its simplest form a process can be written as a bullet point list of steps, e.g. Get data from user Output result to user Transform data Parse data Map result for output Such a compilation of steps - possibly on different levels of abstraction - often is the first artifact of functional design. It can be generated by a team in an initial design brainstorming. Next comes ordering the steps. What should happen first, what next etc.? Get data from user Parse data Transform data Map result for output Output result to user That´s great for a start into functional design. It´s better than starting to code right away on a given function using TDD. Please get me right: TDD is a valuable practice. But it can be unnecessarily hard if the scope of a functionn is too large. But how do you know beforehand without investing some thinking? And how to do this thinking in a systematic fashion? My recommendation: For any given function you´re supposed to implement first do a functional design. Then, once you´re confident you know the processing steps - which are pretty small - refine and code them using TDD. You´ll see that´s much, much easier - and leads to cleaner code right away. For more information on this approach I call “Informed TDD” read my book of the same title. Thinking before coding is smart. And writing down the solution as a bunch of functions possibly is the simplest thing you can do, I´d say. It´s more according to the KISS (Keep It Simple, Stupid) principle than returning constants or other trivial stuff TDD development often is started with. So far so good. A simple ordered list of processing steps will do to start with functional design. As shown in the above example such steps can easily be translated into functions. Moving from design to coding thus is simple. However, such a list does not scale. Processing is not always that simple to be captured in a list. And then the list is just text. Again. Like code. That means the design is lacking visuality. Textual representations need more parsing by your brain than visual representations. Plus they are limited in their “dimensionality”: text just has one dimension, it´s sequential. Alternatives and parallelism are hard to encode in text. In addition the functional design using numbered lists lacks data. It´s not visible what´s the input, output, and state of the processing steps. That´s why functional design should be done using a lightweight visual notation. No tool is necessary to draw such designs. Use pen and paper; a flipchart, a whiteboard, or even a napkin is sufficient. Visualizing processes The building block of the functional design notation is a functional unit. I mostly draw it like this: Something is done, it´s clear what goes in, it´s clear what comes out, and it´s clear what the processing step requires in terms of state or hardware. Whenever input flows into a functional unit it gets processed and output is produced and/or a side effect occurs. Flowing data is the driver of something happening. That´s why I call this approach to functional design Flow Design. It´s about data flow instead of control flow. Control flow like in algorithms is of no concern to functional design. Thinking about control flow simply is too low level. Once you start with control flow you easily get bogged down by tons of details. That´s what you want to avoid during design. Design is supposed to be quick, broad brush, abstract. It should give overview. But what about all the details? As Robert C. Martin rightly said: “Programming is abot detail”. Detail is a matter of code. Once you start coding the processing steps you designed you can worry about all the detail you want. Functional design does not eliminate all the nitty gritty. It just postpones tackling them. To me that´s also an example of the SRP. Function design has the responsibility to come up with a solution to a problem posed by a single function (Entry Point). And later coding has the responsibility to implement the solution down to the last detail (i.e. statement, API-call). TDD unfortunately mixes both responsibilities. It´s just coding - and thereby trying to find detailed implementations (green phase) plus getting the design right (refactoring). To me that´s one reason why TDD has failed to deliver on its promise for many developers. Using functional units as building blocks of functional design processes can be depicted very easily. Here´s the initial process for the example problem: For each processing step draw a functional unit and label it. Choose a verb or an “action phrase” as a label, not a noun. Functional design is about activities, not state or structure. Then make the output of an upstream step the input of a downstream step. Finally think about the data that should flow between the functional units. Write the data above the arrows connecting the functional units in the direction of the data flow. Enclose the data description in brackets. That way you can clearly see if all flows have already been specified. Empty brackets mean “no data is flowing”, but nevertheless a signal is sent. A name like “list” or “strings” in brackets describes the data content. Use lower case labels for that purpose. A name starting with an upper case letter like “String” or “Customer” on the other hand signifies a data type. If you like, you also can combine descriptions with data types by separating them with a colon, e.g. (list:string) or (strings:string[]). But these are just suggestions from my practice with Flow Design. You can do it differently, if you like. Just be sure to be consistent. Flows wired-up in this manner I call one-dimensional (1D). Each functional unit just has one input and/or one output. A functional unit without an output is possible. It´s like a black hole sucking up input without producing any output. Instead it produces side effects. A functional unit without an input, though, does make much sense. When should it start to work? What´s the trigger? That´s why in the above process even the first processing step has an input. If you like, view such 1D-flows as pipelines. Data is flowing through them from left to right. But as you can see, it´s not always the same data. It get´s transformed along its passage: (args) becomes a (list) which is turned into (strings). The Principle of Mutual Oblivion A very characteristic trait of flows put together from function units is: no functional units knows another one. They are all completely independent of each other. Functional units don´t know where their input is coming from (or even when it´s gonna arrive). They just specify a range of values they can process. And they promise a certain behavior upon input arriving. Also they don´t know where their output is going. They just produce it in their own time independent of other functional units. That means at least conceptually all functional units work in parallel. Functional units don´t know their “deployment context”. They now nothing about the overall flow they are place in. They are just consuming input from some upstream, and producing output for some downstream. That makes functional units very easy to test. At least as long as they don´t depend on state or resources. I call this the Principle of Mutual Oblivion (PoMO). Functional units are oblivious of others as well as an overall context/purpose. They are just parts of a whole focused on a single responsibility. How the whole is built, how a larger goal is achieved, is of no concern to the single functional units. By building software in such a manner, functional design interestingly follows nature. Nature´s building blocks for organisms also follow the PoMO. The cells forming your body do not know each other. Take a nerve cell “controlling” a muscle cell for example:[2] The nerve cell does not know anything about muscle cells, let alone the specific muscel cell it is “attached to”. Likewise the muscle cell does not know anything about nerve cells, let a lone a specific nerve cell “attached to” it. Saying “the nerve cell is controlling the muscle cell” thus only makes sense when viewing both from the outside. “Control” is a concept of the whole, not of its parts. Control is created by wiring-up parts in a certain way. Both cells are mutually oblivious. Both just follow a contract. One produces Acetylcholine (ACh) as output, the other consumes ACh as input. Where the ACh is going, where it´s coming from neither cell cares about. Million years of evolution have led to this kind of division of labor. And million years of evolution have produced organism designs (DNA) which lead to the production of these different cell types (and many others) and also to their co-location. The result: the overall behavior of an organism. How and why this happened in nature is a mystery. For our software, though, it´s clear: functional and quality requirements needs to be fulfilled. So we as developers have to become “intelligent designers” of “software cells” which we put together to form a “software organism” which responds in satisfying ways to triggers from it´s environment. My bet is: If nature gets complex organisms working by following the PoMO, who are we to not apply this recipe for success to our much simpler “machines”? So my rule is: Wherever there is functionality to be delivered, because there is a clear Entry Point into software, design the functionality like nature would do it. Build it from mutually oblivious functional units. That´s what Flow Design is about. In that way it´s even universal, I´d say. Its notation can also be applied to biology: Never mind labeling the functional units with nouns. That´s ok in Flow Design. You´ll do that occassionally for functional units on a higher level of abstraction or when their purpose is close to hardware. Getting a cockroach to roam your bedroom takes 1,000,000 nerve cells (neurons). Getting the de-duplication program to do its job just takes 5 “software cells” (functional units). Both, though, follow the same basic principle. Translating functional units into code Moving from functional design to code is no rocket science. In fact it´s straightforward. There are two simple rules: Translate an input port to a function. Translate an output port either to a return statement in that function or to a function pointer visible to that function. The simplest translation of a functional unit is a function. That´s what you saw in the above example. Functions are mutually oblivious. That why Functional Programming likes them so much. It makes them composable. Which is the reason, nature works according to the PoMO. Let´s be clear about one thing: There is no dependency injection in nature. For all of an organism´s complexity no DI container is used. Behavior is the result of smooth cooperation between mutually oblivious building blocks. Functions will often be the adequate translation for the functional units in your designs. But not always. Take for example the case, where a processing step should not always produce an output. Maybe the purpose is to filter input. Here the functional unit consumes words and produces words. But it does not pass along every word flowing in. Some words are swallowed. Think of a spell checker. It probably should not check acronyms for correctness. There are too many of them. Or words with no more than two letters. Such words are called “stop words”. In the above picture the optionality of the output is signified by the astrisk outside the brackets. It means: Any number of (word) data items can flow from the functional unit for each input data item. It might be none or one or even more. This I call a stream of data. Such behavior cannot be translated into a function where output is generated with return. Because a function always needs to return a value. So the output port is translated into a function pointer or continuation which gets passed to the subroutine when called:[3]void filter_stop_words( string word, Action<string> onNoStopWord) { if (...check if not a stop word...) onNoStopWord(word); } If you want to be nitpicky you might call such a function pointer parameter an injection. And technically you´re right. Conceptually, though, it´s not an injection. Because the subroutine is not functionally dependent on the continuation. Firstly continuations are procedures, i.e. subroutines without a return type. Remember: Flow Design is about unidirectional data flow. Secondly the name of the formal parameter is chosen in a way as to not assume anything about downstream processing steps. onNoStopWord describes a situation (or event) within the functional unit only. Translating output ports into function pointers helps keeping functional units mutually oblivious in cases where output is optional or produced asynchronically. Either pass the function pointer to the function upon call. Or make it global by putting it on the encompassing class. Then it´s called an event. In C# that´s even an explicit feature.class Filter { public void filter_stop_words( string word) { if (...check if not a stop word...) onNoStopWord(word); } public event Action<string> onNoStopWord; } When to use a continuation and when to use an event dependens on how a functional unit is used in flows and how it´s packed together with others into classes. You´ll see examples further down the Flow Design road. Another example of 1D functional design Let´s see Flow Design once more in action using the visual notation. How about the famous word wrap kata? Robert C. Martin has posted a much cited solution including an extensive reasoning behind his TDD approach. So maybe you want to compare it to Flow Design. The function signature given is:string WordWrap(string text, int maxLineLength) {...} That´s not an Entry Point since we don´t see an application with an environment and users. Nevertheless it´s a function which is supposed to provide a certain functionality. The text passed in has to be reformatted. The input is a single line of arbitrary length consisting of words separated by spaces. The output should consist of one or more lines of a maximum length specified. If a word is longer than a the maximum line length it can be split in multiple parts each fitting in a line. Flow Design Let´s start by brainstorming the process to accomplish the feat of reformatting the text. What´s needed? Words need to be assembled into lines Words need to be extracted from the input text The resulting lines need to be assembled into the output text Words too long to fit in a line need to be split Does sound about right? I guess so. And it shows a kind of priority. Long words are a special case. So maybe there is a hint for an incremental design here. First let´s tackle “average words” (words not longer than a line). Here´s the Flow Design for this increment: The the first three bullet points turned into functional units with explicit data added. As the signature requires a text is transformed into another text. See the input of the first functional unit and the output of the last functional unit. In between no text flows, but words and lines. That´s good to see because thereby the domain is clearly represented in the design. The requirements are talking about words and lines and here they are. But note the asterisk! It´s not outside the brackets but inside. That means it´s not a stream of words or lines, but lists or sequences. For each text a sequence of words is output. For each sequence of words a sequence of lines is produced. The asterisk is used to abstract from the concrete implementation. Like with streams. Whether the list of words gets implemented as an array or an IEnumerable is not important during design. It´s an implementation detail. Does any processing step require further refinement? I don´t think so. They all look pretty “atomic” to me. And if not… I can always backtrack and refine a process step using functional design later once I´ve gained more insight into a sub-problem. Implementation The implementation is straightforward as you can imagine. The processing steps can all be translated into functions. Each can be tested easily and separately. Each has a focused responsibility. And the process flow becomes just a sequence of function calls: Easy to understand. It clearly states how word wrapping works - on a high level of abstraction. And it´s easy to evolve as you´ll see. Flow Design - Increment 2 So far only texts consisting of “average words” are wrapped correctly. Words not fitting in a line will result in lines too long. Wrapping long words is a feature of the requested functionality. Whether it´s there or not makes a difference to the user. To quickly get feedback I decided to first implement a solution without this feature. But now it´s time to add it to deliver the full scope. Fortunately Flow Design automatically leads to code following the Open Closed Principle (OCP). It´s easy to extend it - instead of changing well tested code. How´s that possible? Flow Design allows for extension of functionality by inserting functional units into the flow. That way existing functional units need not be changed. The data flow arrow between functional units is a natural extension point. No need to resort to the Strategy Pattern. No need to think ahead where extions might need to be made in the future. I just “phase in” the remaining processing step: Since neither Extract words nor Reformat know of their environment neither needs to be touched due to the “detour”. The new processing step accepts the output of the existing upstream step and produces data compatible with the existing downstream step. Implementation - Increment 2 A trivial implementation checking the assumption if this works does not do anything to split long words. The input is just passed on: Note how clean WordWrap() stays. The solution is easy to understand. A developer looking at this code sometime in the future, when a new feature needs to be build in, quickly sees how long words are dealt with. Compare this to Robert C. Martin´s solution:[4] How does this solution handle long words? Long words are not even part of the domain language present in the code. At least I need considerable time to understand the approach. Admittedly the Flow Design solution with the full implementation of long word splitting is longer than Robert C. Martin´s. At least it seems. Because his solution does not cover all the “word wrap situations” the Flow Design solution handles. Some lines would need to be added to be on par, I guess. But even then… Is a difference in LOC that important as long as it´s in the same ball park? I value understandability and openness for extension higher than saving on the last line of code. Simplicity is not just less code, it´s also clarity in design. But don´t take my word for it. Try Flow Design on larger problems and compare for yourself. What´s the easier, more straightforward way to clean code? And keep in mind: You ain´t seen all yet ;-) There´s more to Flow Design than described in this chapter. In closing I hope I was able to give you a impression of functional design that makes you hungry for more. To me it´s an inevitable step in software development. Jumping from requirements to code does not scale. And it leads to dirty code all to quickly. Some thought should be invested first. Where there is a clear Entry Point visible, it´s functionality should be designed using data flows. Because with data flows abstraction is possible. For more background on why that´s necessary read my blog article here. For now let me point out to you - if you haven´t already noticed - that Flow Design is a general purpose declarative language. It´s “programming by intention” (Shalloway et al.). Just write down how you think the solution should work on a high level of abstraction. This breaks down a large problem in smaller problems. And by following the PoMO the solutions to those smaller problems are independent of each other. So they are easy to test. Or you could even think about getting them implemented in parallel by different team members. Flow Design not only increases evolvability, but also helps becoming more productive. All team members can participate in functional design. This goes beyon collective code ownership. We´re talking collective design/architecture ownership. Because with Flow Design there is a common visual language to talk about functional design - which is the foundation for all other design activities. PS: If you like what you read, consider getting my ebook “The Incremental Architekt´s Napkin”. It´s where I compile all the articles in this series for easier reading. I like the strictness of Function Programming - but I also find it quite hard to live by. And it certainly is not what millions of programmers are used to. Also to me it seems, the real world is full of state and side effects. So why give them such a bad image? That´s why functional design takes a more pragmatic approach. State and side effects are ok for processing steps - but be sure to follow the SRP. Don´t put too much of it into a single processing step. ? Image taken from www.physioweb.org ? My code samples are written in C#. C# sports typed function pointers called delegates. Action is such a function pointer type matching functions with signature void someName(T t). Other languages provide similar ways to work with functions as first class citizens - even Java now in version 8. I trust you find a way to map this detail of my translation to your favorite programming language. I know it works for Java, C++, Ruby, JavaScript, Python, Go. And if you´re using a Functional Programming language it´s of course a no brainer. ? Taken from his blog post “The Craftsman 62, The Dark Path”. ?

Read the article

Accelerated C++, problem 5-6 (copying values from inside a vector to the front)

- by Darel

Hello, I'm working through the exercises in Accelerated C++ and I'm stuck on question 5-6. Here's the problem description: (somewhat abbreviated, I've removed extraneous info.) 5-6. Write the extract_fails function so that it copies the records for the passing students to the beginning of students, and then uses the resize function to remove the extra elements from the end of students. (students is a vector of student structures. student structures contain an individual student's name and grades.) More specifically, I'm having trouble getting the vector.insert function to properly copy the passing student structures to the start of the vector students. Here's the extract_fails function as I have it so far (note it doesn't resize the vector yet, as directed by the problem description; that should be trivial once I get past my current issue.) // Extract the students who failed from the "students" vector. void extract_fails(vector<Student_info>& students) { typedef vector<Student_info>::size_type str_sz; typedef vector<Student_info>::iterator iter; iter it = students.begin(); str_sz i = 0, count = 0; while (it != students.end()) { // fgrade tests wether or not the student failed if (!fgrade(*it)) { // if student passed, copy to front of vector students.insert(students.begin(), it, it); // tracks of the number of passing students(so we can properly resize the array) count++; } cout << it->name << endl; // output to verify that each student is iterated to it++; } } The code compiles and runs, but the students vector isn't adding any student structures to its front. My program's output displays that the students vector is unchanged. Here's my complete source code, followed by a sample input file (I redirect input from the console by typing " < grades" after the compiled program name at the command prompt.) #include <iostream> #include <string> #include <algorithm> // to get the declaration of `sort' #include <stdexcept> // to get the declaration of `domain_error' #include <vector> // to get the declaration of `vector' //driver program for grade partitioning examples using std::cin; using std::cout; using std::endl; using std::string; using std::domain_error; using std::sort; using std::vector; using std::max; using std::istream; struct Student_info { std::string name; double midterm, final; std::vector<double> homework; }; bool compare(const Student_info&, const Student_info&); std::istream& read(std::istream&, Student_info&); std::istream& read_hw(std::istream&, std::vector<double>&); double median(std::vector<double>); double grade(double, double, double); double grade(double, double, const std::vector<double>&); double grade(const Student_info&); bool fgrade(const Student_info&); void extract_fails(vector<Student_info>& v); int main() { vector<Student_info> vs; Student_info s; string::size_type maxlen = 0; while (read(cin, s)) { maxlen = max(maxlen, s.name.size()); vs.push_back(s); } sort(vs.begin(), vs.end(), compare); extract_fails(vs); // display the new, modified vector - it should be larger than // the input vector, due to some student structures being // added to the front of the vector. cout << "count: " << vs.size() << endl << endl; vector<Student_info>::iterator it = vs.begin(); while (it != vs.end()) cout << it++->name << endl; return 0; } // Extract the students who failed from the "students" vector. void extract_fails(vector<Student_info>& students) { typedef vector<Student_info>::size_type str_sz; typedef vector<Student_info>::iterator iter; iter it = students.begin(); str_sz i = 0, count = 0; while (it != students.end()) { // fgrade tests wether or not the student failed if (!fgrade(*it)) { // if student passed, copy to front of vector students.insert(students.begin(), it, it); // tracks of the number of passing students(so we can properly resize the array) count++; } cout << it->name << endl; // output to verify that each student is iterated to it++; } } bool compare(const Student_info& x, const Student_info& y) { return x.name < y.name; } istream& read(istream& is, Student_info& s) { // read and store the student's name and midterm and final exam grades is >> s.name >> s.midterm >> s.final; read_hw(is, s.homework); // read and store all the student's homework grades return is; } // read homework grades from an input stream into a `vector<double>' istream& read_hw(istream& in, vector<double>& hw) { if (in) { // get rid of previous contents hw.clear(); // read homework grades double x; while (in >> x) hw.push_back(x); // clear the stream so that input will work for the next student in.clear(); } return in; } // compute the median of a `vector<double>' // note that calling this function copies the entire argument `vector' double median(vector<double> vec) { typedef vector<double>::size_type vec_sz; vec_sz size = vec.size(); if (size == 0) throw domain_error("median of an empty vector"); sort(vec.begin(), vec.end()); vec_sz mid = size/2; return size % 2 == 0 ? (vec[mid] + vec[mid-1]) / 2 : vec[mid]; } // compute a student's overall grade from midterm and final exam grades and homework grade double grade(double midterm, double final, double homework) { return 0.2 * midterm + 0.4 * final + 0.4 * homework; } // compute a student's overall grade from midterm and final exam grades // and vector of homework grades. // this function does not copy its argument, because `median' does so for us. double grade(double midterm, double final, const vector<double>& hw) { if (hw.size() == 0) throw domain_error("student has done no homework"); return grade(midterm, final, median(hw)); } double grade(const Student_info& s) { return grade(s.midterm, s.final, s.homework); } // predicate to determine whether a student failed bool fgrade(const Student_info& s) { return grade(s) < 60; } Sample input file: Moo 100 100 100 100 100 100 100 100 Fail1 45 55 65 80 90 70 65 60 Moore 75 85 77 59 0 85 75 89 Norman 57 78 73 66 78 70 88 89 Olson 89 86 70 90 55 73 80 84 Peerson 47 70 82 73 50 87 73 71 Baker 67 72 73 40 0 78 55 70 Davis 77 70 82 65 70 77 83 81 Edwards 77 72 73 80 90 93 75 90 Fail2 55 55 65 50 55 60 65 60 Thanks to anyone who takes the time to look at this!

Search Results

Search found 6441 results on 258 pages for 'schema compare'.

Page 137/258 | < Previous Page | 133 134 135 136 137 138 139 140 141 142 143 144 | Next Page >

- by Benjamin Priestman

- by Malcolm Jones

- by Rantanplan

- by mbcrump

- by FransBouma

- by Mark Rosenberg

- by AjarnMark

- by Kelly Jones

- by Tarun Arora

- by shawn

- by jorg

- by Jesse

- by nmarun

- by pinaldave

- by James Michael Hare

- by Javier Treviño

- by user12587121

- by Joe Lamantia

- by Mat Keep

- by Paul White

- by sv744

- by Dennis Vroegop

- by Fricken Hamster

- by Ralf Westphal

- by Darel

< Previous Page | 133 134 135 136 137 138 139 140 141 142 143 144 | Next Page >