Search Results

Search found 1228 results on 50 pages for 'comparing'.

Page 43/50 | < Previous Page | 39 40 41 42 43 44 45 46 47 48 49 50 | Next Page >

Benchmarking hosting providers IO with Bonnie

- by Derek Organ

Ok, because of a bunch of projects I'm working on I've access to dedicated Servers on a 3 hosting providers. As an experiment and for educational purposes I decided to see if I could benchmark how good the IO is with each. Bit of research lead me to Bonnie++ So I installed it on the server and ran this simple command /usr/sbin/bonnie -d /tmp/foo The 3 machines in different hosting providers are all dedicated machines, one is a VPS, other two are on some cloud platform e.g. VMWare / Xen using some kind of clustered SAN for storage This might be a naive thing to do but here are the results I found. HOST A Version 1.03c ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP xxxxxxxxxxxxxxxx 1G 45081 88 56244 14 19167 4 20965 40 67110 6 67.2 0 ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 15264 28 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ xxxxxxxx,1G,45081,88,56244,14,19167,4,20965,40,67110,6,67.2,0,16,15264,28,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++ HOST B Version 1.03d ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP xxxxxxxxxxxx 4G 43070 91 64510 15 19092 0 29276 47 39169 0 448.2 0 ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 24799 52 +++++ +++ +++++ +++ 25443 54 +++++ +++ +++++ +++ xxxxxxx,4G,43070,91,64510,15,19092,0,29276,47,39169,0,448.2,0,16,24799,52,+++++,+++,+++++,+++,25443,54,+++++,+++,+++++,+++ HOST C Version 1.03c ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP xxxxxxxxxxxxx 1536M 15598 22 85698 13 258969 20 16194 22 723655 21 +++++ +++ ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 14142 22 +++++ +++ 18621 22 13544 22 +++++ +++ 17363 21 xxxxxxxx,1536M,15598,22,85698,13,258969,20,16194,22,723655,21,+++++,+++,16,14142,22,+++++,+++,18621,22,13544,22,+++++,+++,17363,21 Ok, so first off what is the best way to read the figures and are there any issues with really comparing these numbers? Is this in any way a true representation of IO Speed? If not is there any way for me to test that? Note: these 3 machines are using either Ubuntu or Debian (I presume that doesn't really matter)

Read the article
where on disk is space allocated for new files inside LVM lv with ext4 file system?

- by Jost

I run a multi-disk server with LVM2. Several large disks serve as LVM2 physical volumes for one volume group, containing one logical volume formatted with ext4. Nothing fancy, just your standard linear setup. Recently an additional, very small disk was added as physical volume to that volume group and I expanded both the logical volume, and the ext4 file system therein onto that disk. This lv is used to store incremental backups using rsync and is only about 30% full, there have rarely been any files deleted from it, only incremental writes. Now this new HDD I added to the pre-existing volume group has unexpectedly died on me, and the volume group won't come up because it is missing one physical volume. As fate will have it, this WAS the "in an event of catastrophic failure on the primary server"-backup, the event happened, the boss is not happy, so this kinda has to work... According to this (Part 3): http://www.novell.com/coolsolutions/appnote/19386.html it is possible to trick LVM into starting anyway by creating a new pv with identical metadata to the failed disk, which will make the volume accessible, but of course leave giant holes in the file system. I have'n tried it yet, because it involves repairing (writing to) the file system which eliminates the possibility of trying other things if it fails. Now my question is: How does this setup actually allocate disk space for new data? Is it allocated linearly from beginning to end of PVs, in the order they were added to the vg? Is it striped somehow in order to increase performance/balance load? since this defective disk was added only later to an existing lvm2 vg and lv, containing a half-empty ext4, what are the chances that there was never any data written to the defective disk? In other words: what are the chances of recovering all my data, even without the defective disk, by just starting the volume group as-is? Am I about to go spend $1500 on having 250GB of empty space recovered when I send the defective disk in for repair? Is there a way to check without mounting the file system and opening the files, hoping they contain something other than zeros? (comparing addresses of used data blocks inside ext4 to address ranges that were on the missing pv, something like that, preferably easy to automate) I know bitwise-copying the entire lv into an image file before trying to repair the ext4 would probably be a good idea, but since this lv is very large and I just suffered major file system failure on several systems it is probably a luxury I don't have... Any suggestions?

Read the article
Windows 7 fails to install on KVM with qemu

- by kief_morris

I'm trying to install Windows 7 on a virtual machine on my 64 bit Ubuntu Karmic box. I get to the point of selecting my language settings and clicking 'install now', but a short while later I get a blue screen of death. I've tried a few variations, including using the 32 bit (fails very quickly). The virt-install command I've tried includes this: sudo virt-install --connect qemu:///system -n ksm-win7 -r 2048 \ --disk path=/home/kief/VM-Images/ksm-win7.qcow2,size=50 \ -c /var/Software/Windows7/Full/64bit/SW_DVD5_SA_Win_Ent_7_64BIT_English_Full_MLF_X15-70749.ISO \ --vnc --os-type windows --os-variant vista --hvm The limited info I could find suggested that 'vista' should work as the --os-variant, I haven't found any values specific to windows 7. Here's my blue screen: I've found very little by Googling, so I'm guessing this isn't a case of KVM simply not supporting Windows 7. Thanks for any help. Update: I have been able to successfully create a Windows 7 VM using the graphical "Virtual Machine Manager" app, although I don't really understand the cause of the problem with the VM created with virt-install. Comparing the configuration files under /etc/libvirt/qemu provides some clues, although I don't know enough to interpret them properly. The interesting differences in the two VM configurations are: --- win7-virt-install.xml +++ win7-vmm.xml -<domain type='qemu'> +<domain type='kvm'> @@ -21 +21 @@ - <emulator>/usr/bin/qemu-system-x86_64</emulator> + <emulator>/usr/bin/kvm</emulator> @@ -23 +23 @@ - <source file='/home/kief/VM-Images/ksm-win7.qcow2'/> + <source file='/var/lib/libvirt/images/ksm-win7x64.img'/> I'm not sure if this means the working VM is not using qemu at all, or if there is some other difference in the way it's used with kvm. Update2: So I've answered my own question (mostly) below. A KVM VM needs to use KVM's own CPU emulation rather than qemu's in order for me to get Windows 7 installed. I'm not sure whether there is something that can be done to get it working on a qemu-emulation CPU, or whether a newer version will support it. But at least it is possible to get it running on a KVM VM.

Read the article
Are file access times not properly maintained in Mac OS X?

- by Ether

I'm trying to determine how file access times are maintained by default in Mac OS X, as I'm trying to diagnose some odd behaviour I'm seeing in a new MBP Unibody (running Snow Leopard, 10.6.2): The symptoms (drilling down to the specific behaviour that seems to be causing the issue): mutt is unable to switch to mailboxes which have recently received new mail mail is delivered by procmail, which updates the mtime of the mbox folder it is updating, but does not alter the atime (this is how new mail detection works: by comparing atime to mtime) however, both the mtime and atime of the mbox file is getting updated Through testing, it does not appear that atimes can be set separately in the filesystem: : [ether@tequila ~]$; touch test : [ether@tequila ~]$; touch -m -t 200801010000 test2 : [ether@tequila ~]$; touch -a -t 200801010000 test3 : [ether@tequila ~]$; ls -l test* -rw------- 1 ether staff 0 Dec 30 11:42 test -rw------- 1 ether staff 0 Jan 1 2008 test2 -rw------- 1 ether staff 0 Dec 30 11:43 test3 : [ether@tequila ~]$; ls -lu test* -rw------- 1 ether staff 0 Dec 30 11:42 test -rw------- 1 ether staff 0 Dec 30 11:43 test2 -rw------- 1 ether staff 0 Dec 30 11:43 test3 The test2 file is created with an old mtime, and the atime is set to now (as it is a new file), which is correct. However, test3 is created with an old atime, but is not set properly on the file. To be sure this is not just behaviour seen with new files, let's modify an old file: : [ether@tequila ~]$; touch -a -t 200801010000 test : [ether@tequila ~]$; ls -l test -rw------- 1 ether staff 0 Dec 30 11:42 test : [ether@tequila ~]$; ls -lu test -rw------- 1 ether staff 0 Dec 30 11:45 test So it would seem that atimes cannot be set explicitly (it is always reset to "now" when either mtime or atime modifications are submitted). Is this something inherent to the filesystem itself, is it something that can be changed, or am I totally crazy and looking in the wrong place? PS. the output of mount is: : [ether@tequila ~]$; mount /dev/disk0s2 on / (hfs, local, journaled) devfs on /dev (devfs, local, nobrowse) map -hosts on /net (autofs, nosuid, automounted, nobrowse) map auto_home on /home (autofs, automounted, nobrowse) ...and Disk Utility says that the drive is of type "Mac OS Extended (Journaled)".

Read the article
Why is Windows Task Scheduler trying to launch multiple instances?

- by Paul H

We have a number of Windows Scheduled tasks that run on one Server 2008 Webserver (not R2) which is in a cluster. We recently moved from an original webserver Cluster to a new webserver Cluser (Server 2008 - not R2). The new webserver (in the cluster) running the Windows Tasks is setup the same as on the original we believe. BUT we now find that on the new Windows Server the Windows Task Scheduler seems to want to instantly start each task three times. If we set the option to queue up a new task we get: Event ID 324 Task Scheduler queued instance "{9a1a8411-b042-45ff-8e6b-89874df230d7}" of task "\Client Reporting" and will launch it as soon as instance "{2bcc3df6-ea3b-4453-90c2-75b8b1946388}" completes. If we set the option to stop an existing task we get: Event ID 323 Task Scheduler stopped instance "{e685a910-b32b-414e-85fd-96bbe54314a2}" of task "\Client Reporting" in order to launch new instance "{4db66265-1f51-4ede-8535-ac7c3cb5c4c1}" . Ticked settings: Allow task to be run on demand. Run task as soon as possible after a scheduled start is missed. Stop the task if running for longer than 1 hour. If the running task does not end when requested force it to stop. Start the task only if the computer is on AC power. Stop the task if the computer switches to battery power. Selected option: If the task is already running - stop the existing instance. Note: We moved the tasks from one server to another in the cluster to see if it the Task Scheduler on the particular server we'd picked causing the problem. Same behaviour. Could it be something to do with the build of the new servers? We have very similar tasks set up on another server cluster that work OK without all this multiple starting. Comparing those tasks to the ones here - there does not seem to be anything obviously different in terms of settings available to us through the options within the Task Scheduler. Trigger: The task is scheduled to be triggered daily, once an hour - and to be stopped if it exceeds this time. Action: Runs a .bat file. What could be causing this/where we can look to see what logic is causing the tasks to start multiple times in this way?

Read the article
Looking for app to work fluidly with CSV data in graph form

- by Aszurom

It often occurs to me that if I had a good tool for viewing CSV data in graphical format, and comparing two sets of numbers to each other, I could do a great deal of meaningful trend watching and data interpretation. For example, perfmon can output quite a lot of data about a server into a CSV file, but there's no good way to view it. A lot of scripts could/have been written that would populate CSV files. I could write these all day long. My problem is that I need a great viewer. I've seen quite a few things that will take a CSV file and after a lot of tweaking and user adjustment produce a static gif/png image. A static image doesn't do me a lot of good, because I have to look at it, then re-calibrate the parameters of the program, regenerate the image, repeat. That sucks. I could do this in Excel. Ideally, I would want a FLUID graph viewer. On the fly, I can adjust how much of my timeline I'm viewing. I could adjust the scaling so that one big spike doesn't make 99.9% of the data an unreadable line across the bottom of the X axis. Stuff like that. I should be able to say "show me CSV column 3 and column 5 as graphs. Show me the data scaled for 20 or 150 entries, and let me slide that window up and down the column of data. Auto scale to fit 95% of data within the Y axis and let crazy spikes go off the screen." Maybe I'm terribly spoiled by how you can drag, zoom, and slide data around on my iPad. I want to be able to view a spreadsheet of data with that fluidity and not have to guess at what sort of static snapshot I want to create from it. I don't want to have to make a study of how to tweak some data plotting program to let me import my file and do what I could just do in Excel. I want to scale, zoom, and transform my graph on the fly and then export a snapshot of it once I have it the way I want it. Is there anything out there that fills this need? I'll take linux, osx, win32 or even iOS suggestions.

Read the article
Kernel-mode Authentication: 401 errors when accessing site from remote machines

- by CJM

I have several Classic ASP sites that use Integrated Windows Authentication and Kerberos delegation. They work OK on the live servers (recently moved to a Server 2008/IIS7 servers), but do not work fully on my development PC or my development server. The IIS on both machines were configured through an IIS web deployment tool package which was exported from an old machine; the deployment didn't work perfectly, and I had to tinker a bit to get the sites working. When accessing the apps locally on either machine, they work fine; when accessing from another machine, the user is prompted by a username/password dialog, and regardless of what you enter, ultimately it results in a 401 (Unauthorised) error. I've tried comparing the configuration of these machines against similar live servers (that all work fine), and they seem generally comparable (given that none of the live servers are yet on IIS7.5 (Windows 7/Server 2008 R2). These applications run in a common application pool which uses a special domain user as it's identity - this user has similar permissions on the live and development machines. On IIS6 platforms, to enable kerberos delegation, I needed to set up some SPNs for this user, and they are still in place (even though I don't believe they are needed any longer for IIS7+ due to kernel-mode authentication), Furthermore, this account is enabled for Kerberos delegation in Active Directory, as is each machine I am dealing with. I'm considering the possibility that the deployment might have made changes/failed to make changes to the IIS configuration thus causing this problem. Perhaps a complete rebuild (minus another web deployment attempt) would solve the problem, but I'd rather fix (thus understand) the current problem. Any ideas so far? I've just had another attempt at fixing this issue, and I've made some progress, but I don't have a complete fix...yet. I've discovered that if I access the sites via IP address (than via NetBIOS name), I get the same dialog, except that it accepts my credentials and thus the application works - not quite a fix, but a useful step. More interestingly, I discovered that if I disable Kernel-mode authentication (in IIS Manager Website Authentication Advanced Settings), the applications work perfectly. My foggy understanding is that this is effectively working in the pre-IIS7 way. A reasonable short-term solution, but consider the following explicit advice from IIS on this issue: By default, IIS enables kernel-mode authentication, which may improve authentication performance and prevent authentication problems with application pools configured to use a custom identity. As a best practice, do not disable this setting if Kerberos authentication is used in your environment and the application pool is configured to use a custom identity. Clearly, this is not the way my applications should be working. So what is the issue?

Read the article
Windows Server 2008 Software Raid 5 - Data integrity issues

- by Fopedush

I've got a server running Windows Server 2008 R2, with a (windows native) software raid-5 array. The array consists of 7x 1TB Western Digital RE3 and RE4 drives. I have offline backups of this array. The problem is this: I noticed a few days ago after copying a large file to the disk that there was an integrity issue with that file - it was a ~12GB file that I had downloaded via uTorrent. After moving it to the raid array, I used uTorrent to relocate the download location, and performed a re-check so I could seed it from that location. The recheck found that only 6308/6310 chunks of the copied file were intact. My next step was to write a quick powershell script that would copy files to the array, while performing a SHA1 hash of the original and resultant files and comparing them. Smaller files (100-1000MB) copied over just fine. When I started copying larger data (~15GB), I found that the hash check failed about 2/3rds of the time. The corrupt files had very, very small inconsistencies - less than .01%. I further eliminated the possibility of networking or client issues by placing this large file on the C:\ of the server, and copying it repeatedly from there to the array, seeing similar results. Copying the data via explorer, powershell, or the standard windows command prompt yield the same results. None of the copies fail or report any problems. The raid array itself is listed as healthy in disk management. After a few experiments, I shut down the server and ran memtest overnight. No errors were detected. A basic run of chkdsk found no problems, but I did not use the /R flag, as I was unsure how that might affect a software raid-5 volume. I next ran Crystal Disk Info to check the smart data on the drives - but found that CDI only detected 5 out of 7 of the disks in the array. I have no idea why. Nevertheless, CDI shows the following "caution" flags on a single one of the drives: 05 199 199 140 000000000001 Reallocated Sectors Count C5 200 200 __0 000000000001 Current Pending Sector Count Which is a little bit alarming, but I don't really know what to do with the information. I hardly feel like one reallocated sector could be causing this. At this point, I'm looking for some guidance on what to do next. I need to determine the cause of this issue, but I'm hesitant to run chkdsk /R or any bootable disk health checkers because I'm afraid they might break the array. I've considered triggering a re-sync of the array, but I'm not actually sure how to do that without doing something silly like manually dropping a disk and then restoring it. Any advice that could help me ferret out the precise cause of this issue would be greatly appreciated.

Read the article
Suspected network performance issue on VirtualBox Ubuntu guest on Win7 host

- by Adam

I set up Ubuntu 12.04 in VirtualBox on the Win7 machine I was allocated on my new project. I am running Java, Eclipse, Tomcat to develop a large data-intensive application and I noticed that this application runs at half the speed of my colleague's identical machine, where he runs it all under Windows. I think I have narrowed down the performance issue to the network, after comparing and equalising all the Java VM settings with my colleague. Is there a ping test I can do or some other network diagnostic test to flag up any problems? To give some background, the network performance is confusing. Running a network speed test to my colleague's machine with iperf shows speeds of 6 Mb/s from my Ubuntu guest, and 90 Mb/s from the win7 host. Large downloads, e.g. the Java SDK, come down at about 1.2 MB/s on both the guest and the host. Pings are sub-1ms on the host, but 1.5ms on the guest. I also did a broadband speed test, and got 10Mb/s download speed on both, but the host has an upload speed of 10Mb/s but the guest only uploads at 3Mb/s. I've been trying to diagnose any MTU problems with ping -M do to identify any kind of packet fragmentation problem but it's progressing very slow because I don't have much experience in this area. From what I read on other people's networking issues with VB and Linux guests on Win7 hosts, I should be able to get the speed on the guest up to the same level as the host. I installed a fresh VM with Ubuntu again to see if I'd foobar'd it somehow, but I'm getting the same readings with iperf on the virgin installation. My setup is: Adapter 1: Intel PRO/1000 MT Desktop (NAT) Adapter 2: ditto (host-only adapter) eth0 Link encap:Ethernet HWaddr 08:00:27:0b:76:bf inet addr:10.0.2.15 Bcast:10.0.2.255 Mask:255.255.255.0 inet6 addr: fe80::a00:27ff:fe0b:76bf/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:86236 errors:0 dropped:0 overruns:0 frame:0 TX packets:49369 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:69163946 (69.1 MB) TX bytes:3530535 (3.5 MB) eth2 Link encap:Ethernet HWaddr 08:00:27:a3:26:b8 inet addr:192.168.56.101 Bcast:192.168.56.255 Mask:255.255.255.0 inet6 addr: fe80::a00:27ff:fea3:26b8/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:59 errors:0 dropped:0 overruns:0 frame:0 TX packets:57 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:9148 (9.1 KB) TX bytes:7648 (7.6 KB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:701 errors:0 dropped:0 overruns:0 frame:0 TX packets:701 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:66321 (66.3 KB) TX bytes:66321 (66.3 KB)

Read the article
Synchronize the same set of files to 2 different locations with 2 different programs for 2 different purposes

- by Hedgetrimmer

Because of stupid questionable IT policies at my not-to-be-named place of occupation, I have been (and will be, for the forseeable future) carrying on an external hard drive a unison-synchronized copy of all of my documents and code, including code which resides in some of my "dotfiles" and other code which resides in ~/bin (things I've made are there because ~/bin is in my $PATH) along with some cruft generated (and to be generated) by conscript and its related "giter8" templating system for Scala project boilerplates. Despite this, I do use a symlinking program to store all of my important dotfiles in a subdirectory. Thanks to that somewhat complicated setup, I have resorted to making a directory full of symlinks to every directory (or file, as is the case with stuff under ~/bin) that I want synchronized, and then follow = True is in my unison profile. It happens to be that this collection of odds and ends—plus an automatically-generated text file containing every package installed on my system—is everything under ~ that needs to be backed up to a remote (rsync-over-ssh) host with client-side encryption and signing from GPG. I already believe that duplicity is the most appropriate program to do that. What isn't as clear-cut is how to make duplicity use the exact same set of files when it runs a backup; it would be simple if duplicity would follow symlinks, but it does not and the manpage lists no option for enabling any such behavior. Comparing unison's file selection algorithm to duplicity's, I don't think I can write a program that could compute a ruleset for one program given one for the other. For the record, I would rather not keep the symlinks manually synchronized with duplicity file-selection rules, as they can change thanks to the above-mentioned complications regarding ~/bin. I don't think running duplicity on the external hard disk is such a good idea either; I usually keep that hard disk unmounted and unplugged in case of a power failure or other physical problem with the computer, plus I'm not sure about duplicity's performance given that: the hard disk is NTFS-formatted in order to be useable at my Windows-imprisoned place of occupation. despite being a USB 3.0 disk, my computer has no USB 3.0 ports so it acts as a USB 2.0 disk. How can I have duplicity (or is there a better program that I have overlooked?) back up the exact same set of files that is bidirectionally synchronized with my external hard disk?

Read the article
Fixing up Visual Studio’s gitignore , using IFix

- by terje

Originally posted on: http://geekswithblogs.net/terje/archive/2014/06/13/fixing-up-visual-studiorsquos-gitignore--using-ifix.aspxDownload tool Is there anything wrong with the built-in Visual Studio gitignore ???? Yes, there is ! First, some background: When you set up a git repo, it should be small and not contain anything not really needed. One thing you should not have in your git repo is binary files. These binary files may come from two sources, one is the output files, in the bin and obj folders. If you have a gitignore file present, which you should always have (!!), these folders are excluded by the standard included file (the one included when you choose Team Explorer/Settings/GitIgnore – Add.) The other source are the packages folder coming from your NuGet setup. You do use NuGet, right ? Of course you do ! But, that gitignore file doesn’t have any exclude clause for those folders. You have to add that manually. (It will very probably be included in some upcoming update or release). This is one thing that is missing from the built-in gitignore. To add those few lines is a no-brainer, you just include this: # NuGet Packages packages/* *.nupkg # Enable "build/" folder in the NuGet Packages folder since # NuGet packages use it for MSBuild targets. # This line needs to be after the ignore of the build folder # (and the packages folder if the line above has been uncommented) !packages/build/ Now, if you are like me, and you probably are, you add git repo’s faster than you can code, and you end up with a bunch of repo’s, and then start to wonder: Did I fix up those gitignore files, or did I forget it? The next thing you learn, for example by reading this blog post, is that the “standard” latest Visual Studio gitignore file exist at https://github.com/github/gitignore, and you locate it under the file name VisualStudio.gitignore. Here you will find all the new stuff, for example, the exclusion of the roslyn ide folders was commited on May 24th. So, you think, all is well, Visual Studio will use this file ….. I am very sorry, it won’t. Visual Studio comes with a gitignore file that is baked into the release, and that is by this time “very old”. The one at github is the latest. The included gitignore miss the exclusion of the nuget packages folder, it also miss a lot of new stuff, like the Roslyn stuff. So, how do you fix this ? … note .. while we wait for the next version… You can manually update it for every single repo you create, which works, but it does get boring after a few times, doesn’t it ? IFix Enter IFix , install it from here. IFix is a command line utility (and the installer adds it to the system path, you might need to reboot), and one of the commands is gitignore If you run it from a directory, it will check and optionally fix all gitignores in all git repo’s in that folder or below. So, start up by running it from your C:/<user>/source/repos folder. To run it in check mode – which will not change anything, just do a check: IFix gitignore --check What it will do is to check if the gitignore file is present, and if it is, check if the packages folder has been excluded. If you want to see those that are ok, add the --verbose command too. The result may look like this: Fixing missing packages Let us fix a single repo by adding the missing packages structure, using IFix --fix We first check, then fix, then check again to verify that the gitignore is correct, and that the “packages/” part has been added. If we open up the .gitignore, we see that the block shown below has been added to the end of the .gitignore file. Comparing and fixing with latest standard Visual Studio gitignore (from github) Now, this tells you if you miss the nuget packages folder, but what about the latest gitignore from github ? You can check for this too, just add the option –merge (why this is named so will be clear later down) So, IFix gitignore --check –merge The result may come out like this (sorry no colors, not got that far yet here): As you can see, one repo has the latest gitignore (test1), the others are missing either 57 or 150 lines. IFix has three ways to fix this: --add --merge --replace The options work as follows: Add: Used to add standard gitignore in the cases where a .gitignore file is missing, and only that, that means it won’t touch other existing gitignores. Merge: Used to merge in the missing lines from the standard into the gitignore file. If gitignore file is missing, the whole standard will be added. Replace: Used to force a complete replacement of the existing gitignore with the standard one. The Add and Replace options can be used without Fix, which means they will actually do the action. If you combine with --check it will otherwise not touch any files, just do a verification. So a Merge Check will tell you if there is any difference between the local gitignore and the standard gitignore, a Compare in effect. When you do a Fix Merge it will combine the local gitignore with the standard, and add what is missing to the end of the local gitignore. It may mean some things may be doubled up if they are spelled a bit differently. You might also see some extra comments added, but they do no harm. Init new repo with standard gitignore One cool thing is that with a new repo, or a repo that is missing its gitignore, you can grab the latest standard just by using either the Add or the Replace command, both will in effect do the same in this case. So, IFix gitignore --add will add it in, as in the complete example below, where we set up a new git repo and add in the latest standard gitignore: Notes The project is open sourced at github, and you can also report issues there.

Read the article
Oracle Open World 2012: SQL Developer Recap

- by thatjeffsmith

Last week was the ‘big show’ in San Francisco. I was very happy to meet many of you in person. And many of you had questions – lots of questions! We had full or overflowing rooms for our sessions and hands-on-labs. The SQL Developer ‘booths’ were also slammed several times. So exciting to see so many of YOU excited about SQL Developer. It’s very cool to hear the stories of our tools saving you and your organizations so much time (and money!) Instead of doing a Day 0 – Day 9 recap, I thought I’d share with you the questions that I heard more than once. And just for giggles, I’ll throw in some answers as well So in no particular order… What’s the difference between Oracle SQL Developer & Oracle SQL Developer Data Modeler? Mathematically speaking – two words. But as far as the actual modeling features go, there’s no difference between the two applications. The same ‘code’ or features as it pertains to data modeling and design are in both tools. However, in SQL Developer you have all of the OTHER features fighting for real estate in the UI. So I have a general rule of thumb – if you spend MOST of your time in the database, use SQL Developer. And if you spend most of your time in the data model, run the separate and dedicated program, Oracle SQL Developer Data Modeler. Here’s a couple of screenshots to drive home the UI point: Oracle SQL Developer Oracle SQL Developer Data Modeler running INSIDE of SQL Developer. Notice how the Modeler menu items fold under the file menu? Oracle SQL Developer Data Modeler Easier to navigate and manipulate your models with the stand alone modeler. Just no worksheet to run your ad-hoc queries, etc. Don’t forget you can disable the Data Modeler inside of SQL Developer via the Extensions preference page. How can I model my table partitions? Partitioning is defined via the Physical model. So after you have finished your relational model, you need to generate a physical model. Oracle SQL Developer Data Modeler Physical Model and Partitioning Open the properties for your physical model table. Enable the ‘partitioned’ property. Once you do so, the ‘Partitioning’ page will activate. Lots and lots of partitioning support and options here But what about Interval Partitioning? An extension of range partitioning in 11gR2, we don’t currently support this partitioning scheme in SQL Developer. But we’re working on it! Can SQL Developer ignore column order when comparing models? Yes! After you start a model compare, one of your options is to disregard the order of an attribute or column definition. Tell SQL Developer you don’t care when your column shows up, just as long as it DOES show up. Wow, you got a lot of questions around modeling! Is that normal? Yes! While we appreciate that many folks inherit their applications and associated designs, new applications are being ‘born’ every day. Since both of our tools are free for anyone to design their new Oracle applications with, we attract a fair amount of attention I want to do a Hands On Lab. How do I get your software and instructional guides? Go here. Download VirtualBox. Then download the VB image. Import the appliance. Start it. Connect oracle/oracle on the OEL VM. Click on ‘Start Here’ in the desktop. Follow the instructions. If you need help, ask away! You went too fast in your Tips & Tricks session. Do you have cliff notes? Yes! And you’re SO close to finding them! Just go to my SQL Developer resources page. All of my tips are documented on this blog somewhere. I’ve indexed the most popular ones on the resource page. You can use the Search dialog on the right to find the rest. Or just send me a comment or question, and I’ll do my best to answer them as they come in.

Read the article
How to build Open JavaFX for Android.

- by PictureCo

Here's a short recipe for baking JavaFX for Android dalvik. We will need just a few ingredients but each one requires special care. So let's get down to the business. SourcesThe first ingredient is an open JavaFX repository. This should be piece of cake. As always there's a catch. You probably know that dalvik is jdk6 compatible and also that certain APIs are missing comparing to good old java vm from Oracle. Fortunately there is a repository which is a backport of regular OpenJFX to jdk7 and going from jdk7 to jdk6 is possible. The first thing to do is to clone or download the repository from https://bitbucket.org/narya/jfx78. Main page of the project says "It works in some cases" so we will presume that it will work in most cases As I've said dalvik vm misses some APIs which would lead to a build failures. To get them use another compatibility repository which is available on GitHub https://github.com/robovm/robovm-jfx78-compat. Download the zip and unzip sources into jfx78/modules/base.We need also a javafx binary stubs. Use jfxrt.jar from jdk8.The last thing to download are freetype sources from http://freetype.org. These will be necessary for native font rendering. Toolchain setup I have to point out that these instructions were tested only on linux. I suppose they will work with minimal changes also on Mac OS. I also presume that you were able to build open JavaFX. That means all tools like ant, gradle, gcc and jdk8 have been installed and are working all right. In addition to this you will need to download and install jdk7, Android SDK and Android NDK for native code compilation. Installing all of them will take some time. Don't forget to put them in your path. export ANDROID_SDK=/opt/android-sdk-linux export ANDROID_NDK=/opt/android-ndk-r9b export JAVA_HOME=/opt/jdk1.7.0 export PATH=$JAVA_HOME/bin:$ANDROID_SDK/tools:$ANDROID_SDK/platform-tools:$ANDROID_NDK FreetypeUnzip freetype release sources first. We will have to cross compile them for arm. Firstly we will create a standalone toolchain for cross compiling installed in ~/work/ndk-standalone-19. $ANDROID_NDK/build/tools/make-standalone-toolchain.sh --platform=android-19 --install-dir=~/work/ndk-standalone-19 After the standalone toolchain has been created cross compile freetype with following script: export TOOLCHAIN=~/work/freetype/ndk-standalone-19 export PATH=$TOOLCHAIN/bin:$PATH export FREETYPE=`pwd` ./configure --host=arm-linux-androideabi --prefix=$FREETYPE/install --without-png --without-zlib --enable-shared sed -i 's/\-version\-info \$(version_info)/-avoid-version/' builds/unix/unix-cc.mk make make install It will compile and install freetype library into $FREETYPE/install. We will link to this install dir later on. It would be possible also to link openjfx font support dynamically against skia library available on Android which already contains freetype. It creates smaller result but can have compatibility problems. Patching Download patches javafx-android-compat.patch + android-tools.patch and patch jfx78 repository. I recommend to have look at patches. First one android-compat.patch updates openjfx build script, removes dependency on SharedSecret classes and updates LensLogger to remove dependency on jdk specific PlatformLogger. Second one android-tools.patch creates helper script in android-tools. The script helps to setup javaFX Android projects. Building Now is time to try the build. Run following script: JAVA_HOME=/opt/jdk1.7.0 JDK_HOME=/opt/jdk1.7.0 ANDROID_SDK=/opt/android-sdk-linux ANDROID_NDK=/opt/android-ndk-r9b PATH=$JAVA_HOME/bin:$ANDROID_SDK/tools:$ANDROID_SDK/platform-tools:$ANDROID_NDK:$PATH gradle -PDEBUG -PDALVIK_VM=true -PBINARY_STUB=~/work/binary_stub/linux/rt/lib/ext/jfxrt.jar \ -PFREETYPE_DIR=~/work/freetype/install -PCOMPILE_TARGETS=android If everything went all right the output is in build/android-sdk Create first JavaFX Android project Use gradle script int android-tools. The script sets the project structure for you. Following command creates Android HelloWorld project which links to a freshly built javafx runtime and to a HelloWorld application. NAME is a name of Android project. DIR where to create our first project. PACKAGE is package name required by Android. It has nothing to do with a packaging of javafx application. JFX_SDK points to our recently built runtime. JFX_APP points to dist directory of javafx application. (where all application jars sit) JFX_MAIN is fully qualified name of a main class. gradle -PDEBUG -PDIR=/home/user/work -PNAME=HelloWorld -PPACKAGE=com.helloworld \ -PJFX_SDK=/home/user/work/jfx78/build/android-sdk -PJFX_APP=/home/user/NetBeansProjects/HelloWorld/dist \ -PJFX_MAIN=com.helloworld.HelloWorld createProject Now cd to the created project and use it like any other android project. ant clean, debug, uninstall, installd will work. I haven't tried it from any IDE Eclipse nor Netbeans. Special thanks to Stefan Fuchs and Daniel Zwolenski for the repositories used in this blog post.

Read the article
Connection Pooling is Busted

- by MightyZot

A few weeks ago we started getting complaints about performance in an application that has performed very well for many years. The application is a n-tier application that uses ADODB with the SQLOLEDB provider to talk to a SQL Server database. Our object model is written in such a way that each public method validates security before performing requested actions, so there is a significant number of queries executed to get information about file cabinets, retrieve images, create workflows, etc. (PaperWise is a document management and workflow system.) A common factor for these customers is that they have remote offices connected via MPLS networks. Naturally, the first thing we looked at was the query performance in SQL Profiler. All of the queries were executing within expected timeframes, most of them were so fast that the duration in SQL Profiler was zero. After getting nowhere with SQL Profiler, the situation was escalated to me. I decided to take a peek with Process Monitor. Procmon revealed some “gaps” in the TCP/IP traffic. There were notable delays between send and receive pairs. The send and receive pairs themselves were quite snappy, but quite often there was a notable delay between a receive and the next send. You might expect some delay because, presumably, the application is doing some thinking in-between the pairs. But, comparing the procmon data at the remote locations with the procmon data for workstations on the local network showed that the remote workstations were significantly delayed. Procmon also showed a high number of disconnects. Wireshark traces showed that connections to the database were taking between 75ms and 150ms. Not only that, but connections to a file share containing images were taking 2 seconds! So, I asked about a trust. Sure enough there was a trust between two domains and the file share was on the second domain. Joining a remote workstation to the domain hosting the share containing images alleviated the time delay in accessing the file share. Removing the trust had no affect on the connections to the database. Microsoft Network Monitor includes filters that parse TDS packets. TDS is the protocol that SQL Server uses to communicate. There is a certificate exchange and some SSL that occurs during authentication. All of this was evident in the network traffic. After staring at the network traffic for a while, and examining packets, I decided to call it a night. On the way home that night, something about the traffic kept nagging at me. Then it dawned on me…at the beginning of the dance of packets between the client and the server all was well. Connection pooling was working and I could see multiple queries getting executed on the same connection and ethereal port. After a particular query, connecting to two different servers, I noticed that ADODB and SQLOLEDB started making repeated connections to the database on different ethereal ports. SQL Server would execute a single query and respond on a port, then open a new port and execute the next query. Connection pooling appeared to be broken. The next morning I wrote a test to confirm my hypothesis. Turns out that the sequence causing the connection nastiness goes something like this: Make a connection to the database. Open a result set that returns enough records to require multiple roundtrips to the server. For each result, query for some other data in the database (this will open a new implicit connection.) Close the inner result set and repeat for every item in the original result set. Close the original connection. Provided that the first result set returns enough data to require multiple roundtrips to the server, ADODB and SQLOLEDB will start making new connections to the database for each query executed in the loop. Originally, I thought this might be due to Microsoft’s denial of service (ddos) attack protection. After turning those features off to no avail, I eventually thought to switch my queries to client-side cursors instead of server-side cursors. Server-side cursors are the default, by the way. Voila! After switching to client-side cursors, the disconnects were gone and the above sequence yielded two connections as expected. While the real problem is the amount of time it takes to make connections over these MPLS networks (100ms on average), switching to client-side cursors made the problem go away. Believe it or not, this is actually documented by Microsoft, and rather difficult to find. (At least it was while we were trying to troubleshoot the problem!) So, if you’re noticing performance issues on slower networks, or networks with slower switching, take a look at the traffic in a tool like Microsoft Network Monitor. If you notice a high number of disconnects, and you’re using fire-hose or server-side cursors, then try switching to client-side cursors and you may see the problem go away. Most likely, Microsoft believes this to be appropriate behavior, because ADODB can’t guarantee that all of the data has been retrieved when you execute the inner queries. I’m not convinced, though, because the problem remains even after replacing all of the implicit connections with explicit connections and closing those connections in-between each of the inner queries. In that case, there doesn’t seem to be a reason why ADODB can’t use a single connection from the connection pool to make the additional queries, bringing the total number of connections to two. Instead ADO appears to make an assumption about the state of the connection. I’ve reported the behavior to Microsoft and am awaiting to hear from the appropriate team, so that I can demonstrate the problem. Maybe they can explain to us why this is appropriate behavior. :)

Read the article
Back Up to Tape the Way You Shop For Groceries

- by rickramsey

Imagine if this was how you shopped for groceries: From the end of the aisle sprint to the point where you reach the ketchup. Pull a bottle from the shelf and yell at the top of your lungs, “Got it!” Sprint back to the end of the aisle. Start again and sprint down the same aisle to the mustard, pull a bottle from the shelf and again yell for the whole store to hear, “Got it!” Sprint back to the end of the aisle. Repeat this procedure for every item you need in the aisle. Proceed to the next aisle and follow the same steps for the list of items you need from that aisle. Sounds ridiculous, doesn’t it? Not only is it horribly inefficient, it’s exhausting and can lead to wear out failures on your grocery cart, or worse, yourself. This is essentially how NetApp and some other applications write NDMP backups to tape. In the analogy, the ketchup and mustard are the files to be written, yelling “Got it!” is the equivalent of a sync mark at the end of a file, and the sprint back to the end of an aisle is the process most commonly called a “backhitch” where the drive has to back up on a tape to start writing again. Writing to tape in this way results in very slow tape drive performance and imposes unnecessary wear on the tape drive and the media, especially when writing small files. The good news is not all tape drives behave this way when writing small files. Unlike midrange LTO drives, Oracle’s StorageTek T10000D tape drive is designed to handle this scenario efficiently. The difference between the two drive types is that the T10000D drive gives you the ability to write files in a NetApp NDMP backup environment the way you would normally shop for groceries. With grocery shopping, you essentially stream through aisles picking up items as you go, and then after checking out, yell, “Got it!”, though you might do that last step silently. With the T10000D, it has a feature called the Tape Application Accelerator, which prevents the drive from having to stop after each file is written to notify NetApp or another application that the write was successful. When enabled in the T10000D tape drive, Tape Application Accelerator causes the tape drive to respond to tape mark and file sync commands differently than when disabled: A tape mark received by the tape drive is treated as a buffered tape mark. A file sync received by the tape drive is treated as a no op command. Since buffered tape marks and no op commands do not cause the tape drive to empty the contents of its buffer to tape and backhitch, the data is written to tape in significantly less time. Oracle has emulated NetApp environments with a number of different file sizes and found the following when comparing the T10000D with the Tape Application Accelerator enabled versus LTO6 tape drives. Notice how the T10000D is not only monumentally faster, but also remarkably consistent? In addition, the writing of the 50 GB of files is done without a single backhitch. The LTO6 drive, meanwhile, will perform as many as 3,800 backhitches! At the end of writing the entire set of files, the T10000D tape drive reports back to the application, in this case NetApp, that the write was successful via a tape mark. So if the Tape Application Accelerator dramatically improves performance and reliability, why wouldn’t you always have it enabled? The reason is because tape drive buffers are meant to be just temporary data repositories so in the event of a power loss, there could be data loss in certain environments for the files that resided in the buffer. Fortunately, we do have best practices depending on your environment to avoid this from happening. I highly recommend reading Maximizing Tape Performance with StorageTek T10000 Tape Drives (pdf) to decide which best practice is right for you. The white paper also digs deeper into the benefits of the Tape Application Accelerator. The white paper is free, and after downloading it you can decide for yourself whether you want to yell “Got it!” out loud or just silently to yourself. Customer Advisory Panel One final link: Oracle has started up a Customer Advisory Panel program to collect feedback from customers on their current experiences with Oracle products, as well as desires for future product development. If you would like to participate in the program, go to this link at oracle.com. photo taken on Idaho's Sacajewea Historic Biway by Rick Ramsey - Brian Zents Follow OTN on Blog | Facebook | Twitter | YouTube

Read the article
SQL SERVER – Fundamentals of Columnstore Index

- by pinaldave

There are two kind of storage in database. Row Store and Column Store. Row store does exactly as the name suggests – stores rows of data on a page – and column store stores all the data in a column on the same page. These columns are much easier to search – instead of a query searching all the data in an entire row whether the data is relevant or not, column store queries need only to search much lesser number of the columns. This means major increases in search speed and hard drive use. Additionally, the column store indexes are heavily compressed, which translates to even greater memory and faster searches. I am sure this looks very exciting and it does not mean that you convert every single index from row store to column store index. One has to understand the proper places where to use row store or column store indexes. Let us understand in this article what is the difference in Columnstore type of index. Column store indexes are run by Microsoft’s VertiPaq technology. However, all you really need to know is that this method of storing data is columns on a single page is much faster and more efficient. Creating a column store index is very easy, and you don’t have to learn new syntax to create them. You just need to specify the keyword “COLUMNSTORE” and enter the data as you normally would. Keep in mind that once you add a column store to a table, though, you cannot delete, insert or update the data – it is READ ONLY. However, since column store will be mainly used for data warehousing, this should not be a big problem. You can always use partitioning to avoid rebuilding the index. A columnstore index stores each column in a separate set of disk pages, rather than storing multiple rows per page as data traditionally has been stored. The difference between column store and row store approaches is illustrated below: In case of the row store indexes multiple pages will contain multiple rows of the columns spanning across multiple pages. In case of column store indexes multiple pages will contain multiple single columns. This will lead only the columns needed to solve a query will be fetched from disk. Additionally there is good chance that there will be redundant data in a single column which will further help to compress the data, this will have positive effect on buffer hit rate as most of the data will be in memory and due to same it will not need to be retrieved. Let us see small example of how columnstore index improves the performance of the query on a large table. As a first step let us create databaseset which is large enough to show performance impact of columnstore index. The time taken to create sample database may vary on different computer based on the resources. USE AdventureWorks GO -- Create New Table CREATE TABLE [dbo].[MySalesOrderDetail]( [SalesOrderID] [int] NOT NULL, [SalesOrderDetailID] [int] NOT NULL, [CarrierTrackingNumber] [nvarchar](25) NULL, [OrderQty] [smallint] NOT NULL, [ProductID] [int] NOT NULL, [SpecialOfferID] [int] NOT NULL, [UnitPrice] [money] NOT NULL, [UnitPriceDiscount] [money] NOT NULL, [LineTotal] [numeric](38, 6) NOT NULL, [rowguid] [uniqueidentifier] NOT NULL, [ModifiedDate] [datetime] NOT NULL ) ON [PRIMARY] GO -- Create clustered index CREATE CLUSTERED INDEX [CL_MySalesOrderDetail] ON [dbo].[MySalesOrderDetail] ( [SalesOrderDetailID]) GO -- Create Sample Data Table -- WARNING: This Query may run upto 2-10 minutes based on your systems resources INSERT INTO [dbo].[MySalesOrderDetail] SELECT S1.* FROM Sales.SalesOrderDetail S1 GO 100 Now let us do quick performance test. I have kept STATISTICS IO ON for measuring how much IO following queries take. In my test first I will run query which will use regular index. We will note the IO usage of the query. After that we will create columnstore index and will measure the IO of the same. -- Performance Test -- Comparing Regular Index with ColumnStore Index USE AdventureWorks GO SET STATISTICS IO ON GO -- Select Table with regular Index SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice, SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty FROM [dbo].[MySalesOrderDetail] GROUP BY ProductID ORDER BY ProductID GO -- Table 'MySalesOrderDetail'. Scan count 1, logical reads 342261, physical reads 0, read-ahead reads 0. -- Create ColumnStore Index CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_MySalesOrderDetail_ColumnStore] ON [MySalesOrderDetail] (UnitPrice, OrderQty, ProductID) GO -- Select Table with Columnstore Index SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice, SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty FROM [dbo].[MySalesOrderDetail] GROUP BY ProductID ORDER BY ProductID GO It is very clear from the results that query is performance extremely fast after creating ColumnStore Index. The amount of the pages it has to read to run query is drastically reduced as the column which are needed in the query are stored in the same page and query does not have to go through every single page to read those columns. If we enable execution plan and compare we can see that column store index performance way better than regular index in this case. Let us clean up the database. -- Cleanup DROP INDEX [IX_MySalesOrderDetail_ColumnStore] ON [dbo].[MySalesOrderDetail] GO TRUNCATE TABLE dbo.MySalesOrderDetail GO DROP TABLE dbo.MySalesOrderDetail GO In future posts we will see cases where Columnstore index is not appropriate solution as well few other tricks and tips of the columnstore index. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Analysing and measuring the performance of a .NET application (survey results)

- by Laila

Back in December last year, I asked myself: could it be that .NET developers think that you need three days and a PhD to do performance profiling on their code? What if developers are shunning profilers because they perceive them as too complex to use? If so, then what method do they use to measure and analyse the performance of their .NET applications? Do they even care about performance? So, a few weeks ago, I decided to get a 1-minute survey up and running in the hopes that some good, hard data would clear the matter up once and for all. I posted the survey on Simple Talk and got help from a few people to promote it. The survey consisted of 3 simple questions: Amazingly, 533 developers took the time to respond - which means I had enough data to get representative results! So before I go any further, I would like to thank all of you who contributed, because I now have some pretty good answers to the troubling questions I was asking myself. To thank you properly, I thought I would share some of the results with you. First of all, application performance is indeed important to most of you. In fact, performance is an intrinsic part of the development cycle for a good 40% of you, which is much higher than I had anticipated, I have to admit. (I know, "Have a little faith Laila!") When asked what tool you use to measure and analyse application performance, I found that nearly half of the respondents use logging statements, a third use performance counters, and 70% of respondents use a profiler of some sort (a 3rd party performance profilers, the CLR profiler or the Visual Studio profiler). The importance attributed to logging statements did surprise me a little. I am still not sure why somebody would go to the trouble of manually instrumenting code in order to measure its performance, instead of just using a profiler. I personally find the process of annotating code, calculating times from log files, and relating it all back to your source terrifyingly laborious. Not to mention that you then need to remember to turn it all off later! Even when you have logging in place throughout all your code anyway, you still have a fair amount of potentially error-prone calculation to sift through the results; in addition, you'll only get method-level rather than line-level timings, and you won't get timings from any framework or library methods you don't have source for. To top it all, we all know that bottlenecks are rarely where you would expect them to be, so you could be wasting time looking for a performance problem in the wrong place. On the other hand, profilers do all the work for you: they automatically collect the CPU and wall-clock timings, and present the results from method timing all the way down to individual lines of code. Maybe I'm missing a trick. I would love to know about the types of scenarios where you actively prefer to use logging statements. Finally, while a third of the respondents didn't have a strong opinion about code performance profilers, those who had an opinion thought that they were mainly complex to use and time consuming. Three respondents in particular summarised this perfectly: "sometimes, they are rather complex to use, adding an additional time-sink to the process of trying to resolve the existing problem". "they are simple to use, but the results are hard to understand" "Complex to find the more advanced things, easy to find some low hanging fruit". These results confirmed my suspicions: Profilers are seen to be designed for more advanced users who can use them effectively and make sense of the results. I found yet more interesting information when I started comparing samples of "developers for whom performance is an important part of the dev cycle", with those "to whom performance is only looked at in times of crisis", and "developers to whom performance is not important, as long as the app works". See the three graphs below. Sample of developers to whom performance is an important part of the dev cycle: Sample of developers to whom performance is important only in times of crisis: Sample of developers to whom performance is not important, as long as the app works: As you can see, there is a strong correlation between the usage of a profiler and the importance attributed to performance: indeed, the more important performance is to a development team, the more likely they are to use a profiler. In addition, developers to whom performance is an important part of the dev cycle have a higher tendency to use a much wider range of methods for performance measurement and analysis. And, unsurprisingly, the less important performance is, the less varied the methods of measurement are. So all in all, to come back to my random questions: .NET developers do care about performance. Those who care the most use a wider range of performance measurement methods than those who care less. But overall, logging statements, performance counters and third party performance profilers are the performance measurement methods of choice for most developers. Finally, although most of you find code profilers complex to use, those of you who care the most about performance tend to use profilers more than those of you to whom performance is not so important.

Read the article
CLR via C# 3rd Edition is out

- by Abhijeet Patel

Time for some book news update. CLR via C#, 3rd Edition seems to have been out for a little while now. The book was released in early Feb this year, and needless to say my copy is on it’s way. I can barely wait to dig in and chew on the goodies that one of the best technical authors and software professionals I respect has in store. The 2nd edition of the book was an absolute treat and this edition promises to be no less. Here is a brief description of what’s new and updated from the 2nd edition. Part I – CLR Basics Chapter 1-The CLR’s Execution Model Added about discussion about C#’s /optimize and /debug switches and how they relate to each other. Chapter 2-Building, Packaging, Deploying, and Administering Applications and Types Improved discussion about Win32 manifest information and version resource information. Chapter 3-Shared Assemblies and Strongly Named Assemblies Added discussion of TypeForwardedToAttribute and TypeForwardedFromAttribute. Part II – Designing Types Chapter 4-Type Fundamentals No new topics. Chapter 5-Primitive, Reference, and Value Types Enhanced discussion of checked and unchecked code and added discussion of new BigInteger type. Also added discussion of C# 4.0’s dynamic primitive type. Chapter 6-Type and Member Basics No new topics. Chapter 7-Constants and Fields No new topics. Chapter 8-Methods Added discussion of extension methods and partial methods. Chapter 9-Parameters Added discussion of optional/named parameters and implicitly-typed local variables. Chapter 10-Properties Added discussion of automatically-implemented properties, properties and the Visual Studio debugger, object and collection initializers, anonymous types, the System.Tuple type and the ExpandoObject type. Chapter 11-Events Added discussion of events and thread-safety as well as showing a cool extension method to simplify the raising of an event. Chapter 12-Generics Added discussion of delegate and interface generic type argument variance. Chapter 13-Interfaces No new topics. Part III – Essential Types Chapter 14-Chars, Strings, and Working with Text No new topics. Chapter 15-Enums Added coverage of new Enum and Type methods to access enumerated type instances. Chapter 16-Arrays Added new section on initializing array elements. Chapter 17-Delegates Added discussion of using generic delegates to avoid defining new delegate types. Also added discussion of lambda expressions. Chapter 18-Attributes No new topics. Chapter 19-Nullable Value Types Added discussion on performance. Part IV – CLR Facilities Chapter 20-Exception Handling and State Management This chapter has been completely rewritten. It is now about exception handling and state management. It includes discussions of code contracts and constrained execution regions (CERs). It also includes a new section on trade-offs between writing productive code and reliable code. Chapter 21-Automatic Memory Management Added discussion of C#’s fixed state and how it works to pin objects in the heap. Rewrote the code for weak delegates so you can use them with any class that exposes an event (the class doesn’t have to support weak delegates itself). Added discussion on the new ConditionalWeakTable class, GC Collection modes, Full GC notifications, garbage collection modes and latency modes. I also include a new sample showing how your application can receive notifications whenever Generation 0 or 2 collections occur. Chapter 22-CLR Hosting and AppDomains Added discussion of side-by-side support allowing multiple CLRs to be loaded in a single process. Added section on the performance of using MarshalByRefObject-derived types. Substantially rewrote the section on cross-AppDomain communication. Added section on AppDomain Monitoring and first chance exception notifications. Updated the section on the AppDomainManager class. Chapter 23-Assembly Loading and Reflection Added section on how to deploy a single file with dependent assemblies embedded inside it. Added section comparing reflection invoke vs bind/invoke vs bind/create delegate/invoke vs C#’s dynamic type. Chapter 24-Runtime Serialization This is a whole new chapter that was not in the 2nd Edition. Part V – Threading Chapter 25-Threading Basics Whole new chapter motivating why Windows supports threads, thread overhead, CPU trends, NUMA Architectures, the relationship between CLR threads and Windows threads, the Thread class, reasons to use threads, thread scheduling and priorities, foreground thread vs background threads. Chapter 26-Performing Compute-Bound Asynchronous Operations Whole new chapter explaining the CLR’s thread pool. This chapter covers all the new .NET 4.0 constructs including cooperative cancelation, Tasks, the aralle class, parallel language integrated query, timers, how the thread pool manages its threads, cache lines and false sharing. Chapter 27-Performing I/O-Bound Asynchronous Operations Whole new chapter explaining how Windows performs synchronous and asynchronous I/O operations. Then, I go into the CLR’s Asynchronous Programming Model, my AsyncEnumerator class, the APM and exceptions, Applications and their threading models, implementing a service asynchronously, the APM and Compute-bound operations, APM considerations, I/O request priorities, converting the APM to a Task, the event-based Asynchronous Pattern, programming model soup. Chapter 28-Primitive Thread Synchronization Constructs Whole new chapter discusses class libraries and thread safety, primitive user-mode, kernel-mode constructs, and data alignment. Chapter 29-Hybrid Thread Synchronization Constructs Whole new chapter discussion various hybrid constructs such as ManualResetEventSlim, SemaphoreSlim, CountdownEvent, Barrier, ReaderWriterLock(Slim), OneManyResourceLock, Monitor, 3 ways to solve the double-check locking technique, .NET 4.0’s Lazy and LazyInitializer classes, the condition variable pattern, .NET 4.0’s concurrent collection classes, the ReaderWriterGate and SyncGate classes.

Read the article
Working with Reporting Services Filters–Part 1

- by smisner

There are two ways that you can filter data in Reporting Services. The first way, which usually provides a faster performance, is to use query parameters to apply a filter using the WHERE clause in a SQL statement. In that case, the structure of the filter depends upon the syntax recognized by the source database. Another way to filter data in Reporting Services is to apply a filter to a dataset, data region, or a group. Using this latter method, you can even apply multiple filters. However, the use of filter operators or the setup of multiple filters is not always obvious, so in this series of posts, I'll provide some more information about the configuration of filters. First, why not use query parameters exclusively for filtering? Here are a few reasons: You might want to apply a filter to part of the report, but not all of the report. Your dataset might retrieve data from a stored procedure, and doesn't allow you to pass a query parameter for filtering purposes. Your report might be set up as a snapshot on the report server and, in that case, cannot be dynamically filtered based on a query parameter. Next, let's look at how to set up a report filter in general. The process is the same whether you are applying the filter to a dataset, data region, or a group. When you go to the Filters page in the Properties dialog box for whichever of these items you selected (dataset, data region, group), you click the Add button to create a new filter. The interface looks like this: The Expression field is usually a field in the dataset, so to make it easier for you to make a selection,the drop-down list displays all of the current dataset fields. But notice the expression button to the right, which means that you can set up any type of expression-not just a dataset field. To the right of the expression button, you'll find a data type drop-down list. It's important to specify the correct data type for the field or expression you're using. Now for the operators. Here's a list of the options that you have: This Operator Performs This Action =, <>, >, >=, <, <=, Like Compares expression to value Top N, Bottom N Compares expression to Top (Bottom) set of N values (N = integer) Top %, Bottom % Compares expression to Top (Bottom) N percent of values (N = integer or float) Between Determines whether expression is between two values, inclusive In Determines whether expression is found in list of values Last, the Value is what you're comparing to the expression using the operator. The construction of a filter using some operators (=, <>, >, etc.) is fairly simple. If my dataset (for AdventureWorks data) has a Category field, and I have a parameter that prompts the user for a single category, I can set up a filter like this: Expression Data Type Operator Value [Category] Text = [@Category] But if I set the parameter to accept multiple values, I need to change the operator from = to In, just as I would have to do if I were using a query parameter. The parameter expression, [@Category], which translates to =Parameters!Category.Value, doesn’t need to change because it represents an array as soon as I change the parameter to allow multiple values. The “In” operator requires an array. With that in mind, let’s consider a variation on Value. Let’s say that I have a parameter that prompts the user for a particular year – and for simplicity’s sake, this parameter only allows a single value, and I have an expression that evaluates the previous year based on the user’s selection. Then I want to use these two values in two separate filters with an OR condition. That is, I want to filter either by the year selected OR by the year that was computed. If I create two filters, one for each year (as shown below), then the report will only display results if BOTH filter conditions are met – which would never be true. Expression Data Type Operator Value [CalendarYear] Integer = [@Year] [CalendarYear] Integer = =Parameters!Year.Value-1 To handle this scenario, we need to create a single filter that uses the “In” operator, and then set up the Value expression as an array. To create an array, we use the Split function after creating a string that concatenates the two values (highlighted in yellow) as shown below. Expression Data Type Operator Value =Cstr(Fields!CalendarYear.Value) Text In =Split( CStr(Parameters!Year.Value) + ”,” + CStr(Parameters!Year.Value-1) , “,”) Note that in this case, I had to apply a string conversion on the year integer so that I could concatenate the parameter selection with the calculated year. Pay attention to the second argument of the Split function—you must use a comma delimiter for the result to work correctly with the In operator. I also had to change the Expression value from [CalendarYear] (or =Fields!CalendarYear.Value) so that the expression would return a string that I could compare with the values in the string array. More fun with filter expressions in future posts!

Read the article
Remote Debug Windows Azure Cloud Service

- by Shaun

Originally posted on: http://geekswithblogs.net/shaunxu/archive/2013/11/02/remote-debug-windows-azure-cloud-service.aspxOn the 22nd of October Microsoft Announced the new Windows Azure SDK 2.2. It introduced a lot of cool features but one of it shocked most, which is the remote debug support for Windows Azure Cloud Service (a.k.a. WACS). Live Debug is Nightmare for Cloud Application When we are developing against public cloud, debug might be the most difficult task, especially after the application had been deployed. In order to minimize the debug effort, Microsoft provided local emulator for cloud service and storage once the Windows Azure platform was announced. By using local emulator developers could be able run their application on local machine with almost the same behavior as running on Windows Azure, and that could be debug easily and quickly. But when we deployed our application to Azure, we have to use log, diagnostic monitor to debug, which is very low efficient. Visual Studio 2012 introduced a new feature named "anonymous remote debug" which allows any workstation under any user could be able to attach the remote process. This is less secure comparing the authenticated remote debug but much easier and simpler to use. Now in Windows Azure SDK 2.2, we could be able to attach our application from our local machine to Windows Azure, and it's very easy. How to Use Remote Debugger First, let's create a new Windows Azure Cloud Project in Visual Studio and selected ASP.NET Web Role. Then create an ASP.NET WebForm application. Then right click on the cloud project and select "publish". In the publish dialog we need to make sure the application will be built in debug mode, since .NET assembly cannot be debugged in release mode. I enabled Remote Desktop as I will log into the virtual machine later in this post. It's NOT necessary for remote debug. And selected "advanced settings" tab, make sure we checked "Enable Remote Debugger for all roles". In WACS, a cloud service could be able to have one or more roles and each role could be able to have one or more instances. The remote debugger will be enabled for all roles and all instances if we checked. Currently there's no way for us to specify which role(s) and which instance(s) to enable. Finally click "publish" button. In the windows azure activity window in Visual Studio we can find some information about remote debugger. To attache remote process would be easy. Open the "server explorer" window in Visual Studio and expand "cloud services" node, find the cloud service, role and instance we had just published and wanted to debug, right click on the instance and select "attach debugger". Then after a while (it's based on how fast our Internet connect to Windows Azure Data Center) the Visual Studio will be switched to debug mode. Let's add a breakpoint in the default web page's form load function and refresh the page in browser to see what's happen. We can see that the our application was stopped at the breakpoint. The call stack, watch features are all available to use. Now let's hit F5 to continue the step, then back to the browser we will find the page was rendered successfully. What Under the Hood Remote debugger is a WACS plugin. When we checked the "enable remote debugger" in the publish dialog, Visual Studio will add two cloud configuration settings in the CSCFG file. Since they were appended when deployment, we cannot find in our project's CSCFG file. But if we opened the publish package we could find as below. At the same time, Visual Studio will generate a certificate and included into the package for remote debugger. If we went to the azure management portal we will find there will a certificate under our application which was created, uploaded by remote debugger plugin. Since I enabled Remote Desktop there will be two certificates in the screenshot below. The other one is for remote debugger. When our application was deployed, windows azure system will open related ports for remote debugger. As below you can see there are two new ports opened on my application. Finally, in our WACS virtual machine, windows azure system will copy the remote debug component based on which version of Visual Studio we are using and start. Our application then can be debugged remotely through the visual studio remote debugger. Below is the task manager on the virtual machine of my WACS application. Summary In this post I demonstrated one of the feature introduced in Windows Azure SDK 2.2, which is Remote Debugger. It allows us to attach our application from local machine to windows azure virtual machine once it had been deployed. Remote debugger is powerful and easy to use, but it brings more security risk. And since it's only available for debug build this means the performance will be worse than release build. Hence we should only use this feature for staging test and bug fix (publish our beta version to azure staging slot), rather than for production. Hope this helps, Shaun All documents and related graphics, codes are provided "AS IS" without warranty of any kind. Copyright © Shaun Ziyan Xu. This work is licensed under the Creative Commons License.

Read the article
Getting started with Oracle Database In-Memory Part III - Querying The IM Column Store

- by Maria Colgan

In my previous blog posts, I described how to install, enable, and populate the In-Memory column store (IM column store). This weeks post focuses on how data is accessed within the IM column store. Let’s take a simple query “What is the most expensive air-mail order we have received to date?” SELECT Max(lo_ordtotalprice) most_expensive_order FROM lineorderWHERE lo_shipmode = 5; The LINEORDER table has been populated into the IM column store and since we have no alternative access paths (indexes or views) the execution plan for this query is a full table scan of the LINEORDER table. You will notice that the execution plan has a new set of keywords “IN MEMORY" in the access method description in the Operation column. These keywords indicate that the LINEORDER table has been marked for INMEMORY and we may use the IM column store in this query. What do I mean by “may use”? There are a small number of cases were we won’t use the IM column store even though the object has been marked INMEMORY. This is similar to how the keyword STORAGE is used on Exadata environments. You can confirm that the IM column store was actually used by examining the session level statistics, but more on that later. For now let's focus on how the data is accessed in the IM column store and why it’s faster to access the data in the new column format, for analytical queries, rather than the buffer cache. There are four main reasons why accessing the data in the IM column store is more efficient. 1. Access only the column data needed The IM column store only has to scan two columns – lo_shipmode and lo_ordtotalprice – to execute this query while the traditional row store or buffer cache has to scan all of the columns in each row of the LINEORDER table until it reaches both the lo_shipmode and the lo_ordtotalprice column. 2. Scan and filter data in it's compressed format When data is populated into the IM column it is automatically compressed using a new set of compression algorithms that allow WHERE clause predicates to be applied against the compressed formats. This means the volume of data scanned in the IM column store for our query will be far less than the same query in the buffer cache where it will scan the data in its uncompressed form, which could be 20X larger. 3. Prune out any unnecessary data within each column The fastest read you can execute is the read you don’t do. In the IM column store a further reduction in the amount of data accessed is possible due to the In-Memory Storage Indexes(IM storage indexes) that are automatically created and maintained on each of the columns in the IM column store. IM storage indexes allow data pruning to occur based on the filter predicates supplied in a SQL statement. An IM storage index keeps track of minimum and maximum values for each column in each of the In-Memory Compression Unit (IMCU). In our query the WHERE clause predicate is on the lo_shipmode column. The IM storage index on the lo_shipdate column is examined to determine if our specified column value 5 exist in any IMCU by comparing the value 5 to the minimum and maximum values maintained in the Storage Index. If the value 5 is outside the minimum and maximum range for an IMCU, the scan of that IMCU is avoided. For the IMCUs where the value 5 does fall within the min, max range, an additional level of data pruning is possible via the metadata dictionary created when dictionary-based compression is used on IMCU. The dictionary contains a list of the unique column values within the IMCU. Since we have an equality predicate we can easily determine if 5 is one of the distinct column values or not. The combination of the IM storage index and dictionary based pruning, enables us to only scan the necessary IMCUs. 4. Use SIMD to apply filter predicates For the IMCU that need to be scanned Oracle takes advantage of SIMD vector processing (Single Instruction processing Multiple Data values). Instead of evaluating each entry in the column one at a time, SIMD vector processing allows a set of column values to be evaluated together in a single CPU instruction. The column format used in the IM column store has been specifically designed to maximize the number of column entries that can be loaded into the vector registers on the CPU and evaluated in a single CPU instruction. SIMD vector processing enables the Oracle Database In-Memory to scan billion of rows per second per core versus the millions of rows per second per core scan rate that can be achieved in the buffer cache. I mentioned earlier in this post that in order to confirm the IM column store was used; we need to examine the session level statistics. You can monitor the session level statistics by querying the performance views v$mystat and v$statname. All of the statistics related to the In-Memory Column Store begin with IM. You can see the full list of these statistics by typing: display_name format a30 SELECT display_name FROM v$statname WHERE display_name LIKE 'IM%'; If we check the session statistics after we execute our query the results would be as follow; SELECT Max(lo_ordtotalprice) most_expensive_order FROM lineorderWHERE lo_shipmode = 5; SELECT display_name FROM v$statname WHERE display_name IN ('IM scan CUs columns accessed', 'IM scan segments minmax eligible', 'IM scan CUs pruned'); As you can see, only 2 IMCUs were accessed during the scan as the majority of the IMCUs (44) in the LINEORDER table were pruned out thanks to the storage index on the lo_shipmode column. In next weeks post I will describe how you can control which queries use the IM column store and which don't. +Maria Colgan

Read the article
IBM "per core" comparisons for SPECjEnterprise2010

- by jhenning

I recently stumbled upon a blog entry from Roman Kharkovski (an IBM employee) comparing some SPECjEnterprise2010 results for IBM vs. Oracle. Mr. Kharkovski's blog claims that SPARC delivers half the transactions per core vs. POWER7. Prior to any argument, I should say that my predisposition is to like Mr. Kharkovski, because he says that his blog is intended to be factual; that the intent is to try to avoid marketing hype and FUD tactic; and mostly because he features a picture of himself wearing a bike helmet (me too). Therefore, in a spirit of technical argument, rather than FUD fight, there are a few areas in his comparison that should be discussed. Scaling is not free For any benchmark, if a small system scores 13k using quantity R1 of some resource, and a big system scores 57k using quantity R2 of that resource, then, sure, it's tempting to divide: is 13k/R1 > 57k/R2 ? It is tempting, but not necessarily educational. The problem is that scaling is not free. Building big systems is harder than building small systems. Scoring 13k/R1 on a little system provides no guarantee whatsoever that one can sustain that ratio when attempting to handle more than 4 times as many users. Choosing the denominator radically changes the picture When ratios are used, one can vastly manipulate appearances by the choice of denominator. In this case, lots of choices are available for the resource to be compared (R1 and R2 above). IBM chooses to put cores in the denominator. Mr. Kharkovski provides some reasons for that choice in his blog entry. And yet, it should be noted that the very concept of a core is: arbitrary: not necessarily comparable across vendors; fluid: modern chips shift chip resources in response to load; and invisible: unless you have a microscope, you can't see it. By contrast, one can actually see processor chips with the naked eye, and they are a bit easier to count. If we put chips in the denominator instead of cores, we get: 13161.07 EjOPS / 4 chips = 3290 EjOPS per chip for IBM vs 57422.17 EjOPS / 16 chips = 3588 EjOPS per chip for Oracle The choice of denominator makes all the difference in the appearance. Speaking for myself, dividing by chips just seems to make more sense, because: I can see chips and count them; and I can accurately compare the number of chips in my system to the count in some other vendor's system; and Tthe probability of being able to continue to accurately count them over the next 10 years of microprocessor development seems higher than the probability of being able to accurately and comparably count "cores". SPEC Fair use requirements Speaking as an individual, not speaking for SPEC and not speaking for my employer, I wonder whether Mr. Kharkovski's blog article, taken as a whole, meets the requirements of the SPEC Fair Use rule www.spec.org/fairuse.html section I.D.2. For example, Mr. Kharkovski's footnote (1) begins Results from http://www.spec.org as of 04/04/2013 Oracle SUN SPARC T5-8 449 EjOPS/core SPECjEnterprise2010 (Oracle's WLS best SPECjEnterprise2010 EjOPS/core result on SPARC). IBM Power730 823 EjOPS/core (World Record SPECjEnterprise2010 EJOPS/core result) The questionable tactic, from a Fair Use point of view, is that there is no such metric at the designated location. At www.spec.org, You can find the SPEC metric 57422.17 SPECjEnterprise2010 EjOPS for Oracle and You can also find the SPEC metric 13161.07 SPECjEnterprise2010 EjOPS for IBM. Despite the implication of the footnote, you will not find any mention of 449 nor anything that says 823. SPEC says that you can, under its fair use rule, derive your own values; but it emphasizes: "The context must not give the appearance that SPEC has created or endorsed the derived value." Substantiation and transparency Although SPEC disclaims responsibility for non-SPEC information (section I.E), it says that non-SPEC data and methods should be accurate, should be explained, should be substantiated. Unfortunately, it is difficult or impossible for the reader to independently verify the pricing: Were like units compared to like (e.g. list price to list price)? Were all components (hw, sw, support) included? Were all fees included? Note that when tpc.org shows IBM pricing, there are often items such as "PROCESSOR ACTIVATION" and "MEMORY ACTIVATION". Without the transparency of a detailed breakdown, the pricing claims are questionable. T5 claim for "Fastest Processor" Mr. Kharkovski several times questions Oracle's claim for fastest processor, writing You see, when you publish industry benchmarks, people may actually compare your results to other vendor's results. Well, as we performance people always say, "it depends". If you believe in performance-per-core as the primary way of looking at the world, then yes, the POWER7+ is impressive, spending its chip resources to support up to 32 threads (8 cores x 4 threads). Or, it just might be useful to consider performance-per-chip. Each SPARC T5 chip allows 128 hardware threads to be simultaneously executing (16 cores x 8 threads). The Industry Standard Benchmark that focuses specifically on processor chip performance is SPEC CPU2006. For this very well known and popular benchmark, SPARC T5: provides better performance than both POWER7 and POWER7+, for 1 chip vs. 1 chip, for 8 chip vs. 8 chip, for integer (SPECint_rate2006) and floating point (SPECfp_rate2006), for Peak tuning and for Base tuning. For example, at the 8-chip level, integer throughput (SPECint_rate2006) is: 3750 for SPARC 2170 for POWER7+. You can find the details at the March 2013 BestPerf CPU2006 page SPEC is a trademark of the Standard Performance Evaluation Corporation, www.spec.org. The two specific results quoted for SPECjEnterprise2010 are posted at the URLs linked from the discussion. Results for SPEC CPU2006 were verified at spec.org 1 July 2013, and can be rechecked here.

Read the article
SSAS Maestro Training in July 2012 #ssasmaestro #ssas

- by Marco Russo (SQLBI)

A few hours ago Chris Webb blogged about SSAS Maestro and I’d like to propagate the news, adding also some background info. SSAS Maestro is the premier certification on Analysis Services that selects the best experts in Analysis Services around the world. In 2011 Microsoft organized two rounds of training/exams for SSAS Maestros and up to now only 11 people from the first wave have been announced – around 10% of attendees of the course! In the next few days the new Maestros from the second round should be announced and this long process is caused by many factors that I’m going to explain. First, the course is just a step in the process. Before the course you receive a list of topics to study, including the slides of the course. During the course, students receive a lot of information that might not have been included in the slides and the best part of the course is class interaction. Students are expected to bring their experience to the table and comparing case studies, experiences and having long debates is an important part of the learning process. And it is also a part of the evaluation: good questions might be also more important than good answers! Finally, after the course, students have their homework and this may require one or two months to be completed. After that, a long (very long) evaluation process begins, taking into account homework, labs, participation… And for this reason the final evaluation may arrive months later after the course. We are going to improve and shorten this process with the next courses. The first wave of SSAS Maestro had been made by invitation only and now the program is opening, requiring a fee to participate in order to cover the cost of preparation, training and exam. The number of attendees will be limited and candidates will have to send their CV in order to be admitted to the course. Only experienced Analysis Services developers will be able to participate to this challenging program. So why you should do that? Well, only 10% of students passed the exam until now. So if you need 100% guarantee to pass the exam, you need to study a lot, before, during and after the course. But the course by itself is a precious opportunity to share experience, create networking and learn mission-critical enterprise-level best practices that it’s hard to find written on books. Oh, well, many existing white papers are a required reading *before* the course! The course is now 5 days long, and every day can be *very* long. We’ll have lectures and discussions in the morning and labs in the afternoon/evening. Plus some more lectures in one or two afternoons. A heavy part of the course is about performance optimization, capacity planning, monitoring. This edition will introduce also Tabular models, and don’t expect something you might find in the SSAS Tabular Workshop – only performance, scalability monitoring and optimization will be covered, knowing Analysis Services is a requirement just to be accepted! I and Chris Webb will be the teachers for this edition. The course is expensive. Applying for SSAS Maestro will cost around 7000€ plus taxes (reduced to 5000€ for students of a previous SSAS Maestro edition). And you will be locked in a training room for the large part of the week. So why you should do that? Well, as I said, this is a challenging course. You will not find the time to check your email – the content is just too much interesting to think you can be distracted by something else. Another good reason is that this course will take place in Italy. Well, the course will take place in the brand new Microsoft Innovation Campus, but in general we’ll be able to provide you hints to get great food and, if you are willing to attach one week-end to your trip, there are plenty of places to visit (and I’m not talking about the classic Rome-Florence-Venice) – you might really need to relax after such a week! Finally, the marking process after the course will be faster – we’d like to complete the evaluation within three months after the course, considering that 1-2 months might be required to complete the homework. If at this point you are not scared: registration will open in mid-April, but you can already write to [email protected] sending your CV/resume and a short description of your level of SSAS knowledge and experience. The selection process will start early and you may want to put your admission form on top of the FIFO queue!

Read the article
Using a Predicate as a key to a Dictionary

- by Tom Hines

I really love Linq and Lambda Expressions in C#. I also love certain community forums and programming websites like DaniWeb. A user on DaniWeb posted a question about comparing the results of a game that is like poker (5-card stud), but is played with dice. The question stemmed around determining what was the winning hand. I looked at the question and issued some comments and suggestions toward a potential answer, but I thought it was a neat homework exercise. [A little explanation] I eventually realized not only could I compare the results of the hands (by name) with a certain construct – I could also compare the values of the individual dice with the same construct. That piece of code eventually became a Dictionary with the KEY as a Predicate<int> and the Value a Func<T> that returns a string from the another structure that contains the mapping of an ENUM to a string. In one instance, that string is the name of the hand and in another instance, it is a string (CSV) representation of of the digits in the hand. An added benefit is that the digits re returned in the order they would be for a proper poker hand. For instance the hand 1,2,5,3,1 would be returned as ONE_PAIR (1,1,5,3,2). [Getting to the point] 1: using System; 2: using System.Collections.Generic; 3: 4: namespace DicePoker 5: { 6: using KVP_E2S = KeyValuePair<CDicePoker.E_DICE_POKER_HAND_VAL, string>; 7: public partial class CDicePoker 8: { 9: /// <summary> 10: /// Magical construction to determine the winner of given hand Key/Value. 11: /// </summary> 12: private static Dictionary<Predicate<int>, Func<List<KVP_E2S>, string>> 13: map_prd2fn = new Dictionary<Predicate<int>, Func<List<KVP_E2S>, string>> 14: { 15: {new Predicate<int>(i => i.Equals(0)), PlayerTie},//first tie 16: 17: {new Predicate<int>(i => i > 0), 18: (m => string.Format("Player One wins\n1={0}({1})\n2={2}({3})", 19: m[0].Key, m[0].Value, m[1].Key, m[1].Value))}, 20: 21: {new Predicate<int>(i => i < 0), 22: (m => string.Format("Player Two wins\n2={2}({3})\n1={0}({1})", 23: m[0].Key, m[0].Value, m[1].Key, m[1].Value))}, 24: 25: {new Predicate<int>(i => i.Equals(0)), 26: (m => string.Format("Tie({0}) \n1={1}\n2={2}", 27: m[0].Key, m[0].Value, m[1].Value))} 28: }; 29: } 30: } When this is called, the code calls the Invoke method of the predicate to return a bool. The first on matching true will have its value invoked. 1: private static Func<DICE_HAND, E_DICE_POKER_HAND_VAL> GetHandEval = dh => 2: map_dph2fn[map_dph2fn.Keys.Where(enm2fn => enm2fn(dh)).First()]; After coming up with this process, I realized (with a little modification) it could be called to evaluate the individual values in the dice hand in the event of a tie. 1: private static Func<List<KVP_E2S>, string> PlayerTie = lst_kvp => 2: map_prd2fn.Skip(1) 3: .Where(x => x.Key.Invoke(RenderDigits(dhPlayerOne).CompareTo(RenderDigits(dhPlayerTwo)))) 4: .Select(s => s.Value) 5: .First().Invoke(lst_kvp); After that, I realized I could now create a program completely without “if” statements or “for” loops! 1: static void Main(string[] args) 2: { 3: Dictionary<Predicate<int>, Action<Action<string>>> main = new Dictionary<Predicate<int>, Action<Action<string>>> 4: { 5: {(i => i.Equals(0)), PlayGame}, 6: {(i => true), Usage} 7: }; 8: 9: main[main.Keys.Where(m => m.Invoke(args.Length)).First()].Invoke(Display); 10: } …and there you have it. :) ZIPPED Project

Read the article
Cowboy Agile?

- by Robert May

In a previous post, I outlined the rules of Scrum. This post details one of those rules. I’ve often heard similar phrases around Scrum that clue me in to someone who doesn’t understand Scrum. The phrases go something like this: “We don’t do Agile because the idea of letting people just do whatever they want is wrong. We believe in a more structured approach.” (i.e. Work is Prison, and I’m the Warden!) “I love Agile. Agile lets us do whatever we want!” (Cowboy Agile?) “We’re Agile, but we use a process that I’ve created.” (Cowboy Agile?) All of those phrases have one thing in common: The assumption that Agile, and I mean Scrum, lets you do whatever you want. This is simply not true. Executing Scrum properly requires more dedication, rigor, and diligence than happens in most traditional development methods. Scrum and Waterfall Compared Since Scrum and Waterfall are two of the most commonly used methodologies, a little bit of contrasting and comparing is in order. Waterfall Scrum A project manager defines all tasks and then manages the tasks that team members are working on. The team members define the tasks and estimates of the stories for the current iteration. Any team member may work on any task in the iteration. Usually only a few milestones that need to be met, the milestones are measured in months, and these milestones are expected to be missed. Little work is ever done to improve estimates and poor estimators can hide behind high estimates. Stories must be delivered every iteration, milestones are measured in hours, and the team is expected to figure out why their estimates were wrong, even when they were under. Repeated misses can get the entire team fired. Partially completed work is normal. Partially completed work doesn’t count. Nobody knows the task you’re working on. Everyone knows what you’re working on, whether or not you’re making progress and how much longer you think its going to take, in hours. Little requirement to show working code. Prototypes are ok. Working code must be shown each iteration. No smoke and mirrors allowed. Testing is done in lengthy cycles at the end of development. Developers aren’t held accountable. Testing is part of the team. If the testers don’t accept the story as complete, the team can’t count it. Complete means that the story’s functionality works as designed. The team can’t have any open defects on the story. Velocity is rarely truly measured and difficult to evaluate. Velocity is integral to the process and can be seen at a glance and everyone in the company knows what it is. A business analyst writes requirements. Designers mock up screens. Developers hide behind “I did it just like the spec doc told me to and made the screen exactly like the picture” Developers are expected to collaborate in real time. If a design is bad or lacks needed details, the developers are required to get it right in the iteration, because all software must be functional. Designers and Business Analysts are part of the team and must do their work in iterations slightly ahead of the developers. Upper Management is often surprised. “You told me things were going well two months ago!” Management receives updates at the end of every iteration showing them exactly what the team did and how that compares to what' is remaining in the backlog. Managers know every iteration what their money is buying. Status meetings are rare or don’t occur. Email is a primary form of communication. Teams coordinate every single day with each other and use other high bandwidth communication channels to make sure they’re making progress. Email is used only as a last resort. Instead, team members stand up, walk to each other, and talk, face to face. If that’s not possible, they pick up the phone. IF someone asks what happened, its at the end of a lengthy development cycle measured in months, and nobody really knows why it happened. Someone asks what happened every iteration. The team talks about what happened, and then adapts to make sure that what happened either never happens again or happens every time. That’s probably enough for now. As you can see, a lot is required of Scrum teams! One of the key differences in Scrum is that the burden for many activities is shifted to a group of people who share responsibility, instead of a single person having responsibility. This is a very good thing, since small groups usually come up with better and more insightful work than single individuals. This shift also results in better velocity. Team members can take vacations and the rest of the team simply picks up the slack. With Waterfall, if a key team member takes a vacation, delays can ensue. Scrum requires much more out of every team member and as a result, Scrum teams outperform non-Scrum teams working 60 hour weeks. Recommended Reading Everyone considering Scrum should read Mike Cohn’s excellent book, User Stories Applied. Technorati Tags: Agile,Scrum,Waterfall

Read the article

Search Results

Search found 1228 results on 50 pages for 'comparing'.

Page 43/50 | < Previous Page | 39 40 41 42 43 44 45 46 47 48 49 50 | Next Page >

- by Derek Organ

- by Jost

- by kief_morris

- by Ether

- by Paul H

- by Aszurom

- by CJM

- by Fopedush

- by Adam

- by Hedgetrimmer

- by terje

- by thatjeffsmith

- by PictureCo

- by MightyZot

- by rickramsey

- by pinaldave

- by Laila

- by Abhijeet Patel

- by smisner

- by Shaun

- by Maria Colgan

- by jhenning

- by Marco Russo (SQLBI)

- by Tom Hines

- by Robert May

< Previous Page | 39 40 41 42 43 44 45 46 47 48 49 50 | Next Page >