Search Results

Search found 1329 results on 54 pages for 'garbage collecting'.

Page 47/54 | < Previous Page | 43 44 45 46 47 48 49 50 51 52 53 54  | Next Page >

  • Casting/dereferencing member variable pointer from void*, is this safe?

    - by Damien
    Hi all, I had a problem while hacking a bigger project so I made a simpel test case. If I'm not omitting something, my test code works fine, but maybe it works accidentally so I wanted to show it to you and ask if there are any pitfalls in this approach. I have an OutObj which has a member variable (pointer) InObj. InObj has a member function. I send the address of this member variable object (InObj) to a callback function as void*. The type of this object never changes so inside the callback I recast to its original type and call the aFunc member function in it. In this exampel it works as expected, but in the project I'm working on it doesn't. So I might be omitting something or maybe there is a pitfall here and this works accidentally. Any comments? Thanks a lot in advance. (The problem I have in my original code is that InObj.data is garbage). #include <stdio.h> class InObj { public: int data; InObj(int argData); void aFunc() { printf("Inside aFunc! data is: %d\n", data); }; }; InObj::InObj(int argData) { data = argData; } class OutObj { public: InObj* objPtr; OutObj(int data); ~OutObj(); }; OutObj::OutObj(int data) { objPtr = new InObj(data); } OutObj::~OutObj() { delete objPtr; } void callback(void* context) { ((InObj*)context)->aFunc(); } int main () { OutObj a(42); callback((void*)a.objPtr); }

    Read the article

  • Optimising ruby regexp -- lots of match groups

    - by Farcaller
    I'm working on a ruby baser lexer. To improve performance, I joined up all tokens' regexps into one big regexp with match group names. The resulting regexp looks like: /\A(?<__anonymous_-1038694222803470993>(?-mix:\n+))|\A(?<__anonymous_-1394418499721420065>(?-mix:\/\/[\A\n]*))|\A(?<__anonymous_3077187815313752157>(?-mix:include\s+"[\A"]+"))|\A(?<LET>(?-mix:let\s))|\A(?<IN>(?-mix:in\s))|\A(?<CLASS>(?-mix:class\s))|\A(?<DEF>(?-mix:def\s))|\A(?<DEFM>(?-mix:defm\s))|\A(?<MULTICLASS>(?-mix:multiclass\s))|\A(?<FUNCNAME>(?-mix:![a-zA-Z_][a-zA-Z0-9_]*))|\A(?<ID>(?-mix:[a-zA-Z_][a-zA-Z0-9_]*))|\A(?<STRING>(?-mix:"[\A"]*"))|\A(?<NUMBER>(?-mix:[0-9]+))/ I'm matching it to my string producing a MatchData where exactly one token is parsed: bigregex =~ "\n ... garbage" puts $~.inspect Which outputs #<MatchData "\n" __anonymous_-1038694222803470993:"\n" __anonymous_-1394418499721420065:nil __anonymous_3077187815313752157:nil LET:nil IN:nil CLASS:nil DEF:nil DEFM:nil MULTICLASS:nil FUNCNAME:nil ID:nil STRING:nil NUMBER:nil> So, the regex actually matched the "\n" part. Now, I need to figure the match group where it belongs (it's clearly visible from #inspect output that it's _anonymous-1038694222803470993, but I need to get it programmatically). I could not find any option other than iterating over #names: m.names.each do |n| if m[n] type = n.to_sym resolved_type = (n.start_with?('__anonymous_') ? nil : type) val = m[n] break end end which verifies that the match group did have a match. The problem here is that it's slow (I spend about 10% of time in the loop; also 8% grabbing the @input[@pos..-1] to make sure that \A works as expected to match start of string (I do not discard input, just shift the @pos in it). You can check the full code at GH repo. Any ideas on how to make it at least a bit faster? Is there any option to figure the "successful" match group easier?

    Read the article

  • C#: How to implement a smart cache

    - by Svish
    I have some places where implementing some sort of cache might be useful. For example in cases of doing resource lookups based on custom strings, finding names of properties using reflection, or to have only one PropertyChangedEventArgs per property name. A simple example of the last one: public static class Cache { private static Dictionary<string, PropertyChangedEventArgs> cache; static Cache() { cache = new Dictionary<string, PropertyChangedEventArgs>(); } public static PropertyChangedEventArgs GetPropertyChangedEventArgsa(string propertyName) { if (cache.ContainsKey(propertyName)) return cache[propertyName]; return cache[propertyName] = new PropertyChangedEventArgs(propertyName); } } But, will this work well? For example if we had a whole load of different propertyNames, that would mean we would end up with a huge cache sitting there never being garbage collected or anything. I'm imagining if what is cached are larger values and if the application is a long-running one, this might end up as kind of a problem... or what do you think? How should a good cache be implemented? Is this one good enough for most purposes? Any examples of some nice cache implementations that are not too hard to understand or way too complex to implement?

    Read the article

  • VB .NET Passing a Structure containing an array of String and an array of Integer into a C++ DLL

    - by DanJunior
    Hi everyone, I'm having problems with marshalling in VB .NET to C++, here's the code : In the C++ DLL : struct APP_PARAM { int numData; LPCSTR *text; int *values; }; int App::StartApp(APP_PARAM params) { for (int i = 0; i < numLines; i++) { OutputDebugString(params.text[i]); } } In VB .NET : <StructLayoutAttribute(LayoutKind.Sequential)> _ Public Structure APP_PARAM Public numData As Integer Public text As System.IntPtr Public values As System.IntPtr End Structure Declare Function StartApp Lib "AppSupport.dll" (ByVal params As APP_PARAM) As Integer Sub Main() Dim params As APP_PARAM params.numData = 3 Dim text As String() = {"A", "B", "C"} Dim textHandle As GCHandle = GCHandle.Alloc(text) params.text = GCHandle.ToIntPtr(textHandle) Dim values As Integer() = {10, 20, 30} Dim valuesHandle As GCHandle = GCHandle.Alloc(values) params.values = GCHandle.ToIntPtr(heightHandle) StartApp(params) textHandle.Free() valuesHandle.Free() End Sub I checked the C++ side, the output from the OutputDebugString is garbage, the text array contains random characters. What is the correct way to do this?? Thanks a lot...

    Read the article

  • How to optimize my game calendar in C#?

    - by MartyIX
    Hi, I've implemented a simple calendar (message system) for my game which consists from: 1) List<Event> calendar; 2) public class Event { /// <summary> /// When to process the event /// </summary> public Int64 when; /// <summary> /// Which object should process the event /// </summary> public GameObject who; /// <summary> /// Type of event /// </summary> public EventType what; public int posX; public int posY; public int EventID; } 3) calendar.Add(new Event(...)) The problem with this code is that even thought the number of messages is not excessise per second. It allocates still new memory and GC will once need to take care of that. The garbage collection may lead to a slight lag in my game and therefore I'd like to optimalize my code. My considerations: To change Event class in a structure - but the structure is not entirely small and it takes some time to copy it wherever I need it. Reuse Event object somehow (add queue with used events and when new event is needed I'll just take from this queue). Does anybody has other idea how to solve the problem? Thanks for suggestions!

    Read the article

  • safe structures embedded systems

    - by user405633
    I have a packet from a server which is parsed in an embedded system. I need to parse it in a very efficient way, avoiding memory issues, like overlapping, corrupting my memory and others variables. The packet has this structure "String A:String B:String C". As example, here the packet received is compounded of three parts separated using a separator ":", all these parts must be accesibles from an structure. Which is the most efficient and safe way to do this. A.- Creating an structure with attributes (partA, PartB PartC) sized with a criteria based on avoid exceed this sized from the source of the packet, and attaching also an index with the length of each part in a way to avoid extracting garbage, this part length indicator could be less or equal to 300 (ie: part B). typedef struct parsedPacket_struct { char partA[2];int len_partA; char partB[300];int len_partB; char partC[2];int len_partC; }parsedPacket; The problem here is that I am wasting memory, because each structure should copy the packet content to each the structure, is there a way to only save the base address of each part and still using the len_partX.

    Read the article

  • How can I write output to a new external file with Perl?

    - by Structure
    Maybe I am searching with the wrong keywords or this is a very basic question but I cannot find the answer to my question. I am having trouble writing the result of my whois command to a new external file. My code is below. It takes $readfilename, which is a file name which has a list of IPs, and $writefilename, which is the destination file for output. Both are user-specified. For my tests, $readfilename contains three IP addresses on three separate lines so there should be three separate whois results in the user specified output file. if ($readfilename) { open (my $inputfile, "<", $readfilename) || die "\n Cannot open the specified file. Please double check your file name and path.\n\n"; open (my $outputfile, ">", $writefilename) || die "\n Could not create write file.\n\n"; while (<$inputfile>) { my $iplookupresult = `whois $_`; print $outputfile $iplookupresult; } close $outputfile; close $inputfile; } I can execute this script and end up with a new external file, but over half of the file has binary garbage data (running on CentOS) and only one (or a portion of one) of the whois lookups is readable. I have no idea how half of my file is ending up binary... but my approach must be incorrect. Is there a better way to achieve the same result?

    Read the article

  • VB.Net Memory Issue

    - by Skulmuk
    We have an application that has some interesting memory usage issues. When it first opens, the program uses aroun 50-60MB of memory. This stays consistent on 32-bit machines. On 64-bit machines, however, re-activating the form in any way (clicking, dragging, alt-tabbing, etc.) adds around another 50MB to it's memory usage. It repeats this process several times before resetting back to around 45MB, at which point the cycle begins again. I've done some research and a lot of people have said that VB in general has pretty poor garbage collection, which could be affecting the software in some way. However, I've yet to find a solution. There are no events fired when the application is activated (as shown by 32-bit usage) - the applications is merely sitting awaiting the user's actions. At load, the system pulls some data into a tree view, but that's the only external connection, and it only re-fires the routine when the user makes a change to something and saves the change. Has anyone else experienced anything this strange, and if so, does anyone know of what might fix it? It seems strange that it only occurs under x64 systems. Thanks

    Read the article

  • When my object is no longer referenced why do its events continue to run?

    - by Ryan Peschel
    Say I am making a game and have a base Buff class. I may also have a subclass named ThornsBuff which subscribes to the Player class's Damaged event. The Player may have a list of Buffs and one of them may be the ThornsBuff. For example: Test Class Player player = new Player(); player.ActiveBuffs.Add(new ThornsBuff(player)); ThornsBuff Class public ThornsBuff(Player player) { player.DamageTaken += player_DamageTaken; } private void player_DamageTaken(MessagePlayerDamaged m) { m.Assailant.Stats.Health -= (int)(m.DamageAmount * .25); } This is all to illustrate an example. If I were to remove the buff from the list, the event is not detached. Even though the player no longer has the buff, the event is still executed as if he did. Now, I could have a Dispel method to unregister the event, but that forces the developer to call Dispel in addition to removing the Buff from the list. What if they forget, increased coupling, etc. What I don't understand is why the event doesn't detatch itself automatically when the Buff is removed from the list. The Buff only existed in the list and that is its one reference. After it is removed shouldn't the event be detached? I tried adding the detaching code to the finalizer of the Buff but that didn't fix it either. The event is still running even after it has 0 references. I suppose it is because the garbage collector had not run yet? Is there any way to make it automatic and instant so that when the object has no references all its events are unregisted? Thanks.

    Read the article

  • [PHP] Does unsetting array values during iterating save on memory?

    - by saturn_rising
    Hello fellow code warriors, This is a simple programming question, coming from my lack of knowledge of how PHP handles array copying and unsetting during a foreach loop. It's like this, I have an array that comes to me from an outside source formatted in a way I want to change. A simple example would be: $myData = array('Key1' => array('value1', 'value2')); But what I want would be something like: $myData = array([0] => array('MyKey' => array('Key1' => array('value1', 'value2')))); So I take the first $myData and format it like the second $myData. I'm totally fine with my formatting algorithm. My question lies in finding a way to conserve memory since these arrays might get a little unwieldy. So, during my foreach loop I copy the current array value(s) into the new format, then I unset the value I'm working with from the original array. E.g.: $formattedData = array(); foreach ($myData as $key => $val) { // do some formatting here, copy to $reformattedVal $formattedData[] = $reformattedVal; unset($myData[$key]); } Is the call to unset() a good idea here? I.e., does it conserve memory since I have copied the data and no longer need the original value? Or, does PHP automatically garbage collect the data since I don't reference it in any subsequent code? The code runs fine, and so far my datasets have been too negligible in size to test for performance differences. I just don't know if I'm setting myself up for some weird bugs or CPU hits later on. Thanks for any insights. -sR

    Read the article

  • C++: Shortest and best way to "reinitialize"/clean a class instance

    - by Oper
    I will keep it short and just show you a code example: class myClass { public: myClass(); int a; int b; int c; } // In the myClass.cpp or whatever myClass::myClass( ) { a = 0; b = 0; c = 0; } Okay. If I know have an instance of myClass and set some random garbage to a, b and c. What is the best way to reset them all to the state after the class constructor was called, so: 0, 0 and 0? I came up with this way: myClass emptyInstance; myUsedInstance = emptyInstance; // Ewww.. code smell? Or.. myUsedInstance.a = 0; myUsedInstance.c = 0; myUsedInstance.c = 0; I think you know what I want, is there any better way to achieve this?

    Read the article

  • JVM process resident set size "equals" max heap size, not current heap size

    - by Volune
    After a few reading about jvm memory (here, here, here, others I forgot...), I am expecting the resident set size of my java process to be roughly equal to the current heap space capacity. That's not what the numbers are saying, it seems to be roughly equal to the max heap space capacity: Resident set size: # echo 0 $(cat /proc/1/smaps | grep Rss | awk '{print $2}' | sed 's#^#+#') | bc 11507912 # ps -C java -O rss | gawk '{ count ++; sum += $2 }; END {count --; print "Number of processes =",count; print "Memory usage per process =",sum/1024/count, "MB"; print "Total memory usage =", sum/1024, "MB" ;};' Number of processes = 1 Memory usage per process = 11237.8 MB Total memory usage = 11237.8 MB Java heap # jmap -heap 1 Attaching to process ID 1, please wait... Debugger attached successfully. Server compiler detected. JVM version is 24.55-b03 using thread-local object allocation. Garbage-First (G1) GC with 18 thread(s) Heap Configuration: MinHeapFreeRatio = 10 MaxHeapFreeRatio = 20 MaxHeapSize = 10737418240 (10240.0MB) NewSize = 1363144 (1.2999954223632812MB) MaxNewSize = 17592186044415 MB OldSize = 5452592 (5.1999969482421875MB) NewRatio = 2 SurvivorRatio = 8 PermSize = 20971520 (20.0MB) MaxPermSize = 85983232 (82.0MB) G1HeapRegionSize = 2097152 (2.0MB) Heap Usage: G1 Heap: regions = 2560 capacity = 5368709120 (5120.0MB) used = 1672045416 (1594.586769104004MB) free = 3696663704 (3525.413230895996MB) 31.144272834062576% used G1 Young Generation: Eden Space: regions = 627 capacity = 3279945728 (3128.0MB) used = 1314914304 (1254.0MB) free = 1965031424 (1874.0MB) 40.089514066496164% used Survivor Space: regions = 49 capacity = 102760448 (98.0MB) used = 102760448 (98.0MB) free = 0 (0.0MB) 100.0% used G1 Old Generation: regions = 147 capacity = 1986002944 (1894.0MB) used = 252273512 (240.5867691040039MB) free = 1733729432 (1653.413230895996MB) 12.702574926293766% used Perm Generation: capacity = 39845888 (38.0MB) used = 38884120 (37.082786560058594MB) free = 961768 (0.9172134399414062MB) 97.58628042120682% used 14654 interned Strings occupying 2188928 bytes. Are my expectations wrong? What should I expect? I need the heap space to be able to grow during spikes (to avoid very slow Full GC), but I would like to have the resident set size as low as possible the rest of the time, to benefit the other processes running on the server. Is there a better way to achieve that? Linux 3.13.0-32-generic x86_64 java version "1.7.0_55" Running in Docker version 1.1.2 Java is running elasticsearch 1.2.0: /usr/bin/java -Xms5g -Xmx10g -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20 -Xss256k -Djava.awt.headless=true -XX:+UseG1GC -XX:MaxGCPauseMillis=350 -XX:InitiatingHeapOccupancyPercent=45 -XX:+AggressiveOpts -XX:+UseCompressedOops -XX:-OmitStackTraceInFastThrow -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -Xloggc:/opt/elasticsearch/logs/gc.log -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt elasticsearch/logs/heapdump.hprof -XX:ErrorFile=/opt/elasticsearch/logs/hs_err.log -Des.logger.port=99999 -Des.logger.host=999.999.999.999 -Delasticsearch -Des.foreground=yes -Des.path.home=/opt/elasticsearch -cp :/opt/elasticsearch/lib/elasticsearch-1.2.0.jar:/opt/elasticsearch/lib/*:/opt/elasticsearch/lib/sigar/* org.elasticsearch.bootstrap.Elasticsearch There actually are 5 elasticsearch nodes, each in a different docker container. All have about the same memory usage. Some stats about the index: size: 9.71Gi (19.4Gi) docs: 3,925,398 (4,052,694)

    Read the article

  • Why are people trying to connect to me network on TCP port 445?

    - by Solignis
    I was playing with my new syslog server and had my m0n0wall firewall logs forwarded as a test, I noticed a bunch of recent firewall log entries that say that it blocked other WAN IPs from my ISP (I checked) from connecting to me on TCP port 445. Why would a random computer be trying to connect to me on a port apperently used for Windows SMB shares? Just internet garbage? A port scan? I am just curious. here is what I am seeing Mar 15 23:38:41 gateway/gateway ipmon[121]: 23:38:40.614422 fxp0 @0:19 b 98.82.198.238,60653 -> 98.103.xxx.xxx,445 PR tcp len 20 48 -S IN broadcast Mar 15 23:38:42 gateway/gateway ipmon[121]: 23:38:41.665571 fxp0 @0:19 b 98.82.198.238,60665 -> 98.103.xxx.xxx,445 PR tcp len 20 48 -S IN Mar 15 23:38:43 gateway/gateway ipmon[121]: 23:38:43.165622 fxp0 @0:19 b 98.82.198.238,60670 -> 98.103.xxx.xxx,445 PR tcp len 20 48 -S IN broadcast Mar 15 23:38:44 gateway/gateway ipmon[121]: 23:38:43.614524 fxp0 @0:19 b 98.82.198.238,60653 -> 98.103.xxx.xxx,445 PR tcp len 20 48 -S IN broadcast Mar 15 23:38:44 gateway/gateway ipmon[121]: 23:38:43.808856 fxp0 @0:19 b 98.82.198.238,60665 -> 98.103.xxx.xxx,445 PR tcp len 20 48 -S IN Mar 15 23:38:44 gateway/gateway ipmon[121]: 23:38:43.836313 fxp0 @0:19 b 98.82.198.238,60670 -> 98.103.xxx,xxx,445 PR tcp len 20 48 -S IN broadcast Mar 15 23:38:48 gateway/gateway ipmon[121]: 23:38:48.305633 fxp0 @0:19 b 98.103.22.25 -> 98.103.xxx.xxx PR icmp len 20 92 icmp echo/0 IN broadcast Mar 15 23:38:48 gateway/gateway ipmon[121]: 23:38:48.490778 fxp0 @0:19 b 98.103.22.25 -> 98.103.xxx.xxx PR icmp len 20 92 icmp echo/0 IN Mar 15 23:38:48 gateway/gateway ipmon[121]: 23:38:48.550230 fxp0 @0:19 b 98.103.22.25 -> 98.103.xxx.xxx PR icmp len 20 92 icmp echo/0 IN broadcast Mar 15 23:43:33 gateway/gateway ipmon[121]: 23:43:33.185836 fxp0 @0:19 b 98.86.34.225,64060 -> 98.103.xxx.xxx,445 PR tcp len 20 48 -S IN broadcast Mar 15 23:43:34 gateway/gateway ipmon[121]: 23:43:33.405137 fxp0 @0:19 b 98.86.34.225,64081 -> 98.103.xxx.xxx,445 PR tcp len 20 48 -S IN Mar 15 23:43:34 gateway/gateway ipmon[121]: 23:43:33.454384 fxp0 @0:19 b 98.86.34.225,64089 -> 98.103.xxx.xxx,445 PR tcp len 20 48 -S IN broadcast I blacked out part of my IP address for my own safety.

    Read the article

  • Apache won't serve images larger than ~2K

    - by dtbaker
    Hello, Just upgraded an old box to Ubuntu to 10.04.2 LTS. Apache will not display images to a browser that are over about 2K. Small images seem to display fine. Static HTML and PHP continues to works fine as well. Installed: apache2 2.2.14-5ubuntu8.4 apache2-mpm-prefork 2.2.14-5ubuntu8.4 apache2-utils 2.2.14-5ubuntu8.4 apache2.2-bin 2.2.14-5ubuntu8.4 apache2.2-common 2.2.14-5ubuntu8.4 here is an ngrep of an image that doesn't display fine in the browser: T 192.168.0.4:32907 - 192.168.0.54:80 [AP] GET /path/path/logo.png HTTP/1.1..Host: 192.1 68.0.54..Connection: keep-alive..Accept: application/xml,application/xhtml+ xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5..User-Ag ent: Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Chrome/9.0.597.98 Safari/534.13..Accept-Enco ding: gzip,deflate,sdch..Accept-Language: en-US,en;q=0.8..Accept- Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3.... T 192.168.0.54:80 - 192.168.0.4:32907 [A] HTTP/1.1 200 OK..Date: Wed, 09 Mar 2011 05:28:38 GMT..Server: Apa che/2.2.14 (Ubuntu)..Last-Modified: Tue, 05 Oct 2010 11:59:17 GMT ..ETag: "17b6f4-15fe-491dd63eb2f40"..Accept-Ranges: bytes..Conten t-Length: 5630..Keep-Alive: timeout=15, max=100..Connection: Keep -Alive..Content-Type: image/png.....PNG........IHDR...!...v...... .%.....sRGB.........bKGD..............pHYs.................tIME.. etc... This looks ok to me! I have tried firefox and chrome, both display small images fine but when a large image is requested the browser prompts to download the file. When the image file is saved to the local computer it is corrupt, it also takes a long time to save which makes me think the browser cannot see the content-length header sent from apache. Also when I look at the saved image file it includes the headers from apache, along with a bit of garbage at the top, like so: vi logo.png: ^@^UÅd^@$^]V^S^H^@E^@^Q,n!@^@@^F^@^@À¨^@6À¨^@^D^@P^Y¬rÇŹéw^P^@Ú^@^@^A^A^H ^@^GÝ^]^@pbSHTTP/1.1 200 OK^M Date: Wed, 09 Mar 2011 04:47:04 GMT^M Server: Apache/2.2.14 (Ubuntu)^M Last-Modified: Tue, 05 Oct 2010 11:59:17 GMT^M ETag: "17b6ff-157c-491dd63eb2f40"^M Accept-Ranges: bytes^M Content-Length: 5500^M Keep-Alive: timeout=15, max=94^M Connection: Keep-Alive^M Content-Type: image/png^M ^M PNG^M etc... Any ideas? It's driving me nuts. There is nothing in apache error logs, and permissions are fine (because the image data is there, it's just somewhat corrupt). There's no proxy or iptables on this ubuntu box either. Thanks heaps!! Dave ps: just tried on IE from a different computer, same problem :( pps: rebooted server, no help.

    Read the article

  • How do large blobs affect SQL delete performance, and how can I mitigate the impact?

    - by Max Pollack
    I'm currently experiencing a strange issue that my understanding of SQL Server doesn't quite mesh with. We use SQL as our file storage for our internal storage service, and our database has about half a million rows in it. Most of the files (86%) are 1mb or under, but even on fresh copies of our database where we simply populate the table with data for the purposes of a test, it appears that rows with large amounts of data stored in a BLOB frequently cause timeouts when our SQL Server is under load. My understanding of how SQL Server deletes rows is that it's a garbage collection process, i.e. the row is marked as a ghost and the row is later deleted by the ghost cleanup process after the changes are copied to the transaction log. This suggests to me that regardless of the size of the data in the blob, row deletion should be close to instantaneous. However when deleting these rows we are definitely experiencing large numbers of timeouts and astoundingly low performance. In our test data set, its files over 30mb that cause this issue. This is an edge case, we don't frequently encounter these, and even though we're looking into SQL filestream as a solution to some of our problems, we're trying to narrow down where these issues are originating from. We ARE performing our deletes inside of a transaction. We're also performing updates to metadata such as file size stats, but these exist in a separate table away from the file data itself. Hierarchy data is stored in the table that contains the file information. Really, in the end it's not so much what we're doing around the deletes that matters, we just can't find any references to low delete performance on rows that contain a large amount of data in a BLOB. We are trying to determine if this is even an avenue worth exploring, or if it has to be one of our processes around the delete that's causing the issue. Are there any situations in which this could occur? Is it common for a database server to come to the point of complete timeouts when many of these deletes are occurring simultaneously? Is there a way to combat this issue if it exists? (cross-posted from StackOverflow )

    Read the article

  • Big Data&rsquo;s Killer App&hellip;

    - by jean-pierre.dijcks
    Recently Keith spent  some time talking about the cloud on this blog and I will spare you my thoughts on the whole thing. What I do want to write down is something about the Big Data movement and what I think is the killer app for Big Data... Where is this coming from, ok, I confess... I spent 3 days in cloud land at the Cloud Connect conference in Santa Clara and it was quite a lot of fun. One of the nice things at Cloud Connect was that there was a track dedicated to Big Data, which prompted me to some extend to write this post. What is Big Data anyways? The most valuable point made in the Big Data track was that Big Data in itself is not very cool. Doing something with Big Data is what makes all of this cool and interesting to a business user! The other good insight I got was that a lot of people think Big Data means a single gigantic monolithic system holding gazillions of bytes or documents or log files. Well turns out that most people in the Big Data track are talking about a lot of collections of smaller data sets. So rather than thinking "big = monolithic" you should be thinking "big = many data sets". This is more than just theoretical, it is actually relevant when thinking about big data and how to process it. It is important because it means that the platform that stores data will most likely consist out of multiple solutions. You may be storing logs on something like HDFS, you may store your customer information in Oracle and you may store distilled clickstream information in some distilled form in MySQL. The big question you will need to solve is not what lives where, but how to get it all together and get some value out of all that data. NoSQL and MapReduce Nope, sorry, this is not the killer app... and no I'm not saying this because my business card says Oracle and I'm therefore biased. I think language is important, but as with storage I think pragmatic is better. In other words, some questions can be answered with SQL very efficiently, others can be answered with PERL or TCL others with MR. History should teach us that anyone trying to solve a problem will use any and all tools around. For example, most data warehouses (Big Data 1.0?) get a lot of data in flat files. Everyone then runs a bunch of shell scripts to massage or verify those files and then shoves those files into the database. We've even built shell script support into external tables to allow for this. I think the Big Data projects will do the same. Some people will use MapReduce, although I would argue that things like Cascading are more interesting, some people will use Java. Some data is stored on HDFS making Cascading the way to go, some data is stored in Oracle and SQL does do a good job there. As with storage and with history, be pragmatic and use what fits and neither NoSQL nor MR will be the one and only. Also, a language, while important, does in itself not deliver business value. So while cool it is not a killer app... Vertical Behavioral Analytics This is the killer app! And you are now thinking: "what does that mean?" Let's decompose that heading. First of all, analytics. I would think you had guessed by now that this is really what I'm after, and of course you are right. But not just analytics, which has a very large scope and means many things to many people. I'm not just after Business Intelligence (analytics 1.0?) or data mining (analytics 2.0?) but I'm after something more interesting that you can only do after collecting large volumes of specific data. That all important data is about behavior. What do my customers do? More importantly why do they behave like that? If you can figure that out, you can tailor web sites, stores, products etc. to that behavior and figure out how to be successful. Today's behavior that is somewhat easily tracked is web site clicks, search patterns and all of those things that a web site or web server tracks. that is where the Big Data lives and where these patters are now emerging. Other examples however are emerging, and one of the examples used at the conference was about prediction churn for a telco based on the social network its members are a part of. That social network is not about LinkedIn or Facebook, but about who calls whom. I call you a lot, you switch provider, and I might/will switch too. And that just naturally brings me to the next word, vertical. Vertical in this context means per industry, e.g. communications or retail or government or any other vertical. The reason for being more specific than just behavioral analytics is that each industry has its own data sources, has its own quirky logic and has its own demands and priorities. Of course, the methods and some of the software will be common and some will have both retail and service industry analytics in place (your corner coffee store for example). But the gist of it all is that analytics that can predict customer behavior for a specific focused group of people in a specific industry is what makes Big Data interesting. Building a Vertical Behavioral Analysis System Well, that is going to be interesting. I have not seen much going on in that space and if I had to have some criticism on the cloud connect conference it would be the lack of concrete user cases on big data. The telco example, while a step into the vertical behavioral part is not really on big data. It used a sample of data from the customers' data warehouse. One thing I do think, and this is where I think parts of the NoSQL stuff come from, is that we will be doing this analysis where the data is. Over the past 10 years we at Oracle have called this in-database analytics. I guess we were (too) early? Now the entire market is going there including companies like SAS. In-place btw does not mean "no data movement at all", what it means that you will do this on data's permanent home. For SAS that is kind of the current problem. Most of the inputs live in a data warehouse. So why move it into SAS and back? That all worked with 1 TB data warehouses, but when we are looking at 100TB to 500 TB of distilled data... Comments? As it is still early days with these systems, I'm very interested in seeing reactions and thoughts to some of these thoughts...

    Read the article

  • JMX Based Monitoring - Part Four - Business App Server Monitoring

    - by Anthony Shorten
    In the last blog entry I talked about the Oracle Utilities Application Framework V4 feature for monitoring and managing aspects of the Web Application Server using JMX. In this blog entry I am going to discuss a similar new feature that allows JMX to be used for management and monitoring the Oracle Utilities business application server component. This feature is primarily focussed on performance tracking of the product. In first release of Oracle Utilities Customer Care And Billing (V1.x I am talking about), we used to use Oracle Tuxedo as part of the architecture. In Oracle Utilities Application Framework V2.0 and above, we removed Tuxedo from the architecture. One of the features that some customers used within Tuxedo was the performance tracking ability. The idea was that you enabled performance logging on the individual Tuxedo servers and then used a utility named txrpt to produce a performance report. This report would list every service called, the number of times it was called and the average response time. When I worked a performance consultant, I used this report to identify badly performing services and also gauge the overall performance characteristics of a site. When Tuxedo was removed from the architecture this information was also lost. While you can get some information from access.log and some Mbeans supplied by the Web Application Server it was not at the same granularity as txrpt or as useful. I am happy to say we have not only reintroduced this facility in Oracle Utilities Application Framework but it is now accessible via JMX and also we have added more detail into the performance tracking. Most of this new design was working with customers around the world to make sure we introduced a new feature that not only satisfied their performance tracking needs but allowed for finer grained performance analysis. As with the Web Application Server, the Business Application Server JMX monitoring is enabled by specifying a JMX port number in RMI Port number for JMX Business and initial credentials in the JMX Enablement System User ID and JMX Enablement System Password configuration options. These options are available using the configureEnv[.sh] -a utility. These credentials are shared across the Web Application Server and Business Application Server for authorization purposes. Once this is information is supplied a number of configuration files are built (by the initialSetup[.sh] utility) to configure the facility: spl.properties - contains the JMX URL, the security configuration and the mbeans that are enabled. For example, on my demonstration machine: spl.runtime.management.rmi.port=6750 spl.runtime.management.connector.url.default=service:jmx:rmi:///jndi/rmi://localhost:6750/oracle/ouaf/ejbAppConnector jmx.remote.x.password.file=scripts/ouaf.jmx.password.file jmx.remote.x.access.file=scripts/ouaf.jmx.access.file ouaf.jmx.com.splwg.ejb.service.management.PerformanceStatistics=enabled ouaf.jmx.* files - contain the userid and password. The default configuration uses the JMX default configuration. You can use additional security features by altering the spl.properties file manually or using a custom template. For more security options see JMX Security for more details. Once it has been configured and the changes reflected in the product using the initialSetup[.sh] utility the JMX facility can be used. For illustrative purposes I will use jconsole but any JSR160 complaint browser or client can be used (with the appropriate configuration). Once you start jconsole (ensure that splenviron[.sh] is executed prior to execution to set the environment variables or for remote connection, ensure java is in your path and jconsole.jar in your classpath) you specify the URL in the spl.runtime.management.connnector.url.default entry. For example: You are then able to track performance of the product using the PerformanceStatistics Mbean. The attributes of the PerformanceStatistics Mbean are counts of each object type. This is where this facility differs from txrpt. The information that is collected includes the following: The Service Type is captured so you can filter the results in terms of the type of service. For maintenance type services you can even see the transaction type (ADD, CHANGE etc) so you can see the performance of updates against read transactions. The Minimum and Maximum are also collected to give you an idea of the spread of performance. The last call is recorded. The date, time and user of the last call are recorded to give you an idea of the timeliness of the data. The Mbean maintains a set of counters per Service Type to give you a summary of the types of transactions being executed. This gives you an overall picture of the types of transactions and volumes at your site. There are a number of interesting operations that can also be performed: reset - This resets the statistics back to zero. This is an important operation. For example, txrpt is restricted to collecting statistics per hour, which is ok for most people. But what if you wanted to be more granular? This operation allows to set the collection period to anything you wish. The statistics collected will represent values since the last restart or last reset. completeExecutionDump - This is the operation that produces a CSV in memory to allow extraction of the data. All the statistics are extracted (see the Server Administration Guide for a full list). This can be then loaded into a database, a tool or simply into your favourite spreadsheet for analysis. Here is an extract of an execution dump from my demonstration environment to give you an idea of the format: ServiceName, ServiceType, MinTime, MaxTime, Avg Time, # of Calls, Latest Time, Latest Date, Latest User ... CFLZLOUL, EXECUTE_LIST, 15.0, 64.0, 22.2, 10, 16.0, 2009-12-16::11-25-36-932, ASHORTEN CILBBLLP, READ, 106.0, 1184.0, 466.3333333333333, 6, 106.0, 2009-12-16::11-39-01-645, BOBAMA CILBBLLP, DELETE, 70.0, 146.0, 108.0, 2, 70.0, 2009-12-15::12-53-58-280, BPAYS CILBBLLP, ADD, 860.0, 4903.0, 2243.5, 8, 860.0, 2009-12-16::17-54-23-862, LELLISON CILBBLLP, CHANGE, 112.0, 3410.0, 815.1666666666666, 12, 112.0, 2009-12-16::11-40-01-103, ASHORTEN CILBCBAL, EXECUTE_LIST, 8.0, 84.0, 26.0, 22, 23.0, 2009-12-16::17-54-01-643, LJACKMAN InitializeUserInfoService, READ_SYSTEM, 49.0, 962.0, 70.83777777777777, 450, 63.0, 2010-02-25::11-21-21-667, ASHORTEN InitializeUserService, READ_SYSTEM, 130.0, 2835.0, 234.85777777777778, 450, 216.0, 2010-02-25::11-21-21-446, ASHORTEN MenuLoginService, READ_SYSTEM, 530.0, 1186.0, 703.3333333333334, 9, 530.0, 2009-12-16::16-39-31-172, ASHORTEN NavigationOptionDescriptionService, READ_SYSTEM, 2.0, 7.0, 4.0, 8, 2.0, 2009-12-21::09-46-46-892, ASHORTEN ... There are other operations and attributes available. Refer to the Server Administration Guide provided with your product to understand the full et of operations and attributes. This is one of the many features I am proud that we implemented as it allows flexible monitoring of the performance of the product.

    Read the article

  • New Validation Attributes in ASP.NET MVC 3 Future

    - by imran_ku07
         Introduction:             Validating user inputs is an very important step in collecting information from users because it helps you to prevent errors during processing data. Incomplete or improperly formatted user inputs will create lot of problems for your application. Fortunately, ASP.NET MVC 3 makes it very easy to validate most common input validations. ASP.NET MVC 3 includes Required, StringLength, Range, RegularExpression, Compare and Remote validation attributes for common input validation scenarios. These validation attributes validates most of your user inputs but still validation for Email, File Extension, Credit Card, URL, etc are missing. Fortunately, some of these validation attributes are available in ASP.NET MVC 3 Future. In this article, I will show you how to leverage Email, Url, CreditCard and FileExtensions validation attributes(which are available in ASP.NET MVC 3 Future) in ASP.NET MVC 3 application.       Description:             First of all you need to download ASP.NET MVC 3 RTM Source Code from here. Then extract all files in a folder. Then open MvcFutures project from mvc3-rtm-sources\mvc3\src\MvcFutures folder. Build the project. In case, if you get compile time error(s) then simply remove the reference of System.Web.WebPages and System.Web.Mvc assemblies and add the reference of System.Web.WebPages and System.Web.Mvc 3 assemblies again but from the .NET tab and then build the project again, it will create a Microsoft.Web.Mvc assembly inside mvc3-rtm-sources\mvc3\src\MvcFutures\obj\Debug folder. Now we can use Microsoft.Web.Mvc assembly inside our application.             Create a new ASP.NET MVC 3 application. For demonstration purpose, I will create a dummy model UserInformation. So create a new class file UserInformation.cs inside Model folder and add the following code,   public class UserInformation { [Required] public string Name { get; set; } [Required] [EmailAddress] public string Email { get; set; } [Required] [Url] public string Website { get; set; } [Required] [CreditCard] public string CreditCard { get; set; } [Required] [FileExtensions(Extensions = "jpg,jpeg")] public string Image { get; set; } }             Inside UserInformation class, I am using Email, Url, CreditCard and FileExtensions validation attributes which are defined in Microsoft.Web.Mvc assembly. By default FileExtensionsAttribute allows png, jpg, jpeg and gif extensions. You can override this by using Extensions property of FileExtensionsAttribute class.             Then just open(or create) HomeController.cs file and add the following code,   public class HomeController : Controller { public ActionResult Index() { return View(); } [HttpPost] public ActionResult Index(UserInformation u) { return View(); } }             Next just open(or create) Index view for Home controller and add the following code,  @model NewValidationAttributesinASPNETMVC3Future.Model.UserInformation @{ ViewBag.Title = "Index"; Layout = "~/Views/Shared/_Layout.cshtml"; } <h2>Index</h2> <script src="@Url.Content("~/Scripts/jquery.validate.min.js")" type="text/javascript"></script> <script src="@Url.Content("~/Scripts/jquery.validate.unobtrusive.min.js")" type="text/javascript"></script> @using (Html.BeginForm()) { @Html.ValidationSummary(true) <fieldset> <legend>UserInformation</legend> <div class="editor-label"> @Html.LabelFor(model => model.Name) </div> <div class="editor-field"> @Html.EditorFor(model => model.Name) @Html.ValidationMessageFor(model => model.Name) </div> <div class="editor-label"> @Html.LabelFor(model => model.Email) </div> <div class="editor-field"> @Html.EditorFor(model => model.Email) @Html.ValidationMessageFor(model => model.Email) </div> <div class="editor-label"> @Html.LabelFor(model => model.Website) </div> <div class="editor-field"> @Html.EditorFor(model => model.Website) @Html.ValidationMessageFor(model => model.Website) </div> <div class="editor-label"> @Html.LabelFor(model => model.CreditCard) </div> <div class="editor-field"> @Html.EditorFor(model => model.CreditCard) @Html.ValidationMessageFor(model => model.CreditCard) </div> <div class="editor-label"> @Html.LabelFor(model => model.Image) </div> <div class="editor-field"> @Html.EditorFor(model => model.Image) @Html.ValidationMessageFor(model => model.Image) </div> <p> <input type="submit" value="Save" /> </p> </fieldset> } <div> @Html.ActionLink("Back to List", "Index") </div>             Now just run your application. You will find that both client side and server side validation for the above validation attributes works smoothly.                      Summary:             Email, URL, Credit Card and File Extension input validations are very common. In this article, I showed you how you can validate these input validations into your application. I explained this with an example. I am also attaching a sample application which also includes Microsoft.Web.Mvc.dll. So you can add a reference of Microsoft.Web.Mvc assembly directly instead of doing any manual work. Hope you will enjoy this article too.   SyntaxHighlighter.all()

    Read the article

  • OS Analytics - Deep Dive Into Your OS

    - by Eran_Steiner
    Enterprise Manager Ops Center provides a feature called "OS Analytics". This feature allows you to get a better understanding of how the Operating System is being utilized. You can research the historical usage as well as real time data. This post will show how you can benefit from OS Analytics and how it works behind the scenes. We will have a call to discuss this blog - please join us!Date: Thursday, November 1, 2012Time: 11:00 am, Eastern Daylight Time (New York, GMT-04:00)1. Go to https://oracleconferencing.webex.com/oracleconferencing/j.php?ED=209833067&UID=1512092402&PW=NY2JhMmFjMmFh&RT=MiMxMQ%3D%3D2. If requested, enter your name and email address.3. If a password is required, enter the meeting password: oracle1234. Click "Join". To join the teleconference:Call-in toll-free number:       1-866-682-4770  (US/Canada)      Other countries:                https://oracle.intercallonline.com/portlets/scheduling/viewNumbers/viewNumber.do?ownerNumber=5931260&audioType=RP&viewGa=true&ga=ONConference Code:       7629343#Security code:            7777# Here is quick summary of what you can do with OS Analytics in Ops Center: View historical charts and real time value of CPU, memory, network and disk utilization Find the top CPU and Memory processes in real time or at a certain historical day Determine proper monitoring thresholds based on historical data View Solaris services status details Drill down into a process details View the busiest zones if applicable Where to start To start with OS Analytics, choose the OS asset in the tree and click the Analytics tab. You can see the CPU utilization, Memory utilization and Network utilization, along with the current real time top 5 processes in each category (click the image to see a larger version):  In the above screen, you can click each of the top 5 processes to see a more detailed view of that process. Here is an example of one of the processes: One of the cool things is that you can see the process tree for this process along with some port binding and open file descriptors. On Solaris machines with zones, you get an extra level of tabs, allowing you to get more information on the different zones: This is a good way to see the busiest zones. For example, one zone may not take a lot of CPU but it can consume a lot of memory, or perhaps network bandwidth. To see the detailed Analytics for each of the zones, simply click each of the zones in the tree and go to its Analytics tab. Next, click the "Processes" tab to see real time information of all the processes on the machine: An interesting column is the "Target" column. If you configured Ops Center to work with Enterprise Manager Cloud Control, then the two products will talk to each other and Ops Center will display the correlated target from Cloud Control in this table. If you are only using Ops Center - this column will remain empty. Next, if you view a Solaris machine, you will have a "Services" tab: By default, all services will be displayed, but you can choose to display only certain states, for example, those in maintenance or the degraded ones. You can highlight a service and choose to view the details, where you can see the Dependencies, Dependents and also the location of the service log file (not shown in the picture as you need to scroll down to see the log file). The "Threshold" tab is particularly helpful - you can view historical trends of different monitored values and based on the graph - determine what the monitoring values should be: You can ask Ops Center to suggest monitoring levels based on the historical values or you can set your own. The different colors in the graph represent the current set levels: Red for critical, Yellow for warning and Blue for Information, allowing you to quickly see how they're positioned against real data. It's important to note that when looking at longer periods, Ops Center smooths out the data and uses averages. So when looking at values such as CPU Usage, try shorter time frames which are more detailed, such as one hour or one day. Applying new monitoring values When first applying new values to monitored attributes - a popup will come up asking if it's OK to get you out of the current Monitoring Policy. This is OK if you want to either have custom monitoring for a specific machine, or if you want to use this current machine as a "Gold image" and extract a Monitoring Policy from it. You can later apply the new Monitoring Policy to other machines and also set it as a default Monitoring Profile. Once you're done with applying the different monitoring values, you can review and change them in the "Monitoring" tab. You can also click the "Extract a Monitoring Policy" in the actions pane on the right to save all the new values to a new Monitoring Policy, which can then be found under "Plan Management" -> "Monitoring Policies". Visiting the past Under the "History" tab you can "go back in time". This is very helpful when you know that a machine was busy a few hours ago (perhaps in the middle of the night?), but you were not around to take a look at it in real time. Here's a view into yesterday's data on one of the machines: You can see an interesting CPU spike happening at around 3:30 am along with some memory use. In the bottom table you can see the top 5 CPU and Memory consumers at the requested time. Very quickly you can see that this spike is related to the Solaris 11 IPS repository synchronization process using the "pkgrecv" command. The "time machine" doesn't stop here - you can also view historical data to determine which of the zones was the busiest at a given time: Under the hood The data collected is stored on each of the agents under /var/opt/sun/xvm/analytics/historical/ An "os.zip" file exists for the main OS. Inside you will find many small text files, named after the Epoch time stamp in which they were taken If you have any zones, there will be a file called "guests.zip" containing the same small files for all the zones, as well as a folder with the name of the zone along with "os.zip" in it If this is the Enterprise Controller or the Proxy Controller, you will have folders called "proxy" and "sat" in which you will find the "os.zip" for that controller The actual script collecting the data can be viewed for debugging purposes as well: On Linux, the location is: /opt/sun/xvmoc/private/os_analytics/collect On Solaris, the location is /opt/SUNWxvmoc/private/os_analytics/collect If you would like to redirect all the standard error into a file for debugging, touch the following file and the output will go into it: # touch /tmp/.collect.stderr   The temporary data is collected under /var/opt/sun/xvm/analytics/.collectdb until it is zipped. If you would like to review the properties for the Analytics, you can view those per each agent in /opt/sun/n1gc/lib/XVM.properties. Find the section "Analytics configurable properties for OS and VSC" to view the Analytics specific values. I hope you find this helpful! Please post questions in the comments below. Eran Steiner

    Read the article

  • Try a sample: Using the counter predicate for event sampling

    - by extended_events
    Extended Events offers a rich filtering mechanism, called predicates, that allows you to reduce the number of events you collect by specifying criteria that will be applied during event collection. (You can find more information about predicates in Using SQL Server 2008 Extended Events (by Jonathan Kehayias)) By evaluating predicates early in the event firing sequence we can reduce the performance impact of collecting events by stopping event collection when the criteria are not met. You can specify predicates on both event fields and on a special object called a predicate source. Predicate sources are similar to action in that they typically are related to some type of global information available from the server. You will find that many of the actions available in Extended Events have equivalent predicate sources, but actions and predicates sources are not the same thing. Applying predicates, whether on a field or predicate source, is very similar to what you are used to in T-SQL in terms of how they work; you pick some field/source and compare it to a value, for example, session_id = 52. There is one predicate source that merits special attention though, not just for its special use, but for how the order of predicate evaluation impacts the behavior you see. I’m referring to the counter predicate source. The counter predicate source gives you a way to sample a subset of events that otherwise meet the criteria of the predicate; for example you could collect every other event, or only every tenth event. Simple CountingThe counter predicate source works by creating an in memory counter that increments every time the predicate statement is evaluated. Here is a simple example with my favorite event, sql_statement_completed, that only collects the second statement that is run. (OK, that’s not much of a sample, but this is for demonstration purposes. Here is the session definition: CREATE EVENT SESSION counter_test ON SERVERADD EVENT sqlserver.sql_statement_completed    (ACTION (sqlserver.sql_text)    WHERE package0.counter = 2)ADD TARGET package0.ring_bufferWITH (MAX_DISPATCH_LATENCY = 1 SECONDS) You can find general information about the session DDL syntax in BOL and from Pedro’s post Introduction to Extended Events. The important part here is the WHERE statement that defines that I only what the event where package0.count = 2; in other words, only the second instance of the event. Notice that I need to provide the package name along with the predicate source. You don’t need to provide the package name if you’re using event fields, only for predicate sources. Let’s say I run the following test queries: -- Run three statements to test the sessionSELECT 'This is the first statement'GOSELECT 'This is the second statement'GOSELECT 'This is the third statement';GO Once you return the event data from the ring buffer and parse the XML (see my earlier post on reading event data) you should see something like this: event_name sql_text sql_statement_completed SELECT ‘This is the second statement’ You can see that only the second statement from the test was actually collected. (Feel free to try this yourself. Check out what happens if you remove the WHERE statement from your session. Go ahead, I’ll wait.) Percentage Sampling OK, so that wasn’t particularly interesting, but you can probably see that this could be interesting, for example, lets say I need a 25% sample of the statements executed on my server for some type of QA analysis, that might be more interesting than just the second statement. All comparisons of predicates are handled using an object called a predicate comparator; the simple comparisons such as equals, greater than, etc. are mapped to the common mathematical symbols you know and love (eg. = and >), but to do the less common comparisons you will need to use the predicate comparators directly. You would probably look to the MOD operation to do this type sampling; we would too, but we don’t call it MOD, we call it divides_by_uint64. This comparator evaluates whether one number is divisible by another with no remainder. The general syntax for using a predicate comparator is pred_comp(field, value), field is always first and value is always second. So lets take a look at how the session changes to answer our new question of 25% sampling: CREATE EVENT SESSION counter_test_25 ON SERVERADD EVENT sqlserver.sql_statement_completed    (ACTION (sqlserver.sql_text)    WHERE package0.divides_by_uint64(package0.counter,4))ADD TARGET package0.ring_bufferWITH (MAX_DISPATCH_LATENCY = 1 SECONDS)GO Here I’ve replaced the simple equivalency check with the divides_by_uint64 comparator to check if the counter is evenly divisible by 4, which gives us back every fourth record. I’ll leave it as an exercise for the reader to test this session. Why order matters I indicated at the start of this post that order matters when it comes to the counter predicate – it does. Like most other predicate systems, Extended Events evaluates the predicate statement from left to right; as soon as the predicate statement is proven false we abandon evaluation of the remainder of the statement. The counter predicate source is only incremented when it is evaluated so whether or not the counter is incremented will depend on where it is in the predicate statement and whether a previous criteria made the predicate false or not. Here is a generic example: Pred1: (WHERE statement_1 AND package0.counter = 2)Pred2: (WHERE package0.counter = 2 AND statement_1) Let’s say I cause a number of events as follows and examine what happens to the counter predicate source. Iteration Statement Pred1 Counter Pred2 Counter A Not statement_1 0 1 B statement_1 1 2 C Not statement_1 1 3 D statement_1 2 4 As you can see, in the case of Pred1, statement_1 is evaluated first, when it fails (A & C) predicate evaluation is stopped and the counter is not incremented. With Pred2 the counter is evaluated first, so it is incremented on every iteration of the event and the remaining parts of the predicate are then evaluated. In this example, Pred1 would return an event for D while Pred2 would return an event for B. But wait, there is an interesting side-effect here; consider Pred2 if I had run my statements in the following order: Not statement_1 Not statement_1 statement_1 statement_1 In this case I would never get an event back from the system because the point at which counter=2, the rest of the predicate evaluates as false so the event is not returned. If you’re using the counter target for sampling and you’re not getting the expected events, or any events, check the order of the predicate criteria. As a general rule I’d suggest that the counter criteria should be the last element of your predicate statement since that will assure that your sampling rate will apply to the set of event records defined by the rest of your predicate. Aside: I’m interested in hearing about uses for putting the counter predicate criteria earlier in the predicate statement. If you have one, post it in a comment to share with the class. - Mike Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • Elegance, thy Name is jQuery

    - by SGWellens
    So, I'm browsing though some questions over on the Stack Overflow website and I found a good jQuery question just a few minutes old. Here is a link to it. It was a tough question; I knew that by answering it, I could learn new stuff and reinforce what I already knew: Reading is good, doing is better. Maybe I could help someone in the process too. I cut and pasted the HTML from the question into my Visual Studio IDE and went back to Stack Overflow to reread the question. Dang, someone had already answered it! And it was a great answer. I never even had a chance to start analyzing the issue. Now I know what a one-legged man feels like in an ass-kicking contest. Nevertheless, since the question and answer were so interesting, I decided to dissect them and learn as much as possible. The HTML consisted of some divs separated by h3 headings.  Note the elements are laid out sequentially with no programmatic grouping: <h3 class="heading">Heading 1</h3> <div>Content</div> <div>More content</div> <div>Even more content</div><h3 class="heading">Heading 2</h3> <div>some content</div> <div>some more content</div><h3 class="heading">Heading 3</h3> <div>other content</div></form></body>  The requirement was to wrap a div around each h3 heading and the subsequent divs grouping them into sections. Why? I don't know, I suppose if you screen-scrapped some HTML from another site, you might want to reformat it before displaying it on your own. Anyways… Here is the marvelously, succinct posted answer: $('.heading').each(function(){ $(this).nextUntil('.heading').andSelf().wrapAll('<div class="section">');}); I was familiar with all the parts except for nextUntil and andSelf. But, I'll analyze the whole answer for completeness. I'll do this by rewriting the posted answer in a different style and adding a boat-load of comments: function Test(){ // $Sections is a jQuery object and it will contain three elements var $Sections = $('.heading'); // use each to iterate over each of the three elements $Sections.each(function () { // $this is a jquery object containing the current element // being iterated var $this = $(this); // nextUntil gets the following sibling elements until it reaches // an element with the CSS class 'heading' // andSelf adds in the source element (this) to the collection $this = $this.nextUntil('.heading').andSelf(); // wrap the elements with a div $this.wrapAll('<div class="section" >'); });}  The code here doesn't look nearly as concise and elegant as the original answer. However, unless you and your staff are jQuery masters, during development it really helps to work through algorithms step by step. You can step through this code in the debugger and examine the jQuery objects to make sure one step is working before proceeding on to the next. It's much easier to debug and troubleshoot when each logical coding step is a separate line. Note: You may think the original code runs much faster than this version. However, the time difference is trivial: Not enough to worry about: Less than 1 millisecond (tested in IE and FF). Note: You may want to jam everything into one line because it results in less traffic being sent to the client. That is true. However, most Internet servers now compress HTML and JavaScript by stripping out comments and white space (go to Bing or Google and view the source). This feature should be enabled on your server: Let the server compress your code, you don't need to do it. Free Career Advice: Creating maintainable code is Job One—Maximum Priority—The Prime Directive. If you find yourself suddenly transferred to customer support, it may be that the code you are writing is not as readable as it could be and not as readable as it should be. Moving on… I created a CSS class to see the results: .section{ background-color: yellow; border: 2px solid black; margin: 5px;} Here is the rendered output before:   …and after the jQuery code runs.   Pretty Cool! But, while playing with this code, the logic of nextUntil began to bother me: What happens in the last section? What stops elements from being collected since there are no more elements with the .heading class? The answer is nothing.  In this case it stopped because it was at the end of the page.  But what if there were additional HTML elements? I added an anchor tag and another div to the HTML: <h3 class="heading">Heading 1</h3> <div>Content</div> <div>More content</div> <div>Even more content</div><h3 class="heading">Heading 2</h3> <div>some content</div> <div>some more content</div><h3 class="heading">Heading 3</h3> <div>other content</div><a>this is a link</a><div>unrelated div</div> </form></body> The code as-is will include both the anchor and the unrelated div. This isn't what we want.   My first attempt to correct this used the filter parameter of the nextUntil function: nextUntil('.heading', 'div')  This will only collect div elements. But it merely skipped the anchor tag and it still collected the unrelated div:   The problem is we need a way to tell the nextUntil function when to stop. CSS selectors to the rescue: nextUntil('.heading, a')  This tells nextUntil to stop collecting sibling elements when it gets to an element with a .heading class OR when it gets to an anchor tag. In this case it solved the problem. FYI: The comma operator in a CSS selector allows multiple criteria.   Bingo! One final note, we could have broken the code down even more: We could have replaced the andSelf function here: $this = $this.nextUntil('.heading, a').andSelf(); With this: // get all the following siblings and then add the current item$this = $this.nextUntil('.heading, a');$this.add(this);  But in this case, the andSelf function reads real nice. In my opinion. Here's a link to a jsFiddle if you want to play with it. I hope someone finds this useful Steve Wellens CodeProject

    Read the article

  • NUMA-aware placement of communication variables

    - by Dave
    For classic NUMA-aware programming I'm typically most concerned about simple cold, capacity and compulsory misses and whether we can satisfy the miss by locally connected memory or whether we have to pull the line from its home node over the coherent interconnect -- we'd like to minimize channel contention and conserve interconnect bandwidth. That is, for this style of programming we're quite aware of where memory is homed relative to the threads that will be accessing it. Ideally, a page is collocated on the node with the thread that's expected to most frequently access the page, as simple misses on the page can be satisfied without resorting to transferring the line over the interconnect. The default "first touch" NUMA page placement policy tends to work reasonable well in this regard. When a virtual page is first accessed, the operating system will attempt to provision and map that virtual page to a physical page allocated from the node where the accessing thread is running. It's worth noting that the node-level memory interleaving granularity is usually a multiple of the page size, so we can say that a given page P resides on some node N. That is, the memory underlying a page resides on just one node. But when thinking about accesses to heavily-written communication variables we normally consider what caches the lines underlying such variables might be resident in, and in what states. We want to minimize coherence misses and cache probe activity and interconnect traffic in general. I don't usually give much thought to the location of the home NUMA node underlying such highly shared variables. On a SPARC T5440, for instance, which consists of 4 T2+ processors connected by a central coherence hub, the home node and placement of heavily accessed communication variables has very little impact on performance. The variables are frequently accessed so likely in M-state in some cache, and the location of the home node is of little consequence because a requester can use cache-to-cache transfers to get the line. Or at least that's what I thought. Recently, though, I was exploring a simple shared memory point-to-point communication model where a client writes a request into a request mailbox and then busy-waits on a response variable. It's a simple example of delegation based on message passing. The server polls the request mailbox, and having fetched a new request value, performs some operation and then writes a reply value into the response variable. As noted above, on a T5440 performance is insensitive to the placement of the communication variables -- the request and response mailbox words. But on a Sun/Oracle X4800 I noticed that was not the case and that NUMA placement of the communication variables was actually quite important. For background an X4800 system consists of 8 Intel X7560 Xeons . Each package (socket) has 8 cores with 2 contexts per core, so the system is 8x8x2. Each package is also a NUMA node and has locally attached memory. Every package has 3 point-to-point QPI links for cache coherence, and the system is configured with a twisted ladder "mobius" topology. The cache coherence fabric is glueless -- there's not central arbiter or coherence hub. The maximum distance between any two nodes is just 2 hops over the QPI links. For any given node, 3 other nodes are 1 hop distant and the remaining 4 nodes are 2 hops distant. Using a single request (client) thread and a single response (server) thread, a benchmark harness explored all permutations of NUMA placement for the two threads and the two communication variables, measuring the average round-trip-time and throughput rate between the client and server. In this benchmark the server simply acts as a simple transponder, writing the request value plus 1 back into the reply field, so there's no particular computation phase and we're only measuring communication overheads. In addition to varying the placement of communication variables over pairs of nodes, we also explored variations where both variables were placed on one page (and thus on one node) -- either on the same cache line or different cache lines -- while varying the node where the variables reside along with the placement of the threads. The key observation was that if the client and server threads were on different nodes, then the best placement of variables was to have the request variable (written by the client and read by the server) reside on the same node as the client thread, and to place the response variable (written by the server and read by the client) on the same node as the server. That is, if you have a variable that's to be written by one thread and read by another, it should be homed with the writer thread. For our simple client-server model that means using split request and response communication variables with unidirectional message flow on a given page. This can yield up to twice the throughput of less favorable placement strategies. Our X4800 uses the QPI 1.0 protocol with source-based snooping. Briefly, when node A needs to probe a cache line it fires off snoop requests to all the nodes in the system. Those recipients then forward their response not to the original requester, but to the home node H of the cache line. H waits for and collects the responses, adjudicates and resolves conflicts and ensures memory-model ordering, and then sends a definitive reply back to the original requester A. If some node B needed to transfer the line to A, it will do so by cache-to-cache transfer and let H know about the disposition of the cache line. A needs to wait for the authoritative response from H. So if a thread on node A wants to write a value to be read by a thread on node B, the latency is dependent on the distances between A, B, and H. We observe the best performance when the written-to variable is co-homed with the writer A. That is, we want H and A to be the same node, as the writer doesn't need the home to respond over the QPI link, as the writer and the home reside on the very same node. With architecturally informed placement of communication variables we eliminate at least one QPI hop from the critical path. Newer Intel processors use the QPI 1.1 coherence protocol with home-based snooping. As noted above, under source-snooping a requester broadcasts snoop requests to all nodes. Those nodes send their response to the home node of the location, which provides memory ordering, reconciles conflicts, etc., and then posts a definitive reply to the requester. In home-based snooping the snoop probe goes directly to the home node and are not broadcast. The home node can consult snoop filters -- if present -- and send out requests to retrieve the line if necessary. The 3rd party owner of the line, if any, can respond either to the home or the original requester (or even to both) according to the protocol policies. There are myriad variations that have been implemented, and unfortunately vendor terminology doesn't always agree between vendors or with the academic taxonomy papers. The key is that home-snooping enables the use of a snoop filter to reduce interconnect traffic. And while home-snooping might have a longer critical path (latency) than source-based snooping, it also may require fewer messages and less overall bandwidth. It'll be interesting to reprise these experiments on a platform with home-based snooping. While collecting data I also noticed that there are placement concerns even in the seemingly trivial case when both threads and both variables reside on a single node. Internally, the cores on each X7560 package are connected by an internal ring. (Actually there are multiple contra-rotating rings). And the last-level on-chip cache (LLC) is partitioned in banks or slices, which with each slice being associated with a core on the ring topology. A hardware hash function associates each physical address with a specific home bank. Thus we face distance and topology concerns even for intra-package communications, although the latencies are not nearly the magnitude we see inter-package. I've not seen such communication distance artifacts on the T2+, where the cache banks are connected to the cores via a high-speed crossbar instead of a ring -- communication latencies seem more regular.

    Read the article

  • TGIF: Engagement Wrap-up

    - by Michael Snow
    We've had a very busy week here at Oracle and as we build up to Oracle OpenWorld starting in less than 10 days - it doesn't look like things will be slowing down. Engagement is definitely in the air this week. Our friend, John Mancini published a great article entitled: "The World of Engagement" on his Digital Landfill blog yesterday and we hosted a great webcast with R "Ray" Wang from Constellation Research yesterday on the "9 C's of Engagement". 12.00 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} I wanted to wrap-up the week with some key takeaways from our webcast yesterday with Ray Wang. If you missed the webcast yesterday, fear not - it is now available  On-Demand. We'll leave you this week with lots of questions about how to navigate these churning waters of engagement. Stay tuned to the Oracle WebCenter Social Business Thought Leaders Webcast Series as we fuel this dialogue. 12.00 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Company Culture Does company support a culture of putting customer satisfaction ahead of profits? Does culture promote creativity and cross functional employee collaboration? Does culture accept different views of multi-generational workforce? Does culture promote employee training and skills development Does culture support upward mobility and long term retention? Does culture support work-life balance? Does the culture provide rewards for employee for outstanding customer support? Channels What are the current primary channels for customer communications? What do you think will be the primary channels in two years? Is company developing support model for emerging channels? Do all channels consistently deliver the same level of customer support? Do you know the cost per transaction across all channels? Do you engage customers proactively across multiple channels? Do all channels have access to the same customer information? Community Does company extend customer support into virtual communities of interest? Does company facilitate educating users through its virtual communities? Does company mine its customer’s experience into useful data? Does company increase the value for customers through using data to deliver new products and services? Does company support two way interactions with its customers through communities of interest? Does company actively support social CRM, online communities and social media markets? Credibility Does company market its trustworthiness through external certificates such as business licenses, BBB certificates or other validations? Does company promote trust through customer testimonials and case studies on ethical business practices? Does company promote truthful market campaigns Does company make it easy for customers to complain? Does company build its reputation for standing behind its products with guarantees for satisfaction? Does company protect its customer data with high security measures> Content What sources do you use to create customer content? Does company mine social media and blogs for customer content? How does your company sort, store and retain its customer content? How frequently does content get updated? What external sources do you use for customer content? How many responses are typically received from a knowledge management system inquiry? Does your company use customer content to design and develop new product and services? Context Does your company market to customers in clusters or individually? Does your company customize its messages and personalize them to specific needs of each individual customer? Does your company store customer data based on their past behaviors, purchases, sentiment analysis and current activities? Does your company manage customer context according to channels used? For example identify personal use channels versus business channels? What is your frequency of collecting customer activities across various touch points? How is your customer data stored and analyzed? Is contextual data used for future customer outreach? Cadence Which channels does your company measure-web site visits, phone calls, IVR, store visits, face to face, social media? Does company make effective use of cross channel marketing to promote more frequent customer engagement? Does your company rate the patterns relevant for your product or service and monitor usage against this pattern? Does your company measure the frequency of both online and offline channels? Does your company apply metrics to the frequency of customer engagements with product or services revenues? Does your company consolidate data for customer engagement across various channels for a complete view of its customer? Catalyst Does company offer coupon discounts? Does company have a customer loyalty program or a VIP membership program? Does company mine customer data to target specific groups of buyers? Do internal employees serve as ambassadors for customer programs? Does company drive loyalty through social media loyalty programs? Does company build rewards based on using loyalty data? Does company offer an employee incentive program to drive customer loyalty?

    Read the article

  • Effectiveness and Efficiency

    - by Daniel Moth
    In the professional environment, i.e. at work, I am always seeking personal growth and to be challenged. The result is that my assignments, my work list, my tasks, my goals, my commitments, my [insert whatever word resonates with you] keep growing (in scope and desired impact). Which in turn means I have to keep finding new ways to deliver more value, while not falling into the trap of working more hours. To do that I continuously evaluate both my effectiveness and my efficiency. EFFECTIVENESS The first thing I check is my effectiveness: Am I doing the right things? Am I focusing too much on unimportant things? Am I spending more time doing stuff that is important to my team/org/division/business/company, or am I spending it on stuff that is important to me and that I enjoy doing? Am I valuing activities that maybe I have outgrown and should be delegated to others who are at a stage I have surpassed (in Microsoft speak: is the work I am doing level appropriate or am I still operating at the previous level)? Notice how the answers to those questions change over time and due to certain events, so I have to remind myself to revisit them frequently. Events that force me to re-examine them are: change of role, change of team/org/etc, change of direction of team/org/etc, re-org, new hires on the team that take on some of the work I did, personal promotion, change of manager... and if none of those events has occurred since the last annual review, I ask myself those at each annual review anyway. If you think you are not being effective at work, make a list of the stuff that you do and start tracking where your time goes. In parallel, have a discussion with your manager about where they think your time should go. Ultimately your time is finite and hence it is your most precious investment, don't waste it. If your management doesn't value as highly what you spend your time on, then either convince your management, or stop spending your time on it, or find different management: Lead, Follow, or get out of the way! That's my view on effectiveness. You have to fix that before moving to being efficient, or you may end up being very efficient at stuff that nobody wants you to be doing in the first place. For example, you may be spending your time writing blog posts and becoming better and faster at it all the time. If your manager thinks that is not even part of your job description, you are wasting your time to satisfy your inner desires. Nobody can help you with your effectiveness other than your management chain and your management peers - they are the judges of it. EFFICIENCY The second thing I check is my efficiency: Am I doing things right? For me, doing things right means that I deliver the same quality of work faster [than what I used to, and than my peers, and than expected of me]. The result is that I can achieve more [than what I used to, and than my peers, and than expected of me]. Notice how the efficiency goal is a more portable one. If, by whatever criteria, you think you are the best at [insert your own skill here], this can change at two events: because you have new colleagues (who are potentially better than your older ones), and it can change with a change of manager (who has potentially higher expectations). That's about it. Once you are efficient at something, you carry that with you... All you need to really be doing here is, when taking on new kinds of work that you haven't done before, try a few approaches and devise a system so that you can become efficient at this new activity too... Just keep "collecting" stuff that you are efficient at. If you think you are not being efficient at something, break it down: What are the steps you take to complete that task? How long do you spend on each step? Talk to others about what steps they take, to see if you can optimize some steps away or trade them for better steps, or just learn how to complete a step faster. Have a system for every task you take so that you can have repeatable success. That's my view on efficiency. You have to fix it so that you can free up time to do more. When you plan a route from A to B - all else being equal - you try to get there as fast as possible so why would you not want to do that with your everyday work? For example, imagine you are inefficient at processing email: You spend more time than necessary dealing with email, and you still end up with dropped email threads and with slower response times than others. How can you improve? Talk to someone that you think is good at this, understand their system (e.g. here is my email processing system) and come up with one that works for you. Parting Thoughts Are you considered, by your colleagues and manager, an effective and efficient person at your workplace? If you are, what would you change if you were asked by your management to do the job of two people? Seriously, think about that! Your immediate reaction may be "that is not possible", but it actually is. You just have to re-assess what things that were previously important will now stop being important, by discussing them with your management and reaching agreement on relative priorities. For example, stuff that was previously on your plate may now have to be delegated or dropped. Where you thought you were efficient, maybe now you have to find an even faster path to completion, perhaps keeping in mind that Perfect is the Enemy of “Good Enough”. My personal experience (from both observing others and from my own reflection) is that when folks are struggling to keep up at work it is because of two reasons: They are investing energy in stuff that they enjoy doing which the business regards as having a lower priority than a lot of other things on their plate. They are completing tasks to a level of higher quality than what is required (due to personal pride) missing the big picture which almost always mandates completing three tasks at good enough quality than knocking only one of them out of the park while the other two come in late or not at all. There is a lot of content on the web, so I strongly encourage you to use your favorite search engine to read other views on effectiveness and efficiency (Bing, Google). Comments about this post by Daniel Moth welcome at the original blog.

    Read the article

  • Metrics - A little knowledge can be a dangerous thing (or 'Why you're not clever enough to interpret metrics data')

    - by Jason Crease
    At RedGate Software, I work on a .NET obfuscator  called SmartAssembly.  Various features of it use a database to store various things (exception reports, name-mappings, etc.) The user is given the option of using either a SQL-Server database (which requires them to have Microsoft SQL Server), or a Microsoft Access MDB file (which requires nothing). MDB is the default option, but power-users soon switch to using a SQL Server database because it offers better performance and data-sharing. In the fashionable spirit of optimization and metrics, an obvious product-management question is 'Which is the most popular? SQL Server or MDB?' We've collected data about this fact, using our 'Feature-Usage-Reporting' technology (available as part of SmartAssembly) and more recently our 'Application Metrics' technology: Parameter Number of users % of total users Number of sessions Number of usages SQL Server 28 19.0 8115 8115 MDB 114 77.6 1449 1449 (As a disclaimer, please note than SmartAssembly has far more than 132 users . This data is just a selection of one build) So, it would appear that SQL-Server is used by fewer users, but more often. Great. But here's why these numbers are useless to me: Only the original developers understand the data What does a single 'usage' of 'MDB' mean? Does this happen once per run? Once per option change? On clicking the 'Obfuscate Now' button? When running the command-line version or just from the UI version? Each question could skew the data 10-fold either way, and the answers only known by the developer that instrumented the application in the first place. In other words, only the original developer can interpret the data - product-managers cannot interpret the data unaided. Most of the data is from uninterested users About half of people who download and run a free-trial from the internet quit it almost immediately. Only a small fraction use it sufficiently to make informed choices. Since the MDB option is the default one, we don't know how many of those 114 were people CHOOSING to use the MDB, or how many were JUST HAPPENING to use this MDB default for their 20-second trial. This is a problem we see across all our metrics: Are people are using X because it's the default or are they using X because they want to use X? We need to segment the data further - asking what percentage of each percentage meet our criteria for an 'established user' or 'informed user'. You end up spending hours writing sophisticated and dubious SQL queries to segment the data further. Not fun. You can't find out why they used this feature Metrics can answer the when and what, but not the why. Why did people use feature X? If you're anything like me, you often click on random buttons in unfamiliar applications just to explore the feature-set. If we listened uncritically to metrics at RedGate, we would eliminate the most-important and more-complex features which people actually buy the software for, leaving just big buttons on the main page and the About-Box. "Ah, that's interesting!" rather than "Ah, that's actionable!" People do love data. Did you know you eat 1201 chickens in a lifetime? But just 4 cows? Interesting, but useless. Often metrics give you a nice number: '5.8% of users have 3 or more monitors' . But unless the statistic is both SUPRISING and ACTIONABLE, it's useless. Most metrics are collected, reviewed with lots of cooing. and then forgotten. Unless a piece-of-data could change things, it's useless collecting it. People get obsessed with significance levels The first things that lots of people do with this data is do a t-test to get a significance level ("Hey! We know with 99.64% confidence that people prefer SQL Server to MDBs!") Believe me: other causes of error/misinterpretation in your data are FAR more significant than your t-test could ever comprehend. Confirmation bias prevents objectivity If the data appears to match our instinct, we feel satisfied and move on. If it doesn't, we suspect the data and dig deeper, plummeting down a rabbit-hole of segmentation and filtering until we give-up and move-on. Data is only useful if it can change our preconceptions. Do you trust this dodgy data more than your own understanding, knowledge and intelligence?  I don't. There's always multiple plausible ways to interpret/action any data Let's say we segment the above data, and get this data: Post-trial users (i.e. those using a paid version after the 14-day free-trial is over): Parameter Number of users % of total users Number of sessions Number of usages SQL Server 13 9.0 1115 1115 MDB 5 4.2 449 449 Trial users: Parameter Number of users % of total users Number of sessions Number of usages SQL Server 15 10.0 7000 7000 MDB 114 77.6 1000 1000 How do you interpret this data? It's one of: Mostly SQL Server users buy our software. People who can't afford SQL Server tend to be unable to afford or unwilling to buy our software. Therefore, ditch MDB-support. Our MDB support is so poor and buggy that our massive MDB user-base doesn't buy it.  Therefore, spend loads of money improving it, and think about ditching SQL-Server support. People 'graduate' naturally from MDB to SQL Server as they use the software more. Things are fine the way they are. We're marketing the tool wrong. The large number of MDB users represent uninformed downloaders. Tell marketing to aggressively target SQL Server users. To choose an interpretation you need to segment again. And again. And again, and again. Opting-out is correlated with feature-usage Metrics tends to be opt-in. This skews the data even further. Between 5% and 30% of people choose to opt-in to metrics (often called 'customer improvement program' or something like that). Casual trial-users who are uninterested in your product or company are less likely to opt-in. This group is probably also likely to be MDB users. How much does this skew your data by? Who knows? It's not all doom and gloom. There are some things metrics can answer well. Environment facts. How many people have 3 monitors? Have Windows 7? Have .NET 4 installed? Have Japanese Windows? Minor optimizations.  Is the text-box big enough for average user-input? Performance data. How long does our app take to start? How many databases does the average user have on their server? As you can see, questions about who-the-user-is rather than what-the-user-does are easier to answer and action. Conclusion Use SmartAssembly. If not for the metrics (called 'Feature-Usage-Reporting'), then at least for the obfuscation/error-reporting. Data raises more questions than it answers. Questions about environment are the easiest to answer.

    Read the article

< Previous Page | 43 44 45 46 47 48 49 50 51 52 53 54  | Next Page >