management data warehouse - Page 135

Memory management with Objective-C Distributed Objects: my temporary instances live forever!

- by jkp

I'm playing with Objective-C Distributed Objects and I'm having some problems understanding how memory management works under the system. The example given below illustrates my problem: Protocol.h #import <Foundation/Foundation.h> @protocol DOServer - (byref id)createTarget; @end Server.m #import <Foundation/Foundation.h> #import "Protocol.h" @interface DOTarget : NSObject @end @interface DOServer : NSObject < DOServer > @end @implementation DOTarget - (id)init { if ((self = [super init])) { NSLog(@"Target created"); } return self; } - (void)dealloc { NSLog(@"Target destroyed"); [super dealloc]; } @end @implementation DOServer - (byref id)createTarget { return [[[DOTarget alloc] init] autorelease]; } @end int main() { NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init]; DOServer *server = [[DOServer alloc] init]; NSConnection *connection = [[NSConnection new] autorelease]; [connection setRootObject:server]; if ([connection registerName:@"test-server"] == NO) { NSLog(@"Failed to vend server object"); } else [[NSRunLoop currentRunLoop] run]; [pool drain]; return 0; } Client.m #import <Foundation/Foundation.h> #import "Protocol.h" int main() { unsigned i = 0; for (; i < 3; i ++) { NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init]; id server = [NSConnection rootProxyForConnectionWithRegisteredName:@"test-server" host:nil]; [server setProtocolForProxy:@protocol(DOServer)]; NSLog(@"Created target: %@", [server createTarget]); [[NSRunLoop currentRunLoop] runUntilDate:[NSDate dateWithTimeIntervalSinceNow:1.0]]; [pool drain]; } return 0; } The issue is that any remote objects created by the root proxy are not released when their proxy counterparts in the client go out of scope. According to the documentation: When an object’s remote proxy is deallocated, a message is sent back to the receiver to notify it that the local object is no longer shared over the connection. I would therefore expect that as each DOTarget goes out of scope (each time around the loop) it's remote counterpart would be dellocated, since there is no other reference to it being held on the remote side of the connection. In reality this does not happen: the temporary objects are only deallocate when the client application quits, or more accurately, when the connection is invalidated. I can force the temporary objects on the remote side to be deallocated by explicitly invalidating the NSConnection object I'm using each time around the loop and creating a new one but somehow this just feels wrong. Is this the correct behaviour from DO? Should all temporary objects live as long as the connection that created them? Are connections therefore to be treated as temporary objects which should be opened and closed with each series of requests against the server? Any insights would be appreciated.

Read the article

Generate and merge data with python multiprocessing

- by Bobby

I have a list of starting data. I want to apply a function to the starting data that creates a few pieces of new data for each element in the starting data. Some pieces of the new data are the same and I want to remove them. The sequential version is essentially: def create_new_data_for(datum): """make a list of new data from some old datum""" return [datum.modified_copy(k) for k in datum.k_list] data = [some list of data] #some data to start with #generate a list of new data from the old data, we'll reduce it next newdata = [] for d in data: newdata.extend(create_new_data_for(d)) #now reduce the data under ".matches(other)" reduced = [] for d in newdata: for seen in reduced: if d.matches(seen): break #so we haven't seen anything like d yet seen.append(d) #now reduced is finished and is what we want! I want to speed this up with multiprocessing. I was thinking that I could use a multiprocessing.Queue for the generation. Each process would just put the stuff it creates on, and when the processes are reducing the data, they can just get the data from the Queue. But I'm not sure how to have the different process loop over reduced and modify it without any race conditions or other issues. What is the best way to do this safely? or is there a different way to accomplish this goal better?

Read the article

How to read a file with variable multi-row data in Python

- by dr.bunsen

I have a file that is about 100Mb that looks like this: #meta data 1 skadjflaskdjfasljdfalskdjfl sdkfjhasdlkgjhsdlkjghlaskdj asdhfk #meta data 2 jflaksdjflaksjdflkjasdlfjas ldaksjflkdsajlkdfj #meta data 3 alsdkjflasdjkfglalaskdjf This file contains one row of meta data that corresponds to several, variable length data containing only alpha-numeric characters. What is the best way to read this data into a simple list like this: data = [[#meta data 1, skadjflaskdjfasljdfalskdjflsdkfjhasdlkgjhsdlkjghlaskdjasdhfk], [#meta data 2, jflaksdjflaksjdflkjasdlfjasldaksjflkdsajlkdfj], [#meta data 3, alsdkjflasdjkfglalaskdjf]] My initial idea was to use the read() method to read the whole file into memory and then use regular expressions to parse the data into the desired format. Is there a better more pythonic way? All metadata lines start with an octothorpe and all data lines are all alpha-numeric. Thanks!

Read the article

How to process large block data visualization with Flex?

- by hydra1983

I know that's a big topic. However, it's better to know some general ideas to handle such problems. I have an application which requires Flex to render statistics data calculated instantly on the client side from a downloaded data set. The problems are: the data set is large and needs more than 10 seconds to be downloaded. there are some filters to control the statistics calculation algorithms. If user changes the filters, it would take a long time to recalculate the result and freeze the UI.

Read the article

Looking for issue tracker software for residential property management

- by Rob

This question is about a computer software (as per SU guidelines) application for centrally tracking issues concerning the management of a residental block of flats (apartments as they say in the US and France). Issues are incidents - and their resultant unplanned maintenance to address them, also planned one-off maintenance and also regular planned routine maintenance. I live in a block of flats (apartments), and along with other residents, are looking to more closely watch over issues with the communal, shared areas of the premises (corridors, courtyards, stairs, lifts, lights, trash/bin shed, bike stands, parking areas etc) and their maintenance, currently done by a property management company. Our own homes are our own affair internally, its the outside communal areas that I have the interest. The aim being to control costs and possibly reduce them, by proactively managing the property using historical data to predict issues and also to scrutinise maintenance charges against such data to ensure that the costs are as expected. Trending could also be established whereby recurrences of things can be detected and pre-empted to reduce costs. As a software professional, I'm aware of Bugzilla, eventum being free tools for software - which could be customised to fit this application, but wondered if there was something more appropriate. It might be useful for such software to be on a web server, with secure access, so that residents can log in and view the issues.

Read the article

html5 uploader + jquery drag & drop: how to store file data with FormData?

- by lauthiamkok

I am making a html5 drag and drop uploader with jquery, below is my code so far, the problem is that I get an empty array without any data. Is this line incorrect to store the file data - fd.append('file', $thisfile);? $('#div').on( 'dragover', function(e) { e.preventDefault(); e.stopPropagation(); } ); $('#div').on( 'dragenter', function(e) { e.preventDefault(); e.stopPropagation(); } ); $('#div').on( 'drop', function(e){ if(e.originalEvent.dataTransfer){ if(e.originalEvent.dataTransfer.files.length) { e.preventDefault(); e.stopPropagation(); // The file list. var fileList = e.originalEvent.dataTransfer.files; //console.log(fileList); // Loop the ajax post. for (var i = 0; i < fileList.length; i++) { var $thisfile = fileList[i]; console.log($thisfile); // HTML5 form data object. var fd = new FormData(); //console.log(fd); fd.append('file', $thisfile); /* var file = {name: fileList[i].name, type: fileList[i].type, size:fileList[i].size}; $.each(file, function(key, value) { fd.append('file['+key+']', value); }) */ $.ajax({ url: "upload.php", type: "POST", data: fd, processData: false, contentType: false, success: function(response) { // .. do something }, error: function(jqXHR, textStatus, errorMessage) { console.log(errorMessage); // Optional } }); } /*UPLOAD FILES HERE*/ upload(e.originalEvent.dataTransfer.files); } } } ); function upload(files){ console.log('Upload '+files.length+' File(s).'); }; then if I use another method is that to make the file data into an array inside the jquery code, var file = {name: fileList[i].name, type: fileList[i].type, size:fileList[i].size}; $.each(file, function(key, value) { fd.append('file['+key+']', value); }); but where is the tmp_name data inside e.originalEvent.dataTransfer.files[i]? php, print_r($_POST); $uploaddir = './uploads/'; $file = $uploaddir . basename($_POST['file']['name']); if (move_uploaded_file($_POST['file']['tmp_name'], $file)) { echo "success"; } else { echo "error"; } as you can see that tmp_name is needed to upload the file via php... html, <div id="div">Drop here</div>

Read the article

MySQL Cluster data nodes - slow SELECTs

- by Boyan Georgiev

Hi to all. First off, I'm new to MySQL Cluster. This is my pain: I've managed to setup a MySQL Cluster with two data nodes, two SQL nodes and one management server. Everything works pretty well, except the following: my data nodes are spread across an intranet link which incurs latency into communications between the data nodes. Apparently, due to MySQL Cluster's internal partitioning schemes, when my PHP application pulls data from the cluster via SELECT queries, parts of the data are pulled from both data nodes. This makes the page appear onscreen REALLY slowly. If I bring one data node offline, the data can only be pulled from that single remaining data node, and thus, the final result (HTML output) appears on the screen in a very timely fashion. So, my question is this: can the data nodes/cluster be told to pull data from partitions stored only on a particular data node?

Read the article

Is there any reason why someone would want to create an Core Data model programmatically?

- by mystify

I wonder in which cases it would be good to make an NSManagedObjectModel completely programmatically, with NSEntityDescription instances and all this stuff. I'm that kind of person who prefers to code programmatically, rejecting Interface Builder. But when it comes to Core Data, I have a hard time figuring out why I should kill my time NOT using the nice Xcode Data Modeler tool. And since data models are stuck to a given state (except when you want to do some ugly migration operations where thinks probably go wrong and users get mad, really mad), I see no big sense in a data model that's made programmatically for the purpose of changing it all the time. Did I miss something?

Read the article

How to structure a Visual Studio project for the data access layer

- by Akk

I currently have a project that uses various DB access technologies mainly for showcasing or for demos. Currently we have: Namespace App.Data (App.Data.dll) Folder NHibernate Folder EntityFramework Folder LinqToSql The above structure is ok as we only use Sql Server as the DB. But going forward we will be including Oracle, MySql etc. So what would be a better structure with this in mind? I thought about: Namespace App.Data.SqlServer (App.Data.SqlServer.dll) Folder NHibernate Folder EntityFramework Folder LinqToSql Or would it just be better to have separate assemblies for each database and access technology?: Namespace App.Data.SqlServer.NHibernate (App.Data.SqlServer.NHibernate.dll) Namespace App.Data.SqlServer.EntityFramework(App.Data.SqlServer.EntityFramework.dll) Namespace App.Data.Oracle.NHibernate (App.Data.Oracle.NHibernate.dll) Namespace App.Data.MySql.NHibernate (App.Data.MySql.Oracle.dll)

Read the article

RabbitMQ Management console not working

- by rrejc

I have started with RabbitMQ. I have a (windows) machine on which I installed two RabbitMQ nodes as a service - I have choose the nodename, port and service name for each of them. The services are running normally (i see that they are listening in a netstat-a). I have also installed management plugin with "rabbitmq-plugins enable rabbitmq_management" and restarted both services. But the plugin isn't running - I dont see it listening in a netstat and I can't connect to the management console via browser. Any idea what could be wrong? Is there any log to see what is goind on? Updated: when I do rabbitmq-plugins list i get: c:\RabbitMq\sbin>rabbitmq-plugins list [e] amqp_client 3.0.1 [ ] cowboy 0.5.0-rmq3.0.1-git4b93c2d [ ] eldap 3.0.1-gite309de4 [e] mochiweb 2.3.1-rmq3.0.1-gitd541e9a [ ] rabbitmq_auth_backend_ldap 3.0.1 [ ] rabbitmq_auth_mechanism_ssl 3.0.1 [ ] rabbitmq_consistent_hash_exchange 3.0.1 [ ] rabbitmq_federation 3.0.1 [ ] rabbitmq_federation_management 3.0.1 [ ] rabbitmq_jsonrpc 3.0.1 [ ] rabbitmq_jsonrpc_channel 3.0.1 [ ] rabbitmq_jsonrpc_channel_examples 3.0.1 [E] rabbitmq_management 3.0.1 [e] rabbitmq_management_agent 3.0.1 [ ] rabbitmq_management_visualiser 3.0.1 [e] rabbitmq_mochiweb 3.0.1 [ ] rabbitmq_mqtt 3.0.1 [ ] rabbitmq_old_federation 3.0.1 [ ] rabbitmq_shovel 3.0.1 [ ] rabbitmq_shovel_management 3.0.1 [ ] rabbitmq_stomp 3.0.1 [ ] rabbitmq_tracing 3.0.1 [ ] rabbitmq_web_stomp 3.0.1 [ ] rabbitmq_web_stomp_examples 3.0.1 [ ] rfc4627_jsonrpc 3.0.1-git7ab174b [ ] sockjs 0.3.3-rmq3.0.1-git92d4ba4 [e] webmachine 1.9.1-rmq3.0.1-git52e62bc

Read the article

Combining FileStream and MemoryStream to avoid disk accesses/paging while receiving gigabytes of data?

- by w128

I'm receiving a file as a stream of byte[] data packets (total size isn't known in advance) that I need to store somewhere before processing it immediately after it's been received (I can't do the processing on the fly). Total received file size can vary from as small as 10 KB to over 4 GB. One option for storing the received data is to use a MemoryStream, i.e. a sequence of MemoryStream.Write(bufferReceived, 0, count) calls to store the received packets. This is very simple, but obviously will result in out of memory exception for large files. An alternative option is to use a FileStream, i.e. FileStream.Write(bufferReceived, 0, count). This way, no out of memory exceptions will occur, but what I'm unsure about is bad performance due to disk writes (which I don't want to occur as long as plenty of memory is still available) - I'd like to avoid disk access as much as possible, but I don't know of a way to control this. I did some testing and most of the time, there seems to be little performance difference between say 10 000 consecutive calls of MemoryStream.Write() vs FileStream.Write(), but a lot seems to depend on buffer size and the total amount of data in question (i.e the number of writes). Obviously, MemoryStream size reallocation is also a factor. Does it make sense to use a combination of MemoryStream and FileStream, i.e. write to memory stream by default, but once the total amount of data received is over e.g. 500 MB, write it to FileStream; then, read in chunks from both streams for processing the received data (first process 500 MB from the MemoryStream, dispose it, then read from FileStream)? Another solution is to use a custom memory stream implementation that doesn't require continuous address space for internal array allocation (i.e. a linked list of memory streams); this way, at least on 64-bit environments, out of memory exceptions should no longer be an issue. Con: extra work, more room for mistakes. So how do FileStream vs MemoryStream read/writes behave in terms of disk access and memory caching, i.e. data size/performance balance. I would expect that as long as enough RAM is available, FileStream would internally read/write from memory (cache) anyway, and virtual memory would take care of the rest. But I don't know how often FileStream will explicitly access a disk when being written to. Any help would be appreciated.

Read the article

where is the data store when i using google-app-engine on localhost..

- by zjm1126

i want to find the data on my disk thanks

Read the article

Are there any frameworks for data subscription and update?

- by Timothy Pratley

There is one server with multiple clients. The clients are viewing subsets of the servers entire data. If the data that a client is viewing changes, the client should be informed of the changes so that it displays the current data. Example: Two clients are viewing a list of users in an administration screen. One client adds a new user to the list and modifies the permissions of another user. The other client sees the changes propagated to their view.

Read the article

Best way to migrate servers without losing any data and with no downtime(?)

- by ina

This is a methodology question from a freelancer, with a corollary on MySQL.. Is there a way to migrate from an old dedicated server to a new one without losing any data in-between - and with no downtime? In the past, I've had to lose MySQL data between the time when the new server goes up (i.e., all files transferred, system up and ready), and when I take the old server down (data still transferred to old until new one takes over). There is also a short period where both are down for DNS, etc., to refresh. Is there a way for MySQL/root to easily transfer all data that was updated/inserted between a certain time frame?

Read the article

R: How to write out a data.frame so that I can paste it into SO for others to read?

- by John

I have a large data.frame displaying some weird properties when plotted. I'd like to ask a question about it on Stackoverflow, to do that I'd like to write the data.frame out in a form that I can paste it into SO and somebody else can easily run it and have it back into a data.frame object again. Is there an easy way to accomplish this? Also, if it is really long, should I use paste bin instead of directly paste it here?

Read the article

Is there an difference between transient properties defined in the data model, or in the custom subc

- by mystify

I was reading that setting the value of a transient property always results in marking the managed object as "dirty". However, what I don't get is this: If I make a subclass of NSManagedObject and use some extra properties which I don't need to be persistet, how does Core Data know about them and how can it mark the object as dirty when I access these? Again, they're not defined in the data model, so Core Data has no really good hint that they are there. Or does Core Data use some kind of introspection to analyze my custom class and figure out what properties I have in there?

Read the article

C# or windows equivalent of OS X's Core Data?

- by Nektarios

I'm late to the boat and have only just now started using Core Data in OS X / Cocoa - it's incredible and is really changing the way I look at things. Is there an equivalent technology in C# or the modern Windows frameworks? i.e. having managed data types where you get saving, data management, deleting, searching all for free? Also wondering if there's anything like this on Linux.

Read the article

Mass data store with SQL SERVER

- by Leo

We need management 10,000 GPS devices, each GPS device upload a GPS data every 30 seconds, these data need to store in the database(MS SQL Server 2005). Each GPS device daily data quantity is: 24 * 60 * 2 = 2,880 10 000 10,000 GPS devices daily data quantity is: 10000 * 2880 = 28,800,000 Each GPS data approximately 160Byte, the amount of data per day is: 28,800,000 * 160 = 4.29GB We need hold at least 3 months of GPS data in the database, My question is: 1, whether SQL Server 2005 can support such a large amount of data store? 2, How to plan data table? (all GPS data storage in one table? Daily table? Each GPS device with a GPS data table?) The GPS data: GPSID varchar(21), RecvTime datetime, GPSTime datetime, IsValid bit, IsNavi bit, Lng float, Lat float, Alt float, Spd smallint, Head smallint, PulseValue bigint, Oil float, TSW1 bigint, TSW1Mask bigint, TSW2 bigint, TSW2Mask, BSW bigint, StateText varchar(200), PosText varchar(200), UploadType tinyint

Read the article

What is the most efficient way to use Core Data?

- by Eric

I'm developing an iPad application using Core Data, and was hoping someone could clarify something about Core Data. Right now, I populate my table by making a fetch request for all of my data in viewDidLoad. I'd rather make individual fetch requests in my tableView:cellForRowAtIndexPath:. Can anyone tell me which is more efficient, and why? In other words, is it much less efficient to make lots of small requests as opposed to one big request?

Read the article

What happens if a user jumps over 10 versions before updating, and every version had a new data mode

- by dontWatchMyProfile

Example: User installs app v.1.0, adds data. Then the dev submits 10 updates in 10 weeks. After 11 weeks, the user wants v.11.0 and grabs a copy from the app store. Assuming that the app has got 11 .xcdatamodel versions inside, where ***11.xcdatamodel is the current one, what would happen now since the persistent store of the user is ages old? would the migration happen 10 times, step-by-step through every migration iteration? Or does the actual migration of data (lets assume gigabytes of data) happen exactly once, after Core Data (or the persistent store coordinator) has figured out precisely what to do to go from v.1.0 to v.11.0?

Read the article

What are the different methods to expedite the data retrieval from database?

- by saillu2003

How can we expedite the data retrieval from the database for my web site which is extensively updating and fetching data from database.

Read the article

Iterating over a large data set in long running Python process - memory issues?

- by user1094786

I am working on a long running Python program (a part of it is a Flask API, and the other realtime data fetcher). Both my long running processes iterate, quite often (the API one might even do so hundreds of times a second) over large data sets (second by second observations of certain economic series, for example 1-5MB worth of data or even more). They also interpolate, compare and do calculations between series etc. What techniques, for the sake of keeping my processes alive, can I practice when iterating / passing as parameters / processing these large data sets? For instance, should I use the gc module and collect manually? Any advice would be appreciated. Thanks!

Read the article

WCF – interchangeable data-contract types

- by nmarun

In a WSDL based environment, unlike a CLR-world, we pass around the ‘state’ of an object and not the reference of an object. Well firstly, what does ‘state’ mean and does this also mean that we can send a struct where a class is expected (or vice-versa) as long as their ‘state’ is one and the same? Let’s see. So I have an operation contract defined as below: 1: [ServiceContract] 2: public interface ILearnWcfServiceExtend : ILearnWcfService 3: { 4: [OperationContract] 5: Employee SaveEmployee(Employee employee); 6: } 7: 8: [ServiceBehavior] 9: public class LearnWcfService : ILearnWcfServiceExtend 10: { 11: public Employee SaveEmployee(Employee employee) 12: { 13: employee.EmployeeId = 123; 14: return employee; 15: } 16: } Quite simplistic operation there (which translates to ‘absolutely no business value’). Now, the data contract Employee mentioned above is a struct. 1: public struct Employee 2: { 3: public int EmployeeId { get; set; } 4: 5: public string FName { get; set; } 6: } After compilation and consumption of this service, my proxy (in the Reference.cs file) looks like below (I’ve ignored the rest of the details just to avoid unwanted confusion): 1: public partial struct Employee : System.Runtime.Serialization.IExtensibleDataObject, System.ComponentModel.INotifyPropertyChanged I call the service with the code below: 1: private static void CallWcfService() 2: { 3: Employee employee = new Employee { FName = "A" }; 4: Console.WriteLine("IsValueType: {0}", employee.GetType().IsValueType); 5: Console.WriteLine("IsClass: {0}", employee.GetType().IsClass); 6: Console.WriteLine("Before calling the service: {0} - {1}", employee.EmployeeId, employee.FName); 7: employee = LearnWcfServiceClient.SaveEmployee(employee); 8: Console.WriteLine("Return from the service: {0} - {1}", employee.EmployeeId, employee.FName); 9: } The output is: I now change my Employee type from a struct to a class in the proxy class and run the application: 1: public partial class Employee : System.Runtime.Serialization.IExtensibleDataObject, System.ComponentModel.INotifyPropertyChanged { The output this time is: The state of an object implies towards its composition, the properties and the values of these properties and not based on whether it is a reference type (class) or a value type (struct). And as shown above, we’re actually passing an object by its state and not by reference. Continuing on the same topic of ‘type-interchangeability’, WCF treats two data contracts as equivalent if they have the same ‘wire-representation’. We can do so using the DataContract and DataMember attributes’ Name property. 1: [DataContract] 2: public struct Person 3: { 4: [DataMember] 5: public int Id { get; set; } 6: 7: [DataMember] 8: public string FirstName { get; set; } 9: } 10: 11: [DataContract(Name="Person")] 12: public class Employee 13: { 14: [DataMember(Name = "Id")] 15: public int EmployeeId { get; set; } 16: 17: [DataMember(Name="FirstName")] 18: public string FName { get; set; } 19: } I’ve created two data contracts with the exact same wire-representation. Just remember that the names and the types of data members need to match to be considered equivalent. The question then arises as to what gets generated in the proxy class. Despite us declaring two data contracts (Person and Employee), only one gets emitted – Person. This is because we’re saying that the Employee type has the same wire-representation as the Person type. Also that the signature of the SaveEmployee operation gets changed on the proxy side: 1: [System.CodeDom.Compiler.GeneratedCodeAttribute("System.ServiceModel", "4.0.0.0")] 2: [System.ServiceModel.ServiceContractAttribute(ConfigurationName="ServiceProxy.ILearnWcfServiceExtend")] 3: public interface ILearnWcfServiceExtend 4: { 5: [System.ServiceModel.OperationContractAttribute(Action="http://tempuri.org/ILearnWcfServiceExtend/SaveEmployee", ReplyAction="http://tempuri.org/ILearnWcfServiceExtend/SaveEmployeeResponse")] 6: ClientApplication.ServiceProxy.Person SaveEmployee(ClientApplication.ServiceProxy.Person employee); 7: } But, on the service side, the SaveEmployee still accepts and returns an Employee data contract. 1: [ServiceBehavior] 2: public class LearnWcfService : ILearnWcfServiceExtend 3: { 4: public Employee SaveEmployee(Employee employee) 5: { 6: employee.EmployeeId = 123; 7: return employee; 8: } 9: } Despite all these changes, our output remains the same as the last one: This is type-interchangeability at work! Here’s one more thing to ponder about. Our Person type is a struct and Employee type is a class. Then how is it that the Person type got emitted as a ‘class’ in the proxy? It’s worth mentioning that WSDL describes a type called Employee and does not say whether it is a class or a struct (see the SOAP message below): 1: <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" 2: xmlns:tem="http://tempuri.org/" 3: xmlns:ser="http://schemas.datacontract.org/2004/07/ServiceApplication"> 4: <soapenv:Header/> 5: <soapenv:Body> 6: <tem:SaveEmployee> 7:  8: <tem:employee> 9:  10: <ser:EmployeeId>?</ser:EmployeeId> 11:  12: <ser:FName>?</ser:FName> 13: </tem:employee> 14: </tem:SaveEmployee> 15: </soapenv:Body> 16: </soapenv:Envelope> There are some differences between how ‘Add Service Reference’ and the svcutil.exe generate the proxy class, but turns out both do some kind of reflection and determine the type of the data contract and emit the code accordingly. So since the Employee type is a class, the proxy ‘Person’ type gets generated as a class. In fact, reflecting on svcutil.exe application, you’ll see that there are a couple of places wherein a flag actually determines a type as a class or a struct. One example is in the ExportISerializableDataContract method in the System.Runtime.Serialization.CodeExporter class. Seems like these flags have a say in deciding whether the type gets emitted as a struct or a class. This behavior is different if you use the WSDL tool though. WSDL tool does not do any kind of reflection of the data contract / serialized type, it emits the type as a class by default. You can check this using the two command lines below: Note to self: Remember ‘state’ and type-interchangeability when traversing through the WSDL planet!

Read the article

Metrics - A little knowledge can be a dangerous thing (or 'Why you're not clever enough to interpret metrics data')

- by Jason Crease

At RedGate Software, I work on a .NET obfuscator called SmartAssembly. Various features of it use a database to store various things (exception reports, name-mappings, etc.) The user is given the option of using either a SQL-Server database (which requires them to have Microsoft SQL Server), or a Microsoft Access MDB file (which requires nothing). MDB is the default option, but power-users soon switch to using a SQL Server database because it offers better performance and data-sharing. In the fashionable spirit of optimization and metrics, an obvious product-management question is 'Which is the most popular? SQL Server or MDB?' We've collected data about this fact, using our 'Feature-Usage-Reporting' technology (available as part of SmartAssembly) and more recently our 'Application Metrics' technology: Parameter Number of users % of total users Number of sessions Number of usages SQL Server 28 19.0 8115 8115 MDB 114 77.6 1449 1449 (As a disclaimer, please note than SmartAssembly has far more than 132 users . This data is just a selection of one build) So, it would appear that SQL-Server is used by fewer users, but more often. Great. But here's why these numbers are useless to me: Only the original developers understand the data What does a single 'usage' of 'MDB' mean? Does this happen once per run? Once per option change? On clicking the 'Obfuscate Now' button? When running the command-line version or just from the UI version? Each question could skew the data 10-fold either way, and the answers only known by the developer that instrumented the application in the first place. In other words, only the original developer can interpret the data - product-managers cannot interpret the data unaided. Most of the data is from uninterested users About half of people who download and run a free-trial from the internet quit it almost immediately. Only a small fraction use it sufficiently to make informed choices. Since the MDB option is the default one, we don't know how many of those 114 were people CHOOSING to use the MDB, or how many were JUST HAPPENING to use this MDB default for their 20-second trial. This is a problem we see across all our metrics: Are people are using X because it's the default or are they using X because they want to use X? We need to segment the data further - asking what percentage of each percentage meet our criteria for an 'established user' or 'informed user'. You end up spending hours writing sophisticated and dubious SQL queries to segment the data further. Not fun. You can't find out why they used this feature Metrics can answer the when and what, but not the why. Why did people use feature X? If you're anything like me, you often click on random buttons in unfamiliar applications just to explore the feature-set. If we listened uncritically to metrics at RedGate, we would eliminate the most-important and more-complex features which people actually buy the software for, leaving just big buttons on the main page and the About-Box. "Ah, that's interesting!" rather than "Ah, that's actionable!" People do love data. Did you know you eat 1201 chickens in a lifetime? But just 4 cows? Interesting, but useless. Often metrics give you a nice number: '5.8% of users have 3 or more monitors' . But unless the statistic is both SUPRISING and ACTIONABLE, it's useless. Most metrics are collected, reviewed with lots of cooing. and then forgotten. Unless a piece-of-data could change things, it's useless collecting it. People get obsessed with significance levels The first things that lots of people do with this data is do a t-test to get a significance level ("Hey! We know with 99.64% confidence that people prefer SQL Server to MDBs!") Believe me: other causes of error/misinterpretation in your data are FAR more significant than your t-test could ever comprehend. Confirmation bias prevents objectivity If the data appears to match our instinct, we feel satisfied and move on. If it doesn't, we suspect the data and dig deeper, plummeting down a rabbit-hole of segmentation and filtering until we give-up and move-on. Data is only useful if it can change our preconceptions. Do you trust this dodgy data more than your own understanding, knowledge and intelligence? I don't. There's always multiple plausible ways to interpret/action any data Let's say we segment the above data, and get this data: Post-trial users (i.e. those using a paid version after the 14-day free-trial is over): Parameter Number of users % of total users Number of sessions Number of usages SQL Server 13 9.0 1115 1115 MDB 5 4.2 449 449 Trial users: Parameter Number of users % of total users Number of sessions Number of usages SQL Server 15 10.0 7000 7000 MDB 114 77.6 1000 1000 How do you interpret this data? It's one of: Mostly SQL Server users buy our software. People who can't afford SQL Server tend to be unable to afford or unwilling to buy our software. Therefore, ditch MDB-support. Our MDB support is so poor and buggy that our massive MDB user-base doesn't buy it. Therefore, spend loads of money improving it, and think about ditching SQL-Server support. People 'graduate' naturally from MDB to SQL Server as they use the software more. Things are fine the way they are. We're marketing the tool wrong. The large number of MDB users represent uninformed downloaders. Tell marketing to aggressively target SQL Server users. To choose an interpretation you need to segment again. And again. And again, and again. Opting-out is correlated with feature-usage Metrics tends to be opt-in. This skews the data even further. Between 5% and 30% of people choose to opt-in to metrics (often called 'customer improvement program' or something like that). Casual trial-users who are uninterested in your product or company are less likely to opt-in. This group is probably also likely to be MDB users. How much does this skew your data by? Who knows? It's not all doom and gloom. There are some things metrics can answer well. Environment facts. How many people have 3 monitors? Have Windows 7? Have .NET 4 installed? Have Japanese Windows? Minor optimizations. Is the text-box big enough for average user-input? Performance data. How long does our app take to start? How many databases does the average user have on their server? As you can see, questions about who-the-user-is rather than what-the-user-does are easier to answer and action. Conclusion Use SmartAssembly. If not for the metrics (called 'Feature-Usage-Reporting'), then at least for the obfuscation/error-reporting. Data raises more questions than it answers. Questions about environment are the easiest to answer.

Read the article

Data Source Connection Pool Sizing

- by Steve Felts

Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman","serif";} One of the most time-consuming procedures of a database application is establishing a connection. The connection pooling of the data source can be used to minimize this overhead. That argues for using the data source instead of accessing the database driver directly. Configuring the size of the pool in the data source is somewhere between an art and science – this article will try to move it closer to science. From the beginning, WLS data source has had an initial capacity and a maximum capacity configuration values. When the system starts up and when it shrinks, initial capacity is used. The pool can grow to maximum capacity. Customers found that they might want to set the initial capacity to 0 (more on that later) but didn’t want the pool to shrink to 0. In WLS 10.3.6, we added minimum capacity to specify the lower limit to which a pool will shrink. If minimum capacity is not set, it defaults to the initial capacity for upward compatibility. We also did some work on the shrinking in release 10.3.4 to reduce thrashing; the algorithm that used to shrink to the maximum of the currently used connections or the initial capacity (basically the unused connections were all released) was changed to shrink by half of the unused connections. The simple approach to sizing the pool is to set the initial/minimum capacity to the maximum capacity. Doing this creates all connections at startup, avoiding creating connections on demand and the pool is stable. However, there are a number of reasons not to take this simple approach. When WLS is booted, the deployment of the data source includes synchronously creating the connections. The more connections that are configured in initial capacity, the longer the boot time for WLS (there have been several projects for parallel boot in WLS but none that are available). Related to creating a lot of connections at boot time is the problem of logon storms (the database gets too much work at one time). WLS has a solution for that by setting the login delay seconds on the pool but that also increases the boot time. There are a number of cases where it is desirable to set the initial capacity to 0. By doing that, the overhead of creating connections is deferred out of the boot and the database doesn’t need to be available. An application may not want WLS to automatically connect to the database until it is actually needed, such as for some code/warm failover configurations. There are a number of cases where minimum capacity should be less than maximum capacity. Connections are generally expensive to keep around. They cause state to be kept on both the client and the server, and the state on the backend may be heavy (for example, a process). Depending on the vendor, connection usage may cost money. If work load is not constant, then database connections can be freed up by shrinking the pool when connections are not in use. When using Active GridLink, connections can be created as needed according to runtime load balancing (RLB) percentages instead of by connection load balancing (CLB) during data source deployment. Shrinking is an effective technique for clearing the pool when connections are not in use. In addition to the obvious reason that there times where the workload is lighter, there are some configurations where the database and/or firewall conspire to make long-unused or too-old connections no longer viable. There are also some data source features where the connection has state and cannot be used again unless the state matches the request. Examples of this are identity based pooling where the connection has a particular owner and XA affinity where the connection is associated with a particular RAC node. At this point, WLS does not re-purpose (discard/replace) connections and shrinking is a way to get rid of the unused existing connection and get a new one with the correct state when needed. So far, the discussion has focused on the relationship of initial, minimum, and maximum capacity. Computing the maximum size requires some knowledge about the application and the current number of simultaneously active users, web sessions, batch programs, or whatever access patterns are common. The applications should be written to only reserve and close connections as needed but multiple statements, if needed, should be done in one reservation (don’t get/close more often than necessary). This means that the size of the pool is likely to be significantly smaller then the number of users. If possible, you can pick a size and see how it performs under simulated or real load. There is a high-water mark statistic (ActiveConnectionsHighCount) that tracks the maximum connections concurrently used. In general, you want the size to be big enough so that you never run out of connections but no bigger. It will need to deal with spikes in usage, which is where shrinking after the spike is important. Of course, the database capacity also has a big influence on the decision since it’s important not to overload the database machine. Planning also needs to happen if you are running in a Multi-Data Source or Active GridLink configuration and expect that the remaining nodes will take over the connections when one of the nodes in the cluster goes down. For XA affinity, additional headroom is also recommended. In summary, setting initial and maximum capacity to be the same may be simple but there are many other factors that may be important in making the decision about sizing.

Search Results

Search found 67506 results on 2701 pages for 'management data warehouse'.

Page 135/2701 | < Previous Page | 131 132 133 134 135 136 137 138 139 140 141 142 | Next Page >

- by jkp

- by Bobby

- by dr.bunsen

- by hydra1983

- by Rob

- by lauthiamkok

- by Boyan Georgiev

- by mystify

- by Akk

- by rrejc

- by w128

- by zjm1126

- by Timothy Pratley

- by ina

- by John

- by mystify

- by Nektarios

- by Leo

- by Eric

- by dontWatchMyProfile

- by saillu2003

- by user1094786

- by nmarun

- by Jason Crease

- by Steve Felts

< Previous Page | 131 132 133 134 135 136 137 138 139 140 141 142 | Next Page >