Search Results

Search found 813 results on 33 pages for 'concurrency'.

Page 33/33 | < Previous Page | 29 30 31 32 33 

  • Concurrent Generation of Sequential Keys

    - by GenTiradentes
    I'm working on a project which generates a very large number of sequential text strings, in a very tight loop. My application makes heavy use of SIMD instruction set extensions like SSE and MMX, in other parts of the program, but the key generator is plain C++. The way my key generator works is I have a keyGenerator class, which holds a single char array that stores the current key. To get the next key, there is a function called "incrementKey," which treats the string as a number, adding one to the string, carrying where necessary. Now, the problem is, the keygen is somewhat of a bottleneck. It's fast, but it would be nice if it were faster. One of the biggest problems is that when I'm generating a set of sequential keys to be processed using my SSE2 code, I have to have the entire set stored in an array, which means I have to sequentially generate and copy 12 strings into an array, one by one, like so: char* keys[12]; for(int i = 0; i < 12; i++) { keys[i] = new char[16]; strcmp(keys[i], keygen++); } So how would you efficiently generate these plaintext strings in order? I need some ideas to help move this along. Concurrency would be nice; as my code is right now, each successive key depends on the previous one, which means that the processor can't start work on the next key until the current one has been completely generated. Here is the code relevant to the key generator: KeyGenerator.h class keyGenerator { public: keyGenerator(unsigned long long location, characterSet* charset) : location(location), charset(charset) { for(int i = 0; i < 16; i++) key[i] = 0; charsetStr = charset->getCharsetStr(); integerToKey(); } ~keyGenerator() { } inline void incrementKey() { register size_t keyLength = strlen(key); for(register char* place = key; place; place++) { if(*place == charset->maxChar) { // Overflow, reset char at place *place = charset->minChar; if(!*(place+1)) { // Carry, no space, insert char *(place+1) = charset->minChar; ++keyLength; break; } else { continue; } } else { // Space available, increment char at place if(*place == charset->charSecEnd[0]) *place = charset->charSecBegin[0]; else if(*place == charset->charSecEnd[1]) *place = charset->charSecBegin[1]; (*place)++; break; } } } inline char* operator++() // Pre-increment { incrementKey(); return key; } inline char* operator++(int) // Post-increment { memcpy(postIncrementRetval, key, 16); incrementKey(); return postIncrementRetval; } void integerToKey() { register unsigned long long num = location; if(!num) { key[0] = charsetStr[0]; } else { num++; while(num) { num--; unsigned int remainder = num % charset->length; num /= charset->length; key[strlen(key)] = charsetStr[remainder]; } } } inline unsigned long long keyToInteger() { // TODO return 0; } inline char* getKey() { return key; } private: unsigned long long location; characterSet* charset; std::string charsetStr; char key[16]; // We need a place to store the key for the post increment operation. char postIncrementRetval[16]; }; CharacterSet.h struct characterSet { characterSet() { } characterSet(unsigned int len, int min, int max, int charsec0, int charsec1, int charsec2, int charsec3) { init(length, min, max, charsec0, charsec1, charsec2, charsec3); } void init(unsigned int len, int min, int max, int charsec0, int charsec1, int charsec2, int charsec3) { length = len; minChar = min; maxChar = max; charSecEnd[0] = charsec0; charSecBegin[0] = charsec1; charSecEnd[1] = charsec2; charSecBegin[1] = charsec3; } std::string getCharsetStr() { std::string retval; for(int chr = minChar; chr != maxChar; chr++) { for(int i = 0; i < 2; i++) if(chr == charSecEnd[i]) chr = charSecBegin[i]; retval += chr; } return retval; } int minChar, maxChar; // charSec = character set section int charSecEnd[2], charSecBegin[2]; unsigned int length; };

    Read the article

  • OIM 11g : Multi-thread approach for writing custom scheduled job

    - by Saravanan V S
    In this post I have shared my experience of designing and developing an OIM schedule job that uses multi threaded approach for updating data in OIM using APIs.  I have used thread pool (in particular fixed thread pool) pattern in developing the OIM schedule job. The thread pooling pattern has noted advantages compared to thread per task approach. I have listed few of the advantage here ·         Threads are reused ·         Creation and tear-down cost of thread is reduced ·         Task execution latency is reduced ·         Improved performance ·         Controlled and efficient management of memory and resources used by threads More about java thread pool http://docs.oracle.com/javase/tutorial/essential/concurrency/pools.html The following diagram depicts the high-level architectural diagram of the schedule job that process input from a flat file to update OIM process form data using fixed thread pool approach    The custom scheduled job shared in this post is developed to meet following requirement 1)      Need to process a CSV extract that contains identity, account identifying key and list of data to be updated on an existing OIM resource account. 2)      CSV file can contain data for multiple resources configured in OIM 3)      List of attribute to update and mapping between CSV column to OIM fields may vary between resources The following are three Java class developed for this requirement (I have given only prototype of the code that explains how to use thread pools in schedule task) CustomScheduler.java - Implementation of TaskSupport class that reads and passes the parameters configured on the schedule job to Thread Executor class. package com.oracle.oim.scheduler; import java.util.HashMap; import com.oracle.oim.bo.MultiThreadDataRecon; import oracle.iam.scheduler.vo.TaskSupport; public class CustomScheduler extends TaskSupport {      public void execute(HashMap options) throws Exception {             /*  Read Schedule Job Parameters */             String param1 = (String) options.get(“Parameter1”);             .             int noOfThread = (int) options.get(“No of Threads”);             .             String paramn = (int) options.get(“ParamterN”); /* Provide all the required input configured on schedule job to Thread Pool Executor implementation class like 1) Name of the file, 2) Delimiter 3) Header Row Numer 4) Line Escape character 5) Config and resource map lookup 6) No the thread to create */ new MultiThreadDataRecon(all_required_parameters, noOfThreads).reconcile();       }       public HashMap getAttributes() { return null; }       public void setAttributes() {       } } MultiThreadDataRecon.java – Helper class that reads data from input file, initialize the thread executor and builds the task queue. package com.oracle.oim.bo; import <required file IO classes>; import  <required java.util classes>; import  <required OIM API classes>; import <csv reader api>; public class MultiThreadDataRecon {  private int noOfThreads;  private ExecutorService threadExecutor = null;  public MetaDataRecon(<required params>, int noOfThreads)  {       //Store parameters locally       .       .       this.noOfThread = noOfThread;  }        /**        *  Initialize         */  private void init() throws Exception {       try {             // Initialize CSV file reader API objects             // Initialize OIM API objects             /* Initialize Fixed Thread Pool Executor class if no of threads                 configured is more than 1 */             if (noOfThreads > 1) {                   threadExecutor = Executors.newFixedThreadPool(noOfThreads);             } else {                   threadExecutor = Executors.newSingleThreadExecutor();             }             /* Initialize TaskProcess clas s which will be executing task                 from the Queue */                TaskProcessor.initializeConfig(params);       } catch (***Exception e) {                   // TO DO       }  }       /**        *  Method to reconcile data from CSV to OIM        */ public void reconcile() throws Exception {        try {             init();             while(<csv file has line>){                   processRow(line);             }             /* Initiate thread shutdown */             threadExecutor.shutdown();             while (!threadExecutor.isTerminated()) {                 // Wait for all task to complete.             }            } catch (Exception e) {                   // TO DO            } finally {                   try {                         //Close all the file handles                   } catch (IOException e) {                         //TO DO                   }             }       }       /**        * Method to process         */       private void processRow(String row) {             // Create task processor instance with the row data              // Following code push the task to work queue and wait for next                available thread to execute             threadExecutor.execute(new TaskProcessor(rowData));       } } TaskProcessor.java – Implementation of “Runnable” interface that executes the required business logic to update data in OIM. package com.oracle.oim.bo; import <required APIs> class TaskProcessor implements Runnable {       //Initialize required member variables       /**        * Constructor        */       public TaskProcessor(<row data>) {             // Initialize and parse csv row       }       /*       *  Method to initialize required object for task execution       */       public static void initializeConfig(<params>) {             // Process param and initialize the required configs and object       }           /*        * (non-Javadoc)        *         * @see java.lang.Runnable#run()        */            public void run() {             if (<is csv data valid>){                   processData();             }       }  /**   * Process the the received CSV input   */  private void processData() {     try{       //Find the user in OIM using the identity matching key value from CSV       // Find the account to be update from user’s account based on account identifying key on CSV       // Update the account with data from CSV       }catch(***Exception e){           //TO DO       }   } }

    Read the article

  • New hire expectations... (Am I being unreasonable?)

    - by user295841
    I work for a very small custom software shop. We currently consist me and my boss. My boss is an old FoxPro DOS developer and OOP makes him uncomfortable. He is planning on taking a back seat in the next few years to hopefully enjoy a “partial retirement”. I will be taking over the day to day operations and we are now desperately looking for more help. We tried Monster.com, Dice.com, and others a few years ago when we started our search. We had no success. We have tried outsourcing overseas (total disaster), hiring kids right out of college (mostly a disaster but that’s where I came from), interns (good for them, not so good for us) and hiring laid off “experienced” developers (there was a reason they were laid off). I have heard hiring practices discussed on podcasts, blogs, etc... and have tried a few. The “Fizz Buzz” test was a good one. One kid looked physically ill before he finally gave up. I think my problem is that I have grown so much as a developer since I started here that I now have a high standard. I hear/read very intelligent people podcasts and blogs and I know that there are lots of people out there that can do the job. I don’t want to settle for less than a “good” developer. Perhaps my expectations are unreasonable. I expect any good developer (entry level or experienced) to be billable (at least paying their own wage) in under one month. I expect any good developer to be able to be productive (at least dangerous) in any language or technology with only a few days of research/training. I expect any good developer to be able to take a project from initial customer request to completion with little or no help from others. Am I being unreasonable? What constitutes a valuable developer? What should be expected of an entry level developer? What should be expected of an experienced developer? I realize that everyone is different but there has to be some sort of expectations standard, right? I have been giving the test project below to potential canidates to weed them out. Good idea? Too much? Too little? Please let me know what you think. Thanks. Project ID: T00001 Description: Order Entry System Deadline: 1 Week Scope The scope of this project is to develop a fully function order entry system. Screen/Form design must be user friendly and promote efficient data entry and modification. User experience (Navigation, Screen/Form layouts, Look and Feel…) is at the developer’s discretion. System may be developed using any technologies that conform to the technical and system requirements. Deliverables Complete source code Database setup instructions (Scripts or restorable backup) Application installation instructions (Installer or installation procedure) Any necessary documentation Technical Requirements Server Platform – Windows XP / Windows Server 2003 / SBS Client Platform – Windows XP Web Browser (If applicable) – IE 8 Database – At developer’s discretion (Must be a relational SQL database.) Language – At developer’s discretion All data must be normalized. (+) All data must maintain referential integrity. (++) All data must be indexed for optimal performance. System must handle concurrency. System Requirements Customer Maintenance Customer records must have unique ID. Customer data will include Name, Address, Phone, etc. User must be able to perform all CRUD (Create, Read, Update, and Delete) operations on the Customer table. User must be able to enter a specific Customer ID to edit. User must be able to pull up a sortable/queryable search grid/utility to find a customer to edit. Validation must be performed prior to database commit. Customer record cannot be deleted if the customer has an order in the system. (++) Inventory Maintenance Part records must have unique ID. Part data will include Description, Price, UOM (Unit of Measure), etc. User must be able to perform all CRUD operations on the part table. User must be able to enter a specific Part ID to edit. User must be able to pull up a sortable/queryable search grid/utility to find a part to edit. Validation must be performed prior to database commit. Part record cannot be deleted if the part has been used in an order. (++) Order Entry Order records must have a unique auto-incrementing key (Order Number). Order data must be split into a header/detail structure. (+) Order can contain an infinite number of detail records. Order header data will include Order Number, Customer ID (++), Order Date, Order Status (Open/Closed), etc. Order detail data will include Part Number (++), Quantity, Price, etc. User must be able to perform all CRUD operations on the order tables. User must be able to enter a specific Order Number to edit. User must be able to pull up a sortable/queryable search grid/utility to find an order to edit. User must be able to print an order form from within the order entry form. Validation must be performed prior to database commit. Reports Customer Listing – All Customers in the system. Inventory Listing – All parts in the system. Open Order Listing – All open orders in system. Customer Order Listing – All orders for specific customer. All reports must include sorts and filter functions where applicable. Ex. Customer Listing by range of Customer IDs. Open Order Listing by date range.

    Read the article

  • Is multithreading the right way to go for my case?

    - by Julien Lebosquain
    Hello, I'm currently designing a multi-client / server application. I'm using plain good old sockets because WCF or similar technology is not what I need. Let me explain: it isn't the classical case of a client simply calling a service; all clients can 'interact' with each other by sending a packet to the server, which will then do some action, and possible re-dispatch an answer message to one or more clients. Although doable with WCF, the application will get pretty complex with hundreds of different messages. For each connected client, I'm of course using asynchronous methods to send and receive bytes. I've got the messages fully working, everything's fine. Except that for each line of code I'm writing, my head just burns because of multithreading issues. Since there could be around 200 clients connected at the same time, I chose to go the fully multithreaded way: each received message on a socket is immediately processed on the thread pool thread it was received, not on a single consumer thread. Since each client can interact with other clients, and indirectly with shared objects on the server, I must protect almost every object that is mutable. I first went with a ReaderWriterLockSlim for each resource that must be protected, but quickly noticed that there are more writes overall than reads in the server application, and switched to the well-known Monitor to simplify the code. So far, so good. Each resource is protected, I have helper classes that I must use to get a lock and its protected resource, so I can't use an object without getting a lock. Moreover, each client has its own lock that is entered as soon as a packet is received from its socket. It's done to prevent other clients from making changes to the state of this client while it has some messages being processed, which is something that will happen frequently. Now, I don't just need to protect resources from concurrent accesses. I must keep every client in sync with the server for some collections I have. One tricky part that I'm currently struggling with is the following: I have a collection of clients. Each client has its own unique ID. When a client connects, it must receive the IDs of every connected client, and each one of them must be notified of the newcomer's ID. When a client disconnects, every other client must know it so that its ID is no longer valid for them. Every client must always have, at a given time, the same clients collection as the server so that I can assume that everybody knows everybody. This way if I'm sending a message to client #1 telling "Client #2 has done something", I know that it will always be correctly interpreted: Client 1 will never wonder "but who is Client 2 anyway?". My first attempt for handling the connection of a new client (let's call it X) was this pseudo-code (remember that newClient is already locked here): lock (clients) { foreach (var client in clients) { lock (client) { client.Send("newClient with id X has connected"); } } clients.Add(newClient); newClient.Send("the list of other clients"); } Now imagine that in the same time, another client has sent a packet that translates into a message that must be broadcasted to every connected client, the pseudo-code will be something like this (remember that the current client - let's call it Y - is already locked here): lock (clients) { foreach (var client in clients) { lock (client) { client.Send("something"); } } } An obvious deadlock occurs here: on one thread X is locked, the clients lock has been entered, started looping through the clients, and at one moment must get Y's lock... which is already acquired on the second thread, itself waiting for the clients collection lock to be released! This is not the only case like this in the server application. There are other collections which must be kept in sync with the clients, some properties on a client can be changed by another one, etc. I tried other types of locks, lock-free mechanisms and a bunch of other things. Either there were obvious deadlocks when I'm using too much locks for safety, or obvious race conditions otherwise. When I finally find a good middle point between the two, it usually comes with very subtle race conditions / dead locks and other multi-threading issues... my head hurts very quickly since for any single line of code I'm writing I have to review almost the whole application to ensure everything will behave correctly with any number of threads. So here's my final question: how would you resolve this specific case, the general case, and more importantly: aren't I going the wrong way here? I have little problems with the .NET framework, C#, simple concurrency or algorithms in general. Still, I'm lost here. I know I could use only one thread processing the incoming requests and everything will be fine. However, that won't scale well at all with more clients... But I'm thinking more and more to go this simple way. What do you think? Thanks in advance to you, StackOverflow people which have taken the time to read this huge question. I really had to explain the whole context if I want to get some help.

    Read the article

  • Which workaround to use for the following SQL deadlock?

    - by Marko
    I found a SQL deadlock scenario in my application during concurrency. I belive that the two statements that cause the deadlock are (note - I'm using LINQ2SQL and DataContext.ExecuteCommand(), that's where this.studioId.ToString() comes into play): exec sp_executesql N'INSERT INTO HQ.dbo.SynchronizingRows ([StudioId], [UpdatedRowId]) SELECT @p0, [t0].[Id] FROM [dbo].[UpdatedRows] AS [t0] WHERE NOT (EXISTS( SELECT NULL AS [EMPTY] FROM [dbo].[ReceivedUpdatedRows] AS [t1] WHERE ([t1].[StudioId] = @p0) AND ([t1].[UpdatedRowId] = [t0].[Id]) ))',N'@p0 uniqueidentifier',@p0='" + this.studioId.ToString() + "'; and exec sp_executesql N'INSERT INTO HQ.dbo.ReceivedUpdatedRows ([UpdatedRowId], [StudioId], [ReceiveDateTime]) SELECT [t0].[UpdatedRowId], @p0, GETDATE() FROM [dbo].[SynchronizingRows] AS [t0] WHERE ([t0].[StudioId] = @p0)',N'@p0 uniqueidentifier',@p0='" + this.studioId.ToString() + "'; The basic logic of my (client-server) application is this: Every time someone inserts or updates a row on the server side, I also insert a row into the table UpdatedRows, specifying the RowId of the modified row. When a client tries to synchronize data, it first copies all of the rows in the UpdatedRows table, that don't contain a reference row for the specific client in the table ReceivedUpdatedRows, to the table SynchronizingRows (the first statement taking part in the deadlock). Afterwards, during the synchronization I look for modified rows via lookup of the SynchronizingRows table. This step is required, otherwise if someone inserts new rows or modifies rows on the server side during synchronization I will miss them and won't get them during the next synchronization (explanation scenario to long to write here...). Once synchronization is complete, I insert rows to the ReceivedUpdatedRows table specifying that this client has received the UpdatedRows contained in the SynchronizingRows table (the second statement taking part in the deadlock). Finally I delete all rows from the SynchronizingRows table that belong to the current client. The way I see it, the deadlock is occuring on tables SynchronizingRows (abbreviation SR) and ReceivedUpdatedRows (abbreviation RUR) during steps 2 and 3 (one client is in step 2 and is inserting into SR and selecting from RUR; while another client is in step 3 inserting into RUR and selecting from SR). I googled a bit about SQL deadlocks and came to a conclusion that I have three options. Inorder to make a decision I need more input about each option/workaround: Workaround 1: The first advice given on the web about SQL deadlocks - restructure tables/queries so that deadlocks don't happen in the first place. Only problem with this is that with my IQ I don't see a way to do the synchronization logic any differently. If someone wishes to dwelve deeper into my current synchronization logic, how and why it is set up the way it is, I'll post a link for the explanation. Perhaps, with the help of someone smarter than me, it's possible to create a logic that is deadlock free. Workaround 2: The second most common advice seems to be the use of WITH(NOLOCK) hint. The problem with this is that NOLOCK might miss or duplicate some rows. Duplication is not a problem, but missing rows is catastrophic! Another option is the WITH(READPAST) hint. On the face of it, this seems to be a perfect solution. I really don't care about rows that other clients are inserting/modifying, because each row belongs only to a specific client, so I may very well skip locked rows. But the MSDN documentaion makes me a bit worried - "When READPAST is specified, both row-level and page-level locks are skipped". As I said, row-level locks would not be a problem, but page-level locks may very well be, since a page might contain rows that belong to multiple clients (including the current one). While there are lots of blog posts specifically mentioning that NOLOCK might miss rows, there seems to be none about READPAST (never) missing rows. This makes me skeptical and nervous to implement it, since there is no easy way to test it (implementing would be a piece of cake, just pop WITH(READPAST) into both statements SELECT clause and job done). Can someone confirm whether the READPAST hint can miss rows? Workaround 3: The final option is to use ALLOW_SNAPSHOT_ISOLATION and READ_COMMITED_SNAPSHOT. This would seem to be the only option to work 100% - at least I can't find any information that would contradict with it. But it is a little bit trickier to setup (I don't care much about the performance hit), because I'm using LINQ. Off the top of my head I probably need to manually open a SQL connection and pass it to the LINQ2SQL DataContext, etc... I haven't looked into the specifics very deeply. Mostly I would prefer option 2 if somone could only reassure me that READPAST will never miss rows concerning the current client (as I said before, each client has and only ever deals with it's own set of rows). Otherwise I'll likely have to implement option 3, since option 1 is probably impossible... I'll post the table definitions for the three tables as well, just in case: CREATE TABLE [dbo].[UpdatedRows]( [Id] [uniqueidentifier] NOT NULL ROWGUIDCOL DEFAULT NEWSEQUENTIALID() PRIMARY KEY CLUSTERED, [RowId] [uniqueidentifier] NOT NULL, [UpdateDateTime] [datetime] NOT NULL, ) ON [PRIMARY] GO CREATE NONCLUSTERED INDEX IX_RowId ON dbo.UpdatedRows ([RowId] ASC) WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO CREATE TABLE [dbo].[ReceivedUpdatedRows]( [Id] [uniqueidentifier] NOT NULL ROWGUIDCOL DEFAULT NEWSEQUENTIALID() PRIMARY KEY NONCLUSTERED, [UpdatedRowId] [uniqueidentifier] NOT NULL REFERENCES [dbo].[UpdatedRows] ([Id]), [StudioId] [uniqueidentifier] NOT NULL REFERENCES, [ReceiveDateTime] [datetime] NOT NULL, ) ON [PRIMARY] GO CREATE CLUSTERED INDEX IX_Studios ON dbo.ReceivedUpdatedRows ([StudioId] ASC) WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO CREATE TABLE [dbo].[SynchronizingRows]( [StudioId] [uniqueidentifier] NOT NULL [UpdatedRowId] [uniqueidentifier] NOT NULL REFERENCES [dbo].[UpdatedRows] ([Id]) PRIMARY KEY CLUSTERED ([StudioId], [UpdatedRowId]) ) ON [PRIMARY] GO PS! Studio = Client. PS2! I just noticed that the index definitions have ALLOW_PAGE_LOCK=ON. If I would turn it off, would that make any difference to READPAST? Are there any negative downsides for turning it off?

    Read the article

  • C programming: hashtable insertion/search

    - by Ricardo Campos
    Hello i have a problem with my hash table its implemented like this: #define HT_SIZE 10 typedef struct _list_t_ { char key[20]; char string[20]; char prevValue[20]; struct _list_t_ *next; } list_t; typedef struct _hash_table_t_ { int size; /* the size of the table */ list_t ***table; /* first */ sem_t lock; } hash_table_t; I have a Linked list with 3 pointers because i want a hash table with several partitions (shards), here is my initialization of my Hash table: hash_table_t *create_hash_table(int NUM_SERVER_THREADS, int num_shards){ hash_table_t *new_table; int j,i; if (HT_SIZE<1) return NULL; /* invalid size for table */ /* Attempt to allocate memory for the hashtable structure */ new_table = (hash_table_t*)malloc(sizeof(hash_table_t)*HT_SIZE); /* Attempt to allocate memory for the table itself */ new_table->table = (list_t ***)calloc(1,sizeof(list_t **)); /* Initialize the elements of the table */ for(j=0; j<num_shards; j++){ new_table->table[j] = (list_t **)calloc(1,sizeof(list_t *)); for(i=0; i<HT_SIZE; i++){ new_table->table[j][i] = (list_t *)calloc(1,sizeof(list_t )); } } /* Set the table's size */ new_table->size = HT_SIZE; sem_init(&new_table->lock, 0, 1); return new_table; } Here is my search function to search in the hash table list_t *lookup_string(hash_table_t *hashtable, char *key, int shardId){ list_t *list ; int hashval = hash(key); /* Go to the correct list based on the hash value and see if key is * in the list. If it is, return return a pointer to the list element. * If it isn't, the item isn't in the table, so return NULL. */ sem_wait(&hashtable->lock); for(list = hashtable->table[shardId][hashval]; list != NULL; list =list->next) { if (strcmp(key, list->key) == 0){ sem_post(&hashtable->lock); return list; } } sem_post(&hashtable->lock); return NULL; } And my insert function: char *add_string(hash_table_t *hashtable, char *str,char *key, int shardId){ list_t *new_list; list_t *current_list; unsigned int hashval = hash(key); /*printf("|%d|%d|%s|\n",hashval,shardId,key);*/ /* Lock for concurrency */ sem_wait(&hashtable->lock); /* Attempt to allocate memory for list */ new_list = (list_t*)malloc(sizeof(list_t)); /* Does item already exist? */ sem_post(&hashtable->lock); current_list = lookup_string(hashtable, key,shardId); sem_wait(&hashtable->lock); /* item already exists, don't insert it again. */ if (current_list != NULL){ strcpy(new_list->prevValue,current_list->string); strcpy(new_list->string,str); strcpy(new_list->key,key); new_list->next = hashtable->table[shardId][hashval]; hashtable->table[shardId][hashval] = new_list; sem_post(&hashtable->lock); return new_list->prevValue; } /* Insert into list */ strcpy(new_list->string,str); strcpy(new_list->key,key); new_list->next = hashtable->table[shardId][hashval]; hashtable->table[shardId][hashval] = new_list; /* Unlock */ sem_post(&hashtable->lock); return new_list->prevValue; } My main class runs some of tests by executing the insertion / reading / delete from the elements of the hash table the problem is when i have more than 4 partitions/shards the tests stop at the first reading element saying it returned the wrong value NULL on the search function, when its less than 4 it runs perfectly well and passes all the tests. You can see my main.c in here if you want to give a look: http://hostcode.sourceforge.net/view/1105 My complete Hash table code: http://hostcode.sourceforge.net/view/1103 And other functions where hash table code is executed: .c file http://hostcode.sourceforge.net/view/1104 .h file http://hostcode.sourceforge.net/view/1106 Thank for you time, i appreciate any help you can give to me this is a college important project that I'm trying to solve and I'm stuck here for 2 days.

    Read the article

  • C#/.NET Little Wonders: The Concurrent Collections (1 of 3)

    - by James Michael Hare
    Once again we consider some of the lesser known classes and keywords of C#.  In the next few weeks, we will discuss the concurrent collections and how they have changed the face of concurrent programming. This week’s post will begin with a general introduction and discuss the ConcurrentStack<T> and ConcurrentQueue<T>.  Then in the following post we’ll discuss the ConcurrentDictionary<T> and ConcurrentBag<T>.  Finally, we shall close on the third post with a discussion of the BlockingCollection<T>. For more of the "Little Wonders" posts, see the index here. A brief history of collections In the beginning was the .NET 1.0 Framework.  And out of this framework emerged the System.Collections namespace, and it was good.  It contained all the basic things a growing programming language needs like the ArrayList and Hashtable collections.  The main problem, of course, with these original collections is that they held items of type object which means you had to be disciplined enough to use them correctly or you could end up with runtime errors if you got an object of a type you weren't expecting. Then came .NET 2.0 and generics and our world changed forever!  With generics the C# language finally got an equivalent of the very powerful C++ templates.  As such, the System.Collections.Generic was born and we got type-safe versions of all are favorite collections.  The List<T> succeeded the ArrayList and the Dictionary<TKey,TValue> succeeded the Hashtable and so on.  The new versions of the library were not only safer because they checked types at compile-time, in many cases they were more performant as well.  So much so that it's Microsoft's recommendation that the System.Collections original collections only be used for backwards compatibility. So we as developers came to know and love the generic collections and took them into our hearts and embraced them.  The problem is, thread safety in both the original collections and the generic collections can be problematic, for very different reasons. Now, if you are only doing single-threaded development you may not care – after all, no locking is required.  Even if you do have multiple threads, if a collection is “load-once, read-many” you don’t need to do anything to protect that container from multi-threaded access, as illustrated below: 1: public static class OrderTypeTranslator 2: { 3: // because this dictionary is loaded once before it is ever accessed, we don't need to synchronize 4: // multi-threaded read access 5: private static readonly Dictionary<string, char> _translator = new Dictionary<string, char> 6: { 7: {"New", 'N'}, 8: {"Update", 'U'}, 9: {"Cancel", 'X'} 10: }; 11:  12: // the only public interface into the dictionary is for reading, so inherently thread-safe 13: public static char? Translate(string orderType) 14: { 15: char charValue; 16: if (_translator.TryGetValue(orderType, out charValue)) 17: { 18: return charValue; 19: } 20:  21: return null; 22: } 23: } Unfortunately, most of our computer science problems cannot get by with just single-threaded applications or with multi-threading in a load-once manner.  Looking at  today's trends, it's clear to see that computers are not so much getting faster because of faster processor speeds -- we've nearly reached the limits we can push through with today's technologies -- but more because we're adding more cores to the boxes.  With this new hardware paradigm, it is even more important to use multi-threaded applications to take full advantage of parallel processing to achieve higher application speeds. So let's look at how to use collections in a thread-safe manner. Using historical collections in a concurrent fashion The early .NET collections (System.Collections) had a Synchronized() static method that could be used to wrap the early collections to make them completely thread-safe.  This paradigm was dropped in the generic collections (System.Collections.Generic) because having a synchronized wrapper resulted in atomic locks for all operations, which could prove overkill in many multithreading situations.  Thus the paradigm shifted to having the user of the collection specify their own locking, usually with an external object: 1: public class OrderAggregator 2: { 3: private static readonly Dictionary<string, List<Order>> _orders = new Dictionary<string, List<Order>>(); 4: private static readonly _orderLock = new object(); 5:  6: public void Add(string accountNumber, Order newOrder) 7: { 8: List<Order> ordersForAccount; 9:  10: // a complex operation like this should all be protected 11: lock (_orderLock) 12: { 13: if (!_orders.TryGetValue(accountNumber, out ordersForAccount)) 14: { 15: _orders.Add(accountNumber, ordersForAccount = new List<Order>()); 16: } 17:  18: ordersForAccount.Add(newOrder); 19: } 20: } 21: } Notice how we’re performing several operations on the dictionary under one lock.  With the Synchronized() static methods of the early collections, you wouldn’t be able to specify this level of locking (a more macro-level).  So in the generic collections, it was decided that if a user needed synchronization, they could implement their own locking scheme instead so that they could provide synchronization as needed. The need for better concurrent access to collections Here’s the problem: it’s relatively easy to write a collection that locks itself down completely for access, but anything more complex than that can be difficult and error-prone to write, and much less to make it perform efficiently!  For example, what if you have a Dictionary that has frequent reads but in-frequent updates?  Do you want to lock down the entire Dictionary for every access?  This would be overkill and would prevent concurrent reads.  In such cases you could use something like a ReaderWriterLockSlim which allows for multiple readers in a lock, and then once a writer grabs the lock it blocks all further readers until the writer is done (in a nutshell).  This is all very complex stuff to consider. Fortunately, this is where the Concurrent Collections come in.  The Parallel Computing Platform team at Microsoft went through great pains to determine how to make a set of concurrent collections that would have the best performance characteristics for general case multi-threaded use. Now, as in all things involving threading, you should always make sure you evaluate all your container options based on the particular usage scenario and the degree of parallelism you wish to acheive. This article should not be taken to understand that these collections are always supperior to the generic collections. Each fills a particular need for a particular situation. Understanding what each container is optimized for is key to the success of your application whether it be single-threaded or multi-threaded. General points to consider with the concurrent collections The MSDN points out that the concurrent collections all support the ICollection interface. However, since the collections are already synchronized, the IsSynchronized property always returns false, and SyncRoot always returns null.  Thus you should not attempt to use these properties for synchronization purposes. Note that since the concurrent collections also may have different operations than the traditional data structures you may be used to.  Now you may ask why they did this, but it was done out of necessity to keep operations safe and atomic.  For example, in order to do a Pop() on a stack you have to know the stack is non-empty, but between the time you check the stack’s IsEmpty property and then do the Pop() another thread may have come in and made the stack empty!  This is why some of the traditional operations have been changed to make them safe for concurrent use. In addition, some properties and methods in the concurrent collections achieve concurrency by creating a snapshot of the collection, which means that some operations that were traditionally O(1) may now be O(n) in the concurrent models.  I’ll try to point these out as we talk about each collection so you can be aware of any potential performance impacts.  Finally, all the concurrent containers are safe for enumeration even while being modified, but some of the containers support this in different ways (snapshot vs. dirty iteration).  Once again I’ll highlight how thread-safe enumeration works for each collection. ConcurrentStack<T>: The thread-safe LIFO container The ConcurrentStack<T> is the thread-safe counterpart to the System.Collections.Generic.Stack<T>, which as you may remember is your standard last-in-first-out container.  If you think of algorithms that favor stack usage (for example, depth-first searches of graphs and trees) then you can see how using a thread-safe stack would be of benefit. The ConcurrentStack<T> achieves thread-safe access by using System.Threading.Interlocked operations.  This means that the multi-threaded access to the stack requires no traditional locking and is very, very fast! For the most part, the ConcurrentStack<T> behaves like it’s Stack<T> counterpart with a few differences: Pop() was removed in favor of TryPop() Returns true if an item existed and was popped and false if empty. PushRange() and TryPopRange() were added Allows you to push multiple items and pop multiple items atomically. Count takes a snapshot of the stack and then counts the items. This means it is a O(n) operation, if you just want to check for an empty stack, call IsEmpty instead which is O(1). ToArray() and GetEnumerator() both also take snapshots. This means that iteration over a stack will give you a static view at the time of the call and will not reflect updates. Pushing on a ConcurrentStack<T> works just like you’d expect except for the aforementioned PushRange() method that was added to allow you to push a range of items concurrently. 1: var stack = new ConcurrentStack<string>(); 2:  3: // adding to stack is much the same as before 4: stack.Push("First"); 5:  6: // but you can also push multiple items in one atomic operation (no interleaves) 7: stack.PushRange(new [] { "Second", "Third", "Fourth" }); For looking at the top item of the stack (without removing it) the Peek() method has been removed in favor of a TryPeek().  This is because in order to do a peek the stack must be non-empty, but between the time you check for empty and the time you execute the peek the stack contents may have changed.  Thus the TryPeek() was created to be an atomic check for empty, and then peek if not empty: 1: // to look at top item of stack without removing it, can use TryPeek. 2: // Note that there is no Peek(), this is because you need to check for empty first. TryPeek does. 3: string item; 4: if (stack.TryPeek(out item)) 5: { 6: Console.WriteLine("Top item was " + item); 7: } 8: else 9: { 10: Console.WriteLine("Stack was empty."); 11: } Finally, to remove items from the stack, we have the TryPop() for single, and TryPopRange() for multiple items.  Just like the TryPeek(), these operations replace Pop() since we need to ensure atomically that the stack is non-empty before we pop from it: 1: // to remove items, use TryPop or TryPopRange to get multiple items atomically (no interleaves) 2: if (stack.TryPop(out item)) 3: { 4: Console.WriteLine("Popped " + item); 5: } 6:  7: // TryPopRange will only pop up to the number of spaces in the array, the actual number popped is returned. 8: var poppedItems = new string[2]; 9: int numPopped = stack.TryPopRange(poppedItems); 10:  11: foreach (var theItem in poppedItems.Take(numPopped)) 12: { 13: Console.WriteLine("Popped " + theItem); 14: } Finally, note that as stated before, GetEnumerator() and ToArray() gets a snapshot of the data at the time of the call.  That means if you are enumerating the stack you will get a snapshot of the stack at the time of the call.  This is illustrated below: 1: var stack = new ConcurrentStack<string>(); 2:  3: // adding to stack is much the same as before 4: stack.Push("First"); 5:  6: var results = stack.GetEnumerator(); 7:  8: // but you can also push multiple items in one atomic operation (no interleaves) 9: stack.PushRange(new [] { "Second", "Third", "Fourth" }); 10:  11: while(results.MoveNext()) 12: { 13: Console.WriteLine("Stack only has: " + results.Current); 14: } The only item that will be printed out in the above code is "First" because the snapshot was taken before the other items were added. This may sound like an issue, but it’s really for safety and is more correct.  You don’t want to enumerate a stack and have half a view of the stack before an update and half a view of the stack after an update, after all.  In addition, note that this is still thread-safe, whereas iterating through a non-concurrent collection while updating it in the old collections would cause an exception. ConcurrentQueue<T>: The thread-safe FIFO container The ConcurrentQueue<T> is the thread-safe counterpart of the System.Collections.Generic.Queue<T> class.  The concurrent queue uses an underlying list of small arrays and lock-free System.Threading.Interlocked operations on the head and tail arrays.  Once again, this allows us to do thread-safe operations without the need for heavy locks! The ConcurrentQueue<T> (like the ConcurrentStack<T>) has some departures from the non-concurrent counterpart.  Most notably: Dequeue() was removed in favor of TryDequeue(). Returns true if an item existed and was dequeued and false if empty. Count does not take a snapshot It subtracts the head and tail index to get the count.  This results overall in a O(1) complexity which is quite good.  It’s still recommended, however, that for empty checks you call IsEmpty instead of comparing Count to zero. ToArray() and GetEnumerator() both take snapshots. This means that iteration over a queue will give you a static view at the time of the call and will not reflect updates. The Enqueue() method on the ConcurrentQueue<T> works much the same as the generic Queue<T>: 1: var queue = new ConcurrentQueue<string>(); 2:  3: // adding to queue is much the same as before 4: queue.Enqueue("First"); 5: queue.Enqueue("Second"); 6: queue.Enqueue("Third"); For front item access, the TryPeek() method must be used to attempt to see the first item if the queue.  There is no Peek() method since, as you’ll remember, we can only peek on a non-empty queue, so we must have an atomic TryPeek() that checks for empty and then returns the first item if the queue is non-empty. 1: // to look at first item in queue without removing it, can use TryPeek. 2: // Note that there is no Peek(), this is because you need to check for empty first. TryPeek does. 3: string item; 4: if (queue.TryPeek(out item)) 5: { 6: Console.WriteLine("First item was " + item); 7: } 8: else 9: { 10: Console.WriteLine("Queue was empty."); 11: } Then, to remove items you use TryDequeue().  Once again this is for the same reason we have TryPeek() and not Peek(): 1: // to remove items, use TryDequeue. If queue is empty returns false. 2: if (queue.TryDequeue(out item)) 3: { 4: Console.WriteLine("Dequeued first item " + item); 5: } Just like the concurrent stack, the ConcurrentQueue<T> takes a snapshot when you call ToArray() or GetEnumerator() which means that subsequent updates to the queue will not be seen when you iterate over the results.  Thus once again the code below will only show the first item, since the other items were added after the snapshot. 1: var queue = new ConcurrentQueue<string>(); 2:  3: // adding to queue is much the same as before 4: queue.Enqueue("First"); 5:  6: var iterator = queue.GetEnumerator(); 7:  8: queue.Enqueue("Second"); 9: queue.Enqueue("Third"); 10:  11: // only shows First 12: while (iterator.MoveNext()) 13: { 14: Console.WriteLine("Dequeued item " + iterator.Current); 15: } Using collections concurrently You’ll notice in the examples above I stuck to using single-threaded examples so as to make them deterministic and the results obvious.  Of course, if we used these collections in a truly multi-threaded way the results would be less deterministic, but would still be thread-safe and with no locking on your part required! For example, say you have an order processor that takes an IEnumerable<Order> and handles each other in a multi-threaded fashion, then groups the responses together in a concurrent collection for aggregation.  This can be done easily with the TPL’s Parallel.ForEach(): 1: public static IEnumerable<OrderResult> ProcessOrders(IEnumerable<Order> orderList) 2: { 3: var proxy = new OrderProxy(); 4: var results = new ConcurrentQueue<OrderResult>(); 5:  6: // notice that we can process all these in parallel and put the results 7: // into our concurrent collection without needing any external locking! 8: Parallel.ForEach(orderList, 9: order => 10: { 11: var result = proxy.PlaceOrder(order); 12:  13: results.Enqueue(result); 14: }); 15:  16: return results; 17: } Summary Obviously, if you do not need multi-threaded safety, you don’t need to use these collections, but when you do need multi-threaded collections these are just the ticket! The plethora of features (I always think of the movie The Three Amigos when I say plethora) built into these containers and the amazing way they acheive thread-safe access in an efficient manner is wonderful to behold. Stay tuned next week where we’ll continue our discussion with the ConcurrentBag<T> and the ConcurrentDictionary<TKey,TValue>. For some excellent information on the performance of the concurrent collections and how they perform compared to a traditional brute-force locking strategy, see this wonderful whitepaper by the Microsoft Parallel Computing Platform team here.   Tweet Technorati Tags: C#,.NET,Concurrent Collections,Collections,Multi-Threading,Little Wonders,BlackRabbitCoder,James Michael Hare

    Read the article

  • Developing web apps using ASP.NET MVC 3, Razor and EF Code First - Part 1

    - by shiju
    In this post, I will demonstrate web application development using ASP. NET MVC 3, Razor and EF code First. This post will also cover Dependency Injection using Unity 2.0 and generic Repository and Unit of Work for EF Code First. The following frameworks will be used for this step by step tutorial. ASP.NET MVC 3 EF Code First CTP 5 Unity 2.0 Define Domain Model Let’s create domain model for our simple web application Category class public class Category {     public int CategoryId { get; set; }     [Required(ErrorMessage = "Name Required")]     [StringLength(25, ErrorMessage = "Must be less than 25 characters")]     public string Name { get; set;}     public string Description { get; set; }     public virtual ICollection<Expense> Expenses { get; set; } }   Expense class public class Expense {             public int ExpenseId { get; set; }            public string  Transaction { get; set; }     public DateTime Date { get; set; }     public double Amount { get; set; }     public int CategoryId { get; set; }     public virtual Category Category { get; set; } } We have two domain entities - Category and Expense. A single category contains a list of expense transactions and every expense transaction should have a Category. In this post, we will be focusing on CRUD operations for the entity Category and will be working on the Expense entity with a View Model object in the later post. And the source code for this application will be refactored over time. The above entities are very simple POCO (Plain Old CLR Object) classes and the entity Category is decorated with validation attributes in the System.ComponentModel.DataAnnotations namespace. Now we want to use these entities for defining model objects for the Entity Framework 4. Using the Code First approach of Entity Framework, we can first define the entities by simply writing POCO classes without any coupling with any API or database library. This approach lets you focus on domain model which will enable Domain-Driven Development for applications. EF code first support is currently enabled with a separate API that is runs on top of the Entity Framework 4. EF Code First is reached CTP 5 when I am writing this article. Creating Context Class for Entity Framework We have created our domain model and let’s create a class in order to working with Entity Framework Code First. For this, you have to download EF Code First CTP 5 and add reference to the assembly EntitFramework.dll. You can also use NuGet to download add reference to EEF Code First.    public class MyFinanceContext : DbContext {     public MyFinanceContext() : base("MyFinance") { }     public DbSet<Category> Categories { get; set; }     public DbSet<Expense> Expenses { get; set; }         }   The above class MyFinanceContext is derived from DbContext that can connect your model classes to a database. The MyFinanceContext class is mapping our Category and Expense class into database tables Categories and Expenses using DbSet<TEntity> where TEntity is any POCO class. When we are running the application at first time, it will automatically create the database. EF code-first look for a connection string in web.config or app.config that has the same name as the dbcontext class. If it is not find any connection string with the convention, it will automatically create database in local SQL Express database by default and the name of the database will be same name as the dbcontext class. You can also define the name of database in constructor of the the dbcontext class. Unlike NHibernate, we don’t have to use any XML based mapping files or Fluent interface for mapping between our model and database. The model classes of Code First are working on the basis of conventions and we can also use a fluent API to refine our model. The convention for primary key is ‘Id’ or ‘<class name>Id’.  If primary key properties are detected with type ‘int’, ‘long’ or ‘short’, they will automatically registered as identity columns in the database by default. Primary key detection is not case sensitive. We can define our model classes with validation attributes in the System.ComponentModel.DataAnnotations namespace and it automatically enforces validation rules when a model object is updated or saved. Generic Repository for EF Code First We have created model classes and dbcontext class. Now we have to create generic repository pattern for data persistence with EF code first. If you don’t know about the repository pattern, checkout Martin Fowler’s article on Repository Let’s create a generic repository to working with DbContext and DbSet generics. public interface IRepository<T> where T : class     {         void Add(T entity);         void Delete(T entity);         T GetById(long Id);         IEnumerable<T> All();     }   RepositoryBasse – Generic Repository class public abstract class RepositoryBase<T> where T : class { private MyFinanceContext database; private readonly IDbSet<T> dbset; protected RepositoryBase(IDatabaseFactory databaseFactory) {     DatabaseFactory = databaseFactory;     dbset = Database.Set<T>(); }   protected IDatabaseFactory DatabaseFactory {     get; private set; }   protected MyFinanceContext Database {     get { return database ?? (database = DatabaseFactory.Get()); } } public virtual void Add(T entity) {     dbset.Add(entity);            }        public virtual void Delete(T entity) {     dbset.Remove(entity); }   public virtual T GetById(long id) {     return dbset.Find(id); }   public virtual IEnumerable<T> All() {     return dbset.ToList(); } }   DatabaseFactory class public class DatabaseFactory : Disposable, IDatabaseFactory {     private MyFinanceContext database;     public MyFinanceContext Get()     {         return database ?? (database = new MyFinanceContext());     }     protected override void DisposeCore()     {         if (database != null)             database.Dispose();     } } Unit of Work If you are new to Unit of Work pattern, checkout Fowler’s article on Unit of Work . According to Martin Fowler, the Unit of Work pattern "maintains a list of objects affected by a business transaction and coordinates the writing out of changes and the resolution of concurrency problems." Let’s create a class for handling Unit of Work   public interface IUnitOfWork {     void Commit(); }   UniOfWork class public class UnitOfWork : IUnitOfWork {     private readonly IDatabaseFactory databaseFactory;     private MyFinanceContext dataContext;       public UnitOfWork(IDatabaseFactory databaseFactory)     {         this.databaseFactory = databaseFactory;     }       protected MyFinanceContext DataContext     {         get { return dataContext ?? (dataContext = databaseFactory.Get()); }     }       public void Commit()     {         DataContext.Commit();     } }   The Commit method of the UnitOfWork will call the commit method of MyFinanceContext class and it will execute the SaveChanges method of DbContext class.   Repository class for Category In this post, we will be focusing on the persistence against Category entity and will working on other entities in later post. Let’s create a repository for handling CRUD operations for Category using derive from a generic Repository RepositoryBase<T>.   public class CategoryRepository: RepositoryBase<Category>, ICategoryRepository     {     public CategoryRepository(IDatabaseFactory databaseFactory)         : base(databaseFactory)         {         }                } public interface ICategoryRepository : IRepository<Category> { } If we need additional methods than generic repository for the Category, we can define in the CategoryRepository. Dependency Injection using Unity 2.0 If you are new to Inversion of Control/ Dependency Injection or Unity, please have a look on my articles at http://weblogs.asp.net/shijuvarghese/archive/tags/IoC/default.aspx. I want to create a custom lifetime manager for Unity to store container in the current HttpContext.   public class HttpContextLifetimeManager<T> : LifetimeManager, IDisposable {     public override object GetValue()     {         return HttpContext.Current.Items[typeof(T).AssemblyQualifiedName];     }     public override void RemoveValue()     {         HttpContext.Current.Items.Remove(typeof(T).AssemblyQualifiedName);     }     public override void SetValue(object newValue)     {         HttpContext.Current.Items[typeof(T).AssemblyQualifiedName] = newValue;     }     public void Dispose()     {         RemoveValue();     } }   Let’s create controller factory for Unity in the ASP.NET MVC 3 application. public class UnityControllerFactory : DefaultControllerFactory { IUnityContainer container; public UnityControllerFactory(IUnityContainer container) {     this.container = container; } protected override IController GetControllerInstance(RequestContext reqContext, Type controllerType) {     IController controller;     if (controllerType == null)         throw new HttpException(                 404, String.Format(                     "The controller for path '{0}' could not be found" +     "or it does not implement IController.",                 reqContext.HttpContext.Request.Path));       if (!typeof(IController).IsAssignableFrom(controllerType))         throw new ArgumentException(                 string.Format(                     "Type requested is not a controller: {0}",                     controllerType.Name),                     "controllerType");     try     {         controller= container.Resolve(controllerType) as IController;     }     catch (Exception ex)     {         throw new InvalidOperationException(String.Format(                                 "Error resolving controller {0}",                                 controllerType.Name), ex);     }     return controller; }   }   Configure contract and concrete types in Unity Let’s configure our contract and concrete types in Unity for resolving our dependencies.   private void ConfigureUnity() {     //Create UnityContainer               IUnityContainer container = new UnityContainer()                 .RegisterType<IDatabaseFactory, DatabaseFactory>(new HttpContextLifetimeManager<IDatabaseFactory>())     .RegisterType<IUnitOfWork, UnitOfWork>(new HttpContextLifetimeManager<IUnitOfWork>())     .RegisterType<ICategoryRepository, CategoryRepository>(new HttpContextLifetimeManager<ICategoryRepository>());                 //Set container for Controller Factory                ControllerBuilder.Current.SetControllerFactory(             new UnityControllerFactory(container)); }   In the above ConfigureUnity method, we are registering our types onto Unity container with custom lifetime manager HttpContextLifetimeManager. Let’s call ConfigureUnity method in the Global.asax.cs for set controller factory for Unity and configuring the types with Unity.   protected void Application_Start() {     AreaRegistration.RegisterAllAreas();     RegisterGlobalFilters(GlobalFilters.Filters);     RegisterRoutes(RouteTable.Routes);     ConfigureUnity(); }   Developing web application using ASP.NET MVC 3 We have created our domain model for our web application and also have created repositories and configured dependencies with Unity container. Now we have to create controller classes and views for doing CRUD operations against the Category entity. Let’s create controller class for Category Category Controller   public class CategoryController : Controller {     private readonly ICategoryRepository categoryRepository;     private readonly IUnitOfWork unitOfWork;           public CategoryController(ICategoryRepository categoryRepository, IUnitOfWork unitOfWork)     {         this.categoryRepository = categoryRepository;         this.unitOfWork = unitOfWork;     }       public ActionResult Index()     {         var categories = categoryRepository.All();         return View(categories);     }     [HttpGet]     public ActionResult Edit(int id)     {         var category = categoryRepository.GetById(id);         return View(category);     }       [HttpPost]     public ActionResult Edit(int id, FormCollection collection)     {         var category = categoryRepository.GetById(id);         if (TryUpdateModel(category))         {             unitOfWork.Commit();             return RedirectToAction("Index");         }         else return View(category);                 }       [HttpGet]     public ActionResult Create()     {         var category = new Category();         return View(category);     }           [HttpPost]     public ActionResult Create(Category category)     {         if (!ModelState.IsValid)         {             return View("Create", category);         }                     categoryRepository.Add(category);         unitOfWork.Commit();         return RedirectToAction("Index");     }       [HttpPost]     public ActionResult Delete(int  id)     {         var category = categoryRepository.GetById(id);         categoryRepository.Delete(category);         unitOfWork.Commit();         var categories = categoryRepository.All();         return PartialView("CategoryList", categories);       }        }   Creating Views in Razor Now we are going to create views in Razor for our ASP.NET MVC 3 application.  Let’s create a partial view CategoryList.cshtml for listing category information and providing link for Edit and Delete operations. CategoryList.cshtml @using MyFinance.Helpers; @using MyFinance.Domain; @model IEnumerable<Category>      <table>         <tr>         <th>Actions</th>         <th>Name</th>          <th>Description</th>         </tr>     @foreach (var item in Model) {             <tr>             <td>                 @Html.ActionLink("Edit", "Edit",new { id = item.CategoryId })                 @Ajax.ActionLink("Delete", "Delete", new { id = item.CategoryId }, new AjaxOptions { Confirm = "Delete Expense?", HttpMethod = "Post", UpdateTargetId = "divCategoryList" })                           </td>             <td>                 @item.Name             </td>             <td>                 @item.Description             </td>         </tr>          }       </table>     <p>         @Html.ActionLink("Create New", "Create")     </p> The delete link is providing Ajax functionality using the Ajax.ActionLink. This will call an Ajax request for Delete action method in the CategoryCotroller class. In the Delete action method, it will return Partial View CategoryList after deleting the record. We are using CategoryList view for the Ajax functionality and also for Index view using for displaying list of category information. Let’s create Index view using partial view CategoryList  Index.chtml @model IEnumerable<MyFinance.Domain.Category> @{     ViewBag.Title = "Index"; }    <h2>Category List</h2>    <script src="@Url.Content("~/Scripts/jquery.unobtrusive-ajax.min.js")" type="text/javascript"></script>    <div id="divCategoryList">               @Html.Partial("CategoryList", Model) </div>   We can call the partial views using Html.Partial helper method. Now we are going to create View pages for insert and update functionality for the Category. Both view pages are sharing common user interface for entering the category information. So I want to create an EditorTemplate for the Category information. We have to create the EditorTemplate with the same name of entity object so that we can refer it on view pages using @Html.EditorFor(model => model) . So let’s create template with name Category. Let’s create view page for insert Category information   @model MyFinance.Domain.Category   @{     ViewBag.Title = "Save"; }   <h2>Create</h2>   <script src="@Url.Content("~/Scripts/jquery.validate.min.js")" type="text/javascript"></script> <script src="@Url.Content("~/Scripts/jquery.validate.unobtrusive.min.js")" type="text/javascript"></script>   @using (Html.BeginForm()) {     @Html.ValidationSummary(true)     <fieldset>         <legend>Category</legend>                @Html.EditorFor(model => model)               <p>             <input type="submit" value="Create" />         </p>     </fieldset> }   <div>     @Html.ActionLink("Back to List", "Index") </div> ViewStart file In Razor views, we can add a file named _viewstart.cshtml in the views directory  and this will be shared among the all views with in the Views directory. The below code in the _viewstart.cshtml, sets the Layout page for every Views in the Views folder.      @{     Layout = "~/Views/Shared/_Layout.cshtml"; }   Source Code You can download the source code from http://efmvc.codeplex.com/ . The source will be refactored on over time.   Summary In this post, we have created a simple web application using ASP.NET MVC 3 and EF Code First. We have discussed on technologies and practices such as ASP.NET MVC 3, Razor, EF Code First, Unity 2, generic Repository and Unit of Work. In my later posts, I will modify the application and will be discussed on more things. Stay tuned to my blog  for more posts on step by step application building.

    Read the article

  • Advanced TSQL Tuning: Why Internals Knowledge Matters

    - by Paul White
    There is much more to query tuning than reducing logical reads and adding covering nonclustered indexes.  Query tuning is not complete as soon as the query returns results quickly in the development or test environments.  In production, your query will compete for memory, CPU, locks, I/O and other resources on the server.  Today’s entry looks at some tuning considerations that are often overlooked, and shows how deep internals knowledge can help you write better TSQL. As always, we’ll need some example data.  In fact, we are going to use three tables today, each of which is structured like this: Each table has 50,000 rows made up of an INTEGER id column and a padding column containing 3,999 characters in every row.  The only difference between the three tables is in the type of the padding column: the first table uses CHAR(3999), the second uses VARCHAR(MAX), and the third uses the deprecated TEXT type.  A script to create a database with the three tables and load the sample data follows: USE master; GO IF DB_ID('SortTest') IS NOT NULL DROP DATABASE SortTest; GO CREATE DATABASE SortTest COLLATE LATIN1_GENERAL_BIN; GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest', SIZE = 3GB, MAXSIZE = 3GB ); GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest_log', SIZE = 256MB, MAXSIZE = 1GB, FILEGROWTH = 128MB ); GO ALTER DATABASE SortTest SET ALLOW_SNAPSHOT_ISOLATION OFF ; ALTER DATABASE SortTest SET AUTO_CLOSE OFF ; ALTER DATABASE SortTest SET AUTO_CREATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_SHRINK OFF ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS_ASYNC ON ; ALTER DATABASE SortTest SET PARAMETERIZATION SIMPLE ; ALTER DATABASE SortTest SET READ_COMMITTED_SNAPSHOT OFF ; ALTER DATABASE SortTest SET MULTI_USER ; ALTER DATABASE SortTest SET RECOVERY SIMPLE ; USE SortTest; GO CREATE TABLE dbo.TestCHAR ( id INTEGER IDENTITY (1,1) NOT NULL, padding CHAR(3999) NOT NULL,   CONSTRAINT [PK dbo.TestCHAR (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestMAX ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL,   CONSTRAINT [PK dbo.TestMAX (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestTEXT ( id INTEGER IDENTITY (1,1) NOT NULL, padding TEXT NOT NULL,   CONSTRAINT [PK dbo.TestTEXT (id)] PRIMARY KEY CLUSTERED (id), ) ; -- ============= -- Load TestCHAR (about 3s) -- ============= INSERT INTO dbo.TestCHAR WITH (TABLOCKX) ( padding ) SELECT padding = REPLICATE(CHAR(65 + (Data.n % 26)), 3999) FROM ( SELECT TOP (50000) n = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1 FROM master.sys.columns C1, master.sys.columns C2, master.sys.columns C3 ORDER BY n ASC ) AS Data ORDER BY Data.n ASC ; -- ============ -- Load TestMAX (about 3s) -- ============ INSERT INTO dbo.TestMAX WITH (TABLOCKX) ( padding ) SELECT CONVERT(VARCHAR(MAX), padding) FROM dbo.TestCHAR ORDER BY id ; -- ============= -- Load TestTEXT (about 5s) -- ============= INSERT INTO dbo.TestTEXT WITH (TABLOCKX) ( padding ) SELECT CONVERT(TEXT, padding) FROM dbo.TestCHAR ORDER BY id ; -- ========== -- Space used -- ========== -- EXECUTE sys.sp_spaceused @objname = 'dbo.TestCHAR'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAX'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestTEXT'; ; CHECKPOINT ; That takes around 15 seconds to run, and shows the space allocated to each table in its output: To illustrate the points I want to make today, the example task we are going to set ourselves is to return a random set of 150 rows from each table.  The basic shape of the test query is the same for each of the three test tables: SELECT TOP (150) T.id, T.padding FROM dbo.Test AS T ORDER BY NEWID() OPTION (MAXDOP 1) ; Test 1 – CHAR(3999) Running the template query shown above using the TestCHAR table as the target, we find that the query takes around 5 seconds to return its results.  This seems slow, considering that the table only has 50,000 rows.  Working on the assumption that generating a GUID for each row is a CPU-intensive operation, we might try enabling parallelism to see if that speeds up the response time.  Running the query again (but without the MAXDOP 1 hint) on a machine with eight logical processors, the query now takes 10 seconds to execute – twice as long as when run serially. Rather than attempting further guesses at the cause of the slowness, let’s go back to serial execution and add some monitoring.  The script below monitors STATISTICS IO output and the amount of tempdb used by the test query.  We will also run a Profiler trace to capture any warnings generated during query execution. DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TC.id, TC.padding FROM dbo.TestCHAR AS TC ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; Let’s take a closer look at the statistics and query plan generated from this: Following the flow of the data from right to left, we see the expected 50,000 rows emerging from the Clustered Index Scan, with a total estimated size of around 191MB.  The Compute Scalar adds a column containing a random GUID (generated from the NEWID() function call) for each row.  With this extra column in place, the size of the data arriving at the Sort operator is estimated to be 192MB. Sort is a blocking operator – it has to examine all of the rows on its input before it can produce its first row of output (the last row received might sort first).  This characteristic means that Sort requires a memory grant – memory allocated for the query’s use by SQL Server just before execution starts.  In this case, the Sort is the only memory-consuming operator in the plan, so it has access to the full 243MB (248,696KB) of memory reserved by SQL Server for this query execution. Notice that the memory grant is significantly larger than the expected size of the data to be sorted.  SQL Server uses a number of techniques to speed up sorting, some of which sacrifice size for comparison speed.  Sorts typically require a very large number of comparisons, so this is usually a very effective optimization.  One of the drawbacks is that it is not possible to exactly predict the sort space needed, as it depends on the data itself.  SQL Server takes an educated guess based on data types, sizes, and the number of rows expected, but the algorithm is not perfect. In spite of the large memory grant, the Profiler trace shows a Sort Warning event (indicating that the sort ran out of memory), and the tempdb usage monitor shows that 195MB of tempdb space was used – all of that for system use.  The 195MB represents physical write activity on tempdb, because SQL Server strictly enforces memory grants – a query cannot ‘cheat’ and effectively gain extra memory by spilling to tempdb pages that reside in memory.  Anyway, the key point here is that it takes a while to write 195MB to disk, and this is the main reason that the query takes 5 seconds overall. If you are wondering why using parallelism made the problem worse, consider that eight threads of execution result in eight concurrent partial sorts, each receiving one eighth of the memory grant.  The eight sorts all spilled to tempdb, resulting in inefficiencies as the spilled sorts competed for disk resources.  More importantly, there are specific problems at the point where the eight partial results are combined, but I’ll cover that in a future post. CHAR(3999) Performance Summary: 5 seconds elapsed time 243MB memory grant 195MB tempdb usage 192MB estimated sort set 25,043 logical reads Sort Warning Test 2 – VARCHAR(MAX) We’ll now run exactly the same test (with the additional monitoring) on the table using a VARCHAR(MAX) padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TM.id, TM.padding FROM dbo.TestMAX AS TM ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query takes around 8 seconds to complete (3 seconds longer than Test 1).  Notice that the estimated row and data sizes are very slightly larger, and the overall memory grant has also increased very slightly to 245MB.  The most marked difference is in the amount of tempdb space used – this query wrote almost 391MB of sort run data to the physical tempdb file.  Don’t draw any general conclusions about VARCHAR(MAX) versus CHAR from this – I chose the length of the data specifically to expose this edge case.  In most cases, VARCHAR(MAX) performs very similarly to CHAR – I just wanted to make test 2 a bit more exciting. MAX Performance Summary: 8 seconds elapsed time 245MB memory grant 391MB tempdb usage 193MB estimated sort set 25,043 logical reads Sort warning Test 3 – TEXT The same test again, but using the deprecated TEXT data type for the padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TT.id, TT.padding FROM dbo.TestTEXT AS TT ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query runs in 500ms.  If you look at the metrics we have been checking so far, it’s not hard to understand why: TEXT Performance Summary: 0.5 seconds elapsed time 9MB memory grant 5MB tempdb usage 5MB estimated sort set 207 logical reads 596 LOB logical reads Sort warning SQL Server’s memory grant algorithm still underestimates the memory needed to perform the sorting operation, but the size of the data to sort is so much smaller (5MB versus 193MB previously) that the spilled sort doesn’t matter very much.  Why is the data size so much smaller?  The query still produces the correct results – including the large amount of data held in the padding column – so what magic is being performed here? TEXT versus MAX Storage The answer lies in how columns of the TEXT data type are stored.  By default, TEXT data is stored off-row in separate LOB pages – which explains why this is the first query we have seen that records LOB logical reads in its STATISTICS IO output.  You may recall from my last post that LOB data leaves an in-row pointer to the separate storage structure holding the LOB data. SQL Server can see that the full LOB value is not required by the query plan until results are returned, so instead of passing the full LOB value down the plan from the Clustered Index Scan, it passes the small in-row structure instead.  SQL Server estimates that each row coming from the scan will be 79 bytes long – 11 bytes for row overhead, 4 bytes for the integer id column, and 64 bytes for the LOB pointer (in fact the pointer is rather smaller – usually 16 bytes – but the details of that don’t really matter right now). OK, so this query is much more efficient because it is sorting a very much smaller data set – SQL Server delays retrieving the LOB data itself until after the Sort starts producing its 150 rows.  The question that normally arises at this point is: Why doesn’t SQL Server use the same trick when the padding column is defined as VARCHAR(MAX)? The answer is connected with the fact that if the actual size of the VARCHAR(MAX) data is 8000 bytes or less, it is usually stored in-row in exactly the same way as for a VARCHAR(8000) column – MAX data only moves off-row into LOB storage when it exceeds 8000 bytes.  The default behaviour of the TEXT type is to be stored off-row by default, unless the ‘text in row’ table option is set suitably and there is room on the page.  There is an analogous (but opposite) setting to control the storage of MAX data – the ‘large value types out of row’ table option.  By enabling this option for a table, MAX data will be stored off-row (in a LOB structure) instead of in-row.  SQL Server Books Online has good coverage of both options in the topic In Row Data. The MAXOOR Table The essential difference, then, is that MAX defaults to in-row storage, and TEXT defaults to off-row (LOB) storage.  You might be thinking that we could get the same benefits seen for the TEXT data type by storing the VARCHAR(MAX) values off row – so let’s look at that option now.  This script creates a fourth table, with the VARCHAR(MAX) data stored off-row in LOB pages: CREATE TABLE dbo.TestMAXOOR ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL,   CONSTRAINT [PK dbo.TestMAXOOR (id)] PRIMARY KEY CLUSTERED (id), ) ; EXECUTE sys.sp_tableoption @TableNamePattern = N'dbo.TestMAXOOR', @OptionName = 'large value types out of row', @OptionValue = 'true' ; SELECT large_value_types_out_of_row FROM sys.tables WHERE [schema_id] = SCHEMA_ID(N'dbo') AND name = N'TestMAXOOR' ; INSERT INTO dbo.TestMAXOOR WITH (TABLOCKX) ( padding ) SELECT SPACE(0) FROM dbo.TestCHAR ORDER BY id ; UPDATE TM WITH (TABLOCK) SET padding.WRITE (TC.padding, NULL, NULL) FROM dbo.TestMAXOOR AS TM JOIN dbo.TestCHAR AS TC ON TC.id = TM.id ; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAXOOR' ; CHECKPOINT ; Test 4 – MAXOOR We can now re-run our test on the MAXOOR (MAX out of row) table: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) MO.id, MO.padding FROM dbo.TestMAXOOR AS MO ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; TEXT Performance Summary: 0.3 seconds elapsed time 245MB memory grant 0MB tempdb usage 193MB estimated sort set 207 logical reads 446 LOB logical reads No sort warning The query runs very quickly – slightly faster than Test 3, and without spilling the sort to tempdb (there is no sort warning in the trace, and the monitoring query shows zero tempdb usage by this query).  SQL Server is passing the in-row pointer structure down the plan and only looking up the LOB value on the output side of the sort. The Hidden Problem There is still a huge problem with this query though – it requires a 245MB memory grant.  No wonder the sort doesn’t spill to tempdb now – 245MB is about 20 times more memory than this query actually requires to sort 50,000 records containing LOB data pointers.  Notice that the estimated row and data sizes in the plan are the same as in test 2 (where the MAX data was stored in-row). The optimizer assumes that MAX data is stored in-row, regardless of the sp_tableoption setting ‘large value types out of row’.  Why?  Because this option is dynamic – changing it does not immediately force all MAX data in the table in-row or off-row, only when data is added or actually changed.  SQL Server does not keep statistics to show how much MAX or TEXT data is currently in-row, and how much is stored in LOB pages.  This is an annoying limitation, and one which I hope will be addressed in a future version of the product. So why should we worry about this?  Excessive memory grants reduce concurrency and may result in queries waiting on the RESOURCE_SEMAPHORE wait type while they wait for memory they do not need.  245MB is an awful lot of memory, especially on 32-bit versions where memory grants cannot use AWE-mapped memory.  Even on a 64-bit server with plenty of memory, do you really want a single query to consume 0.25GB of memory unnecessarily?  That’s 32,000 8KB pages that might be put to much better use. The Solution The answer is not to use the TEXT data type for the padding column.  That solution happens to have better performance characteristics for this specific query, but it still results in a spilled sort, and it is hard to recommend the use of a data type which is scheduled for removal.  I hope it is clear to you that the fundamental problem here is that SQL Server sorts the whole set arriving at a Sort operator.  Clearly, it is not efficient to sort the whole table in memory just to return 150 rows in a random order. The TEXT example was more efficient because it dramatically reduced the size of the set that needed to be sorted.  We can do the same thing by selecting 150 unique keys from the table at random (sorting by NEWID() for example) and only then retrieving the large padding column values for just the 150 rows we need.  The following script implements that idea for all four tables: SET STATISTICS IO ON ; WITH TestTable AS ( SELECT * FROM dbo.TestCHAR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id = ANY (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAX ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestTEXT ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAXOOR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; All four queries now return results in much less than a second, with memory grants between 6 and 12MB, and without spilling to tempdb.  The small remaining inefficiency is in reading the id column values from the clustered primary key index.  As a clustered index, it contains all the in-row data at its leaf.  The CHAR and VARCHAR(MAX) tables store the padding column in-row, so id values are separated by a 3999-character column, plus row overhead.  The TEXT and MAXOOR tables store the padding values off-row, so id values in the clustered index leaf are separated by the much-smaller off-row pointer structure.  This difference is reflected in the number of logical page reads performed by the four queries: Table 'TestCHAR' logical reads 25511 lob logical reads 000 Table 'TestMAX'. logical reads 25511 lob logical reads 000 Table 'TestTEXT' logical reads 00412 lob logical reads 597 Table 'TestMAXOOR' logical reads 00413 lob logical reads 446 We can increase the density of the id values by creating a separate nonclustered index on the id column only.  This is the same key as the clustered index, of course, but the nonclustered index will not include the rest of the in-row column data. CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestCHAR (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAX (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestTEXT (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAXOOR (id); The four queries can now use the very dense nonclustered index to quickly scan the id values, sort them by NEWID(), select the 150 ids we want, and then look up the padding data.  The logical reads with the new indexes in place are: Table 'TestCHAR' logical reads 835 lob logical reads 0 Table 'TestMAX' logical reads 835 lob logical reads 0 Table 'TestTEXT' logical reads 686 lob logical reads 597 Table 'TestMAXOOR' logical reads 686 lob logical reads 448 With the new index, all four queries use the same query plan (click to enlarge): Performance Summary: 0.3 seconds elapsed time 6MB memory grant 0MB tempdb usage 1MB sort set 835 logical reads (CHAR, MAX) 686 logical reads (TEXT, MAXOOR) 597 LOB logical reads (TEXT) 448 LOB logical reads (MAXOOR) No sort warning I’ll leave it as an exercise for the reader to work out why trying to eliminate the Key Lookup by adding the padding column to the new nonclustered indexes would be a daft idea Conclusion This post is not about tuning queries that access columns containing big strings.  It isn’t about the internal differences between TEXT and MAX data types either.  It isn’t even about the cool use of UPDATE .WRITE used in the MAXOOR table load.  No, this post is about something else: Many developers might not have tuned our starting example query at all – 5 seconds isn’t that bad, and the original query plan looks reasonable at first glance.  Perhaps the NEWID() function would have been blamed for ‘just being slow’ – who knows.  5 seconds isn’t awful – unless your users expect sub-second responses – but using 250MB of memory and writing 200MB to tempdb certainly is!  If ten sessions ran that query at the same time in production that’s 2.5GB of memory usage and 2GB hitting tempdb.  Of course, not all queries can be rewritten to avoid large memory grants and sort spills using the key-lookup technique in this post, but that’s not the point either. The point of this post is that a basic understanding of execution plans is not enough.  Tuning for logical reads and adding covering indexes is not enough.  If you want to produce high-quality, scalable TSQL that won’t get you paged as soon as it hits production, you need a deep understanding of execution plans, and as much accurate, deep knowledge about SQL Server as you can lay your hands on.  The advanced database developer has a wide range of tools to use in writing queries that perform well in a range of circumstances. By the way, the examples in this post were written for SQL Server 2008.  They will run on 2005 and demonstrate the same principles, but you won’t get the same figures I did because 2005 had a rather nasty bug in the Top N Sort operator.  Fair warning: if you do decide to run the scripts on a 2005 instance (particularly the parallel query) do it before you head out for lunch… This post is dedicated to the people of Christchurch, New Zealand. © 2011 Paul White email: @[email protected] twitter: @SQL_Kiwi

    Read the article

  • CodePlex Daily Summary for Saturday, October 29, 2011

    CodePlex Daily Summary for Saturday, October 29, 2011Popular Releasespatterns & practices: Enterprise Library Contrib: Enterprise Library Contrib - 5.0 (Oct 2011): This release of Enterprise Library Contrib is based on the Microsoft patterns & practices Enterprise Library 5.0 core and contains the following: Common extensionsTypeConfigurationElement<T> - A Polymorphic Configuration Element without having to be part of a PolymorphicConfigurationElementCollection. AnonymousConfigurationElement - A Configuration element that can be uniquely identified without having to define its name explicitly. Data Access Application Block extensionsMySql Provider - ...Network Monitor Open Source Parsers: Network Monitor Parsers 3.4.2748: The Network Monitor Parsers packages contain parsers for more than 400 network protocols, including RFC based public protocols and protocols for Microsoft products defined in the Microsoft Open Specifications for Windows and SQL Server. NetworkMonitor_Parsers.msi is the base parser package which defines parsers for commonly used public protocols and protocols for Microsoft Windows. In this release, NetowrkMonitor_Parsers.msi continues to improve quality and fix bugs. It has included the fo...Duckworth Lewis Professional Edition Calculator: DLcalc 3.0: DLcalc 3.0 can perform Duckworth/Lewis Professional Edition calculations 100% accurately. It also produces over-by-over and ball-by-ball PAR score tables.Folder Bookmarks: Folder Bookmarks 2.2.0.1: In this version: Custom Icons - now you can change the icons of the bookmarks. By default, whenever an image is added, the icon is automatically changed to a thumbnail of the picture. This can be turned off in the settings (Options... > Settings) Ability to remove items from the 'Recent' category Bugfixes - 'Choose' button in 'Edit Bookmark' now works Another bug fix: another problem in the 'Edit Bookmark' windowMedia Companion: MC 3.420b Weekly: Ensure .NET 4.0 Full Framework is installed. (Available from http://www.microsoft.com/download/en/details.aspx?id=17718) Ensure the NFO ID fix is applied when transitioning from versions prior to 3.416b. (Details here) Movies Fixed: Fanart and poster scraping issues TV Shows (Re)Added: Rebuild single show Fixed: Issue when shows are moved from original location Ability to handle " for actor nicknames Crash when episode name contains "<" (does not scrape yet) Clears fanart when switch...patterns & practices - Unity: Unity 3.0 for .NET4.5 Preview: The Unity 3.0.1026.0 Preview enables Unity to work on .NET 4.5 with both the WinRT and desktop profiles. The major changes include: Unity projects updated to target .NET 4.5. Dynamic build plans modified to use compiled lambda expressions instead of Reflection.Emit Converting reflection to use the new TypeInfo for reflection. Projects updated to work with the Microsoft Visual Studio 2011 Preview Notes/Known Issues: The Microsoft.Practices.Unity.UnityServiceLocator class cannot be use...Managed Extensibility Framework: MEF 2 Preview 4: Detailed information on this release is available on the BCL team blog.Image Converter: Image Converter 0.3: New Features: - English and German support Technical Improvements: - Microsoft All Rules using Code Analysis Planned Features for future release: 1. Unit testing 2. Command line interface 3. Automatic UpdatesAcDown????? - Anime&Comic Downloader: AcDown????? v3.6: ?? ● AcDown??????????、??????,??????????????????????,???????Acfun、Bilibili、???、???、???、Tucao.cc、SF???、?????80????,???????????、?????????。 ● AcDown???????????????????????????,???,???????????????????。 ● AcDown???????C#??,????.NET Framework 2.0??。?????"Acfun?????"。 ????32??64? Windows XP/Vista/7 ????????????? ??:????????Windows XP???,?????????.NET Framework 2.0???(x86)?.NET Framework 2.0???(x64),?????"?????????"??? ??????????????,??????????: ??"AcDown?????"????????? ?? v3.6?? ??“????”...DotNetNuke® Events: 05.02.01: This release fixes any know bugs from any previous version. Events 05.02.01 will work for any DNN version 5.5.0 and up. Full details on the changes can be found at http://dnnevents.codeplex.com/workitem/list/basic Please review and rate this release... (stars are welcome)BUG FIXESAdded validation around category cookie RSS feed was missing an explicit close of the file when writing. Fixed. Added extra security into detail view .ICS Files did not include correct line folding. Fixed Cha...Microsoft Ajax Minifier: Microsoft Ajax Minifier 4.33: Add JSParser.ParseExpression method to parse JavaScript expressions rather than source-elements. Add -strict switch (CodeSettings.StrictMode) to force input code to ECMA5 Strict-mode (extra error-checking, "use strict" at top). Fixed bug when MinifyCode setting was set to false but RemoveUnneededCode was left it's default value of true.Path Copy Copy: 8.0: New version that mostly adds lots of requested features: 11340 11339 11338 11337 This version also features a more elaborate Settings UI that has several tabs. I tried to add some notes to better explain the use and purpose of the various options. The Path Copy Copy documentation is also on the way, both to explain how to develop custom plugins and to explain how to pre-configure options if you're a network admin. Stay tuned.MVC Controls Toolkit: Mvc Controls Toolkit 1.5.0: Added: The new Client Blocks feaure of Views A new "move" js method for the TreeViews The NewHtmlCreated js event to the DataGrid Improved the ChoiceList structure that now allows also the selection list of a dropdown to be chosen with a lambda expression Improved the AcceptViewHintAttribute controller filter. Now a client can specify not only the name of a View or Partial View it prefers, but also to receive just the rough data in Json format. Fixed: Issue with partial thrust Cl...Free SharePoint Master Pages: Buried Alive (Halloween) Theme: Release Notes *Created for Halloween, you will find theme file, custom css file and images. *Created by Al Roome @AlstarRoome Features: Custom styling for web part Custom background *Screenshot https://s3.amazonaws.com/kkhipple/post/sharepoint-showcase-halloween.pngDevForce Application Framework: DevForce AF 2.0.3 RTW: PrerequisitesWPF 4.0 Silverlight 4.0 DevForce 2010 6.1.3.1 Download ContentsDebug and Release Assemblies API Documentation Source code License.txt Requirements.txt Release HighlightsNew: EventAggregator event forwarding New: EntityManagerInterceptor<T> to intercept EntityManger events New: IHarnessAware to allow for ViewModel setup when executed inside of the Development Harness New: Improved design time stability New: Support for add-in development New: CoroutineFns.To...NicAudio: NicAudio 2.0.5: Minor change to accept special DTS stereo modes (LtRt, AB,...)NDepend TFS 2010 integration: version 0.5.0 beta 1: Only the activity and the VS plugin are avalaible right now. They basically work. Data types that are logged into tfs reports are subject to change. This is no big deal since data is not yet sent into the warehouse.Windows Azure Toolkit for Windows Phone: Windows Azure Toolkit for Windows Phone v1.3.1: Upgraded Windows Azure projects to Windows Azure Tools for Microsoft Visual Studio 2010 1.5 – September 2011 Upgraded the tools tools to support the Windows Phone Developer Tools RTW Update SQL Azure only scenarios to use ASP.NET Universal Providers (through the System.Web.Providers v1.0.1 NuGet package) Changed Shared Access Signature service interface to support more operations Refactored Blobs API to have a similar interface and usage to that provided by the Windows Azure SDK Stor...DotNetNuke® FAQ: 05.00.00: FAQ (Frequently Asked Questions) 05.00.00 will work for any DNN version 5.6.1 and up. It is the first version which is rewritten in C#. The scope of this update is to fix all known issues and improve user interface. Please review and rate this release... (stars are welcome)BUG FIXESManage Categories button text was not localized Edit/Add FAQ Entry: button text was not localized ENHANCEMENTSAdded an option to select the control for category display: Listbox with checkboxes (flat category ...SiteMap Editor for Microsoft Dynamics CRM 2011: SiteMap Editor (1.0.921.340): Added CodePlex and PayPal links New iconNew ProjectsAsynk: Asynk is a framework/application that allows existing applications to easily be extended with an offloaded asynchronous worker layer. Asynk is developed using C#.Blob Tower Defense: 3D tower defense game for Windows Phone 7. School project for Brno University of Technology, computer graphics class.Booz: Booz is... An extended version of the boo shell (booish2 to be precise). Offers additional commands like cd, md, ls etc. I hope this shell can be used to take the position of/surpass the native windows shell in the near future.CIMS: a sanction infomation system for sencience and technology of hustCrystalDot - Icon Collection / Pack (LGPL): .Net / Mono freundliche Varainte der Crystal-Icons von Everaldo Icon collection / pack for .NET and Mono designed by Everaldo - KDE style http://www.everaldo.com/crystal/dotetes: dotetes adalah teka teki silang tool dikembangkan dengan bahasa c#Emoe': This Project is a Windows Phone 7.1 application.Equation Inversion: Visual Studion 2008 Add-in for equation inversions.Exploring VMR Features on WEC7: This is the sample application helps you to do alpha blending the bitmap on camera streaming in Windows Embedded Compact 7 using Directshow video Renderer (VMR). It is a VS2008 based smart device project developed on C++. I have explained the sample application in the following blog link. http://www.e-consystems.com/blog/windowsce/?p=759 EzValidation: Custom validation extensions for ASP.NET MVC 3. Includes server and client side model based validation attributes for: -- Equal To -- Not Equal To -- Greater Than -- Greater Than or Equal To -- Less Than -- Less Than or Equal To Supports validating against: -- Another Model Field -- A Specific Value -- Current Date/Yesterday/Tomorrow (for Dates and Strings) Download & Install via NuGet "package-install ezvalidation"Flu.net: Flu.net is a tool that helps you creating your own fluent syntax for .NET Framework applications in a declarative fashion. It is aimed for infrastructures and other open-source projects use.For Chess Endgames: King vs. King Opposition Calculator: You must input the locations of 2 kings on a chessboard, and whose turn it is to move. The calculator will display which king has the opposition, and how it can be used or maintained.GameTrakXNA: This project aims to create a simple library to use the unique GameTrak controller within XNA and Flash.Google Speech Recognition Example: Google Speech Recognition contains a working example of application that uses google speech recognition API. App contains all necessary dlls to record, decode and send your voice request to google service and recieve a text representation of what you've said. It's developed in C#Interval Mandelbrot Explorer: Explore the Mandelbrot set using interval arithmetic.ISD training tasks: ISD training examples and tasksiTunesControlBar: The iTunesControlBar helps user control their iTunes Application while it is minimized. iTunesControlBar resides at the top of the screen, invisible when not used, and allows playback and volume control, library searches and media information without the need to bring up iTunes.iTurtle: A bunch of Powerscripts to automate server management in AD environment.M26WC - Mono 2.6 Wizard Control: Wizard which runs under Mono2.6 A fork of: http://aerowizard.codeplex.com/Microsoft Help Viewer 2: Help Viewer 2 is the help runtime for both Visual Studio 11 help and Windows 8 help. The code in this project will help you use and understand the HV2 runtime API.MONTRASEC: Monitoring Trafficking in human beings and Sexual Exploitation of Children: benchmarking for member state and EU reporting, turning the SIAMSECT templates into a user-friendly interface and reporting tool. MTF.NET Runtime: Managed Task Framework .NET Runtime The MTF.NET runtime software and resulting assemblies are required to run applications built using the Managed Task Framework.NET Professional (Visual Studio 2010 extension) software design editor. The MTF.NET team are committed to continuously improving the core MTF.NET runtime and ensuring it is always available free and fully transparent. Pandoras Box: A greenfield inversion of control project utilising the power and flexibility of expressions and preferring convention over configuration.Pass the Puzzle: Pass the Puzzle is a frantic word-guessing party game. The game displays a few letters, and the players must come up with words containing those letters. But beware: if the timer goes off, you lose! It is based on the folk party game Pass the Parcel and is written in C#.PerCiGal: Percigal is a project for the development of applications for managing your personal media library. It consists in - a windows application to use at home to catalog movies, TV series, cast and books, with the support of the Internet for information retrieval; - a web interface for viewing and cataloging everywhere your media; - an application for smartphones. Project Flying Carpet: Este jogo é um projeto para a cadeira Projeto de Jogos: Motores Jogos do curso de Jogos Digitais da Unisinos.proxy browser: sed leo Latin's Butterfly....Python Multiple Dispatch: Multiple dispatch (AKA multimethods) for Python 3 via a metaclass and type annotations.reDune: ?????????? ???? ? ????? «????????? ? ???????? ???????». ???????? ?? Dune2000 ?? Westwood ? Electronic Arts.Rereadable: Keep page from internet for read it latter.ServStop: ServStop is a .NET application that makes it easy to stop several system services at once. Now you don't have to change startup types or stop them one at a time. It has a simple list-based interface with the ability to save and load lists of user services to stop. Written in C#.SharePoint 2010 Audience Membership Workflow Activity (Full Trust): A simple SharePoint 2010 workflow activity / workflow condition to check whether the user initiating the workflow is a member of a specified audience. Farm-level .wsp solution, written in C#. Once installed, the workflow activity can be used in SharePoint Designer 2010 declarative workflows.SQL Server® to Firebird DB converter: Converts Microsoft SQL Server® database into Firebird database including entire structure and datastegitest: test projectSystem.Threading.Joins: The Joins project provides asynchronous concurrency semantics based on join calculus and modeled after the Microsoft Research C? (C Omega) project.TestAndroidGame: try dev a TestAndroidGametetribricks: block game Topographic Explorer: A project to import, convert, explore, manipulate, and save topographical maps. Looking to use C# and WPF.Trading: Under construction!!!Trombone: Trombone makes it easier for Windows Mobile Professional users to automate status reply through SMS. It's developed in Visual C# 2008.Tulsa SharePoint Interest Group: Repository for source code for the Tulsa SharePoint Interest Group's web site. The Tulsa SharePoint Interest Group is using the Community Kit for SharePoint. This project will house any modifications that are specific to our user group.World of Tanks RU tiny stats collection utilty.: Tiny utility to load players stats for World of Tanks RU server. Results saved to comma separated file.WS-Discovery Proxy: Attempt at creating general purpose WS-Discovery Proxy.Yamaha Tu?n Tr?c: This application is used to manage information for Yamaha Tu?n Tr?c

    Read the article

  • Why should you choose Oracle WebLogic 12c instead of JBoss EAP 6?

    - by Ricardo Ferreira
    In this post, I will cover some technical differences between Oracle WebLogic 12c and JBoss EAP 6, which was released a couple days ago from Red Hat. This article claims to help you in the evaluation of key points that you should consider when choosing for an Java EE application server. In the following sections, I will present to you some important aspects that most customers ask us when they are seriously evaluating for an middleware infrastructure, specially if you are considering JBoss for some reason. I would suggest that you keep the following question in mind while you are reading the points: "Why should I choose JBoss instead of WebLogic?" 1) Multi Datacenter Deployment and Clustering - D/R ("Disaster & Recovery") architecture support is embedded on the WebLogic Server 12c product. JBoss EAP 6 on the other hand has no direct D/R support included, Red Hat relies on third-part tools with higher prices. When you consider a middleware solution to host your business critical application, you should worry with every architectural aspect that are related with the solution. Fail-over support is one little aspect of a truly reliable solution. If you do not worry about D/R, your solution will not be reliable. Having said that, with Red Hat and JBoss EAP 6, you have this extra cost that will increase considerably the total cost of ownership of the solution. As we commonly hear from analysts, open-source are not so cheaper when you start seeing the big picture. - WebLogic Server 12c supports advanced LAN clustering, detection of death servers and have a common alert framework. JBoss EAP 6 on the other hand has limited LAN clustering support with no server death detection. They do not generate any alerts when servers goes down (only if you buy JBoss ON which is a separated technology, but until now does not support JBoss EAP 6) and manual intervention are required when servers goes down. In most cases, admin people must rely on "kill -9", "tail -f someFile.log" and "ps ax | grep java" commands to manage failures and clustering anomalies. - WebLogic Server 12c supports the concept of Node Manager, which is a separated process that runs on the physical | virtual servers that allows extend the administration of the cluster to WebLogic managed servers that are often distributed across multiple machines and geographic locations. JBoss EAP 6 on the other hand has no equivalent technology. Whole server instances must be managed individually. - WebLogic Server 12c Node Manager supports Coherence to boost performance when managing servers. JBoss EAP 6 on the other hand has no similar technology. There is no way to coordinate JBoss and infiniband instances provided by JBoss using high throughput and low latency protocols like InfiniBand. The Node Manager feature also allows another very important feature that JBoss EAP lacks: secure the administration. When using WebLogic Node Manager, all the administration tasks are sent to the managed servers in a secure tunel protected by a certificate, which means that the transport layer that separates the WebLogic administration console from the managed servers are secured by SSL. - WebLogic Server 12c are now integrated with OTD ("Oracle Traffic Director") which is a web server technology derived from the former Sun iPlanet Web Server. This software complements the web server support offered by OHS ("Oracle HTTP Server"). Using OTD, WebLogic instances are load-balanced by a high powerful software that knows how to handle SDP ("Socket Direct Protocol") over InfiniBand, which boost performance when used with engineered systems technologies like Oracle Exalogic Elastic Cloud. JBoss EAP 6 on the other hand only offers support to Apache Web Server with custom modules created to deal with JBoss clusters, but only across standard TCP/IP networks.  2) Application and Runtime Diagnostics - WebLogic Server 12c have diagnostics capabilities embedded on the server called WLDF ("WebLogic Diagnostic Framework") so there is no need to rely on third-part tools. JBoss EAP 6 on the other hand has no diagnostics capabilities. Their only diagnostics tool is the log generated by the application server. Admin people are encouraged to analyse thousands of log lines to find out what is going on. - WebLogic Server 12c complement WLDF with JRockit MC ("Mission Control"), which provides to administrators and developers a complete insight about the JVM performance, behavior and possible bottlenecks. WebLogic Server 12c also have an classloader analysis tool embedded, and even a log analyzer tool that enables administrators and developers to view logs of multiple servers at the same time. JBoss EAP 6 on the other hand relies on third-part tools to do something similar. Again, only log searching are offered to find out whats going on. - WebLogic Server 12c offers end-to-end traceability and monitoring available through Oracle EM ("Enterprise Manager"), including monitoring of business transactions that flows through web servers, ESBs, application servers and database servers, all of this with high deep JVM analysis and diagnostics. JBoss EAP 6 on the other hand, even using JBoss ON ("Operations Network"), which is a separated technology, does not support those features. Red Hat relies on third-part tools to provide direct Oracle database traceability across JVMs. One of those tools are Oracle EM for non-Oracle middleware that manage JBoss, Tomcat, Websphere and IIS transparently. - WebLogic Server 12c with their JRockit support offers a tool called JRockit Flight Recorder, which can give developers a complete visibility of a certain period of application production monitoring with zero extra overhead. This automatic recording allows you to deep analyse threads latency, memory leaks, thread contention, resource utilization, stack overflow damages and GC ("Garbage Collection") cycles, to observe in real time stop-the-world phenomenons, generational, reference count and parallel collects and mutator threads analysis. JBoss EAP 6 don't even dream to support something similar, even because they don't have their own JVM. 3) Application Server Administration - WebLogic Server 12c offers a complete administration console complemented with scripting and macro-like recording capabilities. A single WebLogic console can managed up to hundreds of WebLogic servers belonging to the same domain. JBoss EAP 6 on the other hand has a limited console and provides a XML centric administration. JBoss, after ten years, started the development of a rudimentary centralized administration that still leave a lot of administration tasks aside, so admin people and developers must touch scripts and XML configuration files for most advanced and even simple administration tasks. This lead applications to error prone and risky deployments. Even using JBoss ON, JBoss EAP are not able to offer decent administration features for admin people which must be high skilled in JBoss internal architecture and its managing capabilities. - Oracle EM is available to manage multiple domains, databases, application servers, operating systems and virtualization, with a complete end-to-end visibility. JBoss ON does not provide management capabilities across the complete architecture, only basic monitoring. Even deployment must be done aside JBoss ON which does no integrate well with others softwares than JBoss. Until now, JBoss ON does not supports JBoss EAP 6, so even their minimal support for JBoss are not available for JBoss EAP 6 leaving customers uncovered and subject to high skilled JBoss admin people. - WebLogic Server 12c has the same administration model whatever is the topology selected by the customer. JBoss EAP 6 on the other hand differentiates between two operational models: standalone-mode and domain-mode, that are not consistent with each other. Depending on the mode used, the administration skill is different. - WebLogic Server 12c has no point-of-failures processes, and it does not need to define any specialized server. Domain model in WebLogic is available for years (at least ten years or more) and is production proven. JBoss EAP 6 on the other hand needs special processes to garantee JBoss integrity, the PC ("Process-Controller") and the HC ("Host-Controller"). Different from WebLogic, the domain model in JBoss is quite new (one year at tops) of maturity, and need to mature considerably until start doing things like WebLogic domain model does. - WebLogic Server 12c supports parallel deployment model which enables some artifacts being deployed at the same time. JBoss EAP 6 on the other hand does not have any similar feature. Every deployment are done atomically in the containers. This means that if you have a huge EAR (an EAR of 120 MB of size for instance) and deploy onto JBoss EAP 6, this EAR will take some minutes in order to starting accept thread requests. The same EAR deployed onto WebLogic Server 12c will reduce the deployment time at least in 2X compared to JBoss. 4) Support and Upgrades - WebLogic Server 12c has patch management available. JBoss EAP 6 on the other hand has no patch management available, each JBoss EAP instance should be patched manually. To achieve such feature, you need to buy a separated technology called JBoss ON ("Operations Network") that manage this type of stuff. But until now, JBoss ON does not support JBoss EAP 6 so, in practice, JBoss EAP 6 does not have this feature. - WebLogic Server 12c supports previuous WebLogic domains without any reconfiguration since its kernel is robust and mature since its creation in 1995. JBoss EAP 6 on the other hand has a proven lack of supportability between JBoss AS 4, 5, 6 and 7. Different kernels and messaging engines were implemented in JBoss stack in the last five years reveling their incapacity to create a well architected and proven middleware technology. - WebLogic Server 12c has patch prescription based on customer configuration. JBoss EAP 6 on the other hand has no such capability. People need to create ticket supports and have their installations revised by Red Hat support guys to gain some patch prescription from them. - Oracle WebLogic Server independent of the version has 8 years of support of new patches and has lifetime release of existing patches beyond that. JBoss EAP 6 on the other hand provides patches for a specific application server version up to 5 years after the release date. JBoss EAP 4 and previous versions had only 4 years. A good question that Red Hat will argue to answer is: "what happens when you find issues after year 5"?  5) RAC ("Real Application Clusters") Support - WebLogic Server 12c ships with a specific JDBC driver to leverage Oracle RAC clustering capabilities (Fast-Application-Notification, Transaction Affinity, Fast-Connection-Failover, etc). Oracle JDBC thin driver are also available. JBoss EAP 6 on the other hand ships only the standard Oracle JDBC thin driver. Load balancing with Oracle RAC are not supported. Manual intervention in case of planned or unplanned RAC downtime are necessary. In JBoss EAP 6, situation does not reestablish automatically after downtime. - WebLogic Server 12c has a feature called Active GridLink for Oracle RAC which provides up to 3X performance on OLTP applications. This seamless integration between WebLogic and Oracle database enable more value added to critical business applications leveraging their investments in Oracle database technology and Oracle middleware. JBoss EAP 6 on the other hand has no performance gains at all, even when admin people implement some kind of connection-pooling tuning. - WebLogic Server 12c also supports transaction and web session affinity to the Oracle RAC, which provides aditional gains of performance. This is particularly interesting if you are creating a reliable solution that are distributed not only in an LAN cluster, but into a different data center. JBoss EAP 6 on the other hand has no such support. 6) Standards and Technology Support - WebLogic Server 12c is fully Java EE 6 compatible and production ready since december of 2011. JBoss EAP 6 on the other hand became fully compatible with Java EE 6 only in the community version after three months, and production ready only in a few days considering that this article was written in June of 2012. Red Hat says that they are the masters of innovation and technology proliferation, but compared with Oracle and even other proprietary vendors like IBM, they historically speaking are lazy to deliver the most newest technologies and standards adherence. - Oracle is the steward of Java, driving innovation into the platform from commercial and open-source vendors. Red Hat on the other hand does not have its own JVM and relies on third-part JVMs to complete their application server offer. 95% of Red Hat customers are using Oracle HotSpot as JVM, which means that without Oracle involvement, their support are limited exclusively to the application server layer and we all know that most problems are happens in the JVM layer. - WebLogic Server 12c supports natively JDK 7, which empower developers to explore the maximum of the Java platform productivity when writing code. This feature differentiate WebLogic from others application servers (except GlassFish that are also managed by Oracle) because the usage of JDK 7 introduce such remarkable productivity features like the "try-with-resources" enhancement, catching multiple exceptions with one try block, Strings in the switch statements, JVM improvements in terms of JDBC, I/O, networking, security, concurrency and of course, the most important feature of Java 7: native support for multiple non-Java languages. More features regarding JDK 7 can be found here. JBoss EAP 6 on the other hand does not support JDK 7 officially, they comment in their community version that "Java SE 7 can be used with JBoss 7" which does not gives you any guarantees of enterprise support for JDK 7. - Oracle WebLogic Server 12c supports integration with Spring framework allowing Spring applications to use WebLogic special transaction manager, exposing bean interfaces to WebLogic MBeans to take advantage of all WebLogic monitoring and administration advantages. JBoss EAP 6 on the other hand has no special integration with Spring. In fact, Red Hat offers a suspicious package called "JBoss Web Platform" that in theory supports Spring, but in practice this package does not offers any special integration. It is just a facility for Red Hat customers to have support from both JBoss and Spring technology using the same customer support. 7) Lightweight Development - Oracle WebLogic Server 12c and Oracle GlassFish are completely integrated and can share applications without any modifications. Starting with the 12c version, WebLogic now understands natively GlassFish deployment descriptors and specific configurations in order to offer you a truly and reliable migration path from a community Java EE application server to a enterprise middleware product like WebLogic. JBoss EAP 6 on the other hand has no support to natively reuse an existing (or still in development) application from JBoss AS community server. Users of JBoss suffer of critical issues during deployment time that includes: changing the libraries and dependencies of the application, patching the DTD or XSD deployment descriptors, refactoring of the application layers due classloading issues and anomalies, rebuilding of persistence, business and web layers due issues with "usage of the certified version of an certain dependency" or "frameworks that Red Hat potentially does not recommend" etc. If you have the culture or enterprise IT directive of developing Java EE applications using community middleware to in a certain future, transition to enterprise (supported by a vendor) middleware, Oracle WebLogic plus Oracle GlassFish offers you a more sustainable solution. - WebLogic Server 12c has a very light ZIP distribution (less than 165 MB). JBoss EAP 6 ZIP size is around 130 MB, together with JBoss ON you have more 100 MB resulting in a higher download footprint. This is particularly interesting if you plan to use automated setup of application server instances (for example, to rapidly setup a development or staging environment) using Maven or Hudson. - WebLogic Server 12c has a complete integration with Maven allowing developers to setup WebLogic domains with few commands. Tasks like downloading WebLogic, installation, domain creation, data sources deployment are completely integrated. JBoss EAP 6 on the other hand has a limited offer integration with those tools.  - WebLogic Server 12c has a startup mode called WLX that turns-off EJB, JMS and JCA containers leaving enabled only the web container with Java EE 6 web profile. JBoss EAP 6 on the other hand has no such feature, you need to disable manually the containers that you do not want to use. - WebLogic Server 12c supports fastswap, which enables you to change classes without redeployment. This is particularly interesting if you are developing patches for the application that is already deployed and you do not want to redeploy the entire application. This is the same behavior that most application servers offers to JSP pages, but with WebLogic Server 12c, you have the same feature for Java classes in general. JBoss EAP 6 on the other hand has no such support. Even JBoss EAP 5 does not support this until now. 8) JMS and Messaging - WebLogic Server 12c has a proven and high scalable JMS implementation since its initial release in 1995. JBoss EAP 6 on the other hand has a still immature technology called HornetQ, which was introduced in JBoss EAP 5 replacing everything that was implemented in the previous versions. Red Hat loves to introduce new technologies across JBoss versions, playing around with customers and their investments. And when they are asked about why they have changed the implementation and caused such a mess, their answer is always: "the previous implementation was inadequate and not aligned with the community strategy so we are creating a new a improved one". This Red Hat practice leads to uncomfortable investments that in a near future (sometimes less than a year) will be affected in someway. - WebLogic Server 12c has troubleshooting and monitoring features included on the WebLogic console and WLDF. JBoss EAP 6 on the other hand has no direct monitoring on the console, activity is reflected only on the logs, no debug logs available in case of JMS issues. - WebLogic Server 12c has extremely good performance and scalability. JBoss EAP 6 on the other hand has a JMS storage mechanism relying on Oracle database or MySQL. This means that if an issue in production happens and Red Hat affirms that an performance issue is happening due to database problems, they will not support you on the performance issue. They will orient you to call Oracle instead. - WebLogic Server 12c supports messaging enterprise features like SAF ("Store and Forward"), Distributed Queues/Topics and Foreign JMS providers support that leverage JMS implementations without compromise developer code making things completely transparent. JBoss EAP 6 on the other hand do not even dream to support such features. 9) Caching and Grid - Coherence, which is the leading and most mature data grid technology from Oracle, is available since early 2000 and was integrated with WebLogic in 2009. Coherence and WebLogic clusters can be both managed from WebLogic administrative console. Even Node Manager supports Coherence. JBoss on the other hand discontinued JBoss Cache, which was their caching implementation just like they did with the messaging implementation (JBossMQ) which was a issue for long term customers. JBoss EAP 6 ships InfiniSpan version 1.0 which is immature and lack a proven record of successful cases and reliability. - WebLogic Server 12c has a feature called ActiveCache which uses Coherence to, without any code changes, replicate HTTP sessions from both WebLogic and other application servers like JBoss, Tomcat, Websphere, GlassFish and even Microsoft IIS. JBoss EAP 6 on the other hand does have such support and even when they do in the future, they probably will support only their own application server. - Coherence can be used to manage both L1 and L2 cache levels, providing support to Oracle TopLink and others JPA compliant implementations, even Hibernate. JBoss EAP 6 and Infinispan on the other hand supports only Hibernate. And most important of all: Infinispan does not have any successful case of L1 or L2 caching level support using Hibernate, which lead us to reflect about its viability. 10) Performance - WebLogic Server 12c is certified with Oracle Exalogic Elastic Cloud and can run unchanged applications at this engineered system. This approach can benefit customers from Exalogic optimization's of both kernel and JVM layers to boost performance in terms of 10X for web, OLTP, JMS and grid applications. JBoss EAP 6 on the other hand has no investment on engineered systems: customers do not have the choice to deploy on a Java ultra fast system if their project becomes relevant and performance issues are detected. - WebLogic Server 12c maintains a performance gain across each new release: starting on WebLogic 5.1, the overall performance gain has been close to 4X, which close to a 20% gain release by release. JBoss on the other hand does not provide SPECJAppServer or SPECJEnterprise performance benchmarks. Their so called "performance gains" remains hidden in their customer environments, which lead us to think if it is true or not since we will never get access to those environments. - WebLogic Server 12c has industry performance benchmarks with submissions across platforms and configurations leading SPECJ. Oracle WebLogic leads SPECJAppServer performance in multiple categories, fitting all customer topologies like: dual-node, single-node, multi-node and multi-node with RAC. JBoss... again, does not provide any SPECJAppServer performance benchmarks. - WebLogic Server 12c has a feature called work manager which allows your application to embrace new performance levels based on critical resource utilization of the CPUs usage. Work managers prioritizes work and allocates threads based on an execution model that takes into account administrator-defined parameters and actual run-time performance and throughput. JBoss EAP 6 on the other hand has no compared feature and probably they never will. Not supporting such feature like work managers, JBoss EAP 6 forces admin people and specially developers to uncover performance gains in a intrusive way, rewriting the code and doing performance refactorings. 11) Professional Services Support - WebLogic Server 12c and any other technology sold by Oracle give customers the possibility of hire OCS ("Oracle Consulting Services") to manage critical scenarios, deployment assistance of new applications, high skilled consultancy of architecture, best practices and people allocation together with customer teams. All OCS services are available without any restrictions, having the customer bought software from Oracle or just starting their implementation before any acquisition. JBoss EAP 6 or Red Hat to be more specifically, only offers professional services if you buy subscriptions from them. If you are developing a new critical application for your business and need the help of Red Hat for a serious issue or architecture decision, they will probably say: "OK... I can help you but after you buy subscriptions from me". Red Hat also does not allows their professional services consultants to manage environments that uses community based software. They will probably force you to first buy a subscription, download their "enterprise" version and them, optionally hire their consultants. - Oracle provides you our university to educate your team into our technologies, including of course specialized trainings of WebLogic application server. At any time and location, you can hire Oracle to train your team so you get trustful knowledge according to your specific needs. Certifications for the products are also available if your technical people desire to differentiate themselves as professionals. Red Hat on the other hand have a limited pool of resources to train your team in their technologies. Basically they are selling training and certification for RHEL ("Red Hat Enterprise Linux") but if you demand more specialized training in JBoss middleware, they will probably connect you to some "certified" partner localized training since they are apparently discontinuing their education center, at least here in Brazil. They were not able to reproduce their success with RHEL education to their middleware division since they need first sell the subscriptions to after gives you specialized training. And again, they only offer you specialized training based on their enterprise version (EAP in the case of JBoss) which means that the courses will be a quite outdated. There are reports of developers that took official training's from Red Hat at this year (2012) and in a certain JBoss advanced course, Red Hat supposedly covered JBossMQ as the messaging subsystem, and even the printed material provided was based on JBossMQ since the training was created for JBoss EAP 4.3. 12) Encouraging Transparency without Ulterior Motives - WebLogic Server 12c like any other software from Oracle can be downloaded any time from anywhere, you should only possess an OTN ("Oracle Technology Network") credential and you can download any enterprise software how many times you want. And is not some kind of "trial" version. It is the official binaries that will be running for ever in your data center. Oracle does not encourages the usage of "specific versions" of our software. The binaries you buy from Oracle are the same binaries anyone in the world could download and use for testing and personal education. JBoss EAP 6 on the other hand are not available for download unless you buy a subscription and get access to the Red Hat enterprise repositories. If you need to test, learn or just start creating your application using Red Hat's middleware software, you should download it from the community website. You are not allowed to download the enterprise version that, according to Red Hat are more secure, reliable and robust. But no one of us want to start the development of a software with an unsecured, unreliable and not scalable middleware right? So what you do? You are "invited" by Red Hat to buy subscriptions from them to get access to the "cool" version of the software. - WebLogic Server 12c prices are publicly available in the Oracle website. If you want to know right now how much WebLogic will cost to your organization, just click here and get access to our price list. In the case of WebLogic, check out the "US Oracle Technology Commercial Price List". Oracle also encourages you to get in touch with a sales representative to discuss discounts that would make possible the investment into our technology. But you are not required to do this, only if you are interested in buying our technology or maybe you want to discuss some discount scenarios. JBoss EAP 6 on the other hand does not have its cost publicly available in Red Hat's website or in any other media, at least is not so easy to get such information. The only link you will possibly find in their website is a "Contact a Sales Representative" link. This is not a very good relationship between an customer and an vendor. This is not an example of transparency, mainly when the software are sold as open. In this situations, customers expects to see the software prices publicly available, so they can have the chance to decide, based on the existing features of the software, if the cost is fair or not. Conclusion Oracle WebLogic is the most mature, secure, reliable and scalable Java EE application server of the market, and have a proven record of success around the globe to prove it's majority. Don't lose the chance to discover today how WebLogic could fit your needs and sustain your global IT middleware strategy, no matter if your strategy are completely based on the Cloud or not.

    Read the article

  • Authoritative sources about Database vs. Flatfile decision

    - by FastAl
    <tldr>looking for a reference to a book or other undeniably authoritative source that gives reasons when you should choose a database vs. when you should choose other storage methods. I have provided an un-authoritative list of reasons about 2/3 of the way down this post.</tldr> I have a situation at my company where a database is being used where it would be better to use another solution (in this case, an auto-generated piece of source code that contains a static lookup table, searched by binary sort). Normally, a database would be an OK solution even though the problem does not require a database, e.g, none of the elements of ACID are needed, as it is read-only data, updated about every 3-5 years (also requiring other sourcecode changes), and fits in memory, and can be keyed into via binary search (a tad faster than db, but speed is not an issue). The problem is that this code runs on our enterprise server, but is shared with several PC platforms (some disconnected, some use a central DB, etc.), and parts of it are managed by multiple programming units, parts by the DBAs, parts even by mathematicians in another department, etc. These hit their own platform’s version of their databases (containing their own copy of the static data). What happens is that every implementation, every little change, something different goes wrong. There are many other issues as well. I can’t even use a flatfile, because one mode of running on our enterprise server does not have permission to read files (only databases, and of course, its own literal storage, e.g., in-source table). Of course, other parts of the system use databases in proper, less obscure manners; there is no problem with those parts. So why don’t we just change it? I don’t have administrative ability to force a change. But I’m affected because sometimes I have to help fix the problems, but mostly because it causes outages and tons of extra IT time by other programmers and d*mmit that makes me mad! The reason neither management, nor the designers of the system, can see the problem is that they propose a solution that won’t work: increase communication; implement more safeguards and standards; etc. But every time, in a different part of the already-pared-down but still multi-step processes, a few different diligent, hard-working, top performing IT personnel make a unique subtle error that causes it to fail, sometimes after the last round of testing! And in general these are not single-person failures, but understandable miscommunications. And communication at our company is actually better than most. People just don't think that's the case because they haven't dug into the matter. However, I have it on very good word from somebody with extensive formal study of sociology and psychology that the relatively small amount of less-than-proper database usage in this gigantic cross-platform multi-source, multi-language project is bureaucratically un-maintainable. Impossible. No chance. At least with Human Beings in the loop, and it can’t be automated. In addition, the management and developers who could change this, though intelligent and capable, don’t understand the rigidity of this ‘how humans are’ issue, and are not convincible on the matter. The reason putting the static data in sourcecode will solve the problem is, although the solution is less sexy than a database, it would function with no technical drawbacks; and since the sharing of sourcecode already works very well, you basically erase any database-related effort from this section of the project, along with all the drawbacks of it that are causing problems. OK, that’s the background, for the curious. I won’t be able to convince management that this is an unfixable sociological problem, and that the real solution is coding around these limits of human nature, just as you would code around a bug in a 3rd party component that you can’t change. So what I have to do is exploit the unsuitableness of the database solution, and not do it using logic, but rather authority. I am aware of many reasons, and posts on this site giving reasons for one over the other; I’m not looking for lists of reasons like these (although you can add a comment if I've miss a doozy): WHY USE A DATABASE? instead of flatfile/other DB vs. file: if you need... Random Read / Transparent search optimization Advanced / varied / customizable Searching and sorting capabilities Transaction/rollback Locks, semaphores Concurrency control / Shared users Security 1-many/m-m is easier Easy modification Scalability Load Balancing Random updates / inserts / deletes Advanced query Administrative control of design, etc. SQL / learning curve Debugging / Logging Centralized / Live Backup capabilities Cached queries / dvlp & cache execution plans Interleaved update/read Referential integrity, avoid redundant/missing/corrupt/out-of-sync data Reporting (from on olap or oltp db) / turnkey generation tools [Disadvantages:] Important to get right the first time - professional design - but only b/c it's meant to last s/w & h/w cost Usu. over a network, speed issue (best vs. best design vs. local=even then a separate process req's marshalling/netwk layers/inter-p comm) indicies and query processing can stand in the way of simple processing (vs. flatfile) WHY USE FLATFILE: If you only need... Sequential Row processing only Limited usage append only (no reading, no master key/update) Only Update the record you're reading (fixed length recs only) Too big to fit into memory If Local disk / read-ahead network connection Portability / small system Email / cut & Paste / store as document by novice - simple format Low design learning curve but high cost later WHY USE IN-MEMORY/TABLE (tables, arrays, etc.): if you need... Processing a single db/ff record that was imported Known size of data Static data if hardcoding the table Narrow, unchanging use (e.g., one program or proc) -includes a class that will be shared, but encapsulates its data manipulation Extreme speed needed / high transaction frequency Random access - but search is dependent on implementation Following are some other posts about the topic: http://stackoverflow.com/questions/1499239/database-vs-flat-text-file-what-are-some-technical-reasons-for-choosing-one-over http://stackoverflow.com/questions/332825/are-flat-file-databases-any-good http://stackoverflow.com/questions/2356851/database-vs-flat-files http://stackoverflow.com/questions/514455/databases-vs-plain-text/514530 What I’d like to know is if anybody could recommend a hard, authoritative source containing these reasons. I’m looking for a paper book I can buy, or a reputable website with whitepapers about the issue (e.g., Microsoft, IBM), not counting the user-generated content on those sites. This will have a greater change to elicit a change that I’m looking for: less wasted programmer time, and more reliable programs. Thanks very much for your help. You win a prize for reading such a large post!

    Read the article

  • How do I prove I should put a table of values in source code instead of a database table?

    - by FastAl
    <tldr>looking for a reference to a book or other undeniably authoritative source that gives reasons when you should choose a database vs. when you should choose other storage methods. I have provided an un-authoritative list of reasons about 2/3 of the way down this post.</tldr> I have a situation at my company where a database is being used where it would be better to use another solution (in this case, an auto-generated piece of source code that contains a static lookup table, searched by binary sort). Normally, a database would be an OK solution even though the problem does not require a database, e.g, none of the elements of ACID are needed, as it is read-only data, updated about every 3-5 years (also requiring other sourcecode changes), and fits in memory, and can be keyed into via binary search (a tad faster than db, but speed is not an issue). The problem is that this code runs on our enterprise server, but is shared with several PC platforms (some disconnected, some use a central DB, etc.), and parts of it are managed by multiple programming units, parts by the DBAs, parts even by mathematicians in another department, etc. These hit their own platform’s version of their databases (containing their own copy of the static data). What happens is that every implementation, every little change, something different goes wrong. There are many other issues as well. I can’t even use a flatfile, because one mode of running on our enterprise server does not have permission to read files (only databases, and of course, its own literal storage, e.g., in-source table). Of course, other parts of the system use databases in proper, less obscure manners; there is no problem with those parts. So why don’t we just change it? I don’t have administrative ability to force a change. But I’m affected because sometimes I have to help fix the problems, but mostly because it causes outages and tons of extra IT time by other programmers and d*mmit that makes me mad! The reason neither management, nor the designers of the system, can see the problem is that they propose a solution that won’t work: increase communication; implement more safeguards and standards; etc. But every time, in a different part of the already-pared-down but still multi-step processes, a few different diligent, hard-working, top performing IT personnel make a unique subtle error that causes it to fail, sometimes after the last round of testing! And in general these are not single-person failures, but understandable miscommunications. And communication at our company is actually better than most. People just don't think that's the case because they haven't dug into the matter. However, I have it on very good word from somebody with extensive formal study of sociology and psychology that the relatively small amount of less-than-proper database usage in this gigantic cross-platform multi-source, multi-language project is bureaucratically un-maintainable. Impossible. No chance. At least with Human Beings in the loop, and it can’t be automated. In addition, the management and developers who could change this, though intelligent and capable, don’t understand the rigidity of this ‘how humans are’ issue, and are not convincible on the matter. The reason putting the static data in sourcecode will solve the problem is, although the solution is less sexy than a database, it would function with no technical drawbacks; and since the sharing of sourcecode already works very well, you basically erase any database-related effort from this section of the project, along with all the drawbacks of it that are causing problems. OK, that’s the background, for the curious. I won’t be able to convince management that this is an unfixable sociological problem, and that the real solution is coding around these limits of human nature, just as you would code around a bug in a 3rd party component that you can’t change. So what I have to do is exploit the unsuitableness of the database solution, and not do it using logic, but rather authority. I am aware of many reasons, and posts on this site giving reasons for one over the other; I’m not looking for lists of reasons like these (although you can add a comment if I've miss a doozy): WHY USE A DATABASE? instead of flatfile/other DB vs. file: if you need... Random Read / Transparent search optimization Advanced / varied / customizable Searching and sorting capabilities Transaction/rollback Locks, semaphores Concurrency control / Shared users Security 1-many/m-m is easier Easy modification Scalability Load Balancing Random updates / inserts / deletes Advanced query Administrative control of design, etc. SQL / learning curve Debugging / Logging Centralized / Live Backup capabilities Cached queries / dvlp & cache execution plans Interleaved update/read Referential integrity, avoid redundant/missing/corrupt/out-of-sync data Reporting (from on olap or oltp db) / turnkey generation tools [Disadvantages:] Important to get right the first time - professional design - but only b/c it's meant to last s/w & h/w cost Usu. over a network, speed issue (best vs. best design vs. local=even then a separate process req's marshalling/netwk layers/inter-p comm) indicies and query processing can stand in the way of simple processing (vs. flatfile) WHY USE FLATFILE: If you only need... Sequential Row processing only Limited usage append only (no reading, no master key/update) Only Update the record you're reading (fixed length recs only) Too big to fit into memory If Local disk / read-ahead network connection Portability / small system Email / cut & Paste / store as document by novice - simple format Low design learning curve but high cost later WHY USE IN-MEMORY/TABLE (tables, arrays, etc.): if you need... Processing a single db/ff record that was imported Known size of data Static data if hardcoding the table Narrow, unchanging use (e.g., one program or proc) -includes a class that will be shared, but encapsulates its data manipulation Extreme speed needed / high transaction frequency Random access - but search is dependent on implementation Following are some other posts about the topic: http://stackoverflow.com/questions/1499239/database-vs-flat-text-file-what-are-some-technical-reasons-for-choosing-one-over http://stackoverflow.com/questions/332825/are-flat-file-databases-any-good http://stackoverflow.com/questions/2356851/database-vs-flat-files http://stackoverflow.com/questions/514455/databases-vs-plain-text/514530 What I’d like to know is if anybody could recommend a hard, authoritative source containing these reasons. I’m looking for a paper book I can buy, or a reputable website with whitepapers about the issue (e.g., Microsoft, IBM), not counting the user-generated content on those sites. This will have a greater change to elicit a change that I’m looking for: less wasted programmer time, and more reliable programs. Thanks very much for your help. You win a prize for reading such a large post!

    Read the article

< Previous Page | 29 30 31 32 33