Search Results

Search found 753 results on 31 pages for 'decisions'.

Page 28/31 | < Previous Page | 24 25 26 27 28 29 30 31  | Next Page >

  • QT vs. Net - REAL comparisons for R.A.D. projects

    - by Pirate for Profit
    Man in all these Qt vs. .NET discussions 90% these people argue about the dumbest crap. Trying to get a real comparison chart here, because I know a little about both frameworks but I don't know everything. I believe Qt and .NET both have strengths and weaknesses. This is to make a comparison that highlights these so people can make more informed decisions before embarking on a project, in the spirit of R.A.D. Event Handling In Qt the event handling system is very simple. You just emit signals when something cool happens and then catch them in slots. ie. // run some calculations, then emit valueChanged(30, false, 20.2); and then catching it, any object can make a slot to recieve that message easily void MyObj::valueChanged(int percent, bool ok, float timeRemaining). It's easy to "block" an event or "disconnect" when needed, and works seamlessly across threads... once you get the hang of it, it just seems a lot more natural and intuitive than the way the .NET event handling is set up (you know, void valueChanged(object sender, CustomEventArgs e). And I'm not just talking about syntax, because in the end the .NET anonymous delegates are the bomb. I'm also talking about in more than just reflection (because, yes, .NET obviously has much stronger reflection capabilities). I'm talking about in the way the system feels to a human being. Qt wins hands down for the simplest yet still flexible event handling system ever i m o. Plugins and such I do love some of the ease of C# compared to C++, as well as .NET's assembly architecture, even though it leads to a bunch of .dll's (there's ways to combine everything into a single exe though). That is a big bonus for modular projects, which are a PITA to import stuff in C++ as far as RAD is concerned. Database Ease of Doing Crap Also what about datasets and database manipulations. I think .net wins here but I'm not sure. Threading/Conccurency How do you guys think of the threading? In .NET, all I've ever done is make like a list of master worker threads with locks. I like QConcurrentFramework, you don't worry about locks or anything, and with the ease of the signal slot system across threads it's nice to get notified about the progress of things. QConcurrent is the simplest threading mechanism I've ever played with. Memory Usage Also what do you think of the overall memory usage comparison. Is the .NET garbage collector pretty on the ball and quick compared to the instantaneous nature of native memory management? Or does it just let programs leak up a storm and lag the computer then clean it up when it's about to really lag? Doesn't the just-in-time compiler make native code that is pretty good, like and that only happens the first time the program is run? However, I am a n00b who doesn't know what I'm talking about, please school me on the subject.

    Read the article

  • Creating a music catalog in C# and extracting first 30 seconds as soon as the first words are sung

    - by Rad
    I already read a question: Separation of singing voice from music. I don’t need this complex audio processing. I only need some detection mechanism that would detect that there is some voice/vocal playing while the music is playing (or not playing) I need to extract first 30 seconds when a vocalist starts singing along with full band music. See question 2 below. I want to create a music catalog using ASP.NET MVC 2 and Silverlight clients and C#.NET 4.0 programming language that would be front store. On the backend I would also like to create a desktop WPF/Windows application to create the music catalog from already existing music files, most of which have metadata in them ID3v1, ID3v2.3, ID3v2.4, iTunes MP4, WMA, Vorbis Comments and APE Tags etc. I would possibly like to create a web service that would allow catalog contributors to upload a zipped album and trigger metadata extraction of music data and extraction of music segments as described below. I would be happy if I achieve no. 1 below. Let's say I have 1000ths of songs in mp3 (or other formats) grouped in subfolders using some classification (Genre, Artists, Albums, Composers or other groupings). I want to create tables in DB that would organize songs so they can be searched based on different criteria (year, length, above classification or by song title, description etc) like what iTune store allows to their customers. I want to extract metadata from various formats (I will try to get songs in mp3 format, but there may be other popular formats) and allow music Catalog manager person to add missing data from either desktop or web applications. He or other contributors can upload zipped music via an HTML or Silverlight upload or WPF. Can anybody suggest open source libraries, articles, code snippets that can do that in an automatic way using .NET and possibly SQL Server DB? My main questions are these. This is an audio processing challenge. I want to extract 2 segments of music (questions 1 and 2): 1. How to extract a music segment: 1-2 seconds before a vocal starts singing and up to 30 seconds from that point in time and 2. Much more challenging is to find repeating segments (One would usually find or recognize the names of the songs and songs are usually known by these refrains. How would I go about creating a list of songs that go great together like what Genius from iTune does? Is there any characteristics of music that can be used to match songs? The goal is for people quickly scan and recognize songs i.e. associate melody, words with a title/album so they can make intelligent decisions like buying a song, create similar mood songs. Thanks, Rad

    Read the article

  • Creating a music catalog and extracting first 30 seconds as soon as the first words are sung

    - by Rad
    I already read a question: Separation of singing voice from music. I don’t need this complex audio processing. I only need some detection mechanism that would detect that there is some voice/vocal playing while the music is playing (or not playing) I need to extract first 30 seconds when a vocalist starts singing along with full band music. See question 2 below. I want to create a music catalog using ASP.NET MVC 2 and Silverlight clients and C#.NET 4.0 programming language that would be front store. On the backend I would also like to create a desktop WPF/Windows application to create the music catalog from already existing music files, most of which have metadata in them ID3v1, ID3v2.3, ID3v2.4, iTunes MP4, WMA, Vorbis Comments and APE Tags etc. I would possibly like to create a web service that would allow catalog contributors to upload a zipped album and trigger metadata extraction of music data and extraction of music segments as described below. I would be happy if I achieve no. 1 below. Let's say I have 1000ths of songs in mp3 (or other formats) grouped in subfolders using some classification (Genre, Artists, Albums, Composers or other groupings). I want to create tables in DB that would organize songs so they can be searched based on different criteria (year, length, above classification or by song title, description etc) like what iTune store allows to their customers. I want to extract metadata from various formats (I will try to get songs in mp3 format, but there may be other popular formats) and allow music Catalog manager person to add missing data from either desktop or web applications. He or other contributors can upload zipped music via an HTML or Silverlight upload or WPF. Can anybody suggest open source libraries, articles, code snippets that can do that in an automatic way using .NET and possibly SQL Server DB? My main questions are these. This is an audio processing challenge. I want to extract 2 segments of music (questions 1 and 2): 1. How to extract a music segment: 1-2 seconds before a vocal starts singing and up to 30 seconds from that point in time and 2. Much more challenging is to find repeating segments (One would usually find or recognize the names of the songs and songs are usually known by these refrains. How would I go about creating a list of songs that go great together like what Genius from iTune does? Is there any characteristics of music that can be used to match songs? The goal is for people quickly scan and recognize songs i.e. associate melody, words with a title/album so they can make intelligent decisions like buying a song, create similar mood songs.

    Read the article

  • Working with friends. Poor career choice?

    - by a_person
    Hi all, Hope you can help me solve somewhat of a moral dilemma. Some time ago, after just a few years of living in U.S. and having to take any job I could get my hands on a friend of mine submitted recommended me for an open position at the company that he was working for. I could have not been happier. I do not have a degree of any sort, however, by being passionate about CS and with constant drive for self education I've became a somewhat of a strong generalist. Every place I worked for recognized me for that quality and used me on various projects where set of technology in hand had no overlap with set of knowledge of the team members. Rapidly I've advanced to Sr. Programmer position and the trend of me following a friend from one place to another have started and continued on for a few years. My friend's goal always been to become an IT Director, mine is to become the best programmer I can be. To my knowledge I've accommodated his goals as much as I could by taking a back seat, and letting him take the lead. Fast forward to today. He's a manager, and I am on his team. I am unhappy and I in considerable amount of suffering. I am not being utilized to my potential, it's almost exact opposite, I am being micromanaged to an unhealthy extent, my decisions, and suggestions are constantly met with negative connotation. Last week I had to hear about how my friend is a better programmer than I am. My ego was ecstatic about this one /s. In addition to that working in the field of BI have exhausted itself for most parts. The only pleasure of my work is being derived from making everything as dynamic and parameter driven as possible. This is the only area where a friend of mine does not feel competent enough to actually micromanage. Because of my situation I feel a fair amount of guilt and ever growing resentment. I need your advice, maybe you've dealt with this expression of ego before, needs of self vs the needs of your friend. Is working with a friend a poor choice? Thank you for reading in.

    Read the article

  • java web application best practices

    - by Bruce
    Hi all I'm trying to figure out the optimum way to develop and release a fairly simple web application, and I'm running into several problems. I'll outline the decisions I've made, because somewhere I've clearly gone off the rails.. Hugely grateful for any help! I have what I think is a fairly simple web application. It contains a couple of jsps that reference a couple of java beans, and the usual static html, js, css and images. Decision 1) I wanted to have a clear and clean release procedure, such that I could develop on my local machine and then release reliably to a production machine. I therefore made the decision to package the application into a war file (including all the static resources), to minimize the separate bits and pieces I would need to release. So far so good? Decision 2) I wanted things on my local machine to be as similar as possible to the production environment. So in my html, for example, I may have a reference to a static file such as http://static.foo.com/file . To keep this code working seamlessly on dev and prod, I decided to put static.foo.com in my /etc/hosts when developing locally, so that all the urls work correctly without changing anything. Decision 3) I decided to use eclipse and maven to give me a best practice environment for administering and building my project. So I have a nice tight set up now, except that: Every time I want to change anything in development, like one line in an html file, I have to rebuild the entire project and then wait for tomcat to load the war before I can see if it's what I wanted. So my questions are: 1) Is there a way to connect up eclipse and tomcat so that I don't have to rebuild the war each time? ie tomcat is looking straight at my actual workspace to serve up the static files? 2)I think I'm maybe making things harder by using /etc/hosts to reflect production urls - is there a better way that doesn't involve manually changing over urls (relative urls are fine of course, but where you have many subdomains, say one for static files and one for dynamic, you have to write out the full path, surely?) 3) Is this really best practice?? How do people set things up so that they balance the requirement for an automated, all-encompassing build process on the one hand, and the speed and flexibility to be able to develop javascript and html and css quickly, as quickly as if one just pointed apache at the directory and developed live? What do people find works? Many thanks!

    Read the article

  • Database Schema Versioning Strategies

    - by Jack Ryan
    I work on a project that uses a reasonably large database, the live version weighing in at somewhere around 60-80GB. The live database is the only real definitive source of our schema, and because of its size duplicating this database is too slow to be done often. This means we have ended up developing our database schema in a pretty ad hoc way, using sql compare to migrate changes from dev dbs to the live system, and only wiping our dev dbs every month or two. I am hoping to get some pointers on how to improve our database development work flow so that we have a little more control. Some things to think about: Currently nobody is really in charge of the database schema, all developers can change it if they need to, though generally these decisions are talked about before they are done. There are stored procedures, functions, and views in the database. These should probably be dumped to files so they can be reloaded on every build. Schema changes should probably be checked in as scripts. We have started to do this recently. However all our scripts must then be numbered (because there may be dependencies between them), and must be re runnable (because our build script currently runs them all in order). This makes them hard to read because they are full of conditionals that check whether tables or columns already exist. This is a step that is often forgotten by developers. Getting a new database should be quick and easy. This is currently a big problem, it takes several hours to get a copy of last nights backup and restore it onto a dev machine. Some mechanism needs to be in place to allow developers to update static data. We have tables that contain data that is never updated through the application, but does potentially need to be changed when we do a new release (often this drives dropdowns). The whole thing needs to be runnable as part of a build script. Are there any tools that can be used to help to do this? Eventually I would like to be at a point where a new DB can be built from scratch without copying any data from the live system. I don't mind writing some scripts to glue all the steps together but each part should be easily editable so that we continue to use it rather than make changes directly on DBs.

    Read the article

  • Is 4-5 years the “Midlife Crisis” for a programming career?

    - by Jeffrey
    I’ve been programming C# professionally for a bit over 4 years now. For the past 4 years I’ve worked for a few small/medium companies ranging from “web/ads agencies”, small industry specific software shops to a small startup. I've been mainly doing "business apps" that involves using high-level programming languages (garbage collected) and my overall experience was that all of the works I’ve done could have been more professional. A lot of the things were done incorrectly (in a rush) mainly due to cost factor that people always wanted something “now” and with the smallest amount of spendable money. I kept on thinking maybe if I could work for a bigger companies or a company that’s better suited for programmers, or somewhere that's got the money and time to really build something longer term and more maintainable I may have enjoyed more in my career. I’ve never had a “mentor” that guided me through my 4 years career. I am pretty much blog / google / self taught programmer other than my bachelor IT degree. I’ve also observed another issue that most so called “senior” programmer in “my working environment” are really not that senior skill wise. They are “senior” only because they’ve been a long time programmer, but the code they write or the decisions they make are absolutely rubbish! They don't want to learn, they don't want to be better they just want to get paid and do what they've told to do which make sense and most of us are like that. Maybe that’s why they are where they are now. But I don’t want to become like them I want to be better. I’ve run into a mental state that I no longer intend to be a programmer for my future career. I started to think maybe there are better things out there to work on. The more blogs I read, the more “best practices” I’ve tried the more I feel I am drifting away from “my reality”. But I am not a great programmer otherwise I don't think I am where I am now. I think 4-5 years is a stage that can be a step forward career wise or a step out of where you are. I just wanted to hear what other have to say about what I’ve mentioned above and whether you’ve experienced similar situation in your past programming career and how you dealt with it. Thanks.

    Read the article

  • Passing arguments between classes - use public properties or pass a properties class as argument?

    - by devoured elysium
    So let's assume I have a class named ABC that will have a list of Point objects. I need to make some drawing logic with them. Each one of those Point objects will have a Draw() method that will be called by the ABC class. The Draw() method code will need info from ABC class. I can only see two ways to make them have this info: Having Abc class make public some properties that would allow draw() to make its decisions. Having Abc class pass to draw() a class full of properties. The properties in both cases would be the same, my question is what is preferred in this case. Maybe the second approach is more flexible? Maybe not? I don't see here a clear winner, but that sure has more to do with my inexperience than any other thing. If there are other good approaches, feel free to share them. Here are both cases: class Abc1 { public property a; public property b; public property c; ... public property z; public void method1(); ... public void methodn(); } and here is approach 2: class Abc2 { //here we make take down all properties public void method1(); ... public void methodn(); } class Abc2MethodArgs { //and we put them here. this class will be passed as argument to //Point's draw() method! public property a; public property b; public property c; ... public property z; } Also, if there are any "formal" names for these two approaches, I'd like to know them so I can better choose the tags/thread name, so it's more useful for searching purposes. That or feel free to edit them.

    Read the article

  • Python - 2 Questions: Editing a variable in a function and changing the order of if else statements

    - by Eric
    First of all, I should explain what I'm trying to do first. I'm creating a dungeon crawler-like game, and I'm trying to program the movement of computer characters/monsters in the map. The map is basically a Cartesian coordinate grid. The locations of characters are represented by tuples of the x and y values, (x,y). The game works by turns, and in a turn a character can only move up, down, left or right 1 space. I'm creating a very simple movement system where the character will simply make decisions to move on a turn by turn basis. Essentially a 'forgetful' movement system. A basic flow chart of what I'm intending to do: Find direction towards destination Make a priority list of movements to be done using the direction eg.('r','u','d','l') means it would try to move right first, then up, then down, then left. Try each of the possibilities following the priority order. If the first movement fails (blocked by obstacle etc.), then it would successively try the movements until the first one that is successful, then it would stop. At step 3, the way I'm trying to do it is like this: def move(direction,location): try: -snip- # Tries to move, raises the exception Movementerror if cannot move in the direction return 1 # Indicates movement successful except Movementerror: return 0 # Indicates movement unsuccessful (thus character has not moved yet) prioritylist = ('r','u','d','l') if move('r',location): pass elif move('u',location): pass elif move('d',location): pass elif move('l',location): pass else: pass In the if/else block, the program would try the first movement on the priority on the priority list. At the move function, the character would try to move. If the character is not blocked and does move, it returns 1, leading to the pass where it would stop. If the character is blocked, it returns 0, then it tries the next movement. However, this results in 2 problems: How do I edit a variable passed into a function inside the function itself, while returning if the edit is successful? I have been told that you can't edit a variable inside a function as it won't really change the value of the variable, it just makes the variable inside the function refer to something else while the original variable remain unchanged. So, the solution is to return the value and then assign the variable to the returned value. However, I want it to return another value indicating if this edit is successful, so I want to edit this variable inside the function itself. How do I do so? How do I change the order of the if/else statements to follow the order of the priority list? It needs to be able to change during runtime as the priority list can change resulting in a different order of movement to try.

    Read the article

  • Parallelism in .NET – Part 10, Cancellation in PLINQ and the Parallel class

    - by Reed
    Many routines are parallelized because they are long running processes.  When writing an algorithm that will run for a long period of time, its typically a good practice to allow that routine to be cancelled.  I previously discussed terminating a parallel loop from within, but have not demonstrated how a routine can be cancelled from the caller’s perspective.  Cancellation in PLINQ and the Task Parallel Library is handled through a new, unified cooperative cancellation model introduced with .NET 4.0. Cancellation in .NET 4 is based around a new, lightweight struct called CancellationToken.  A CancellationToken is a small, thread-safe value type which is generated via a CancellationTokenSource.  There are many goals which led to this design.  For our purposes, we will focus on a couple of specific design decisions: Cancellation is cooperative.  A calling method can request a cancellation, but it’s up to the processing routine to terminate – it is not forced. Cancellation is consistent.  A single method call requests a cancellation on every copied CancellationToken in the routine. Let’s begin by looking at how we can cancel a PLINQ query.  Supposed we wanted to provide the option to cancel our query from Part 6: double min = collection .AsParallel() .Min(item => item.PerformComputation()); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } We would rewrite this to allow for cancellation by adding a call to ParallelEnumerable.WithCancellation as follows: var cts = new CancellationTokenSource(); // Pass cts here to a routine that could, // in parallel, request a cancellation try { double min = collection .AsParallel() .WithCancellation(cts.Token) .Min(item => item.PerformComputation()); } catch (OperationCanceledException e) { // Query was cancelled before it finished } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here, if the user calls cts.Cancel() before the PLINQ query completes, the query will stop processing, and an OperationCanceledException will be raised.  Be aware, however, that cancellation will not be instantaneous.  When cts.Cancel() is called, the query will only stop after the current item.PerformComputation() elements all finish processing.  cts.Cancel() will prevent PLINQ from scheduling a new task for a new element, but will not stop items which are currently being processed.  This goes back to the first goal I mentioned – Cancellation is cooperative.  Here, we’re requesting the cancellation, but it’s up to PLINQ to terminate. If we wanted to allow cancellation to occur within our routine, we would need to change our routine to accept a CancellationToken, and modify it to handle this specific case: public void PerformComputation(CancellationToken token) { for (int i=0; i<this.iterations; ++i) { // Add a check to see if we've been canceled // If a cancel was requested, we'll throw here token.ThrowIfCancellationRequested(); // Do our processing now this.RunIteration(i); } } With this overload of PerformComputation, each internal iteration checks to see if a cancellation request was made, and will throw an OperationCanceledException at that point, instead of waiting until the method returns.  This is good, since it allows us, as developers, to plan for cancellation, and terminate our routine in a clean, safe state. This is handled by changing our PLINQ query to: try { double min = collection .AsParallel() .WithCancellation(cts.Token) .Min(item => item.PerformComputation(cts.Token)); } catch (OperationCanceledException e) { // Query was cancelled before it finished } PLINQ is very good about handling this exception, as well.  There is a very good chance that multiple items will raise this exception, since the entire purpose of PLINQ is to have multiple items be processed concurrently.  PLINQ will take all of the OperationCanceledException instances raised within these methods, and merge them into a single OperationCanceledException in the call stack.  This is done internally because we added the call to ParallelEnumerable.WithCancellation. If, however, a different exception is raised by any of the elements, the OperationCanceledException as well as the other Exception will be merged into a single AggregateException. The Task Parallel Library uses the same cancellation model, as well.  Here, we supply our CancellationToken as part of the configuration.  The ParallelOptions class contains a property for the CancellationToken.  This allows us to cancel a Parallel.For or Parallel.ForEach routine in a very similar manner to our PLINQ query.  As an example, we could rewrite our Parallel.ForEach loop from Part 2 to support cancellation by changing it to: try { var cts = new CancellationTokenSource(); var options = new ParallelOptions() { CancellationToken = cts.Token }; Parallel.ForEach(customers, options, customer => { // Run some process that takes some time... DateTime lastContact = theStore.GetLastContact(customer); TimeSpan timeSinceContact = DateTime.Now - lastContact; // Check for cancellation here options.CancellationToken.ThrowIfCancellationRequested(); // If it's been more than two weeks, send an email, and update... if (timeSinceContact.Days > 14) { theStore.EmailCustomer(customer); customer.LastEmailContact = DateTime.Now; } }); } catch (OperationCanceledException e) { // The loop was cancelled } Notice that here we use the same approach taken in PLINQ.  The Task Parallel Library will automatically handle our cancellation in the same manner as PLINQ, providing a clean, unified model for cancellation of any parallel routine.  The TPL performs the same aggregation of the cancellation exceptions as PLINQ, as well, which is why a single exception handler for OperationCanceledException will cleanly handle this scenario.  This works because we’re using the same CancellationToken provided in the ParallelOptions.  If a different exception was thrown by one thread, or a CancellationToken from a different CancellationTokenSource was used to raise our exception, we would instead receive all of our individual exceptions merged into one AggregateException.

    Read the article

  • Parallelism in .NET – Part 5, Partitioning of Work

    - by Reed
    When parallelizing any routine, we start by decomposing the problem.  Once the problem is understood, we need to break our work into separate tasks, so each task can be run on a different processing element.  This process is called partitioning. Partitioning our tasks is a challenging feat.  There are opposing forces at work here: too many partitions adds overhead, too few partitions leaves processors idle.  Trying to work the perfect balance between the two extremes is the goal for which we should aim.  Luckily, the Task Parallel Library automatically handles much of this process.  However, there are situations where the default partitioning may not be appropriate, and knowledge of our routines may allow us to guide the framework to making better decisions. First off, I’d like to say that this is a more advanced topic.  It is perfectly acceptable to use the parallel constructs in the framework without considering the partitioning taking place.  The default behavior in the Task Parallel Library is very well-behaved, even for unusual work loads, and should rarely be adjusted.  I have found few situations where the default partitioning behavior in the TPL is not as good or better than my own hand-written partitioning routines, and recommend using the defaults unless there is a strong, measured, and profiled reason to avoid using them.  However, understanding partitioning, and how the TPL partitions your data, helps in understanding the proper usage of the TPL. I indirectly mentioned partitioning while discussing aggregation.  Typically, our systems will have a limited number of Processing Elements (PE), which is the terminology used for hardware capable of processing a stream of instructions.  For example, in a standard Intel i7 system, there are four processor cores, each of which has two potential hardware threads due to Hyperthreading.  This gives us a total of 8 PEs – theoretically, we can have up to eight operations occurring concurrently within our system. In order to fully exploit this power, we need to partition our work into Tasks.  A task is a simple set of instructions that can be run on a PE.  Ideally, we want to have at least one task per PE in the system, since fewer tasks means that some of our processing power will be sitting idle.  A naive implementation would be to just take our data, and partition it with one element in our collection being treated as one task.  When we loop through our collection in parallel, using this approach, we’d just process one item at a time, then reuse that thread to process the next, etc.  There’s a flaw in this approach, however.  It will tend to be slower than necessary, often slower than processing the data serially. The problem is that there is overhead associated with each task.  When we take a simple foreach loop body and implement it using the TPL, we add overhead.  First, we change the body from a simple statement to a delegate, which must be invoked.  In order to invoke the delegate on a separate thread, the delegate gets added to the ThreadPool’s current work queue, and the ThreadPool must pull this off the queue, assign it to a free thread, then execute it.  If our collection had one million elements, the overhead of trying to spawn one million tasks would destroy our performance. The answer, here, is to partition our collection into groups, and have each group of elements treated as a single task.  By adding a partitioning step, we can break our total work into small enough tasks to keep our processors busy, but large enough tasks to avoid overburdening the ThreadPool.  There are two clear, opposing goals here: Always try to keep each processor working, but also try to keep the individual partitions as large as possible. When using Parallel.For, the partitioning is always handled automatically.  At first, partitioning here seems simple.  A naive implementation would merely split the total element count up by the number of PEs in the system, and assign a chunk of data to each processor.  Many hand-written partitioning schemes work in this exactly manner.  This perfectly balanced, static partitioning scheme works very well if the amount of work is constant for each element.  However, this is rarely the case.  Often, the length of time required to process an element grows as we progress through the collection, especially if we’re doing numerical computations.  In this case, the first PEs will finish early, and sit idle waiting on the last chunks to finish.  Sometimes, work can decrease as we progress, since previous computations may be used to speed up later computations.  In this situation, the first chunks will be working far longer than the last chunks.  In order to balance the workload, many implementations create many small chunks, and reuse threads.  This adds overhead, but does provide better load balancing, which in turn improves performance. The Task Parallel Library handles this more elaborately.  Chunks are determined at runtime, and start small.  They grow slowly over time, getting larger and larger.  This tends to lead to a near optimum load balancing, even in odd cases such as increasing or decreasing workloads.  Parallel.ForEach is a bit more complicated, however. When working with a generic IEnumerable<T>, the number of items required for processing is not known in advance, and must be discovered at runtime.  In addition, since we don’t have direct access to each element, the scheduler must enumerate the collection to process it.  Since IEnumerable<T> is not thread safe, it must lock on elements as it enumerates, create temporary collections for each chunk to process, and schedule this out.  By default, it uses a partitioning method similar to the one described above.  We can see this directly by looking at the Visual Partitioning sample shipped by the Task Parallel Library team, and available as part of the Samples for Parallel Programming.  When we run the sample, with four cores and the default, Load Balancing partitioning scheme, we see this: The colored bands represent each processing core.  You can see that, when we started (at the top), we begin with very small bands of color.  As the routine progresses through the Parallel.ForEach, the chunks get larger and larger (seen by larger and larger stripes). Most of the time, this is fantastic behavior, and most likely will out perform any custom written partitioning.  However, if your routine is not scaling well, it may be due to a failure in the default partitioning to handle your specific case.  With prior knowledge about your work, it may be possible to partition data more meaningfully than the default Partitioner. There is the option to use an overload of Parallel.ForEach which takes a Partitioner<T> instance.  The Partitioner<T> class is an abstract class which allows for both static and dynamic partitioning.  By overriding Partitioner<T>.SupportsDynamicPartitions, you can specify whether a dynamic approach is available.  If not, your custom Partitioner<T> subclass would override GetPartitions(int), which returns a list of IEnumerator<T> instances.  These are then used by the Parallel class to split work up amongst processors.  When dynamic partitioning is available, GetDynamicPartitions() is used, which returns an IEnumerable<T> for each partition.  If you do decide to implement your own Partitioner<T>, keep in mind the goals and tradeoffs of different partitioning strategies, and design appropriately. The Samples for Parallel Programming project includes a ChunkPartitioner class in the ParallelExtensionsExtras project.  This provides example code for implementing your own, custom allocation strategies, including a static allocator of a given chunk size.  Although implementing your own Partitioner<T> is possible, as I mentioned above, this is rarely required or useful in practice.  The default behavior of the TPL is very good, often better than any hand written partitioning strategy.

    Read the article

  • SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Introduction – Day 1 of 31

    - by pinaldave
    List of all the Interview Questions and Answers Series blogs Posts covering interview questions and answers always make for interesting reading.  Some people like the subject for their helpful hints and thought provoking subject, and others dislike these posts because they feel it is nothing more than cheating.  I’d like to discuss the pros and cons of a Question and Answer format here. Interview Questions and Answers are Helpful Just like blog posts, books, and articles, interview Question and Answer discussions are learning material.  The popular Dummy’s books or Idiots Guides are not only for “dummies,” but can help everyone relearn the fundamentals.  Question and Answer discussions can serve the same purpose.  You could call this SQL Server Fundamentals or SQL Server 101. I have administrated hundreds of interviews during my career and I have noticed that sometimes an interviewee with several years of experience lacks an understanding of the fundamentals.  These individuals have been in the industry for so long, usually working on a very specific project, that the ABCs of the business have slipped their mind. Or, when a college graduate is looking to get into the industry, he is not expected to have experience since he is just graduated. However, the new grad is expected to have an understanding of fundamentals and theory.  Sometimes after the stress of final exams and graduation, it can be difficult to remember the correct answers to interview questions, though. An interview Question and Answer discussion can be very helpful to both these individuals.  It is simply a way to go back over the building blocks of a topic.  Many times a simple review like this will help “jog” your memory, and all those previously-memorized facts will come flooding back to you.  It is not a way to re-learn a topic, but a way to remind yourself of what you already know. A Question and Answer discussion can also be a way to go over old topics in a more interesting manner.  Especially if you have been working in the industry, or taking lots of classes on the topic, everything you read can sound like a repeat of what you already know.  Going over a topic in a new format can make the material seem fresh and interesting.  And an interested mind will be more engaged and remember more in the end. Interview Questions and Answers are Harmful A common argument against a Question and Answer discussion is that it will give someone a “cheat sheet.” A new guy with relatively little experience can read the interview questions and answers, and then memorize them. When an interviewer asks him the same questions, he will repeat the answers and get the job. Honestly, is he good hire because he memorized the interview questions? Wouldn’t it be better for the interviewer to hire someone with actual experience?  The answer is not as easy as it seems – there are many different factors to be considered. If the interviewer is asking fundamentals-related questions only, he gets the answers he wants to hear, and then hires this first candidate – there is a good chance that he is hiring based on personality rather than experience.  If the interviewer is smart he will ask deeper questions, have more than one person on the interview team, and interview a variety of candidates.  If one interviewee happens to memorize some answers, it usually doesn’t mean he will automatically get the job at the expense of more qualified candidates. Another argument against interview Question and Answers is that it will give candidates a false sense of confidence, and that they will appear more qualified than they are. Well, if that is true, it will not last after the first interview when the candidate is asked difficult questions and he cannot find the answers in the list of interview Questions and Answers.  Besides, confidence is one of the best things to walk into an interview with! In today’s competitive job market, there are often hundreds of candidates applying for the same position.  With so many applicants to choose from, interviewers must make decisions about who to call back and who to hire based on their gut feeling.  One drawback to reading an interview Question and Answer article is that you might sound very boring in your interview – saying the same thing as every single candidate, and parroting answers that sound like someone else wrote them for you – because they did.  However, it is definitely better to go to an interview prepared, just make sure that you give a lot of thought to your answers to make them sound like your own voice.  Remember that you will be hired based on your skills as well as your personality, so don’t think that having all the right answers will make get you hired.  A good interviewee will be prepared, confident, and know how to stand out. My Opinion A list of interview Questions and Answers is really helpful as a refresher or for beginners. To really ace an interview, one needs to have real-world, hands-on experience with SQL Server as well. Interview questions just serve as a starter or easy read for experienced professionals. When I have to learn new technology, I often search online for interview questions and get an idea about the breadth and depth of the technology. Next Action I am going to write about interview Questions and Answers for next 30 days. I have previously written a series of interview questions and answers; now I have re-written them keeping the latest version of SQL Server and current industry progress in mind. If you have faced interesting interview questions or situations, please write to me and I will publish them as a guest post. If you want me to add few more details, leave a comment and I will make sure that I do my best to accommodate. Tomorrow we will start the interview Questions and Answers series, with a few interesting stories, best practices and guest posts. We will have a prize give-away and other awards when the series ends. List of all the Interview Questions and Answers Series blogs Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Interview Questions and Answers, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • How Oracle Data Integration Customers Differentiate Their Business in Competitive Markets

    - by Irem Radzik
    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 With data being a central force in driving innovation and competing effectively, data integration has become a key IT approach to remove silos and ensure working with consistent and trusted data. Especially with the release of 12c version, Oracle Data Integrator and Oracle GoldenGate offer easy-to-use and high-performance solutions that help companies with their critical data initiatives, including big data analytics, moving to cloud architectures, modernizing and connecting transactional systems and more. In a recent press release we announced the great momentum and analyst recognition Oracle Data Integration products have achieved in the data integration and replication market. In this press release we described some of the key new features of Oracle Data Integrator 12c and Oracle GoldenGate 12c. In addition, a few from our 4500+ customers explained how Oracle’s data integration platform helped them achieve their business goals. In this blog post I would like to go over what these customers shared about their experience. Land O’Lakes is one of America’s premier member-owned cooperatives, and offers an extensive line of agricultural supplies, as well as production and business services. Rich Bellefeuille, manager, ETL & data warehouse for Land O’Lakes told us how GoldenGate helped them modernize their critical ERP system without impacting service and how they are moving to new projects with Oracle Data Integrator 12c: “With Oracle GoldenGate 11g, we've been able to migrate our enterprise-wide implementation of Oracle’s JD Edwards EnterpriseOne, ERP system, to a new database and application server platform with minimal downtime to our business. Using Oracle GoldenGate 11g we reduced database migration time from nearly 30 hours to less than 30 minutes. Given our quick success, we are considering expansion of our Oracle GoldenGate 12c footprint. We are also in the midst of deploying a solution leveraging Oracle Data Integrator 12c to manage our pricing data to handle orders more effectively and provide a better relationship with our clients. We feel we are gaining higher productivity and flexibility with Oracle's data integration products." ICON, a global provider of outsourced development services to the pharmaceutical, biotechnology and medical device industries, highlighted the competitive advantage that a solid data integration foundation brings. Diarmaid O’Reilly, enterprise data warehouse manager, ICON plc said “Oracle Data Integrator enables us to align clinical trials intelligence with the information needs of our sponsors. It helps differentiate ICON’s services in an increasingly competitive drug-development industry."  You can find more info on ICON's implementation here. A popular use case for Oracle GoldenGate’s real-time data integration is offloading operational reporting from critical transaction processing systems. SolarWorld, one of the world’s largest solar-technology producers and the largest U.S. solar panel manufacturer, implemented Oracle GoldenGate for real-time data integration of manufacturing data for fast analysis. Russ Toyama, U.S. senior database administrator for SolarWorld told us real-time data helps their operations and GoldenGate’s solution supports high performance of their manufacturing systems: “We use Oracle GoldenGate for real-time data integration into our decision support system, which performs real-time analysis for manufacturing operations to continuously improve product quality, yield and efficiency. With reliable and low-impact data movement capabilities, Oracle GoldenGate also helps ensure that our critical manufacturing systems are stable and operate with high performance."  You can watch the full interview with SolarWorld's Russ Toyama here. Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Starwood Hotels and Resorts is one of the many customers that found out how well Oracle Data Integration products work with Oracle Exadata. Gordon Light, senior director of information technology for StarWood Hotels, says they had notable performance gain in loading Oracle Exadata reporting environment: “We leverage Oracle GoldenGate to replicate data from our central reservations systems and other OLTP databases – significantly decreasing the overall ETL duration. Moving forward, we plan to use Oracle GoldenGate to help the company achieve near-real-time reporting.”You can listen about Starwood Hotels' implementation here. Many companies combine the power of Oracle GoldenGate with Oracle Data Integrator to have a single, integrated data integration platform for variety of use cases across the enterprise. Ufone is another good example of that. The leading mobile communications service provider of Pakistan has improved customer service using timely customer data in its data warehouse. Atif Aslam, head of management information systems for Ufone says: “Oracle Data Integrator and Oracle GoldenGate help us integrate information from various systems and provide up-to-date and real-time CRM data updates hourly, rather than daily. The applications have simplified data warehouse operations and allowed business users to make faster and better informed decisions to protect revenue in the fast-moving Pakistani telecommunications market.” You can read more about Ufone's use case here. In our Oracle Data Integration 12c launch webcast back in November we also heard from BT’s CTO Surren Parthab about their use of GoldenGate for moving to private cloud architecture. Surren also shared his perspectives on Oracle Data Integrator 12c and Oracle GoldenGate 12c releases. You can watch the video here. These are only a few examples of leading companies that have made data integration and real-time data access a key part of their data governance and IT modernization initiatives. They have seen real improvements in how their businesses operate and differentiate in today’s competitive markets. You can read about other customer examples in our Ebook: The Path to the Future and access resources including white papers, data sheets, podcasts and more via our Oracle Data Integration resource kit. /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;}

    Read the article

  • SQL SERVER – 5 Tips for Improving Your Data with expressor Studio

    - by pinaldave
    It’s no secret that bad data leads to bad decisions and poor results.  However, how do you prevent dirty data from taking up residency in your data store?  Some might argue that it’s the responsibility of the person sending you the data.  While that may be true, in practice that will rarely hold up.  It doesn’t matter how many times you ask, you will get the data however they decide to provide it. So now you have bad data.  What constitutes bad data?  There are quite a few valid answers, for example: Invalid date values Inappropriate characters Wrong data Values that exceed a pre-set threshold While it is certainly possible to write your own scripts and custom SQL to identify and deal with these data anomalies, that effort often takes too long and becomes difficult to maintain.  Instead, leveraging an ETL tool like expressor Studio makes the data cleansing process much easier and faster.  Below are some tips for leveraging expressor to get your data into tip-top shape. Tip 1:     Build reusable data objects with embedded cleansing rules One of the new features in expressor Studio 3.2 is the ability to define constraints at the metadata level.  Using expressor’s concept of Semantic Types, you can define reusable data objects that have embedded logic such as constraints for dealing with dirty data.  Once defined, they can be saved as a shared atomic type and then re-applied to other data attributes in other schemas. As you can see in the figure above, I’ve defined a constraint on zip code.  I can then save the constraint rules I defined for zip code as a shared atomic type called zip_type for example.   The next time I get a different data source with a schema that also contains a zip code field, I can simply apply the shared atomic type (shown below) and the previously defined constraints will be automatically applied. Tip 2:     Unlock the power of regular expressions in Semantic Types Another powerful feature introduced in expressor Studio 3.2 is the option to use regular expressions as a constraint.   A regular expression is used to identify patterns within data.   The patterns could be something as simple as a date format or something much more complex such as a street address.  For example, I could define that a valid IP address should be made up of 4 numbers, each 0 to 255, and separated by a period.  So 192.168.23.123 might be a valid IP address whereas 888.777.0.123 would not be.   How can I account for this using regular expressions? A very simple regular expression that would look for any 4 sets of 3 digits separated by a period would be:  ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ Alternatively, the following would be the exact check for truly valid IP addresses as we had defined above:  ^(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])$ .  In expressor, we would enter this regular expression as a constraint like this: Here we select the corrective action to be ‘Escalate’, meaning that the expressor Dataflow operator will decide what to do.  Some of the options include rejecting the offending record, skipping it, or aborting the dataflow. Tip 3:     Email pattern expressions that might come in handy In the example schema that I am using, there’s a field for email.  Email addresses are often entered incorrectly because people are trying to avoid spam.  While there are a lot of different ways to define what constitutes a valid email address, a quick search online yields a couple of really useful regular expressions for validating email addresses: This one is short and sweet:  \b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b (Source: http://www.regular-expressions.info/) This one is more specific about which characters are allowed:  ^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$ (Source: http://regexlib.com/REDetails.aspx?regexp_id=26 ) Tip 4:     Reject “dirty data” for analysis or further processing Yet another feature introduced in expressor Studio 3.2 is the ability to reject records based on constraint violations.  To capture reject records on input, simply specify Reject Record in the Error Handling setting for the Read File operator.  Then attach a Write File operator to the reject port of the Read File operator as such: Next, in the Write File operator, you can configure the expressor operator in a similar way to the Read File.  The key difference would be that the schema needs to be derived from the upstream operator as shown below: Once configured, expressor will output rejected records to the file you specified.  In addition to the rejected records, expressor also captures some diagnostic information that will be helpful towards identifying why the record was rejected.  This makes diagnosing errors much easier! Tip 5:    Use a Filter or Transform after the initial cleansing to finish the job Sometimes you may want to predicate the data cleansing on a more complex set of conditions.  For example, I may only be interested in processing data containing males over the age of 25 in certain zip codes.  Using an expressor Filter operator, you can define the conditional logic which isolates the records of importance away from the others. Alternatively, the expressor Transform operator can be used to alter the input value via a user defined algorithm or transformation.  It also supports the use of conditional logic and data can be rejected based on constraint violations. However, the best tip I can leave you with is to not constrain your solution design approach – expressor operators can be combined in many different ways to achieve the desired results.  For example, in the expressor Dataflow below, I can post-process the reject data from the Filter which did not meet my pre-defined criteria and, if successful, Funnel it back into the flow so that it gets written to the target table. I continue to be impressed that expressor offers all this functionality as part of their FREE expressor Studio desktop ETL tool, which you can download from here.  Their Studio ETL tool is absolutely free and they are very open about saying that if you want to deploy their software on a dedicated Windows Server, you need to purchase their server software, whose pricing is posted on their website. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Is Berkeley DB a NoSQL solution?

    - by Gregory Burd
    Berkeley DB is a library. To use it to store data you must link the library into your application. You can use most programming languages to access the API, the calls across these APIs generally mimic the Berkeley DB C-API which makes perfect sense because Berkeley DB is written in C. The inspiration for Berkeley DB was the DBM library, a part of the earliest versions of UNIX written by AT&T's Ken Thompson in 1979. DBM was a simple key/value hashtable-based storage library. In the early 1990s as BSD UNIX was transitioning from version 4.3 to 4.4 and retrofitting commercial code owned by AT&T with unencumbered code, it was the future founders of Sleepycat Software who wrote libdb (aka Berkeley DB) as the replacement for DBM. The problem it addressed was fast, reliable local key/value storage. At that time databases almost always lived on a single node, even the most sophisticated databases only had simple fail-over two node solutions. If you had a lot of data to store you would choose between the few commercial RDBMS solutions or to write your own custom solution. Berkeley DB took the headache out of the custom approach. These basic market forces inspired other DBM implementations. There was the "New DBM" (ndbm) and the "GNU DBM" (GDBM) and a few others, but the theme was the same. Even today TokyoCabinet calls itself "a modern implementation of DBM" mimicking, and improving on, something first created over thirty years ago. In the mid-1990s, DBM was the name for what you needed if you were looking for fast, reliable local storage. Fast forward to today. What's changed? Systems are connected over fast, very reliable networks. Disks are cheep, fast, and capable of storing huge amounts of data. CPUs continued to follow Moore's Law, processing power that filled a room in 1990 now fits in your pocket. PCs, servers, and other computers proliferated both in business and the personal markets. In addition to the new hardware entire markets, social systems, and new modes of interpersonal communication moved onto the web and started evolving rapidly. These changes cause a massive explosion of data and a need to analyze and understand that data. Taken together this resulted in an entirely different landscape for database storage, new solutions were needed. A number of novel solutions stepped up and eventually a category called NoSQL emerged. The new market forces inspired the CAP theorem and the heated debate of BASE vs. ACID. But in essence this was simply the market looking at what to trade off to meet these new demands. These new database systems shared many qualities in common. There were designed to address massive amounts of data, millions of requests per second, and scale out across multiple systems. The first large-scale and successful solution was Dynamo, Amazon's distributed key/value database. Dynamo essentially took the next logical step and added a twist. Dynamo was to be the database of record, it would be distributed, data would be partitioned across many nodes, and it would tolerate failure by avoiding single points of failure. Amazon did this because they recognized that the majority of the dynamic content they provided to customers visiting their web store front didn't require the services of an RDBMS. The queries were simple, key/value look-ups or simple range queries with only a few queries that required more complex joins. They set about to use relational technology only in places where it was the best solution for the task, places like accounting and order fulfillment, but not in the myriad of other situations. The success of Dynamo, and it's design, inspired the next generation of Non-SQL, distributed database solutions including Cassandra, Riak and Voldemort. The problem their designers set out to solve was, "reliability at massive scale" so the first focal point was distributed database algorithms. Underneath Dynamo there is a local transactional database; either Berkeley DB, Berkeley DB Java Edition, MySQL or an in-memory key/value data structure. Dynamo was an evolution of local key/value storage onto networks. Cassandra, Riak, and Voldemort all faced similar design decisions and one, Voldemort, choose Berkeley DB Java Edition for it's node-local storage. Riak at first was entirely in-memory, but has recently added write-once, append-only log-based on-disk storage similar type of storage as Berkeley DB except that it is based on a hash table which must reside entirely in-memory rather than a btree which can live in-memory or on disk. Berkeley DB evolved too, we added high availability (HA) and a replication manager that makes it easy to setup replica groups. Berkeley DB's replication doesn't partitioned the data, every node keeps an entire copy of the database. For consistency, there is a single node where writes are committed first - a master - then those changes are delivered to the replica nodes as log records. Applications can choose to wait until all nodes are consistent, or fire and forget allowing Berkeley DB to eventually become consistent. Berkeley DB's HA scales-out quite well for read-intensive applications and also effectively eliminates the central point of failure by allowing replica nodes to be elected (using a PAXOS algorithm) to mastership if the master should fail. This implementation covers a wide variety of use cases. MemcacheDB is a server that implements the Memcache network protocol but uses Berkeley DB for storage and HA to replicate the cache state across all the nodes in the cache group. Google Accounts, the user authentication layer for all Google properties, was until recently running Berkeley DB HA. That scaled to a globally distributed system. That said, most NoSQL solutions try to partition (shard) data across nodes in the replication group and some allow writes as well as reads at any node, Berkeley DB HA does not. So, is Berkeley DB a "NoSQL" solution? Not really, but it certainly is a component of many of the existing NoSQL solutions out there. Forgetting all the noise about how NoSQL solutions are complex distributed databases when you boil them down to a single node you still have to store the data to some form of stable local storage. DBMs solved that problem a long time ago. NoSQL has more to do with the layers on top of the DBM; the distributed, sometimes-consistent, partitioned, scale-out storage that manage key/value or document sets and generally have some form of simple HTTP/REST-style network API. Does Berkeley DB do that? Not really. Is Berkeley DB a "NoSQL" solution today? Nope, but it's the most robust solution on which to build such a system. Re-inventing the node-local data storage isn't easy. A lot of people are starting to come to appreciate the sophisticated features found in Berkeley DB, even mimic them in some cases. Could Berkeley DB grow into a NoSQL solution? Absolutely. Our key/value API could be extended over the net using any of a number of existing network protocols such as memcache or HTTP/REST. We could adapt our node-local data partitioning out over replicated nodes. We even have a nice query language and cost-based query optimizer in our BDB XML product that we could reuse were we to build out a document-based NoSQL-style product. XML and JSON are not so different that we couldn't adapt one to work with the other interchangeably. Without too much effort we could add what's missing, we could jump into this No SQL market withing a single product development cycle. Why isn't Berkeley DB already a NoSQL solution? Why aren't we working on it? Why indeed...

    Read the article

  • Company Review: Google Products

    Google, Inc offers an array of products and services to all of its end-users. However their search capabilities are the foundation for Google’s current success and their primary business focus. Currently, Google offers over twenty different search applications that allow users to search the internet for books, maps, videos, images, products and much more. Their product decisions have allowed users demands to be met while focusing on the free based model. This allows users to access Google data free of charge and indirectly gives Google a strong competitive advantage of other competitors along with the accuracy of the search results. According to Google, Inc, they offer the following types of searching capabilities: Alerts Get email updates on the topics of your choice Blog Search Find blogs on your favorite topics  Books Search the full text of books  Custom Search Create a customized search experience for your community  Desktop Search and personalize your computer  Dictionary Search for definitions of words and phrases Directory Search the web, organized by topic or category Earth Explore the world from your computer Finance Business info, news and interactive charts GOOG-411 Find and connect for free with businesses from your phone  Images Search for images on the web Maps View maps and directions News Search thousands of news stories Patent Search Search the full text of US Patents Product Search Search for stuff to buy Scholar Search scholarly papers Toolbar Add a search box to your browser Trends Explore past and present search trends Videos Search for videos on the web Web Search Search billions of web pages Web Search Features Find movies, music, stocks, books and more mapping Google’s free based business model is only one way it differentiates itself from its competition. There is also a strong focus on the accuracy of search results and the speed in which they are returned to the end-user. Quality function deployment (QFD) is a structured method used to help connect user needs to the design features of a project proposed to address those needs. This method is particularly useful in accounting for needs that are not easily articulated or precisely defined according to the U. S. Department of Transportation Federal Highway Administration. Due to the fact that QFD is so customer driven Google is always in a constant state of change in attempt to reengineer its search algorithms, and other dependant systems so that end-users requirements are constantly being met. Value engineering is a key example of this, Google is constantly trying to improve all aspects of its products, improve system maintainability, and system interoperability. Bridgefield Group defines value engineering as an organized methodology that identifies and selects the lowest lifecycle cost options in design, materials and processes that achieves the desired level of performance, reliability and customer satisfaction. In addition, it seeks to remove unnecessary costs in the above areas and is often a joint effort with cross-functional internal teams and relevant suppliers. Common issues that appear when developing large scale systems like Google’s search applications include modular design of a product and/or service and providing accurate value analysis. A design approach that adheres to four fundamental tenets of cohesiveness, encapsulation, self-containment, and high binding to design a system component as an independently operable unit subject to change is how the Open System Joint Task Force defines modular design. More specifically M. S. Schmaltz defines modular software design as having a large collection of statements strung together in one partition of in-line code; we segment or divide the statements into logical groups called modules. Each module performs one or two tasks, and then passes control to another module. By breaking up the code into "bite-sized chunks", so to speak, we are able to better control the flow of data and control. This is especially true in large software systems. Value analysis is a process to evaluate products and services based on effectiveness, safety, and cost. Value analysis involves assessing the quality as well as the cost of a product or service as defined by the Healthcare Financial Management Association.  “Operations Management deals with the design and management of products, processes, services and supply chains. It considers the acquisition, development, and utilization of resources that firms need to deliver the goods and services their clients want.” (MIT,2010) Google, Inc encourages an open environment between all employees, also known as Googlers. This is reinforced by a cross-section team or cross-functional teams comprised from multiple departments assigned to every project so that every department like marketing, finance, and quality assurance has input on every project. In addition, Google is known for their openness to new ideas regardless of the status or seniority of an employee. In fact, Google allows for 20% of an employee’s time can be devoted to developing new ideas and/or pet projects. HumTech.com defines a cross-functional team as a collection of people with varied levels of skills and experience brought together to accomplish a task. As the name implies, Cross-Functional Team members come from different organizational units. Cross-Functional Teams may be permanent or ad hoc. Google’s search application product strategy primarily focuses on mass customization. This is allows Google to create a base search application and allows results to be returned to the end-users quickly based on specific parameters and search settings. In addition, they also store the data that is returned in case other desire the same results based on other end-users supplying the same customized settings. This allows Google to appear to render search results in virtually real-time to the user while allowing for complete customization of the searching criteria. Greg Vogl, a professor at Uganda Martyrs University, defines mass customization as when a business gives its customers the opportunity to tailor its products or services to the customer's specifications. The IT staff at Google play a key role in ensuring that the search application’s product strategy is maintained simply because the IT staff designs, develops, and maintains all of their proprietary applications. In fact, they also maintain all network infrastructure to ensure that it is available to all end-users. References: http://www.google.com/intl/en/options/ http://ops.fhwa.dot.gov/freight/publications/ftat_user_guide/sec5.htm http://www.bridgefieldgroup.com/bridgefieldgroup/glos9.htm#V http://www.acq.osd.mil/osjtf/termsdef.html http://www.cise.ufl.edu/~mssz/Pascal-CGS2462/prog-dsn.html http://www.hfma.org/publications/business_caring_newsletter/exclusives/Supply+and+Inventory+Terms+Defined.htm http://mitsloan.mit.edu/omg/om-definition.php http://www.humtech.com/opm/grtl/ols/ols3.cfm http://www.gregvogl.net/courses/mis1/glossary.htm

    Read the article

  • Developer’s Life – Disaster Lessons – Notes from the Field #039

    - by Pinal Dave
    [Note from Pinal]: This is a 39th episode of Notes from the Field series. What is the best solution do you have when you encounter a disaster in your organization. Now many of you would answer that in this scenario you would have another standby machine or alternative which you will plug in. Now let me ask second question – What would you do if you as an individual faces disaster?  In this episode of the Notes from the Field series database expert Mike Walsh explains a very crucial issue we face in our career, which is not technical but more to relate to human nature. Read on this may be the best blog post you might read in recent times. Howdy! When it was my turn to share the Notes from the Field last time, I took a departure from my normal technical content to talk about Attitude and Communication.(http://blog.sqlauthority.com/2014/05/08/developers-life-attitude-and-communication-they-can-cause-problems-notes-from-the-field-027/) Pinal said it was a popular topic so I hope he won’t mind if I stick with Professional Development for another of my turns at sharing some information here. Like I said last time, the “soft skills” of the IT world are often just as important – sometimes more important – than the technical skills. As a consultant with Linchpin People – I see so many situations where the professional skills I’ve gained and use are more valuable to clients than knowing the best way to tune a query. Today I want to continue talking about professional development and tell you about the way I almost got myself hit by a train – and why that matters in our day jobs. Sometimes we can learn a lot from disasters. Whether we caused them or someone else did. If you are interested in learning about some of my observations in these lessons you can see more where I talk about lessons from disasters on my blog. For now, though, onto how I almost got my vehicle hit by a train… The Train Crash That Almost Was…. My family and I own a little schoolhouse building about a 10 mile drive away from our house. We use it as a free resource for families in the area that homeschool their children – so they can have some class space. I go up there a lot to check in on the property, to take care of the trash and to do work on the property. On the way there, there is a very small Stop Sign controlled railroad intersection. There is only two small freight trains a day passing there. Actually the same train, making a journey south and then back North. That’s it. This road is a small rural road, barely ever a second car driving in the neighborhood there when I am. The stop sign is pretty much there only for the train crossing. When we first bought the building, I was up there a lot doing renovations on the property. Being familiar with the area, I am also familiar with the train schedule and know the tracks are normally free of trains. So I developed a bad habit. You see, I’d approach the stop sign and slow down as I roll through it. Sometimes I’d do a quick look and come to an “almost” stop there but keep on going. I let my impatience and complacency take over. And that is because most of the time I was going there long after the train was done for the day or in between the runs. This habit became pretty well established after a couple years of driving the route. The behavior reinforced a bit by the success ratio. I saw others doing it as well from the neighborhood when I would happen to be there around the time another car was there. Well. You already know where this ends up by the title and backstory here. A few months ago I came to that little crossing, and I started to do the normal routine. I’d pretty much stopped looking in some respects because of the pattern I’d gotten into.  For some reason I looked and heard and saw the train slowly approaching and slammed on my brakes and stopped. It was an abrupt stop, and it was close. I probably would have made it okay, but I sat there thinking about lessons for IT professionals from the situation once I started breathing again and watched the cars loaded with sand and propane slowly labored down the tracks… Here are Those Lessons… It’s easy to get stuck into a routine – That isn’t always bad. Except when it’s a bad routine. Momentum and inertia are powerful. Once you have a habit and a routine developed – it’s really hard to break that. Make sure you are setting the right routines and habits TODAY. What almost dangerous things are you doing today? How are you almost messing up your production environment today? Stop doing that. Be Deliberate – (Even when you are the only one) – Like I said – a lot of people roll through that stop sign. Perhaps the neighbors or other drivers think “why is he fully stopping and looking… The train only comes two times a day!” – they can think that all they want. Through deliberate actions and forcing myself to pay attention, I will avoid that oops again. Slow down. Take a deep breath. Be Deliberate in your job. Pay attention to the small stuff and go out of your way to be careful. It will save you later. Be Observant – Keep your eyes open. By looking around, observing the situation and understanding what your servers, databases, users and vendors are doing – you’ll notice when something is out of place. But if you don’t know what is normal, if you don’t look to make sure nothing has changed – that train will come and get you. Where can you be more observant? What warning signs are you ignoring in your environment today? In the IT world – trains are everywhere. Projects move fast. Decisions happen fast. Problems turn from a warning sign to a disaster quickly. If you get stuck in a complacent pattern of “Everything is okay, it always has been and always will be” – that’s the time that you will most likely get stuck in a bad situation. Don’t let yourself get complacent, don’t let your team get complacent. That will lead to being proactive. And a proactive environment spends less money on consultants for troubleshooting problems you should have seen ahead of time. You can spend your money and IT budget on improving for your customers. If you want to get started with performance analytics and triage of virtualized SQL Servers with the help of experts, read more over at Fix Your SQL Server. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Notes from the Field, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • User Experience Highlights in PeopleSoft and PeopleTools: Direct from Jeff Robbins

    - by mvaughan
    By Kathy Miedema, Oracle Applications User Experience  This is the fifth in a series of blog posts on the user experience (UX) highlights in various Oracle product families. The last posted interview was with Nadia Bendjedou, Senior Director, Product Strategy on upcoming Oracle E-Business Suite user experience highlights. You’ll see themes around productivity and efficiency, and get an early look at the latest mobile offerings coming through these product lines. Today’s post is on the user experience in PeopleSoft and PeopleTools. To learn more about what’s ahead, attend PeopleSoft or PeopleTools OpenWorld presentations.This interview is with Jeff Robbins, Senior Director, PeopleSoft Development. Jeff Robbins Q: How would you describe the vision you have for the user experience of PeopleSoft?A: Intuitive – Specifically, customers use PeopleSoft to help their employees do their day-to-day work, and the UI (user interface) has been helpful and assistive in that effort. If it’s not obvious what they need to do a task, then the UI isn’t working. So the application needs to make it simple for users to find information they need, complete a task, do all the things they are responsible for, and it really helps when the UI just makes sense. Productive – PeopleSoft is a tool used to support people to do their work, and a lot of users are measured by how much work they’re able to get done per hour, per day, etc. The UI needs to help them be as productive as possible, and can’t make them waste time or energy. The UI needs to reflect the type of work necessary for a task -- if it's data entry, the UI needs to assist the user to get information into the system. For analysts, the UI needs help users assess or analyze information in a particular way. Innovative – The concept of the UI being innovative is something we’ve been working on for years. It’s not just that we want to be seen as innovative, the fact is that companies are asking their employees to do more than they’ve ever asked before. More often companies want to roll out processes as employee or manager self-service, where an employee is responsible to review and maintain their own data. So we’ve had to reinvent, and ask,  “How can we modify the ways an employee interacts with our applications so that they can be more productive and efficient – even with tasks that are entirely unfamiliar?”  Our focus on innovation has forced us to design new ways for users to interact with the entire application.Q: How are the UX features you have delivered so far resonating with customers?  A: Resonating very well. We’re hearing tremendous responses from users, managers, decision-makers -- who are very happy with the improved user experience. Many of the individual features resonate well. Some have really hit home, others are better than they used to be but show us that there’s still room for improvement.A couple innovations really stand out; features that have a significant effect on how users interact with PeopleSoft.First, the deployment of PeopleSoft in a way that’s more like a consumer website with the PeopleSoft Home page and Dashboards.  This new approach is very web-centric, where users feel they’re coming to a website rather than logging into an enterprise application.  There’s lots of information from all around the organization collected in a way that feels very familiar to users. In order to do your job, you can come to this web site rather than having to learn how to log into an application and figure out a complicated menu. Companies can host these really rich web sites for employees that are home pages for accessing critical tasks and information. The UI elements of incorporating search into the whole navigation process is another hit. Rather than having to log in and choose a task from a menu, users come to the web site and begin a task by simply searching for data: themselves, another employee, a customer record, whatever.  The search results include the data along with a set of actions the user might take, completely eliminating the need to hunt through a complicated system menu. Search-centric navigation is really sitting well with customers who are trying to deploy an intuitive set of systems. Q: Are any UX highlights more popular than you expected them to be?  A: We introduced a feature called Pivot Grid in the last release, which is a combination of an interactive grid, like an Excel Pivot Table, along with a dynamic visual chart that automatically graphs the data. I wasn’t certain at first how extensively this would be used. It looked like an innovative tool, but it wasn’t clear how it would be incorporated in business process applications. The fact is that everyone who sees Pivot Grids is thrilled with that kind of interactivity.  It reflects the amount of analytical thinking customers are asking employees to do. Employees can’t just enter data any more. They must interact with it, analyze it, and make decisions. Pivot Grids fit into this way of working. Q: What can you tell us about PeopleSoft’s mobile offerings?A: A lot of customers are finding that mobile is the chief priority in their organization.  They tell us they want their employees to be able to access company information from their mobile devices.  Of course, not everyone has the same requirements, so we’re working to make sure we can help our customers accomplish what they’re trying to do.  We’ve already delivered a number of mobile features.  For instance, PeopleSoft home pages, dashboards and workcenters all work well on an iPad, straight out of the box.  We’ve delivered a number of key functions and tasks for mobile workers – those who are responsible for using a mobile device to manage inventory, for example.  Customers tell us they also need a holistic strategy, one that allows their employees to access nearly every task from a mobile device.  While we don’t expect users to do extensive data entry from their smartphone, it makes sense that they have access to company information and systems while away from their desk.  That’s where our strategy is going now.  We plan to unveil a number of new mobile offerings at OpenWorld.  Some will be available then, some shortly after. Q: What else are you working on now that you think is going to be exciting to customers at Oracle OpenWorld?A: Our next release -- the big thing is PeopleSoft 9.2, and we’ll be talking about the huge amount of work that’s gone into the next versions. A new toolset, 8.53, will be coming, and there’s a lot to talk about there, and the next generation of PeopleSoft 9.2.  We have a ton of new stuff coming.Q: What do you want PeopleSoft customers to know? A: We have been focusing on the user experience in PeopleSoft as a very high priority for the last 4 years, and it’s had interesting effects. One thing is that the application is better, more usable.  We’ve made visible improvements. Another aspect is that in customers’ minds, the PeopleSoft brand is being reinvigorated. Customers invested in PeopleSoft years ago, and then they weren’t sure where PeopleSoft was going.  This investment in the UI and overall user experience keeps PeopleSoft current, innovative and fresh.  Customers  are able to take advantage of a lot of new features, even on the older applications, simply by upgrading their PeopleTools. The interest in that ability has been tremendous. Knowing they have a lot of these features available -- right now, that’s pretty huge. There’s been a tremendous amount of positive response, just on the fact that we’re focusing on the user experience. Editor’s note: For more on PeopleSoft and PeopleTools user experience highlights, visit the Usable Apps web site.To find out more about these enhancements at Openworld, be sure to check out these sessions: GEN8928     General Session: PeopleSoft Update and Product RoadmapCON9183     PeopleSoft PeopleTools Technology Roadmap CON8932     New Functional PeopleSoft PeopleTools Capabilities for the Line-of-Business UserCON9196     PeopleSoft PeopleTools Roadmap: Mobile ApplicationsCON9186     Case Study: Delivering a Groundbreaking User Interface with PeopleSoft PeopleTools

    Read the article

  • Developer’s Life – Attitude and Communication – They Can Cause Problems – Notes from the Field #027

    - by Pinal Dave
    [Note from Pinal]: This is a 27th episode of Notes from the Field series. The biggest challenge for anyone is to understand human nature. We human have so many things on our mind at any moment of time. There are cases when what we say is not what we mean and there are cases where what we mean we do not say. We do say and things as per our mood and our agenda in mind. Sometimes there are incidents when our attitude creates confusion in the communication and we end up creating a situation which is absolutely not warranted. In this episode of the Notes from the Field series database expert Mike Walsh explains a very crucial issue we face in our career, which is not technical but more to relate to human nature. Read on this may be the best blog post you might read in recent times. In this week’s note from the field, I’m taking a slight departure from technical knowledge and concepts explained. We’ll be back to it next week, I’m sure. Pinal wanted us to explain some of the issues we bump into and how we see some of our customers arrive at problem situations and how we have helped get them back on the right track. Often it is a technical problem we are officially solving – but in a lot of cases as a consultant, we are really helping fix some communication difficulties. This is a technical blog post and not an “advice column” in a newspaper – but the longer I am a consultant, the more years I add to my experience in technology the more I learn that the vast majority of the problems we encounter have “soft skills” included in the chain of causes for the issue we are helping overcome. This is not going to be exhaustive but I hope that sharing four pieces of advice inspired by real issues starts a process of searching for places where we can be the cause of these challenges and look at fixing them in ourselves. Or perhaps we can begin looking at resolving them in teams that we manage. I’ll share three statements that I’ve either heard, read or said and talk about some of the communication or attitude challenges highlighted by the statement. 1 – “But that’s the SAN Administrator’s responsibility…” I heard that early on in my consulting career when talking with a customer who had serious corruption and no good recent backups – potentially no good backups at all. The statement doesn’t have to be this one exactly, but the attitude here is an attitude of “my job stops here, and I don’t care about the intent or principle of why I’m here.” It’s also a situation of having the attitude that as long as there is someone else to blame, I’m fine…  You see in this case, the DBA had a suspicion that the backups were not being handled right.  They were the DBA and they knew that they had responsibility to ensure SQL backups were good to go – it’s a basic requirement of a production DBA. In my “As A DBA Where Do I start?!” presentation, I argue that is job #1 of a DBA. But in this case, the thought was that there was someone else to blame. Rather than create extra work and take on responsibility it was decided to just let it be another team’s responsibility. This failed the company, the company’s customers and no one won. As technologists – we should strive to go the extra mile. If there is a lack of clarity around roles and responsibilities and we know it – we should push to get it resolved. Especially as the DBAs who should act as the advocates of the data contained in the databases we are responsible for. 2 – “We’ve always done it this way, it’s never caused a problem before!” Complacency. I have to say that many failures I’ve been paid good money to help recover from would have not happened had it been for an attitude of complacency. If any thoughts like this have entered your mind about your situation you may be suffering from it. If, while reading this, you get this sinking feeling in your stomach about that one thing you know should be fixed but haven’t done it.. Why don’t you stop and go fix it then come back.. “We should have better backups, but we’re on a SAN so we should be fine really.” “Technically speaking that could happen, but what are the chances?” “We’ll just clean that up as a fast follow” ..and so on. In the age of tightening IT budgets, increased expectations of up time, availability and performance there is no room for complacency. Our customers and business units expect – no demand – the best. Complacency says “we will give you second best or hopefully good enough and we accept the risk and know this may hurt us later. Sometimes an organization will opt for “good enough” and I agree with the concept that at times the perfect can be the enemy of the good. But when we make those decisions in a vacuum and are not reporting them up and discussing them as an organization that is different. That is us unilaterally choosing to do something less than the best and purposefully playing a game of chance. 3 – “This device must accept interference from other devices but not create any” I’ve paraphrased this one – but it’s something the Federal Communications Commission – a federal agency in the United States that regulates electronic communication – requires of all manufacturers of any device that could cause or receive interference electronically. I blogged in depth about this here (http://www.straightpathsql.com/archives/2011/07/relationship-advice-from-the-fcc/) so I won’t go into much detail other than to say this… If we all operated more on the premise that we should do our best to not be the cause of conflict, and to be less easily offended and less upset when we perceive offense life would be easier in many areas! This doesn’t always cause the issues we are called in to help out. Not directly. But where we see it is in unhealthy relationships between the various technology teams at a client. We’ll see teams hoarding knowledge, not sharing well with others and almost working against other teams instead of working with them. If you trace these problems back far enough it often stems from someone or some group of people violating this principle from the FCC. To Sum It Up Technology problems are easy to solve. At Linchpin People we help many customers get past the toughest technological challenge – and at the end of the day it is really just a repeatable process of pattern based troubleshooting, logical thinking and starting at the beginning and carefully stepping through to the end. It’s easy at the end of the day. The tough part of what we do as consultants is the people skills. Being able to help get teams working together, being able to help teams take responsibility, to improve team to team communication? That is the difficult part, and we get to use the soft skills on every engagement. Work on professional development (http://professionaldevelopment.sqlpass.org/) and see continuing improvement here, not just with technology. I can teach just about anyone how to be an excellent DBA and performance tuner, but some of these soft skills are much more difficult to teach. If you want to get started with performance analytics and triage of virtualized SQL Servers with the help of experts, read more over at Fix Your SQL Server. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Notes from the Field, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • September Independent Oracle User Group (IOUG) Regional Events:

    - by Mandy Ho
    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif";} September 5, 2012 – Denver, CO Oracle 11g Database Upgrade Seminar Join Roy Swonger, Senior Director of software development at Oracle to learn about upgrading to Oracle Database 11g. Topics include: All the required preparatory steps Database upgrade strategies Post-upgrade performance analysis Helpful tips and common pitfalls to watch out for http://www.oracle.com/webapps/events/ns/EventsDetail.jsp?p_eventId=152242&src=7598177&src=7598177&Act=4 September 6, 2012 – Salt Lake City, UT Fall Symposium 2012 Plan to join us for our annual fall event on Sept 6. They day will be filled with learning and networking with tracks focused on Applications, APEX, BI, Development and DBA Topics. This event is free for UTOUG members to attend, but please register. http://www.utoug.org/apex/f?p=972:2:6686308836668467::::P2_EVENT_ID:121 September 6, 2012 – Portland, OR Oracle’s Hands on Workshop Series focused on providing Defense-in-Depth Solutions to secure data at the source, reduce risk and simplify compliance The Oracle Database Security Workshop is a one-day hands-on session for IT Managers, IT Security Architects and Oracle DBAs who are looking for solutions to address their information protection, privacy, and accountability challenges within their Oracle database environment. Most security programs offered today fail toadequately address database security. Customers continue to be challenged tosecure information against loss and protect the integrity of sensitiveinformation like critical financial data, personally identifiable information(PII) and credit card data for PCI compliance. http://nwoug.org/content.aspx?page_id=87&club_id=165905&item_id=241082 September 11, 2012 – Montreal, QC APEXposed! For APEX aficionados – join ODTUG in Montreal, September 11-12 for APEXposed! Topics will include Dynamic Actions, Plug-ins, Tuning, and Building Mobile Apps. The cost is $399 US and early registration ends August 15th. For more information: http://www.odtugapextraining.com  September 11, 2012 – Philadelphia, PA Big Data & What are we still doing wrong with Tom Kyte Tom Kyte is a Senior Technical Architect in Oracle's Server Technology Division. Tom is the Tom behind the AskTom column in Oracle Magazine and is also the author of Expert Oracle Database Architecture (Apress, 2005/2009) among other books Abstract: Big Data The term "big data" draws a lot of attention, but behind the hype there's a simple story. For decades, companies have been making business decisions based on transactional data stored in relational databases. However, beyond that critical data is a potential treasure trove of less structured data: weblogs, social media, email, sensors, and photographs that can be mined for useful information. This presentation will take a look at what Big Data is and means - and Oracle's strategy for handling it Abstract: What are we still doing wrong? I've given many best practices presentations in the last 10 years. I've given many worst practices presentations in the last 10 years. I've seen some things change over the last ten years and many other things stay exactly the same. In this talk - we'll be taking a look at the good and the bad - what we do right and what we continue to do wrong over and over again. We'll look at why "Why" is probably the right initial answer to most any question. We'll look at how we get to "Know what we Know", and why that can be both a help and a hindrance. We'll peek at "Best Practices" and tie them into what I term "Worst Practices". In short, a talk on the good and the bad. Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif";} http://ioug.itconvergence.com/pls/apex/f?p=207:27:3669516430980563::NO September 12, 2012- New York, NY NYOUG Fall General Meeting “Trends in Database Administration and Why the Future of Database Administration is the Vdba” http://www.nyoug.org/upcoming_events.htm#General_Meeting1 September 21, 2012 – Cleveland, OH Oracle Database 11g for Developers: What You need to know or Oracle Database 11g New Features for Developers Attendees are introduced to the new and improved features of Oracle 11g (both Oracle 11g R1 and Oracle 11g R2) that directly impact application development. Special emphasis is placed on features that reduce development time, make development simpler, improve performance, or speed deployment. Specific topics include: New SQL functions, virtual columns, result caching, XML improvements, pivot statements, JDBC improvements, and PL/SQL enhancements such as compound triggers. http://www.neooug.org/ September 24, 2012 – Ottawa, ON Introduction to Oracle Spatial The free Oracle Locator functionality, and the Oracle Spatial option which dramatically extends Locator, are very useful, but poorly understood capabilities of the database. In the afternoon we will extend into additional areas selected from: storage and performance; answering business problems with spatial queries; using Oracle Maps in OBIEE; an overview and capabilities of Oracle Topology; under the covers with GeoCoding. Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif";} http://www.oug-ottawa.org/pls/htmldb/f?p=327:27:4209274028390246::NO

    Read the article

  • Lessons from rewriting POP Forums for MVC, open source-like

    - by Jeff
    It has been a ton of work, interrupted over the last two years by unemployment, moving, a baby, failing to sell houses and other life events, but it's really exciting to see POP Forums v9 coming together. I'm not even sure when I decided to really commit to it as an open source project, but working on the same team as the CodePlex folks probably had something to do with it. Moving along the roadmap I set for myself, the app is now running on a quasi-production site... we launched MouseZoom last weekend. (That's a post-beta 1 build of the forum. There's also some nifty Silverlight DeepZoom goodness on that site.)I have to make a point to illustrate just how important starting over was for me. I started this forum thing for my sites in old ASP more than ten years ago. What a mess that stuff was, including SQL injection vulnerabilities and all kinds of crap. It went to ASP.NET in 2002, but even then, it felt a little too much like script. More than a year later, in 2003, I did an honest to goodness rewrite. If you've been in this business of writing code for any amount of time, you know how much you hate what you wrote a month ago, so just imagine that with seven years in between. The subsequent versions still carried a fair amount of crap, and that's why I had to start over, to make a clean break. Mind you, much of that crap is still running on some of my production sites in a stable manner, but it's a pain in the ass to maintain.So with that clean break, there is much that I have learned. These are a few of those lessons, in no particular order...Avoid shiny object syndromeOver the years, I've embraced new things without bothering to ask myself why. I remember spending the better part of a year trying to adapt this app to use the membership and profile API's in ASP.NET, just because they were there. They didn't solve any known problem. Early on in this version, I dabbled in exotic ORM's, even though I already had the fundamental SQL that I knew worked. I bloated up the client side code with all kinds of jQuery UI and plugins just because, and it got in the way. All the new shiny can be distracting, and I've come to realize that I've allowed it to be a distraction most of my professional life.Just query what you needI've spent a lot of time over-thinking how to query data. In the SQL world, this means exotic joins, special caches, the read-update-commit loop of ORM's, etc. There are times when you have to remind yourself that you aren't Facebook, you'll never be Facebook, and that databases are in fact intended to serve data. In a lot of projects, back in the day, I used to have these big, rich data objects and pass them all over the place, through various application tiers, when in reality, all I needed was some ID from the entity. I try to be mindful of how many queries hit the database on a given request, but I don't obsess over it. I just get what I need.Don't spend too much time worrying about your unit testsIf you've looked at any of the tests for POP Forums, you might offer an audible WTF. That's OK. There's a whole lot of mocking going on. In some cases, it points out where you're doing too much, and that's good for improving your design. In other cases it shows where your design sucks. But the biggest trap of unit testing is that you worry it should be prettier. That's a waste of time. When you write a test, in many cases before the production code, the important part is that you're testing the right thing. If you have to mock up a bunch of stuff to test the outcome, so be it, but it's not wasted time. You're still doing up the typical arrange-action-assert deal, and you'll be able to read that later if you need to.Get back to your HTTP rootsASP.NET Webforms did a reasonably decent job at abstracting us away from the stateless nature of the Web. A lot of people criticize it, but I think it all worked pretty well. These days, with MVC, jQuery, REST services, and what not, we've gone back to thinking about the wire. The nuts and bolts passing between our Web browser and server matters. This doesn't make things harder, in my opinion, it makes them easier. There is something incredibly freeing about how we approach development of Web apps now. HTTP is a really simple protocol, and the stuff we push through it, in particular HTML and JSON, are pretty simple too. The debugging points are really easy to trap and trace.Premature optimization is prematureI'll go back to the data thing for a moment. I've been known to look at a particular action or use case and stress about the number of calls that are made to the database. I'm not suggesting that it's a bad thing to keep these in mind, but if you worry about it outside of the context of the actual impact, you're wasting time. For example, I query the database for last read times in a forum separately of the user and the list of forums. The impact on performance barely exists. If I put it under load, exceeding the kind of load I expect, it still barely has an impact. Then consider it only counts for logged in users. The context of this "inefficient" action is that it doesn't matter. Did I mention I won't be Facebook?Solve your own problems firstThis is another trap I've fallen into. I've often thought about what other people might need for some feature or aspect of the app. In other words, I was willing to make design decisions based on non-existent data. How stupid is that? When I decided to truly open source this thing, building for myself first was a stated design goal. This app has to server the audiences of CoasterBuzz, MouseZoom and other sites first. In this development scenario, you don't have access to mountains of usability studies or user focus groups. You have to start with what you know.I'm sure there are other points I could make too. It has been a lot of fun to work on, and I look forward to evolving the UI as time goes on. That's where I hope to see more magic in the future.

    Read the article

  • Oracle @ E20 Conference Boston - Building Social Business

    - by Michael Snow
    12.00 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-family:"Calibri","sans-serif"; mso-ascii- mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi- mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Oracle WebCenter is The Engagement Platform Powering Exceptional Experiences for Employees, Partners and Customers &amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lt;span id=&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;quot;XinhaEditingPostion&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;quot;&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt;&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lt;/span&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt;lt;p&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt; &amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lt;/p&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt; The way we work is changing rapidly, offering an enormous competitive advantage to those who embrace the new tools that enable contextual, agile and simplified information exchange and collaboration to distributed workforces and  networks of partners and customers. As many of you are aware, Enterprise 2.0 is the term for the technologies and business practices that liberate the workforce from the constraints of legacy communication and productivity tools like email. It provides business managers with access to the right information at the right time through a web of inter-connected applications, services and devices. Enterprise 2.0 makes accessible the collective intelligence of many, translating to a huge  competitive advantage in the form of increased innovation, productivity and agility.The Enterprise 2.0 Conference takes a strategic perspective, emphasizing the bigger picture implications of the technology and the exploration of what is at stake for organizations trying to change not only tools, but also culture and process. Beyond discussion of the "why", there will also be in-depth opportunities for learning the "how" that will help you bring Enterprise 2.0 to your business. You won't want to miss this opportunity to learn and hear from leading experts in the fields of technology for business, collaboration, culture change and collective intelligence.Oracle was a proud Gold sponsor of the Enterprise 2.0 Conference, taking place this past week in Boston. For those of you that weren't able to make it - we've made the Oracle Social Network Presentation session available here and have posted the slides below. 12.00 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-family:"Calibri","sans-serif"; mso-ascii- mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi- mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} 12.00 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-family:"Calibri","sans-serif"; mso-ascii- mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi- mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. 12.00 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;}

    Read the article

  • New Feature in ODI 11.1.1.6: ODI for Big Data

    - by Julien Testut
    Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} By Ananth Tirupattur Starting with Oracle Data Integrator 11.1.1.6.0, ODI is offering a solution to process Big Data. This post provides an overview of this feature. With all the buzz around Big Data and before getting into the details of ODI for Big Data, I will provide a brief introduction to Big Data and Oracle Solution for Big Data. So, what is Big Data? Big data includes: structured data (this includes data from relation data stores, xml data stores), semi-structured data (this includes data from weblogs) unstructured data (this includes data from text blob, images) Traditionally, business decisions are based on the information gathered from transactional data. For example, transactional Data from CRM applications is fed to a decision system for analysis and decision making. Products such as ODI play a key role in enabling decision systems. However, with the emergence of massive amounts of semi-structured and unstructured data it is important for decision system to include them in the analysis to achieve better decision making capability. While there is an abundance of opportunities for business for gaining competitive advantages, process of Big Data has challenges. The challenges of processing Big Data include: Volume of data Velocity of data - The high Rate at which data is generated Variety of data In order to address these challenges and convert them into opportunities, we would need an appropriate framework, platform and the right set of tools. Hadoop is an open source framework which is highly scalable, fault tolerant system, for storage and processing large amounts of data. Hadoop provides 2 key services, distributed and reliable storage called Hadoop Distributed File System or HDFS and a framework for parallel data processing called Map-Reduce. Innovations in Hadoop and its related technology continue to rapidly evolve, hence therefore, it is highly recommended to follow information on the web to keep up with latest information. Oracle's vision is to provide a comprehensive solution to address the challenges faced by Big Data. Oracle is providing the necessary Hardware, software and tools for processing Big Data Oracle solution includes: Big Data Appliance Oracle NoSQL Database Cloudera distribution for Hadoop Oracle R Enterprise- R is a statistical package which is very popular among data scientists. ODI solution for Big Data Oracle Loader for Hadoop for loading data from Hadoop to Oracle. Further details can be found here: http://www.oracle.com/us/products/database/big-data-appliance/overview/index.html ODI Solution for Big Data: ODI’s goal is to minimize the need to understand the complexity of Hadoop framework and simplify the adoption of processing Big Data seamlessly in an enterprise. ODI is providing the capabilities for an integrated architecture for processing Big Data. This includes capability to load data in to Hadoop, process data in Hadoop and load data from Hadoop into Oracle. ODI is expanding its support for Big Data by providing the following out of the box Knowledge Modules (KMs). IKM File to Hive (LOAD DATA).Load unstructured data from File (Local file system or HDFS ) into Hive IKM Hive Control AppendTransform and validate structured data on Hive IKM Hive TransformTransform unstructured data on Hive IKM File/Hive to Oracle (OLH)Load processed data in Hive to Oracle RKM HiveReverse engineer Hive tables to generate models Using the Loading KM you can map files (local and HDFS files) to the corresponding Hive tables. For example, you can map weblog files categorized by date into a corresponding partitioned Hive table schema. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Using the Hive control Append KM you can validate and transform data in Hive. In the below example, two source Hive tables are joined and mapped to a target Hive table. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} The Hive Transform KM facilitates processing of semi-structured data in Hive. In the below example, the data from weblog is processed using a Perl script and mapped to target Hive table. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Using the Oracle Loader for Hadoop (OLH) KM you can load data from Hive table or HDFS to a corresponding table in Oracle. OLH is available as a standalone product. ODI greatly enhances OLH capability by generating the configuration and mapping files for OLH based on the configuration provided in the interface and KM options. ODI seamlessly invokes OLH when executing the scenario. In the below example, a HDFS file is mapped to a table in Oracle. Development and Deployment:The following diagram illustrates the development and deployment of ODI solution for Big Data. Using the ODI Studio on your development machine create and develop ODI solution for processing Big Data by connecting to a MySQL DB or Oracle database on a BDA machine or Hadoop cluster. Schedule the ODI scenarios to be executed on the ODI agent deployed on the BDA machine or Hadoop cluster. ODI Solution for Big Data provides several exciting new capabilities to facilitate the adoption of Big Data in an enterprise. You can find more information about the Oracle Big Data connectors on OTN. You can find an overview of all the new features introduced in ODI 11.1.1.6 in the following document: ODI 11.1.1.6 New Features Overview

    Read the article

  • Thread placement policies on NUMA systems - update

    - by Dave
    In a prior blog entry I noted that Solaris used a "maximum dispersal" placement policy to assign nascent threads to their initial processors. The general idea is that threads should be placed as far away from each other as possible in the resource topology in order to reduce resource contention between concurrently running threads. This policy assumes that resource contention -- pipelines, memory channel contention, destructive interference in the shared caches, etc -- will likely outweigh (a) any potential communication benefits we might achieve by packing our threads more densely onto a subset of the NUMA nodes, and (b) benefits of NUMA affinity between memory allocated by one thread and accessed by other threads. We want our threads spread widely over the system and not packed together. Conceptually, when placing a new thread, the kernel picks the least loaded node NUMA node (the node with lowest aggregate load average), and then the least loaded core on that node, etc. Furthermore, the kernel places threads onto resources -- sockets, cores, pipelines, etc -- without regard to the thread's process membership. That is, initial placement is process-agnostic. Keep reading, though. This description is incorrect. On Solaris 10 on a SPARC T5440 with 4 x T2+ NUMA nodes, if the system is otherwise unloaded and we launch a process that creates 20 compute-bound concurrent threads, then typically we'll see a perfect balance with 5 threads on each node. We see similar behavior on an 8-node x86 x4800 system, where each node has 8 cores and each core is 2-way hyperthreaded. So far so good; this behavior seems in agreement with the policy I described in the 1st paragraph. I recently tried the same experiment on a 4-node T4-4 running Solaris 11. Both the T5440 and T4-4 are 4-node systems that expose 256 logical thread contexts. To my surprise, all 20 threads were placed onto just one NUMA node while the other 3 nodes remained completely idle. I checked the usual suspects such as processor sets inadvertently left around by colleagues, processors left offline, and power management policies, but the system was configured normally. I then launched multiple concurrent instances of the process, and, interestingly, all the threads from the 1st process landed on one node, all the threads from the 2nd process landed on another node, and so on. This happened even if I interleaved thread creating between the processes, so I was relatively sure the effect didn't related to thread creation time, but rather that placement was a function of process membership. I this point I consulted the Solaris sources and talked with folks in the Solaris group. The new Solaris 11 behavior is intentional. The kernel is no longer using a simple maximum dispersal policy, and thread placement is process membership-aware. Now, even if other nodes are completely unloaded, the kernel will still try to pack new threads onto the home lgroup (socket) of the primordial thread until the load average of that node reaches 50%, after which it will pick the next least loaded node as the process's new favorite node for placement. On the T4-4 we have 64 logical thread contexts (strands) per socket (lgroup), so if we launch 48 concurrent threads we will find 32 placed on one node and 16 on some other node. If we launch 64 threads we'll find 32 and 32. That means we can end up with our threads clustered on a small subset of the nodes in a way that's quite different that what we've seen on Solaris 10. So we have a policy that allows process-aware packing but reverts to spreading threads onto other nodes if a node becomes too saturated. It turns out this policy was enabled in Solaris 10, but certain bugs suppressed the mixed packing/spreading behavior. There are configuration variables in /etc/system that allow us to dial the affinity between nascent threads and their primordial thread up and down: see lgrp_expand_proc_thresh, specifically. In the OpenSolaris source code the key routine is mpo_update_tunables(). This method reads the /etc/system variables and sets up some global variables that will subsequently be used by the dispatcher, which calls lgrp_choose() in lgrp.c to place nascent threads. Lgrp_expand_proc_thresh controls how loaded an lgroup must be before we'll consider homing a process's threads to another lgroup. Tune this value lower to have it spread your process's threads out more. To recap, the 'new' policy is as follows. Threads from the same process are packed onto a subset of the strands of a socket (50% for T-series). Once that socket reaches the 50% threshold the kernel then picks another preferred socket for that process. Threads from unrelated processes are spread across sockets. More precisely, different processes may have different preferred sockets (lgroups). Beware that I've simplified and elided details for the purposes of explication. The truth is in the code. Remarks: It's worth noting that initial thread placement is just that. If there's a gross imbalance between the load on different nodes then the kernel will migrate threads to achieve a better and more even distribution over the set of available nodes. Once a thread runs and gains some affinity for a node, however, it becomes "stickier" under the assumption that the thread has residual cache residency on that node, and that memory allocated by that thread resides on that node given the default "first-touch" page-level NUMA allocation policy. Exactly how the various policies interact and which have precedence under what circumstances could the topic of a future blog entry. The scheduler is work-conserving. The x4800 mentioned above is an interesting system. Each of the 8 sockets houses an Intel 7500-series processor. Each processor has 3 coherent QPI links and the system is arranged as a glueless 8-socket twisted ladder "mobius" topology. Nodes are either 1 or 2 hops distant over the QPI links. As an aside the mapping of logical CPUIDs to physical resources is rather interesting on Solaris/x4800. On SPARC/Solaris the CPUID layout is strictly geographic, with the highest order bits identifying the socket, the next lower bits identifying the core within that socket, following by the pipeline (if present) and finally the logical thread context ("strand") on the core. But on Solaris on the x4800 the CPUID layout is as follows. [6:6] identifies the hyperthread on a core; bits [5:3] identify the socket, or package in Intel terminology; bits [2:0] identify the core within a socket. Such low-level details should be of interest only if you're binding threads -- a bad idea, the kernel typically handles placement best -- or if you're writing NUMA-aware code that's aware of the ambient placement and makes decisions accordingly. Solaris introduced the so-called critical-threads mechanism, which is expressed by putting a thread into the FX scheduling class at priority 60. The critical-threads mechanism applies to placement on cores, not on sockets, however. That is, it's an intra-socket policy, not an inter-socket policy. Solaris 11 introduces the Power Aware Dispatcher (PAD) which packs threads instead of spreading them out in an attempt to be able to keep sockets or cores at lower power levels. Maximum dispersal may be good for performance but is anathema to power management. PAD is off by default, but power management polices constitute yet another confounding factor with respect to scheduling and dispatching. If your threads communicate heavily -- one thread reads cache lines last written by some other thread -- then the new dense packing policy may improve performance by reducing traffic on the coherent interconnect. On the other hand if your threads in your process communicate rarely, then it's possible the new packing policy might result on contention on shared computing resources. Unfortunately there's no simple litmus test that says whether packing or spreading is optimal in a given situation. The answer varies by system load, application, number of threads, and platform hardware characteristics. Currently we don't have the necessary tools and sensoria to decide at runtime, so we're reduced to an empirical approach where we run trials and try to decide on a placement policy. The situation is quite frustrating. Relatedly, it's often hard to determine just the right level of concurrency to optimize throughput. (Understanding constructive vs destructive interference in the shared caches would be a good start. We could augment the lines with a small tag field indicating which strand last installed or accessed a line. Given that, we could augment the CPU with performance counters for misses where a thread evicts a line it installed vs misses where a thread displaces a line installed by some other thread.)

    Read the article

  • Network Restructure Method for Double-NAT network

    - by Adrian
    Due to a series of poor network design decisions (mostly) made many years ago in order to save a few bucks here and there, I have a network that is decidedly sub-optimally architected. I'm looking for suggestions to improve this less-than-pleasant situation. We're a non-profit with a Linux-based IT department and a limited budget. (Note: None of the Windows equipment we have runs does anything that talks to the Internet nor do we have any Windows admins on staff.) Key points: We have a main office and about 12 remote sites that essentially double NAT their subnets with physically-segregated switches. (No VLANing and limited ability to do so with current switches) These locations have a "DMZ" subnet that are NAT'd on an identically assigned 10.0.0/24 subnet at each site. These subnets cannot talk to DMZs at any other location because we don't route them anywhere except between server and adjacent "firewall". Some of these locations have multiple ISP connections (T1, Cable, and/or DSLs) that we manually route using IP Tools in Linux. These firewalls all run on the (10.0.0/24) network and are mostly "pro-sumer" grade firewalls (Linksys, Netgear, etc.) or ISP-provided DSL modems. Connecting these firewalls (via simple unmanaged switches) is one or more servers that must be publically-accessible. Connected to the main office's 10.0.0/24 subnet are servers for email, tele-commuter VPN, remote office VPN server, primary router to the internal 192.168/24 subnets. These have to be access from specific ISP connections based on traffic type and connection source. All our routing is done manually or with OpenVPN route statements Inter-office traffic goes through the OpenVPN service in the main 'Router' server which has it's own NAT'ing involved. Remote sites only have one server installed at each site and cannot afford multiple servers due to budget constraints. These servers are all LTSP servers several 5-20 terminals. The 192.168.2/24 and 192.168.3/24 subnets are mostly but NOT entirely on Cisco 2960 switches that can do VLAN. The remainder are DLink DGS-1248 switches that I am not sure I trust well enough to use with VLANs. There is also some remaining internal concern about VLANs since only the senior networking staff person understands how it works. All regular internet traffic goes through the CentOS 5 router server which in turns NATs the 192.168/24 subnets to the 10.0.0.0/24 subnets according to the manually-configured routing rules that we use to point outbound traffic to the proper internet connection based on '-host' routing statements. I want to simplify this and ready All Of The Things for ESXi virtualization, including these public-facing services. Is there a no- or low-cost solution that would get rid of the Double-NAT and restore a little sanity to this mess so that my future replacement doesn't hunt me down? Basic Diagram for the main office: These are my goals: Public-facing Servers with interfaces on that middle 10.0.0/24 network to be moved in to 192.168.2/24 subnet on ESXi servers. Get rid of the double NAT and get our entire network on one single subnet. My understanding is that this is something we'll need to do under IPv6 anyway, but I think this mess is standing in the way.

    Read the article

< Previous Page | 24 25 26 27 28 29 30 31  | Next Page >