Search Results

Search found 7864 results on 315 pages for 'pre commit hook'.

Page 105/315 | < Previous Page | 101 102 103 104 105 106 107 108 109 110 111 112  | Next Page >

  • Parallelism in .NET – Part 3, Imperative Data Parallelism: Early Termination

    - by Reed
    Although simple data parallelism allows us to easily parallelize many of our iteration statements, there are cases that it does not handle well.  In my previous discussion, I focused on data parallelism with no shared state, and where every element is being processed exactly the same. Unfortunately, there are many common cases where this does not happen.  If we are dealing with a loop that requires early termination, extra care is required when parallelizing. Often, while processing in a loop, once a certain condition is met, it is no longer necessary to continue processing.  This may be a matter of finding a specific element within the collection, or reaching some error case.  The important distinction here is that, it is often impossible to know until runtime, what set of elements needs to be processed. In my initial discussion of data parallelism, I mentioned that this technique is a candidate when you can decompose the problem based on the data involved, and you wish to apply a single operation concurrently on all of the elements of a collection.  This covers many of the potential cases, but sometimes, after processing some of the elements, we need to stop processing. As an example, lets go back to our previous Parallel.ForEach example with contacting a customer.  However, this time, we’ll change the requirements slightly.  In this case, we’ll add an extra condition – if the store is unable to email the customer, we will exit gracefully.  The thinking here, of course, is that if the store is currently unable to email, the next time this operation runs, it will handle the same situation, so we can just skip our processing entirely.  The original, serial case, with this extra condition, might look something like the following: foreach(var customer in customers) { // Run some process that takes some time... DateTime lastContact = theStore.GetLastContact(customer); TimeSpan timeSinceContact = DateTime.Now - lastContact; // If it's been more than two weeks, send an email, and update... if (timeSinceContact.Days > 14) { // Exit gracefully if we fail to email, since this // entire process can be repeated later without issue. if (theStore.EmailCustomer(customer) == false) break; customer.LastEmailContact = DateTime.Now; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here, we’re processing our loop, but at any point, if we fail to send our email successfully, we just abandon this process, and assume that it will get handled correctly the next time our routine is run.  If we try to parallelize this using Parallel.ForEach, as we did previously, we’ll run into an error almost immediately: the break statement we’re using is only valid when enclosed within an iteration statement, such as foreach.  When we switch to Parallel.ForEach, we’re no longer within an iteration statement – we’re a delegate running in a method. This needs to be handled slightly differently when parallelized.  Instead of using the break statement, we need to utilize a new class in the Task Parallel Library: ParallelLoopState.  The ParallelLoopState class is intended to allow concurrently running loop bodies a way to interact with each other, and provides us with a way to break out of a loop.  In order to use this, we will use a different overload of Parallel.ForEach which takes an IEnumerable<T> and an Action<T, ParallelLoopState> instead of an Action<T>.  Using this, we can parallelize the above operation by doing: Parallel.ForEach(customers, (customer, parallelLoopState) => { // Run some process that takes some time... DateTime lastContact = theStore.GetLastContact(customer); TimeSpan timeSinceContact = DateTime.Now - lastContact; // If it's been more than two weeks, send an email, and update... if (timeSinceContact.Days > 14) { // Exit gracefully if we fail to email, since this // entire process can be repeated later without issue. if (theStore.EmailCustomer(customer) == false) parallelLoopState.Break(); else customer.LastEmailContact = DateTime.Now; } }); There are a couple of important points here.  First, we didn’t actually instantiate the ParallelLoopState instance.  It was provided directly to us via the Parallel class.  All we needed to do was change our lambda expression to reflect that we want to use the loop state, and the Parallel class creates an instance for our use.  We also needed to change our logic slightly when we call Break().  Since Break() doesn’t stop the program flow within our block, we needed to add an else case to only set the property in customer when we succeeded.  This same technique can be used to break out of a Parallel.For loop. That being said, there is a huge difference between using ParallelLoopState to cause early termination and to use break in a standard iteration statement.  When dealing with a loop serially, break will immediately terminate the processing within the closest enclosing loop statement.  Calling ParallelLoopState.Break(), however, has a very different behavior. The issue is that, now, we’re no longer processing one element at a time.  If we break in one of our threads, there are other threads that will likely still be executing.  This leads to an important observation about termination of parallel code: Early termination in parallel routines is not immediate.  Code will continue to run after you request a termination. This may seem problematic at first, but it is something you just need to keep in mind while designing your routine.  ParallelLoopState.Break() should be thought of as a request.  We are telling the runtime that no elements that were in the collection past the element we’re currently processing need to be processed, and leaving it up to the runtime to decide how to handle this as gracefully as possible.  Although this may seem problematic at first, it is a good thing.  If the runtime tried to immediately stop processing, many of our elements would be partially processed.  It would be like putting a return statement in a random location throughout our loop body – which could have horrific consequences to our code’s maintainability. In order to understand and effectively write parallel routines, we, as developers, need a subtle, but profound shift in our thinking.  We can no longer think in terms of sequential processes, but rather need to think in terms of requests to the system that may be handled differently than we’d first expect.  This is more natural to developers who have dealt with asynchronous models previously, but is an important distinction when moving to concurrent programming models. As an example, I’ll discuss the Break() method.  ParallelLoopState.Break() functions in a way that may be unexpected at first.  When you call Break() from a loop body, the runtime will continue to process all elements of the collection that were found prior to the element that was being processed when the Break() method was called.  This is done to keep the behavior of the Break() method as close to the behavior of the break statement as possible. We can see the behavior in this simple code: var collection = Enumerable.Range(0, 20); var pResult = Parallel.ForEach(collection, (element, state) => { if (element > 10) { Console.WriteLine("Breaking on {0}", element); state.Break(); } Console.WriteLine(element); }); If we run this, we get a result that may seem unexpected at first: 0 2 1 5 6 3 4 10 Breaking on 11 11 Breaking on 12 12 9 Breaking on 13 13 7 8 Breaking on 15 15 What is occurring here is that we loop until we find the first element where the element is greater than 10.  In this case, this was found, the first time, when one of our threads reached element 11.  It requested that the loop stop by calling Break() at this point.  However, the loop continued processing until all of the elements less than 11 were completed, then terminated.  This means that it will guarantee that elements 9, 7, and 8 are completed before it stops processing.  You can see our other threads that were running each tried to break as well, but since Break() was called on the element with a value of 11, it decides which elements (0-10) must be processed. If this behavior is not desirable, there is another option.  Instead of calling ParallelLoopState.Break(), you can call ParallelLoopState.Stop().  The Stop() method requests that the runtime terminate as soon as possible , without guaranteeing that any other elements are processed.  Stop() will not stop the processing within an element, so elements already being processed will continue to be processed.  It will prevent new elements, even ones found earlier in the collection, from being processed.  Also, when Stop() is called, the ParallelLoopState’s IsStopped property will return true.  This lets longer running processes poll for this value, and return after performing any necessary cleanup. The basic rule of thumb for choosing between Break() and Stop() is the following. Use ParallelLoopState.Stop() when possible, since it terminates more quickly.  This is particularly useful in situations where you are searching for an element or a condition in the collection.  Once you’ve found it, you do not need to do any other processing, so Stop() is more appropriate. Use ParallelLoopState.Break() if you need to more closely match the behavior of the C# break statement. Both methods behave differently than our C# break statement.  Unfortunately, when parallelizing a routine, more thought and care needs to be put into every aspect of your routine than you may otherwise expect.  This is due to my second observation: Parallelizing a routine will almost always change its behavior. This sounds crazy at first, but it’s a concept that’s so simple its easy to forget.  We’re purposely telling the system to process more than one thing at the same time, which means that the sequence in which things get processed is no longer deterministic.  It is easy to change the behavior of your routine in very subtle ways by introducing parallelism.  Often, the changes are not avoidable, even if they don’t have any adverse side effects.  This leads to my final observation for this post: Parallelization is something that should be handled with care and forethought, added by design, and not just introduced casually.

    Read the article

  • Parallelism in .NET – Part 7, Some Differences between PLINQ and LINQ to Objects

    - by Reed
    In my previous post on Declarative Data Parallelism, I mentioned that PLINQ extends LINQ to Objects to support parallel operations.  Although nearly all of the same operations are supported, there are some differences between PLINQ and LINQ to Objects.  By introducing Parallelism to our declarative model, we add some extra complexity.  This, in turn, adds some extra requirements that must be addressed. In order to illustrate the main differences, and why they exist, let’s begin by discussing some differences in how the two technologies operate, and look at the underlying types involved in LINQ to Objects and PLINQ . LINQ to Objects is mainly built upon a single class: Enumerable.  The Enumerable class is a static class that defines a large set of extension methods, nearly all of which work upon an IEnumerable<T>.  Many of these methods return a new IEnumerable<T>, allowing the methods to be chained together into a fluent style interface.  This is what allows us to write statements that chain together, and lead to the nice declarative programming model of LINQ: double min = collection .Where(item => item.SomeProperty > 6 && item.SomeProperty < 24) .Min(item => item.PerformComputation()); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Other LINQ variants work in a similar fashion.  For example, most data-oriented LINQ providers are built upon an implementation of IQueryable<T>, which allows the database provider to turn a LINQ statement into an underlying SQL query, to be performed directly on the remote database. PLINQ is similar, but instead of being built upon the Enumerable class, most of PLINQ is built upon a new static class: ParallelEnumerable.  When using PLINQ, you typically begin with any collection which implements IEnumerable<T>, and convert it to a new type using an extension method defined on ParallelEnumerable: AsParallel().  This method takes any IEnumerable<T>, and converts it into a ParallelQuery<T>, the core class for PLINQ.  There is a similar ParallelQuery class for working with non-generic IEnumerable implementations. This brings us to our first subtle, but important difference between PLINQ and LINQ – PLINQ always works upon specific types, which must be explicitly created. Typically, the type you’ll use with PLINQ is ParallelQuery<T>, but it can sometimes be a ParallelQuery or an OrderedParallelQuery<T>.  Instead of dealing with an interface, implemented by an unknown class, we’re dealing with a specific class type.  This works seamlessly from a usage standpoint – ParallelQuery<T> implements IEnumerable<T>, so you can always “switch back” to an IEnumerable<T>.  The difference only arises at the beginning of our parallelization.  When we’re using LINQ, and we want to process a normal collection via PLINQ, we need to explicitly convert the collection into a ParallelQuery<T> by calling AsParallel().  There is an important consideration here – AsParallel() does not need to be called on your specific collection, but rather any IEnumerable<T>.  This allows you to place it anywhere in the chain of methods involved in a LINQ statement, not just at the beginning.  This can be useful if you have an operation which will not parallelize well or is not thread safe.  For example, the following is perfectly valid, and similar to our previous examples: double min = collection .AsParallel() .Select(item => item.SomeOperation()) .Where(item => item.SomeProperty > 6 && item.SomeProperty < 24) .Min(item => item.PerformComputation()); However, if SomeOperation() is not thread safe, we could just as easily do: double min = collection .Select(item => item.SomeOperation()) .AsParallel() .Where(item => item.SomeProperty > 6 && item.SomeProperty < 24) .Min(item => item.PerformComputation()); In this case, we’re using standard LINQ to Objects for the Select(…) method, then converting the results of that map routine to a ParallelQuery<T>, and processing our filter (the Where method) and our aggregation (the Min method) in parallel. PLINQ also provides us with a way to convert a ParallelQuery<T> back into a standard IEnumerable<T>, forcing sequential processing via standard LINQ to Objects.  If SomeOperation() was thread-safe, but PerformComputation() was not thread-safe, we would need to handle this by using the AsEnumerable() method: double min = collection .AsParallel() .Select(item => item.SomeOperation()) .Where(item => item.SomeProperty > 6 && item.SomeProperty < 24) .AsEnumerable() .Min(item => item.PerformComputation()); Here, we’re converting our collection into a ParallelQuery<T>, doing our map operation (the Select(…) method) and our filtering in parallel, then converting the collection back into a standard IEnumerable<T>, which causes our aggregation via Min() to be performed sequentially. This could also be written as two statements, as well, which would allow us to use the language integrated syntax for the first portion: var tempCollection = from item in collection.AsParallel() let e = item.SomeOperation() where (e.SomeProperty > 6 && e.SomeProperty < 24) select e; double min = tempCollection.AsEnumerable().Min(item => item.PerformComputation()); This allows us to use the standard LINQ style language integrated query syntax, but control whether it’s performed in parallel or serial by adding AsParallel() and AsEnumerable() appropriately. The second important difference between PLINQ and LINQ deals with order preservation.  PLINQ, by default, does not preserve the order of of source collection. This is by design.  In order to process a collection in parallel, the system needs to naturally deal with multiple elements at the same time.  Maintaining the original ordering of the sequence adds overhead, which is, in many cases, unnecessary.  Therefore, by default, the system is allowed to completely change the order of your sequence during processing.  If you are doing a standard query operation, this is usually not an issue.  However, there are times when keeping a specific ordering in place is important.  If this is required, you can explicitly request the ordering be preserved throughout all operations done on a ParallelQuery<T> by using the AsOrdered() extension method.  This will cause our sequence ordering to be preserved. For example, suppose we wanted to take a collection, perform an expensive operation which converts it to a new type, and display the first 100 elements.  In LINQ to Objects, our code might look something like: // Using IEnumerable<SourceClass> collection IEnumerable<ResultClass> results = collection .Select(e => e.CreateResult()) .Take(100); If we just converted this to a parallel query naively, like so: IEnumerable<ResultClass> results = collection .AsParallel() .Select(e => e.CreateResult()) .Take(100); We could very easily get a very different, and non-reproducable, set of results, since the ordering of elements in the input collection is not preserved.  To get the same results as our original query, we need to use: IEnumerable<ResultClass> results = collection .AsParallel() .AsOrdered() .Select(e => e.CreateResult()) .Take(100); This requests that PLINQ process our sequence in a way that verifies that our resulting collection is ordered as if it were processed serially.  This will cause our query to run slower, since there is overhead involved in maintaining the ordering.  However, in this case, it is required, since the ordering is required for correctness. PLINQ is incredibly useful.  It allows us to easily take nearly any LINQ to Objects query and run it in parallel, using the same methods and syntax we’ve used previously.  There are some important differences in operation that must be considered, however – it is not a free pass to parallelize everything.  When using PLINQ in order to parallelize your routines declaratively, the same guideline I mentioned before still applies: Parallelization is something that should be handled with care and forethought, added by design, and not just introduced casually.

    Read the article

  • Parallelism in .NET – Part 9, Configuration in PLINQ and TPL

    - by Reed
    Parallel LINQ and the Task Parallel Library contain many options for configuration.  Although the default configuration options are often ideal, there are times when customizing the behavior is desirable.  Both frameworks provide full configuration support. When working with Data Parallelism, there is one primary configuration option we often need to control – the number of threads we want the system to use when parallelizing our routine.  By default, PLINQ and the TPL both use the ThreadPool to schedule tasks.  Given the major improvements in the ThreadPool in CLR 4, this default behavior is often ideal.  However, there are times that the default behavior is not appropriate.  For example, if you are working on multiple threads simultaneously, and want to schedule parallel operations from within both threads, you might want to consider restricting each parallel operation to using a subset of the processing cores of the system.  Not doing this might over-parallelize your routine, which leads to inefficiencies from having too many context switches. In the Task Parallel Library, configuration is handled via the ParallelOptions class.  All of the methods of the Parallel class have an overload which accepts a ParallelOptions argument. We configure the Parallel class by setting the ParallelOptions.MaxDegreeOfParallelism property.  For example, let’s revisit one of the simple data parallel examples from Part 2: Parallel.For(0, pixelData.GetUpperBound(0), row => { for (int col=0; col < pixelData.GetUpperBound(1); ++col) { pixelData[row, col] = AdjustContrast(pixelData[row, col], minPixel, maxPixel); } }); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here, we’re looping through an image, and calling a method on each pixel in the image.  If this was being done on a separate thread, and we knew another thread within our system was going to be doing a similar operation, we likely would want to restrict this to using half of the cores on the system.  This could be accomplished easily by doing: var options = new ParallelOptions(); options.MaxDegreeOfParallelism = Math.Max(Environment.ProcessorCount / 2, 1); Parallel.For(0, pixelData.GetUpperBound(0), options, row => { for (int col=0; col < pixelData.GetUpperBound(1); ++col) { pixelData[row, col] = AdjustContrast(pixelData[row, col], minPixel, maxPixel); } }); Now, we’re restricting this routine to using no more than half the cores in our system.  Note that I included a check to prevent a single core system from supplying zero; without this check, we’d potentially cause an exception.  I also did not hard code a specific value for the MaxDegreeOfParallelism property.  One of our goals when parallelizing a routine is allowing it to scale on better hardware.  Specifying a hard-coded value would contradict that goal. Parallel LINQ also supports configuration, and in fact, has quite a few more options for configuring the system.  The main configuration option we most often need is the same as our TPL option: we need to supply the maximum number of processing threads.  In PLINQ, this is done via a new extension method on ParallelQuery<T>: ParallelEnumerable.WithDegreeOfParallelism. Let’s revisit our declarative data parallelism sample from Part 6: double min = collection.AsParallel().Min(item => item.PerformComputation()); Here, we’re performing a computation on each element in the collection, and saving the minimum value of this operation.  If we wanted to restrict this to a limited number of threads, we would add our new extension method: int maxThreads = Math.Max(Environment.ProcessorCount / 2, 1); double min = collection .AsParallel() .WithDegreeOfParallelism(maxThreads) .Min(item => item.PerformComputation()); This automatically restricts the PLINQ query to half of the threads on the system. PLINQ provides some additional configuration options.  By default, PLINQ will occasionally revert to processing a query in parallel.  This occurs because many queries, if parallelized, typically actually cause an overall slowdown compared to a serial processing equivalent.  By analyzing the “shape” of the query, PLINQ often decides to run a query serially instead of in parallel.  This can occur for (taken from MSDN): Queries that contain a Select, indexed Where, indexed SelectMany, or ElementAt clause after an ordering or filtering operator that has removed or rearranged original indices. Queries that contain a Take, TakeWhile, Skip, SkipWhile operator and where indices in the source sequence are not in the original order. Queries that contain Zip or SequenceEquals, unless one of the data sources has an originally ordered index and the other data source is indexable (i.e. an array or IList(T)). Queries that contain Concat, unless it is applied to indexable data sources. Queries that contain Reverse, unless applied to an indexable data source. If the specific query follows these rules, PLINQ will run the query on a single thread.  However, none of these rules look at the specific work being done in the delegates, only at the “shape” of the query.  There are cases where running in parallel may still be beneficial, even if the shape is one where it typically parallelizes poorly.  In these cases, you can override the default behavior by using the WithExecutionMode extension method.  This would be done like so: var reversed = collection .AsParallel() .WithExecutionMode(ParallelExecutionMode.ForceParallelism) .Select(i => i.PerformComputation()) .Reverse(); Here, the default behavior would be to not parallelize the query unless collection implemented IList<T>.  We can force this to run in parallel by adding the WithExecutionMode extension method in the method chain. Finally, PLINQ has the ability to configure how results are returned.  When a query is filtering or selecting an input collection, the results will need to be streamed back into a single IEnumerable<T> result.  For example, the method above returns a new, reversed collection.  In this case, the processing of the collection will be done in parallel, but the results need to be streamed back to the caller serially, so they can be enumerated on a single thread. This streaming introduces overhead.  IEnumerable<T> isn’t designed with thread safety in mind, so the system needs to handle merging the parallel processes back into a single stream, which introduces synchronization issues.  There are two extremes of how this could be accomplished, but both extremes have disadvantages. The system could watch each thread, and whenever a thread produces a result, take that result and send it back to the caller.  This would mean that the calling thread would have access to the data as soon as data is available, which is the benefit of this approach.  However, it also means that every item is introducing synchronization overhead, since each item needs to be merged individually. On the other extreme, the system could wait until all of the results from all of the threads were ready, then push all of the results back to the calling thread in one shot.  The advantage here is that the least amount of synchronization is added to the system, which means the query will, on a whole, run the fastest.  However, the calling thread will have to wait for all elements to be processed, so this could introduce a long delay between when a parallel query begins and when results are returned. The default behavior in PLINQ is actually between these two extremes.  By default, PLINQ maintains an internal buffer, and chooses an optimal buffer size to maintain.  Query results are accumulated into the buffer, then returned in the IEnumerable<T> result in chunks.  This provides reasonably fast access to the results, as well as good overall throughput, in most scenarios. However, if we know the nature of our algorithm, we may decide we would prefer one of the other extremes.  This can be done by using the WithMergeOptions extension method.  For example, if we know that our PerformComputation() routine is very slow, but also variable in runtime, we may want to retrieve results as they are available, with no bufferring.  This can be done by changing our above routine to: var reversed = collection .AsParallel() .WithExecutionMode(ParallelExecutionMode.ForceParallelism) .WithMergeOptions(ParallelMergeOptions.NotBuffered) .Select(i => i.PerformComputation()) .Reverse(); On the other hand, if are already on a background thread, and we want to allow the system to maximize its speed, we might want to allow the system to fully buffer the results: var reversed = collection .AsParallel() .WithExecutionMode(ParallelExecutionMode.ForceParallelism) .WithMergeOptions(ParallelMergeOptions.FullyBuffered) .Select(i => i.PerformComputation()) .Reverse(); Notice, also, that you can specify multiple configuration options in a parallel query.  By chaining these extension methods together, we generate a query that will always run in parallel, and will always complete before making the results available in our IEnumerable<T>.

    Read the article

  • Parallelism in .NET – Part 2, Simple Imperative Data Parallelism

    - by Reed
    In my discussion of Decomposition of the problem space, I mentioned that Data Decomposition is often the simplest abstraction to use when trying to parallelize a routine.  If a problem can be decomposed based off the data, we will often want to use what MSDN refers to as Data Parallelism as our strategy for implementing our routine.  The Task Parallel Library in .NET 4 makes implementing Data Parallelism, for most cases, very simple. Data Parallelism is the main technique we use to parallelize a routine which can be decomposed based off data.  Data Parallelism refers to taking a single collection of data, and having a single operation be performed concurrently on elements in the collection.  One side note here: Data Parallelism is also sometimes referred to as the Loop Parallelism Pattern or Loop-level Parallelism.  In general, for this series, I will try to use the terminology used in the MSDN Documentation for the Task Parallel Library.  This should make it easier to investigate these topics in more detail. Once we’ve determined we have a problem that, potentially, can be decomposed based on data, implementation using Data Parallelism in the TPL is quite simple.  Let’s take our example from the Data Decomposition discussion – a simple contrast stretching filter.  Here, we have a collection of data (pixels), and we need to run a simple operation on each element of the pixel.  Once we know the minimum and maximum values, we most likely would have some simple code like the following: for (int row=0; row < pixelData.GetUpperBound(0); ++row) { for (int col=0; col < pixelData.GetUpperBound(1); ++col) { pixelData[row, col] = AdjustContrast(pixelData[row, col], minPixel, maxPixel); } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This simple routine loops through a two dimensional array of pixelData, and calls the AdjustContrast routine on each pixel. As I mentioned, when you’re decomposing a problem space, most iteration statements are potentially candidates for data decomposition.  Here, we’re using two for loops – one looping through rows in the image, and a second nested loop iterating through the columns.  We then perform one, independent operation on each element based on those loop positions. This is a prime candidate – we have no shared data, no dependencies on anything but the pixel which we want to change.  Since we’re using a for loop, we can easily parallelize this using the Parallel.For method in the TPL: Parallel.For(0, pixelData.GetUpperBound(0), row => { for (int col=0; col < pixelData.GetUpperBound(1); ++col) { pixelData[row, col] = AdjustContrast(pixelData[row, col], minPixel, maxPixel); } }); Here, by simply changing our first for loop to a call to Parallel.For, we can parallelize this portion of our routine.  Parallel.For works, as do many methods in the TPL, by creating a delegate and using it as an argument to a method.  In this case, our for loop iteration block becomes a delegate creating via a lambda expression.  This lets you write code that, superficially, looks similar to the familiar for loop, but functions quite differently at runtime. We could easily do this to our second for loop as well, but that may not be a good idea.  There is a balance to be struck when writing parallel code.  We want to have enough work items to keep all of our processors busy, but the more we partition our data, the more overhead we introduce.  In this case, we have an image of data – most likely hundreds of pixels in both dimensions.  By just parallelizing our first loop, each row of pixels can be run as a single task.  With hundreds of rows of data, we are providing fine enough granularity to keep all of our processors busy. If we parallelize both loops, we’re potentially creating millions of independent tasks.  This introduces extra overhead with no extra gain, and will actually reduce our overall performance.  This leads to my first guideline when writing parallel code: Partition your problem into enough tasks to keep each processor busy throughout the operation, but not more than necessary to keep each processor busy. Also note that I parallelized the outer loop.  I could have just as easily partitioned the inner loop.  However, partitioning the inner loop would have led to many more discrete work items, each with a smaller amount of work (operate on one pixel instead of one row of pixels).  My second guideline when writing parallel code reflects this: Partition your problem in a way to place the most work possible into each task. This typically means, in practice, that you will want to parallelize the routine at the “highest” point possible in the routine, typically the outermost loop.  If you’re looking at parallelizing methods which call other methods, you’ll want to try to partition your work high up in the stack – as you get into lower level methods, the performance impact of parallelizing your routines may not overcome the overhead introduced. Parallel.For works great for situations where we know the number of elements we’re going to process in advance.  If we’re iterating through an IList<T> or an array, this is a typical approach.  However, there are other iteration statements common in C#.  In many situations, we’ll use foreach instead of a for loop.  This can be more understandable and easier to read, but also has the advantage of working with collections which only implement IEnumerable<T>, where we do not know the number of elements involved in advance. As an example, lets take the following situation.  Say we have a collection of Customers, and we want to iterate through each customer, check some information about the customer, and if a certain case is met, send an email to the customer and update our instance to reflect this change.  Normally, this might look something like: foreach(var customer in customers) { // Run some process that takes some time... DateTime lastContact = theStore.GetLastContact(customer); TimeSpan timeSinceContact = DateTime.Now - lastContact; // If it's been more than two weeks, send an email, and update... if (timeSinceContact.Days > 14) { theStore.EmailCustomer(customer); customer.LastEmailContact = DateTime.Now; } } Here, we’re doing a fair amount of work for each customer in our collection, but we don’t know how many customers exist.  If we assume that theStore.GetLastContact(customer) and theStore.EmailCustomer(customer) are both side-effect free, thread safe operations, we could parallelize this using Parallel.ForEach: Parallel.ForEach(customers, customer => { // Run some process that takes some time... DateTime lastContact = theStore.GetLastContact(customer); TimeSpan timeSinceContact = DateTime.Now - lastContact; // If it's been more than two weeks, send an email, and update... if (timeSinceContact.Days > 14) { theStore.EmailCustomer(customer); customer.LastEmailContact = DateTime.Now; } }); Just like Parallel.For, we rework our loop into a method call accepting a delegate created via a lambda expression.  This keeps our new code very similar to our original iteration statement, however, this will now execute in parallel.  The same guidelines apply with Parallel.ForEach as with Parallel.For. The other iteration statements, do and while, do not have direct equivalents in the Task Parallel Library.  These, however, are very easy to implement using Parallel.ForEach and the yield keyword. Most applications can benefit from implementing some form of Data Parallelism.  Iterating through collections and performing “work” is a very common pattern in nearly every application.  When the problem can be decomposed by data, we often can parallelize the workload by merely changing foreach statements to Parallel.ForEach method calls, and for loops to Parallel.For method calls.  Any time your program operates on a collection, and does a set of work on each item in the collection where that work is not dependent on other information, you very likely have an opportunity to parallelize your routine.

    Read the article

  • Parallelism in .NET – Part 4, Imperative Data Parallelism: Aggregation

    - by Reed
    In the article on simple data parallelism, I described how to perform an operation on an entire collection of elements in parallel.  Often, this is not adequate, as the parallel operation is going to be performing some form of aggregation. Simple examples of this might include taking the sum of the results of processing a function on each element in the collection, or finding the minimum of the collection given some criteria.  This can be done using the techniques described in simple data parallelism, however, special care needs to be taken into account to synchronize the shared data appropriately.  The Task Parallel Library has tools to assist in this synchronization. The main issue with aggregation when parallelizing a routine is that you need to handle synchronization of data.  Since multiple threads will need to write to a shared portion of data.  Suppose, for example, that we wanted to parallelize a simple loop that looked for the minimum value within a dataset: double min = double.MaxValue; foreach(var item in collection) { double value = item.PerformComputation(); min = System.Math.Min(min, value); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This seems like a good candidate for parallelization, but there is a problem here.  If we just wrap this into a call to Parallel.ForEach, we’ll introduce a critical race condition, and get the wrong answer.  Let’s look at what happens here: // Buggy code! Do not use! double min = double.MaxValue; Parallel.ForEach(collection, item => { double value = item.PerformComputation(); min = System.Math.Min(min, value); }); This code has a fatal flaw: min will be checked, then set, by multiple threads simultaneously.  Two threads may perform the check at the same time, and set the wrong value for min.  Say we get a value of 1 in thread 1, and a value of 2 in thread 2, and these two elements are the first two to run.  If both hit the min check line at the same time, both will determine that min should change, to 1 and 2 respectively.  If element 1 happens to set the variable first, then element 2 sets the min variable, we’ll detect a min value of 2 instead of 1.  This can lead to wrong answers. Unfortunately, fixing this, with the Parallel.ForEach call we’re using, would require adding locking.  We would need to rewrite this like: // Safe, but slow double min = double.MaxValue; // Make a "lock" object object syncObject = new object(); Parallel.ForEach(collection, item => { double value = item.PerformComputation(); lock(syncObject) min = System.Math.Min(min, value); }); This will potentially add a huge amount of overhead to our calculation.  Since we can potentially block while waiting on the lock for every single iteration, we will most likely slow this down to where it is actually quite a bit slower than our serial implementation.  The problem is the lock statement – any time you use lock(object), you’re almost assuring reduced performance in a parallel situation.  This leads to two observations I’ll make: When parallelizing a routine, try to avoid locks. That being said: Always add any and all required synchronization to avoid race conditions. These two observations tend to be opposing forces – we often need to synchronize our algorithms, but we also want to avoid the synchronization when possible.  Looking at our routine, there is no way to directly avoid this lock, since each element is potentially being run on a separate thread, and this lock is necessary in order for our routine to function correctly every time. However, this isn’t the only way to design this routine to implement this algorithm.  Realize that, although our collection may have thousands or even millions of elements, we have a limited number of Processing Elements (PE).  Processing Element is the standard term for a hardware element which can process and execute instructions.  This typically is a core in your processor, but many modern systems have multiple hardware execution threads per core.  The Task Parallel Library will not execute the work for each item in the collection as a separate work item. Instead, when Parallel.ForEach executes, it will partition the collection into larger “chunks” which get processed on different threads via the ThreadPool.  This helps reduce the threading overhead, and help the overall speed.  In general, the Parallel class will only use one thread per PE in the system. Given the fact that there are typically fewer threads than work items, we can rethink our algorithm design.  We can parallelize our algorithm more effectively by approaching it differently.  Because the basic aggregation we are doing here (Min) is communitive, we do not need to perform this in a given order.  We knew this to be true already – otherwise, we wouldn’t have been able to parallelize this routine in the first place.  With this in mind, we can treat each thread’s work independently, allowing each thread to serially process many elements with no locking, then, after all the threads are complete, “merge” together the results. This can be accomplished via a different set of overloads in the Parallel class: Parallel.ForEach<TSource,TLocal>.  The idea behind these overloads is to allow each thread to begin by initializing some local state (TLocal).  The thread will then process an entire set of items in the source collection, providing that state to the delegate which processes an individual item.  Finally, at the end, a separate delegate is run which allows you to handle merging that local state into your final results. To rewriting our routine using Parallel.ForEach<TSource,TLocal>, we need to provide three delegates instead of one.  The most basic version of this function is declared as: public static ParallelLoopResult ForEach<TSource, TLocal>( IEnumerable<TSource> source, Func<TLocal> localInit, Func<TSource, ParallelLoopState, TLocal, TLocal> body, Action<TLocal> localFinally ) The first delegate (the localInit argument) is defined as Func<TLocal>.  This delegate initializes our local state.  It should return some object we can use to track the results of a single thread’s operations. The second delegate (the body argument) is where our main processing occurs, although now, instead of being an Action<T>, we actually provide a Func<TSource, ParallelLoopState, TLocal, TLocal> delegate.  This delegate will receive three arguments: our original element from the collection (TSource), a ParallelLoopState which we can use for early termination, and the instance of our local state we created (TLocal).  It should do whatever processing you wish to occur per element, then return the value of the local state after processing is completed. The third delegate (the localFinally argument) is defined as Action<TLocal>.  This delegate is passed our local state after it’s been processed by all of the elements this thread will handle.  This is where you can merge your final results together.  This may require synchronization, but now, instead of synchronizing once per element (potentially millions of times), you’ll only have to synchronize once per thread, which is an ideal situation. Now that I’ve explained how this works, lets look at the code: // Safe, and fast! double min = double.MaxValue; // Make a "lock" object object syncObject = new object(); Parallel.ForEach( collection, // First, we provide a local state initialization delegate. () => double.MaxValue, // Next, we supply the body, which takes the original item, loop state, // and local state, and returns a new local state (item, loopState, localState) => { double value = item.PerformComputation(); return System.Math.Min(localState, value); }, // Finally, we provide an Action<TLocal>, to "merge" results together localState => { // This requires locking, but it's only once per used thread lock(syncObj) min = System.Math.Min(min, localState); } ); Although this is a bit more complicated than the previous version, it is now both thread-safe, and has minimal locking.  This same approach can be used by Parallel.For, although now, it’s Parallel.For<TLocal>.  When working with Parallel.For<TLocal>, you use the same triplet of delegates, with the same purpose and results. Also, many times, you can completely avoid locking by using a method of the Interlocked class to perform the final aggregation in an atomic operation.  The MSDN example demonstrating this same technique using Parallel.For uses the Interlocked class instead of a lock, since they are doing a sum operation on a long variable, which is possible via Interlocked.Add. By taking advantage of local state, we can use the Parallel class methods to parallelize algorithms such as aggregation, which, at first, may seem like poor candidates for parallelization.  Doing so requires careful consideration, and often requires a slight redesign of the algorithm, but the performance gains can be significant if handled in a way to avoid excessive synchronization.

    Read the article

  • Parallelism in .NET – Part 11, Divide and Conquer via Parallel.Invoke

    - by Reed
    Many algorithms are easily written to work via recursion.  For example, most data-oriented tasks where a tree of data must be processed are much more easily handled by starting at the root, and recursively “walking” the tree.  Some algorithms work this way on flat data structures, such as arrays, as well.  This is a form of divide and conquer: an algorithm design which is based around breaking up a set of work recursively, “dividing” the total work in each recursive step, and “conquering” the work when the remaining work is small enough to be solved easily. Recursive algorithms, especially ones based on a form of divide and conquer, are often a very good candidate for parallelization. This is apparent from a common sense standpoint.  Since we’re dividing up the total work in the algorithm, we have an obvious, built-in partitioning scheme.  Once partitioned, the data can be worked upon independently, so there is good, clean isolation of data. Implementing this type of algorithm is fairly simple.  The Parallel class in .NET 4 includes a method suited for this type of operation: Parallel.Invoke.  This method works by taking any number of delegates defined as an Action, and operating them all in parallel.  The method returns when every delegate has completed: Parallel.Invoke( () => { Console.WriteLine("Action 1 executing in thread {0}", Thread.CurrentThread.ManagedThreadId); }, () => { Console.WriteLine("Action 2 executing in thread {0}", Thread.CurrentThread.ManagedThreadId); }, () => { Console.WriteLine("Action 3 executing in thread {0}", Thread.CurrentThread.ManagedThreadId); } ); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Running this simple example demonstrates the ease of using this method.  For example, on my system, I get three separate thread IDs when running the above code.  By allowing any number of delegates to be executed directly, concurrently, the Parallel.Invoke method provides us an easy way to parallelize any algorithm based on divide and conquer.  We can divide our work in each step, and execute each task in parallel, recursively. For example, suppose we wanted to implement our own quicksort routine.  The quicksort algorithm can be designed based on divide and conquer.  In each iteration, we pick a pivot point, and use that to partition the total array.  We swap the elements around the pivot, then recursively sort the lists on each side of the pivot.  For example, let’s look at this simple, sequential implementation of quicksort: public static void QuickSort<T>(T[] array) where T : IComparable<T> { QuickSortInternal(array, 0, array.Length - 1); } private static void QuickSortInternal<T>(T[] array, int left, int right) where T : IComparable<T> { if (left >= right) { return; } SwapElements(array, left, (left + right) / 2); int last = left; for (int current = left + 1; current <= right; ++current) { if (array[current].CompareTo(array[left]) < 0) { ++last; SwapElements(array, last, current); } } SwapElements(array, left, last); QuickSortInternal(array, left, last - 1); QuickSortInternal(array, last + 1, right); } static void SwapElements<T>(T[] array, int i, int j) { T temp = array[i]; array[i] = array[j]; array[j] = temp; } Here, we implement the quicksort algorithm in a very common, divide and conquer approach.  Running this against the built-in Array.Sort routine shows that we get the exact same answers (although the framework’s sort routine is slightly faster).  On my system, for example, I can use framework’s sort to sort ten million random doubles in about 7.3s, and this implementation takes about 9.3s on average. Looking at this routine, though, there is a clear opportunity to parallelize.  At the end of QuickSortInternal, we recursively call into QuickSortInternal with each partition of the array after the pivot is chosen.  This can be rewritten to use Parallel.Invoke by simply changing it to: // Code above is unchanged... SwapElements(array, left, last); Parallel.Invoke( () => QuickSortInternal(array, left, last - 1), () => QuickSortInternal(array, last + 1, right) ); } This routine will now run in parallel.  When executing, we now see the CPU usage across all cores spike while it executes.  However, there is a significant problem here – by parallelizing this routine, we took it from an execution time of 9.3s to an execution time of approximately 14 seconds!  We’re using more resources as seen in the CPU usage, but the overall result is a dramatic slowdown in overall processing time. This occurs because parallelization adds overhead.  Each time we split this array, we spawn two new tasks to parallelize this algorithm!  This is far, far too many tasks for our cores to operate upon at a single time.  In effect, we’re “over-parallelizing” this routine.  This is a common problem when working with divide and conquer algorithms, and leads to an important observation: When parallelizing a recursive routine, take special care not to add more tasks than necessary to fully utilize your system. This can be done with a few different approaches, in this case.  Typically, the way to handle this is to stop parallelizing the routine at a certain point, and revert back to the serial approach.  Since the first few recursions will all still be parallelized, our “deeper” recursive tasks will be running in parallel, and can take full advantage of the machine.  This also dramatically reduces the overhead added by parallelizing, since we’re only adding overhead for the first few recursive calls.  There are two basic approaches we can take here.  The first approach would be to look at the total work size, and if it’s smaller than a specific threshold, revert to our serial implementation.  In this case, we could just check right-left, and if it’s under a threshold, call the methods directly instead of using Parallel.Invoke. The second approach is to track how “deep” in the “tree” we are currently at, and if we are below some number of levels, stop parallelizing.  This approach is a more general-purpose approach, since it works on routines which parse trees as well as routines working off of a single array, but may not work as well if a poor partitioning strategy is chosen or the tree is not balanced evenly. This can be written very easily.  If we pass a maxDepth parameter into our internal routine, we can restrict the amount of times we parallelize by changing the recursive call to: // Code above is unchanged... SwapElements(array, left, last); if (maxDepth < 1) { QuickSortInternal(array, left, last - 1, maxDepth); QuickSortInternal(array, last + 1, right, maxDepth); } else { --maxDepth; Parallel.Invoke( () => QuickSortInternal(array, left, last - 1, maxDepth), () => QuickSortInternal(array, last + 1, right, maxDepth)); } We no longer allow this to parallelize indefinitely – only to a specific depth, at which time we revert to a serial implementation.  By starting the routine with a maxDepth equal to Environment.ProcessorCount, we can restrict the total amount of parallel operations significantly, but still provide adequate work for each processing core. With this final change, my timings are much better.  On average, I get the following timings: Framework via Array.Sort: 7.3 seconds Serial Quicksort Implementation: 9.3 seconds Naive Parallel Implementation: 14 seconds Parallel Implementation Restricting Depth: 4.7 seconds Finally, we are now faster than the framework’s Array.Sort implementation.

    Read the article

  • Queued Loadtest to remove Concurrency issues using Shared Data Service in OpenScript

    - by stefan.thieme(at)oracle.com
    Queued Processing to remove Concurrency issues in Loadtest ScriptsSome scripts act on information returned by the server, e.g. act on first item in the returned list of pending tasks/actions. This may lead to concurrency issues if the virtual users simulated in a load test scenario are not synchronized in some way.As the load test cases should be carried out in a comparable and straight forward manner simply cancel a transaction in case a collision occurs is clearly not an option. In case you increase the number of virtual users this approach would lead to a high number of requests for the early steps in your transaction (e.g. login, retrieve list of action points, assign an action point to the virtual user) but later steps would be rarely visited successfully or at all, depending on the application logic.A way to tackle this problem is to enqueue the virtual users in a Shared Data Service queue. Only the first virtual user in this queue will be allowed to carry out the critical steps (retrieve list of action points, assign an action point to the virtual user) in your transaction at any one time.Once a virtual user has passed the critical path it will dequeue himself from the head of the queue and continue with his actions. This does theoretically allow virtual users to run in parallel all steps of the transaction which are not part of the critical path.In practice it has been seen this is rarely the case, though it does not allow adding more than N users to perform a transaction without causing delays due to virtual users waiting in the queue. N being the time of the total transaction divided by the sum of the time of all critical steps in this transaction.While this problem can be circumvented by allowing multiple queues to act on individual segments of the list of actions, e.g. per country filter, ends with 0..9 filter, etc.This would require additional handling of these additional queues of slots for the virtual users at the head of the queue in order to maintain the mutually exclusive access to the first element in the list returned by the server at any one time of the load test. Such an improved handling of multiple queues and/or multiple slots is above the subject of this paper.Shared Data Services Pre-RequisitesStart WebLogic Server to host Shared Data ServicesYou will have to make sure that your WebLogic server is installed and started. Shared Data Services may not work if you installed only the minimal installation package for OpenScript. If however you installed the default package including OLT and OTM, you may follow the instructions below to start and verify WebLogic installation.To start the WebLogic Server deployed underneath of Oracle Load Testing and/or Oracle Test Manager you can go to your Start menu, Oracle Application Testing Suite and select the Restart Oracle Application Testing Suite Application Service entry from the Tools submenu.To verify the service has been started you can run the Microsoft Management Console for Services by Selecting Run from the Start Menu and entering services.msc. Look for the entry that reads Oracle Application Testing Suite Application Service, once it has changed it status from Starting to Started you can proceed to verify the login. Please note that this may take several minutes, I would say up to 10 minutes depending on the strength of your CPU horse-power.Verify WebLogic Server user credentialsYou will have to make sure that your WebLogic Server is installed and started. Next open the Oracle WebLogic Server Adminstration Console on http://localhost:8088/console.It may take a while until the application is deployed and started. It may display the following until the Administration Console has been deployed on the fly.Afterwards you can login using the username oats and the password that you selected during install time for your Application Testing Suite administrative purposes.This will bring up the Home page of you WebLogic Server. You have actually verified that you are able to login with these credentials already. However if you want to check the details, navigate to Security Realms, myrealm, Users and Groups tab.Here you could add users to your WebLogic Server which could be used in the later steps. Details on the Groups required for such a custom user to work are exceeding this quick overview and have to be selected with the WebLogic Server Adminstration Guide in mind.Shared Data Services pre-requisites for Load testingOpenScript Preferences have to be set to enable Encryption and provide a default Shared Data Service Connection for Playback.These are pre-requisites you want to use for load testing with Shared Data Services.Please note that the usage of the Connection Parameters (individual directive in the script) for Shared Data Services did not playback reliably in the current version 9.20.0370 of Oracle Load Testing (OLT) and encryption of credentials still seemed to be mandatory as well.General Encryption settingsSelect OpenScript Preferences from the View menu and navigate to the General, Encryption entry in the tree on the left. Select the Encrypt script data option from the list and enter the same password that you used for securing your WebLogic Server Administration Console.Enable global shared data access credentialsSelect OpenScript Preferences from the View menu and navigate to the Playback, Shared Data entry in the tree on the left. Enable the global shared data access credentials and enter the Address, User name and Password determined for your WebLogic Server to host Shared Data Services.Please note, that you may want to replace the localhost in Address with the hosts realname in case you plan to run load tests with Loadtest Agents running on remote systems.Queued Processing of TransactionsEnable Shared Data Services Module in Script PropertiesThe Shared Data Services Module has to be enabled for each Script that wants to employ the Shared Data Service Queue functionality in OpenScript. It can be enabled under the Script menu selecting Script Properties. On the Script Properties Dialog select the Modules section and check Shared Data to enable Shared Data Service Module for your script. Checking the Shared Data Services option will effectively add a line to your script code that adds the sharedData ScriptService to your script class of IteratingVUserScript.@ScriptService oracle.oats.scripting.modules.sharedData.api.SharedDataService sharedData;Record your scriptRecord your script as usual and then add the following things for Queue handling in the Initialize code block, before the first step and after the last step of your critical path and in the Finalize code block.The java code to be added at individual locations is explained in the following sections in full detail.Create a Shared Data Queue in InitializeTo create a Shared Data Queue go to the Java view of your script and enter the following statements to the initialize() code block.info("Create queueA with life time of 120 minutes");sharedData.createQueue("queueA", 120);This will create an instantiation of the Shared Data Queue object named queueA which is maintained for upto 120 minutes.If you want to use the code for multiple scripts, make sure to use a different queue name for each one here and in the subsequent steps. You may even consider to use a dynamic queueName based on filters of your result list being concurrently accessed.Prepare a unique id for each IterationIn order to keep track of individual virtual users in our queue we need to create a unique identifier from the virtual user id and the used username right after retrieving the next record from our databank file.getDatabank("Usernames").getNextDatabankRecord();getVariables().set("usernameValue1","VU_{{@vuid}}_{{@iterationnum}}_{{db.Usernames.Username}}_{{@timestamp}}_{{@random(10000)}}");String usernameValue = getVariables().get("usernameValue1");info("Now running virtual user " + usernameValue);As you can see from the above code block, we have set the OpenScript variable usernameValue1 to VU_{{@vuid}}_{{@iterationnum}}_{{db.Usernames.Username}}_{{@timestamp}}_{{@random(10000)}} which is a concatenation of the virtual user id and the iterationnumber for general uniqueness; as well as the username from our databank, the timestamp and a random number for making it further unique and ease spotting of errors.Not all of these fields are actually required to make it really unique, but adding the queue name may also be considered to help troubleshoot multiple queues.The value is then retrieved with the getVariables.get() method call and assigned to the usernameValue String used throughout the script.Please note that moving the getDatabank("Usernames").getNextDatabankRecord(); call to the initialize block was later considered to remove concurrency of multiple virtual users running with the same userid and therefor accessing the same "My Inbox" in step 6. This will effectively give each virtual user a userid from the databank file. Make sure you have enough userids to remove this second hurdle.Enqueue and attend Queue before Critical PathTo maintain the right order of virtual users being allowed into the critical path of the transaction the following pseudo step has to be added in front of the first critical step. In the case of this example this is right in front of the step where we retrieve the list of actions from which we select the first to be assigned to us.beginStep("[0] Waiting in the Queue", 0);{info("Enqueued virtual user " + usernameValue + " at the end of queueA");sharedData.offerLast("queueA", usernameValue);info("Wait until the user is the first in queueA");String queueValue1 = null;do {// we wait for at least 0.7 seconds before we check the head of the// queue. This is the time it takes one user to move through the// critical path, i.e. pass steps [5] Enter country and [6] Assign// to meThread.sleep(700);queueValue1 = (String) sharedData.peekFirst("queueA");info("The first user in queueA is currently: '" + queueValue1 + "' " + queueValue1.getClass() + " length " + queueValue1.length() );info("The current user is '"+ usernameValue + "' " + usernameValue.getClass() + " length " + usernameValue.length() + ": indexOf " + usernameValue.indexOf(queueValue1) + " equals " + usernameValue.equals(queueValue1) );} while ( queueValue1.indexOf(usernameValue) < 0 );info("Now the user is the first in queueA");}endStep();This will enqueue the username to the tail of our Queue. It will will wait for at least 700 milliseconds, the time it takes for one user to exit the critical path and then compare the head of our queue with it's username. This last step will be repeated while the two are not equal (indexOf less than zero). If they are equal the indexOf will yield a value of zero or larger and we will perform the critical steps.Dequeue after Critical PathAfter the virtual user has left the critical path and complete its last step the following code block needs to dequeue the virtual user. In the case of our example this is right after the action has been actually assigned to the virtual user. This will allow the next virtual user to retrieve the list of actions still available and in turn let him make his selection/assignment.info("Get and remove the current user from the head of queueA");String pollValue1 = (String) sharedData.pollFirst("queueA");The current user is removed from the head of the queue. The next one will now be able to match his username against the head of the queue.Clear and Destroy Queue for FinishWhen the script has completed, it should clear and destroy the queue. This code block can be put in the finish block of your script and/or in a separate script in order to clear and remove the queue in case you have spotted an error or want to reset the queue for some reason.info("Clear queueA");sharedData.clearQueue("queueA");info("Destroy queueA");sharedData.destroyQueue("queueA");The users waiting in queueA are cleared and the queue is destroyed. If you have scripts still executing they will be caught in a loop.I found it better to maintain a separate Reset Queue script which contained only the following code in the initialize() block. I use to call this script to make sure the queue is cleared in between multiple Loadtest runs. This script could also even be added as the first in a larger scenario, which would execute it only once at very start of the Loadtest and make sure the queues do not contain any stale entries.info("Create queueA with life time of 120 minutes");sharedData.createQueue("queueA", 120);info("Clear queueA");sharedData.clearQueue("queueA");This will create a Shared Data Queue instance of queueA and clear all entries from this queue.Monitoring QueueWhile creating the scripts it was useful to monitor the contents, i.e. the current first user in the Queue. The following code block will make sure the Shared Data Queue is accessible in the initialize() block.info("Create queueA with life time of 120 minutes");sharedData.createQueue("queueA", 120);In the run() block the following code will continuously monitor the first element of the Queue and write an informational message with the current username Value to the Result window.info("Monitor the first users in queueA");String queueValue1 = null;do {queueValue1 = (String) sharedData.peekFirst("queueA");if (queueValue1 != null)info("The first user in queueA is currently: '" + queueValue1 + "' " + queueValue1.getClass() + " length " + queueValue1.length() );} while ( true );This script can be run from OpenScript parallel to a loadtest performed by the Oracle Load Test.However it is not recommend to run this in a production loadtest as the performance impact is unknown. Accessing the Queue's head with the peekFirst() method has been reported with about 2 seconds response time by both OpenScript and OTL. It is advised to log a Service Request to see if this could be lowered in future releases of Application Testing Suite, as the pollFirst() and even offerLast() writing to the tail of the Queue usually returned after an average 0.1 seconds.Debugging QueueWhile debugging the scripts the following was useful to remove single entries from its head, i.e. the current first user in the Queue. The following code block will make sure the Shared Data Queue is accessible in the initialize() block.info("Create queueA with life time of 120 minutes");sharedData.createQueue("queueA", 120);In the run() block the following code will remove the first element of the Queue and write an informational message with the current username Value to the Result window.info("Get and remove the current user from the head of queueA");String pollValue1 = (String) sharedData.pollFirst("queueA");info("The first user in queueA was currently: '" + pollValue1 + "' " + pollValue1.getClass() + " length " + pollValue1.length() );ReferencesOracle Functional Testing OpenScript User's Guide Version 9.20 [E15488-05]Chapter 17 Using the Shared Data Modulehttp://download.oracle.com/otn/nt/apptesting/oats-docs-9.21.0030.zipOracle Fusion Middleware Oracle WebLogic Server Administration Console Online Help 11g Release 1 (10.3.4) [E13952-04]Administration Console Online Help - Manage users and groupshttp://download.oracle.com/docs/cd/E17904_01/apirefs.1111/e13952/taskhelp/security/ManageUsersAndGroups.htm

    Read the article

  • Building the Ultimate SharePoint 2010 Development Environment

    - by Manesh Karunakaran
    It’s been more than a month since SharePoint 2010 RTMed. And a lot of people have downloaded and set up their very own SharePoint 2010 development rigs. And quite a few people have written blogs about setting up good development environments, there is even an MSDN article on it. Two of the blogs worth noting are from MVPs Sahil Malik and Wictor Wilén. Make sure that you check these out as well. Part of the bad side-effects of being a geek is the need to do the technical stuff the best way possible (pragmatic or otherwise), but the problem with this is that what is considered “best” is relative. Precisely the reason why you are reading this post now. Most of the posts that I read are out dated/need updations or are using the wrong OS’es or virtualization solutions (again, opinions vary) or using them the wrong way. Here’s a developer’s view of Building the Ultimate SharePoint 2010 Development Rig. If you are a sales guy, it’s time to close this window. Confusion 1: Which Host Operating System and Virtualization Solution to use? This point has been beaten to death in numerous blog posts in the past, if you have time to invest, read this excellent post by our very own SharePoint Joel on this subject. But if you are planning to build the Ultimate Development Rig, then Windows Server 2008 R2 with Hyper-V is the option that you should be looking at. I have been using this as my primary OS for about 6-7 months now, and I haven’t had any Driver issue or Application compatibility issue. In my experience all the Windows 7 drivers work fine with WIN2008 R2 also. You can enable Aero for eye candy (and the Windows 7 look and feel) and except for a few things like the Hibernation support (which a can be enabled if you really want it), Windows Server 2008 R2, is the best Workstation OS that I have used till date. But frankly the answer to this question of which OS to use depends primarily on one question - Are you willing to change your primary OS? If the answer to that is ‘Yes’, then Windows 2008 R2 with Hyper-V is the best option, if not look at vmWare or VirtualBox, both are equally good. Those who are familiar with a Virtual PC background might prefer Sun VirtualBox. Besides, these provide support for running 64 bit guest machines on 32 bit hosts if the underlying hardware is truly 64 bit. See my earlier post on this. Since we are going to make the ultimate rig, we will use Windows Server 2008 R2 with Hyper-V, for reasons mentioned above. Confusion 2: Should I use a multi-(virtual) server set up? A lot of people use multiple servers for their development environments - like Wictor Wilén is suggesting - one server hosting the Active directory, one hosting SharePoint Server and another one for SQL Server. True, this mimics the production environment the best possible way, but as somebody who has fallen for this set up earlier, I can tell you that you don’t really get anything by doing this. Microsoft has done well to ensure that if you can do it on one machine, you can do it in a farm environment as well. Besides, when you run multiple Server class machine instances in parallel, there are a lot of unwanted processor cycles wasted for no good use. In my personal experience, as somebody who needs to switch between MOSS 2007/SharePoint 2010 environments from time to time, the best possible solution is to Make the host Windows Server 2008 R2 machine your Domain Controller (AD Server) Make all your Virtual Guest OS’es join this domain. Have each Individual Guest OS Image have it’s own local SQL Server instance. The advantages are that you can reuse the users and groups in each of the Guest operating systems, you can manage the users in one place, AD is light weight and doesn't take too much resources on your host machine and also having separate SQL instances for each of the Development images gives you maximum flexibility in terms of configuration, for example your SharePoint rigs can have simpler DB configurations, compared to your MS BI blast pits. Confusion 3: Which Operating System should I use to run SharePoint 2010 Now that’s a no brainer. Use Windows 2008 R2 as your Guest OS. When you are building the ultimate rig, why compromise? If you are planning to run Windows Server 2008 as your Guest OS, there are a few patches that you need to install at different times during the installation, for that follow the steps mentioned here Okay now that we have made our choices, let’s get to the interesting part of building the rig, Step 1: Prepare the host machine – Install Windows Server 2008 R2 Install Windows Server 2008 R2 on your best Desktop/Laptop. If you have read this far, I am quite sure that you are somebody who can install an OS on your own, so go ahead and do that. Make sure that you run the compatibility wizard before you go ahead and nuke your current OS. There are plenty of blogs telling you how to make a good Windows 2008 R2 Workstation that feels and behaves like a Windows 7 machine, follow one and once you are done, head to Step 2. Step 2: Configure the host machine as a Domain Controller Before we begin this, let me tell you, this step is completely optional, you don’t really need to do this, you can simply use the local users on the Guest machines instead, but if this is a much cleaner approach to manage users and groups if you run multiple guest operating systems.  This post neatly explains how to configure your Windows Server 2008 R2 host machine as a Domain Controller. Follow those simple steps and you are good to go. If you are not able to get it to work, try this. Step 3: Prepare the guest machine – Install Windows Server 2008 R2 Open Hyper-V Manager Choose to Create a new Guest Operating system Allocate at least 2 GB of Memory to the Guest OS Choose the Windows 2008 R2 Installation Media Start the Virtual Machine to commence installation. Once the Installation is done, Activate the OS. Step 4: Make the Guest operating systems Join the Domain This step is quite simple, just follow these steps below, Fire up Hyper-V Manager, open your Guest OS Click on Start, and Right click on ‘Computer’ and choose ‘Properties’ On the window that pops-up, click on ‘Change Settings’ On the ‘System Properties’ Window that comes up, Click on the ‘Change’ button Now a window named ‘Computer Name/Domain Changes’ opens up, In the text box titled Domain, type in the Domain name from Step 2. Click Ok and windows will show you the welcome to domain message and ask you to restart the machine, click OK to restart. If the addition to domain fails, that means that you have not set up networking in Hyper-V for the Guest OS to communicate with the Host. To enable it, follow the steps I had mentioned in this post earlier. Step 5: Install SQL Server 2008 R2 on the Guest Machine SQL Server 2008 R2 gets installed with out hassle on Windows Server 2008 R2. SQL Server 2008 needs SP2 to work properly on WIN2008 R2. Also SQL Server 2008 R2 allows you to directly add PowerPivot support to SharePoint. Choose to install in SharePoint Integrated Mode in Reporting Server Configuration. Step 6: Install KB971831 and SharePoint 2010 Pre-requisites Now install the WCF Hotfix for Microsoft Windows (KB971831) from this location, and SharePoint 2010 Pre-requisites from the SP2010 Installation media. Step 7: Install and Configure SharePoint 2010 Install SharePoint 2010 from the installation media, after the installation is complete, you are prompted to start the SharePoint Products and Technologies Configuration Wizard. If you are using a local instance of Microsoft SQL Server 2008, install the Microsoft SQL Server 2008 KB 970315 x64 before starting the wizard. If your development environment uses a remote instance of Microsoft SQL Server 2008 or if it has a pre-existing installation of Microsoft SQL Server 2008 on which KB 970315 x64 has already been applied, this step is not necessary. With the wizard open, do the following: Install SQL Server 2008 KB 970315 x64. After the Microsoft SQL Server 2008 KB 970315 x64 installation is finished, complete the wizard. Alternatively, you can choose not to run the wizard by clearing the SharePoint Products and Technologies Configuration Wizard check box and closing the completed installation dialog box. Install SQL Server 2008 KB 970315 x64, and then manually start the SharePoint Products and Technologies Configuration Wizard by opening a Command Prompt window and executing the following command: C:\Program Files\Common Files\Microsoft Shared Debug\Web Server Extensions\14\BIN\psconfigui.exe The SharePoint Products and Technologies Configuration Wizard may fail if you are using a computer that is joined to a domain but that is not connected to a domain controller. Step 8: Install Visual Studio 2010 and SharePoint 2010 SDK Install Visual Studio 2010 Download and Install the Microsoft SharePoint 2010 SDK Step 9: Install PowerPivot for SharePoint and Configure Reporting Services Pop-In the SQLServer 2008 R2 installation media once again and install PowerPivot for SharePoint. This will get added as another instance named POWERPIVOT. Configure Reporting Services by following the steps mentioned here, if you need to get down to the details on how the integration between SharePoint 2010 and SQL Server 2008 R2 works, see Working Together: SQL Server 2008 R2 Reporting Services Integration in SharePoint 2010 an excellent article by Alan Le Marquand Step 10: Download and Install Sample Databases for Microsoft SQL Server 2008R2 SharePoint 2010 comes with a lot of cool stuff like PerformancePoint Services and BCS, if you need to try these out, you need to have data in your databases. So if you want to save yourself the trouble of creating sample data for your PerformancePoint and BCS experiments, download and install Sample Databases for Microsoft SQL Server 2008R2 from CodePlex. And you are done! Fire up your Visual Studio 2010 and Start Coding away!!

    Read the article

  • Surface RT: To Be Or Not To Be (Part 1)

    - by smehaffie
    So the Surface RT has been out for 9 months and Microsoft just declared a $900 million dollar write-down. So how did this happen and what does it mean for Microsoft’s efforts to break into the tablet market? I have been thinking a lot about most of the information below since the Surface product line was released. If you are looking for a “Microsoft Is Dead” story, then don’t read any further. But if you want an honest look at what I think led Microsoft to this point and what I think can be done to make Surface RT devices better, then please continue reading. What Led Microsoft To The $900 Million Write-Down Surface Unveiling:Microsoft totally missed the boat when they unveiled the Surface product line on June 18th, 2012. Microsoft should’ve been ready to post the specifications of both devices that night. Microsoft should’ve had a site up and running right after the event so people could pre-order the devices. This would have given them a good idea what the interest was in each device.  They could also have used this data to make a better estimate for the number of units to to have available for the launch and beyond.  They also lost out on taking advantage of the excitement generated by the Surface RT and Surface Pro announcement. They could have thrown in a free touch keyboard to anyone who pre-ordered. The advertising should have started right after the announcement and gotten bigger as launch day approached. Push for as many pre-order as possible and build excitement for the launch. Actual Launch (Surface RT): By this time all excitement was gone from the initial announcement, except for the Micorsoft faithful. Microsoft should have been ready to sell the Surface in as many markets as possible at launch. The limited market release was a real letdown for a lot of people.  A limited release right after the initial announce is understandable, but not at the official launch of the product. Microsoft overpriced the device and now they are lowering it to what it should have been to start with. The $349 price is within the range I suggested it should be at before pricing was announced. (Surface Tablets: The Price Must Be Right). Limited ordering options online was also a killer. User should have been able to buy the base unit of each device and then add on whatever keyboard they wanted to (this applies more to the Surface Pro).  There should have also been a place where users could order any additional add-ins that they wanted to buy (covers, extra power supplies, etc.) Marketing was better and the dancing “Click In” commercial was cool, but the ads comparing the iPad with Siri should have been on the air from day one of the announcement (or at least the launch).  Consumers want to know why you tablet is better, not just that is has a clickable keyboard and built-in kickstand. They could have also compared it to some of the other mid-range tablets if they had not overprices it to begin with. Stock Applications (Mail, People, Calendar, Music, Video, Reader and IE): This is where Microsoft really blew it. They had all the time in the world to make these applications the best of breed and instead we got applications that seemed thrown together.  Some updates have made these application better, but they are all still lacking in features that should have been there from day one. This did not help to enhance a new users experience any. ** I will admit that the applications that were data driven were first class citizen’s and that makes it even more perplexing why MS could knock it out of the park with the Weather, Travel, Finance, Bing, etc.) and fail so miserably on the core applications users would use the most on a tablet. Desktop on Tablet: The desktop just is so out of place on the tablet  I understand it was needed for Office but think it would have been better to not have the desktop in Windows RT, but instead open up the Office applications in full screen mode, in a desktop shell (same goes for  IE11).That way the user wouldn’t realize they are leaving Metro and going to the desktop. The other option would have been to just not include Office on Windows RT devices. Instead they could have made awesome Widows Store Apps for Word, Excel, OneNote and PowerPoint. In addition, they could have made the stock Mail, People, and Calendar applications contain all the functions that Outlook gives desktop users. Having some of the settings in desktop mode and others under “Change PC Settings” made Windows RT seemed unfinished and rushed to market. What Can Be Done To Make Windows RT Based Tablets Better (At least in my opinion) Either eliminate the desktop all together from Windows RT or at least make the user experience better by hiding the fact the user is running Office/IE in the desktop. Personally I ‘d like them to totally get rid of it and just make awesome Windows Store Application version of Word, Excel PowerPoint & OneNote.  This might also make the OS smaller and give the user more available disk space. I doubt there will ever be a Windows Store App versions of Office, but I still think it is a good idea. Make is so users can easily direct their documents, picture, videos and music to their extra storage and can access these files from the standard libraries.  A user should not have to create a VM on their microSD card or create symbolic links to get this to work properly. Most consumers would not be able to do this. Then users get frustrated when they run out or room on their main storage because nothing is automatically save to their microSD card when saved to libraries.  This is a major bug that needs to be fixed, otherwise Microsoft’s selling point of having a microSD slot is worthless. Allows users to uninstall and re-install any of the Office product that come with the Surface. That way people can free up storage space by uninstalling the Office applications they do not need. Everyone’s needs are different, so make the options flexible. Don’t take up storage space for applications the user will not use. Make the Core applications the “Cream of the Crop” Windows App Store applications. The should set the bar for all other Store applications. Improve performance as much as possible, if it seems to be sluggish on a tablet consumer will not buy it. They need to price the next line of Surface product very aggressive to undercut not only iPad but also Android low end tablets (Nook, Kindle Fire, and Nexus, etc.) Give developers incentives to write quality applications for the devices. Don’t reward developers for cranking out cookie cutter, low quality applications. I’d even suggest Microsoft consider implementing some new store certification guideline to stop these type of applications being published. Allow users to easily move the recover disk “partition between their microSD card and main storage. My Predictions for the Surface RT and Windows RT I honestly think even with all the missteps MS has made since the announcement  about the Surface product line, that they are on the right path. I was excited the Surface tablets when they were announced, and I still am. The truth be told, Windows 8 on a tablet (aka: Windows RT) is better than both iOS and Android. My nephew who is an Apple fan boy told me after he saw and used Windows 8 (he got the beta running on his iPad), that Windows 8 kicked Apples butt as a tablet OS. So there is hope for all Windows RT based tablets. I agree with my nephew and that is why whenever anyone asks me about my Surface, I love showing it off and recommend it. The 6 keys to gaining market share in the tablet market are; Aggressive pricing by both Microsoft and their OEM’s Good quality devices put out by Microsoft and their OEM’s (there are some out there, but not enough) Marketing, Marketing, Marketing from both Microsoft and their OEM’s (Need more ads showing why windows based tablets are better than iPads and Android tablets) Getting Widows tablets in retails stores all over, and giving sales people incentive to sell them. Consumers like to try electronics out before they buy them, and most will listen to what the sales person suggest. Microsoft needs sales people in retail stores directing people to buy windows based tablets over iPads and Android tablets. I think the Microsoft Stores within Best Buy is a good start, but they also need to get prominent displays in Walmart, Target, etc.. Release a smaller form factor Surface, Hopefully the 8”-10” next generation Surface is not a rumor. Make “Surface” the brand name for all Microsoft tablets and hybrid devices that they come out with. They cannot change the name with each new release.  Make Surface synonymous with quality, the same way that iPad  is for Apple. Well, that is my 2 cents on the subject. Let me know your thoughts by leaving a comment below. Soon to follow will be my thought on the Surface Pro, so keep an eye out for it. var addthis_pub="smehaffie"; var addthis_options="email, print, digg, slashdot, delicious, twitter, live, myspace, facebook, google, stumbleupon, newsvine";

    Read the article

  • Using the OAM Mobile & Social SDK to secure native mobile apps - Part 2 : OAM Mobile & Social Server configuration

    - by kanishkmahajan
    Objective  In the second part of this blog post I'll now cover configuration of OAM to secure our sample native apps developed using the iOS SDK. First, here are some key server side concepts: Application Profiles: An application profile is a logical representation of your application within OAM server. It could be a web (html/javascript) or native (iOS or Android) application. Applications may have different requirements for AuthN/AuthZ, and therefore each application that interacts with OAM Mobile & Social REST services must be uniquely defined. Service Providers: Service providers represent the back end services that are accessed by applications. With OAM Mobile & Social these services are in the areas of authentication, authorization and user profile access. A Service Provider then defines a type or class of service for authentication, authorization or user profiles. For example, the JWTAuthentication provider performs authentication and returns JWT (JSON Web Tokens) to the application. In contrast, the OAMAuthentication also provides authentication but uses OAM SSO tokens Service Profiles:  A Service Profile is a logical envelope that defines a service endpoint URL for a service provider for the OAM Mobile & Social Service. You can create multiple service profiles for a service provider to define token capabilities and service endpoints. Each service provider instance requires atleast one corresponding service profile.The  OAM Mobile & Social Service includes a pre-configured service profile for each pre-configured service provider. Service Domains: Service domains bind together application profiles and service profiles with an optional security handler. So now let's configure the OAM server. Additional details are in the OAM Documentation and this post simply provides an outline of configuration tasks required to configure OAM for securing native apps.  Configuration  Create The Application Profile Log on to the Oracle Access Management console and from System Configuration -> Mobile and Social -> Mobile Services, select "Create" under Application Profiles. You would do this  step twice - once for each of the native apps - AvitekInventory and AvitekScheduler. Enter the parameters for the new Application profile: Name:  The application name. In this example we use 'InventoryApp' for the AvitekInventory app and 'SchedulerApp' for the AvitekScheduler app. The application name configured here must match the application name in the settings for the deployed iOS application. BaseSecret: Enter a password here. This does not need to match any existing password. It is used as an encryption key between the client and the OAM server.  Mobile Configuration: Enable this checkbox for any mobile applications. This enables the SDK to collect and send Mobile specific attributes to the OAM server.  Webview: Controls the type of browser that the iOS application will use. The embedded browser (default) will render the browser within the application. External will use the system standalone browser. External can sometimes be preferable for debugging URLScheme: The URL scheme associated with the iOS apps that is also used as a custom URL scheme to register O/S handlers that will take control when OAM transfers control to device. For the AvitekInventory and the AvitekScheduler apps I used osa:// and client:// respectively. You set this scheme in Xcode while developing your iOS Apps under Info->URL Types.  Bundle Identifier : The fully qualified name of your iOS application. You typically set this when you create a new Xcode project or under General->Identity in Xcode. For the AvitekInventory and AvitekScheduler apps these were com.us.oracle.AvitekInventory and com.us.oracle.AvitekScheduler respectively.  Create The Service Domain Select create under Service domains. Create a name for your domain (AvitekDomain is what I've used). The name configured must match the service domain set in the iOS application settings. Under "Application Profile Selection" click the browse button. Choose the application profiles that you created in the previous step one by one. Set the InventoryApp as the SSO agent (with an automatic priority of 1) and the SchedulerApp as the SSO client. This associates these applications with this service domain and configures them in a 'circle of trust'.  Advance to the next page of the wizard to configure the services for this domain. For this example we will use the following services:  Authentication:   This will use the JWT (JSON Web Token) format authentication provider. The iOS application upon successful authentication will receive a signed JWT token from OAM Mobile & Social service. This token will be used in subsequent calls to OAM. Use 'MobileOAMAuthentication' here. Authorization:  The authorization provider. The SDK makes calls to this provider endpoint to obtain authorization decisions on resource requests. Use 'OAMAuthorization' here. User Profile Service:  This is the service that provides user profile services (attribute lookup, attribute modification). It can be any directory configured as a data source in OAM.  And that's it! We're done configuring our native apps. In the next section, let's look at some additional features that were mentioned in the earlier post that are automated by the SDK for the app developer i.e. these are areas that require no additional coding by the app developer when developing with the SDK as they only require server side configuration: Additional Configuration  Offline Authentication Select this option in the service domain configuration to allow users to log in and authenticate to the application locally. Clear the box to block users from authenticating locally. Strong Authentication By simply selecting the OAAMSecurityHandlerPlugin while configuring mobile related Service Domains, the OAM Mobile&Social service allows sophisticated device and client application registration logic as well as the advanced risk and fraud analysis logic found in OAAM to be applied to mobile authentication. Let's look at some scenarios where the OAAMSecurityHandlerPlugin gets used. First, when we configure OAM and OAAM to integrate together using the TAP scheme, then that integration kicks off by selecting the OAAMSecurityHandlerPlugin in the mobile service domain. This is how the mobile device is now prompted for KBA,OTP etc depending on the TAP scheme integration and the OAM users registered in the OAAM database. Second, when we configured the service domain, there were claim attributes there that are already pre-configured in OAM Mobile&Social service and we simply accepted the default values- these are the set of attributes that will be fetched from the device and passed to the server during registration/authentication as device profile attributes. When a mobile application requests a token through the Mobile Client SDK, the SDK logic will send the Device Profile attributes as a part of an HTTP request. This set of Device Profile attributes enhances security by creating an audit trail for devices that assists device identification. When the OAAM Security Plug-in is used, a particular combination of Device Profile attribute values is treated as a device finger print, known as the Digital Finger Print in the OAAM Administration Console. Each finger print is assigned a unique fingerprint number. Each OAAM session is associated with a finger print and the finger print makes it possible to log (and audit) the devices that are performing authentication and token acquisition. Finally, if the jail broken option is selected while configuring an application profile, the SDK detects a device is jail broken based on configured policy and if the OAAM handler is configured the plug-in can allow or block access to client device depending on the OAAM policy as well as detect blacklisted, lost or stolen devices and send a wipeout command that deletes all the mobile &social relevant data and blocks the device from future access. 1024x768 Social Logins Finally, let's complete this post by adding configuration to configure social logins for mobile applications. Although the Avitek sample apps do not demonstrate social logins this would be an ideal exercise for you based on the sample code provided in the earlier post. I'll cover the server side configuration here (with Facebook as an example) and you can retrofit the code to accommodate social logins by following the steps outlined in "Invoking Authentication Services" and add code in LoginViewController and maybe create a new delegate - AvitekRPDelegate based on the description in the previous post. So, here all you will need to do is configure an application profile for social login, configure a new service domain that uses the social login application profile, register the app on Facebook and finally configure the Facebook OAuth provider in OAM with those settings. Navigate to Mobile and Social, click on "Internet Identity Services" and create a new application profile. Here are the relevant parameters for the new application profile (-also we're not registering the social user in OAM with this configuration below, however that is a key feature as well): Name:  The application name. This must match the name of the of mobile application profile created for your application under Mobile Services. We used InventoryApp for this example. SharedSecret: Enter a password here. This does not need to match any existing password. It is used as an encryption key between the client and the OAM Mobile and Social service.  Mobile Application Return URL: After the Relying Party (social) login, the OAM Mobile & Social service will redirect to the iOS application using this URI. This is defined under Info->URL type and we used 'osa', so we define this here as 'osa://' Login Type: Choose to allow only internet identity authentication for this exercise. Authentication Service Endpoint : Make sure that /internetidentityauthentication is selected. Login to http://developers.facebook.com using your Facebook account and click on Apps and register the app as InventoryApp. Note that the consumer key and API secret gets generated automatically by the Facebook OAuth server. Navigate back to OAM and under Mobile and Social, click on "Internet Identity Services" and edit the Facebook OAuth Provider. Add the consumer key and API secret from the Facebook developers site to the Facebook OAuth Provider: Navigate to Mobile Services. Click on New to create a new service domain. In this example we call the domain "AvitekDomainRP". The type should be 'Mobile Application' and the application credential type 'User Token'. Add the application "InventoryApp" to the domain. Advance the next page of the wizard. Select the  default service profiles but ensure that the Authentication Service is set to 'InternetIdentityAuthentication'. Finish the creation of the service domain.

    Read the article

  • Does ModSecurity 2.7.1 work with ASP.NET MVC 3?

    - by autonomatt
    I'm trying to get ModSecurity 2.7.1 to work with an ASP.NET MVC 3 website. The installation ran without errors and looking at the event log, ModSecurity is starting up successfully. I am using the modsecurity.conf-recommended file to set the basic rules. The problem I'm having is that whenever I am POSTing some form data, it doesn't get through to the controller action (or model binder). I have SecRuleEngine set to DetectionOnly. I have SecRequestBodyAccess set to On. With these settings, the body of the POST never reaches the controller action. If I set SecRequestBodyAccess to Off it works, so it's definitely something to do with how ModSecurity forwards the body data. The ModSecurity debug shows the following (looks to me as if all passed through): Second phase starting (dcfg 94b750). Input filter: Reading request body. Adding request argument (BODY): name "[0].IsSelected", value "on" Adding request argument (BODY): name "[0].Quantity", value "1" Adding request argument (BODY): name "[0].VariantSku", value "047861" Adding request argument (BODY): name "[1].Quantity", value "0" Adding request argument (BODY): name "[1].VariantSku", value "047862" Input filter: Completed receiving request body (length 115). Starting phase REQUEST_BODY. Recipe: Invoking rule 94c620; [file "*********************"] [line "54"] [id "200001"]. Rule 94c620: SecRule "REQBODY_ERROR" "!@eq 0" "phase:2,auditlog,id:200001,t:none,log,deny,status:400,msg:'Failed to parse request body.',logdata:%{reqbody_error_msg},severity:2" Transformation completed in 0 usec. Executing operator "!eq" with param "0" against REQBODY_ERROR. Operator completed in 0 usec. Rule returned 0. Recipe: Invoking rule 5549c38; [file "*********************"] [line "75"] [id "200002"]. Rule 5549c38: SecRule "MULTIPART_STRICT_ERROR" "!@eq 0" "phase:2,auditlog,id:200002,t:none,log,deny,status:44,msg:'Multipart request body failed strict validation: PE %{REQBODY_PROCESSOR_ERROR}, BQ %{MULTIPART_BOUNDARY_QUOTED}, BW %{MULTIPART_BOUNDARY_WHITESPACE}, DB %{MULTIPART_DATA_BEFORE}, DA %{MULTIPART_DATA_AFTER}, HF %{MULTIPART_HEADER_FOLDING}, LF %{MULTIPART_LF_LINE}, SM %{MULTIPART_MISSING_SEMICOLON}, IQ %{MULTIPART_INVALID_QUOTING}, IP %{MULTIPART_INVALID_PART}, IH %{MULTIPART_INVALID_HEADER_FOLDING}, FL %{MULTIPART_FILE_LIMIT_EXCEEDED}'" Transformation completed in 0 usec. Executing operator "!eq" with param "0" against MULTIPART_STRICT_ERROR. Operator completed in 0 usec. Rule returned 0. Recipe: Invoking rule 554bd70; [file "********************"] [line "80"] [id "200003"]. Rule 554bd70: SecRule "MULTIPART_UNMATCHED_BOUNDARY" "!@eq 0" "phase:2,auditlog,id:200003,t:none,log,deny,status:44,msg:'Multipart parser detected a possible unmatched boundary.'" Transformation completed in 0 usec. Executing operator "!eq" with param "0" against MULTIPART_UNMATCHED_BOUNDARY. Operator completed in 0 usec. Rule returned 0. Recipe: Invoking rule 554cbe0; [file "*********************************"] [line "94"] [id "200004"]. Rule 554cbe0: SecRule "TX:/^MSC_/" "!@streq 0" "phase:2,log,auditlog,id:200004,t:none,deny,msg:'ModSecurity internal error flagged: %{MATCHED_VAR_NAME}'" Rule returned 0. Hook insert_filter: Adding input forwarding filter (r 5541fc0). Hook insert_filter: Adding output filter (r 5541fc0). Initialising logging. Starting phase LOGGING. Recording persistent data took 0 microseconds. Audit log: Ignoring a non-relevant request. I can't see anything unusual in Fiddler. I'm using a ViewModel in the parameters of my action. No data is bound if SecRequestBodyAccess is set to On. I'm even logging all the Request.Form.Keys and values via log4net, but not getting any values there either. I'm starting to wonder if ModSecurity actually works with ASP.NET MVC or if there is some conflict with the ModSecurity http Module and the model binder kicking in. Does anyone have any suggestions or can anyone confirm they have ModSecurity working with an ASP.NET MVC website?

    Read the article

  • Cocos2d-xna memory management for WP8

    - by Arkiliknam
    I recently upgraded to VS2012 and try my in dev game out on the new WP8 emulators but was dismayed to find out the emulator now crashes and throws an out of memory exception during my sprite loading procedure (funnily, it still works in WP7 emulators and on my WP7). Regardless of whether the problem is the emulator or not, I want to get a clear understanding of how I should be managing memory in the game. My game consists of a character whom has 4 or more different animations. Each animation consists of 4 to 7 frames. On top of that, the character has up to 8 stackable visualization modifications (eg eye type, nose type, hair type, clothes type). Pre memory issue, I preloaded all textures for each animation frame and customization and created animate action out of them. The game then plays animations using the customizations applied to that current character. I re-looked at this implementation when I received the out of memory exceptions and have started playing with RenderTexture instead, so instead of pre loading all possible textures, it on loads textures needed for the character, renders them onto a single texture, from which the animation is built. This means the animations use 1/8th of the sprites they were before. I thought this would solve my issue, but it hasn't. Here's a snippet of my code: var characterTexture = CCRenderTexture.Create((int)width, (int)height); characterTexture.BeginWithClear(0, 0, 0, 0); // stamp a body onto my texture var bodySprite = MethodToCreateSpecificSprite(); bodySprite.Position = centerPoint; bodySprite.Visit(); bodySprite.Cleanup(); bodySprite = null; // stamp eyes, nose, mouth, clothes, etc... characterTexture.End(); As you can see, I'm calling CleanUp and setting the sprite to null in the hope of releasing the memory, though I don't believe this is the right way, nor does it seem to work... I also tried using SharedTextureCache to load textures before Stamping my texture out, and then clearing the SharedTextureCache with: CCTextureCache.SharedTextureCache.RemoveAllTextures(); But this didn't have an effect either. Any tips on what I'm not doing? I used VS to do a memory profile of the emulation causing the crash. Both WP7.1 and WP8 emulators peak at about 150mb of usage. WP8 crashes and throws an out of memory exception. Each customisation/frame is 15kb at the most. Lets say there are 8 layers of customisation = 120kb but I render then onto one texture which I would assume is only 15kb again. Each animation is 8 frames at the most. That's 15kb for 1 texture, or 960kb for 8 textures of customisation. There are 4 animation sets. That's 60Kb for 4 sets of 1 texture, or 3.75MB for 4 sets of 8 textures of customisation. So even if its storing every layer, its 3.75MB.... no where near the 150mb breaking point my profiler seems to suggest :( WP 7.1 Memory Profile (max 150MB) WP8 Memory Profile (max 150MB and crashes)

    Read the article

  • GZip/Deflate Compression in ASP.NET MVC

    - by Rick Strahl
    A long while back I wrote about GZip compression in ASP.NET. In that article I describe two generic helper methods that I've used in all sorts of ASP.NET application from WebForms apps to HttpModules and HttpHandlers that require gzip or deflate compression. The same static methods also work in ASP.NET MVC. Here are the two routines:/// <summary> /// Determines if GZip is supported /// </summary> /// <returns></returns> public static bool IsGZipSupported() { string AcceptEncoding = HttpContext.Current.Request.Headers["Accept-Encoding"]; if (!string.IsNullOrEmpty(AcceptEncoding) && (AcceptEncoding.Contains("gzip") || AcceptEncoding.Contains("deflate"))) return true; return false; } /// <summary> /// Sets up the current page or handler to use GZip through a Response.Filter /// IMPORTANT: /// You have to call this method before any output is generated! /// </summary> public static void GZipEncodePage() { HttpResponse Response = HttpContext.Current.Response; if (IsGZipSupported()) { string AcceptEncoding = HttpContext.Current.Request.Headers["Accept-Encoding"]; if (AcceptEncoding.Contains("gzip")) { Response.Filter = new System.IO.Compression.GZipStream(Response.Filter, System.IO.Compression.CompressionMode.Compress); Response.Headers.Remove("Content-Encoding"); Response.AppendHeader("Content-Encoding", "gzip"); } else { Response.Filter = new System.IO.Compression.DeflateStream(Response.Filter, System.IO.Compression.CompressionMode.Compress); Response.Headers.Remove("Content-Encoding"); Response.AppendHeader("Content-Encoding", "deflate"); } } // Allow proxy servers to cache encoded and unencoded versions separately Response.AppendHeader("Vary", "Content-Encoding"); } The first method checks whether the client sending the request includes the accept-encoding for either gzip or deflate, and if if it does it returns true. The second function uses IsGzipSupported() to decide whether it should encode content and uses an Response Filter to do its job. Basically response filters look at the Response output stream as it's written and convert the data flowing through it. Filters are a bit tricky to work with but the two .NET filter streams for GZip and Deflate Compression make this a snap to implement. In my old code and even now in MVC I can always do:public ActionResult List(string keyword=null, int category=0) { WebUtils.GZipEncodePage(); …} to encode my content. And that works just fine. The proper way: Create an ActionFilterAttribute However in MVC this sort of thing is typically better handled by an ActionFilter which can be applied with an attribute. So to be all prim and proper I created an CompressContentAttribute ActionFilter that incorporates those two helper methods and which looks like this:/// <summary> /// Attribute that can be added to controller methods to force content /// to be GZip encoded if the client supports it /// </summary> public class CompressContentAttribute : ActionFilterAttribute { /// <summary> /// Override to compress the content that is generated by /// an action method. /// </summary> /// <param name="filterContext"></param> public override void OnActionExecuting(ActionExecutingContext filterContext) { GZipEncodePage(); } /// <summary> /// Determines if GZip is supported /// </summary> /// <returns></returns> public static bool IsGZipSupported() { string AcceptEncoding = HttpContext.Current.Request.Headers["Accept-Encoding"]; if (!string.IsNullOrEmpty(AcceptEncoding) && (AcceptEncoding.Contains("gzip") || AcceptEncoding.Contains("deflate"))) return true; return false; } /// <summary> /// Sets up the current page or handler to use GZip through a Response.Filter /// IMPORTANT: /// You have to call this method before any output is generated! /// </summary> public static void GZipEncodePage() { HttpResponse Response = HttpContext.Current.Response; if (IsGZipSupported()) { string AcceptEncoding = HttpContext.Current.Request.Headers["Accept-Encoding"]; if (AcceptEncoding.Contains("gzip")) { Response.Filter = new System.IO.Compression.GZipStream(Response.Filter, System.IO.Compression.CompressionMode.Compress); Response.Headers.Remove("Content-Encoding"); Response.AppendHeader("Content-Encoding", "gzip"); } else { Response.Filter = new System.IO.Compression.DeflateStream(Response.Filter, System.IO.Compression.CompressionMode.Compress); Response.Headers.Remove("Content-Encoding"); Response.AppendHeader("Content-Encoding", "deflate"); } } // Allow proxy servers to cache encoded and unencoded versions separately Response.AppendHeader("Vary", "Content-Encoding"); } } It's basically the same code wrapped into an ActionFilter attribute, which intercepts requests MVC requests to Controller methods and lets you hook up logic before and after the methods have executed. Here I want to override OnActionExecuting() which fires before the Controller action is fired. With the CompressContentAttribute created, it can now be applied to either the controller as a whole:[CompressContent] public class ClassifiedsController : ClassifiedsBaseController { … } or to one of the Action methods:[CompressContent] public ActionResult List(string keyword=null, int category=0) { … } The former applies compression to every action method, while the latter is selective and only applies it to the individual action method. Is the attribute better than the static utility function? Not really, but it is the standard MVC way to hook up 'filter' content and that's where others are likely to expect to set options like this. In fact,  you have a bit more control with the utility function because you can conditionally apply it in code, but this is actually much less likely in MVC applications than old WebForms apps since controller methods tend to be more focused. Compression Caveats Http compression is very cool and pretty easy to implement in ASP.NET but you have to be careful with it - especially if your content might get transformed or redirected inside of ASP.NET. A good example, is if an error occurs and a compression filter is applied. ASP.NET errors don't clear the filter, but clear the Response headers which results in some nasty garbage because the compressed content now no longer matches the headers. Another issue is Caching, which has to account for all possible ways of compression and non-compression that the content is served. Basically compressed content and caching don't mix well. I wrote about several of these issues in an old blog post and I recommend you take a quick peek before diving into making every bit of output Gzip encoded. None of these are show stoppers, but you have to be aware of the issues. Related Posts GZip Compression with ASP.NET Content ASP.NET GZip Encoding Caveats© Rick Strahl, West Wind Technologies, 2005-2012Posted in ASP.NET  MVC   Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

    Read the article

  • Redaction in AutoVue

    - by [email protected]
    As the trend to digitize all paper assets continues, so does the push to digitize all the processes around these assets. One such process is redaction - removing sensitive or classified information from documents. While for some this may conjure up thoughts of old CIA documents filled with nothing but blacked out pages, there are actually many uses for redaction today beyond military and government. Many companies have a need to remove names, phone numbers, social security numbers, credit card numbers, etc. from documents that are being scanned in and/or released to the public or less privileged users - insurance companies, banks and legal firms are a few examples. The process of digital redaction actually isn't that far from the old paper method: Step 1. Find a folder with a big red stamp on it labeled "TOP SECRET" Step 2. Make a copy of that document, since some folks still need to access the original contents Step 3. Black out the text or pages you want to hide Step 4. Release or distribute this new 'redacted' copy So where does a solution like AutoVue come in? Well, we've really been doing all of these things for years! 1. With AutoVue's VueLink integration and iSDK, we can integrate to virtually any content management system and view documents of almost any format with a single click. Finding the document and opening it in AutoVue: CHECK! 2. With AutoVue's markup capabilities, adding filled boxes (or other shapes) around certain text is a no-brainer. You can even leverage AutoVue's powerful APIs to automate the addition of markups over certain text or pre-defined regions using our APIs. Black out the text you want to hide: CHECK! 3. With AutoVue's conversion capabilities, you can 'burn-in' the comments into a new file, either as a TIFF, JPEG or PDF document. Burning-in the redactions avoids slip-ups like the recent (well-publicized) TSA one. Through our tight integrations, the newly created copies can be directly checked into the content management system with no manual intervention. Make a copy of that document: CHECK! 4. Again, leveraging AutoVue's integrations, we can now define rules in the system based on a user's privileges. An 'authorized' user wishing to view the document from the repository will get exactly that - no redactions. An 'unauthorized' user, when requesting to view that same document, can get redirected to open the redacted copy of the same document. Release or distribute the new 'redacted' copy: CHECK! See this movie (WMV format, 2mins, 20secs, no audio) for a quick illustration of AutoVue's redaction capabilities. It shows how redactions can be added based on text searches, manual input or pre-defined templates/regions. Let us know what you think in the comments. And remember - this is all in our flagship AutoVue product - no additional software required!

    Read the article

  • New Communications Industry Data Model with "Factory Installed" Predictive Analytics using Oracle Da

    - by charlie.berger
    Oracle Introduces Oracle Communications Data Model to Provide Actionable Insight for Communications Service Providers   We've integrated pre-installed analytical methodologies with the new Oracle Communications Data Model to deliver automated, simple, yet powerful predictive analytics solutions for customers.  Churn, sentiment analysis, identifying customer segments - all things that can be anticipated and hence, preconcieved and implemented inside an applications.  Read on for more information! TM Forum Management World, Nice, France - 18 May 2010 News Facts To help communications service providers (CSPs) manage and analyze rapidly growing data volumes cost effectively, Oracle today introduced the Oracle Communications Data Model. With the Oracle Communications Data Model, CSPs can achieve rapid time to value by quickly implementing a standards-based enterprise data warehouse that features communications industry-specific reporting, analytics and data mining. The combination of the Oracle Communications Data Model, Oracle Exadata and the Oracle Business Intelligence (BI) Foundation represents the most comprehensive data warehouse and BI solution for the communications industry. Also announced today, Hong Kong Broadband Network enhanced their data warehouse system, going live on Oracle Communications Data Model in three months. The leading provider increased its subscriber base by 37 percent in six months and reduced customer churn to less than one percent. Product Details Oracle Communications Data Model provides industry-specific schema and embedded analytics that address key areas such as customer management, marketing segmentation, product development and network health. CSPs can efficiently capture and monitor critical data and transform it into actionable information to support development and delivery of next-generation services using: More than 1,300 industry-specific measurements and key performance indicators (KPIs) such as network reliability statistics, provisioning metrics and customer churn propensity. Embedded OLAP cubes for extremely fast dimensional analysis of business information. Embedded data mining models for sophisticated trending and predictive analysis. Support for multiple lines of business, such as cable, mobile, wireline and Internet, which can be easily extended to support future requirements. With Oracle Communications Data Model, CSPs can jump start the implementation of a communications data warehouse in line with communications-industry standards including the TM Forum Information Framework (SID), formerly known as the Shared Information Model. Oracle Communications Data Model is optimized for any Oracle Database 11g platform, including Oracle Exadata, which can improve call data record query performance by 10x or more. Supporting Quotes "Oracle Communications Data Model covers a wide range of business areas that are relevant to modern communications service providers and is a comprehensive solution - with its data model and pre-packaged templates including BI dashboards, KPIs, OLAP cubes and mining models. It helps us save a great deal of time in building and implementing a customized data warehouse and enables us to leverage the advanced analytics quickly and more effectively," said Yasuki Hayashi, executive manager, NTT Comware Corporation. "Data volumes will only continue to grow as communications service providers expand next-generation networks, deploy new services and adopt new business models. They will increasingly need efficient, reliable data warehouses to capture key insights on data such as customer value, network value and churn probability. With the Oracle Communications Data Model, Oracle has demonstrated its commitment to meeting these needs by delivering data warehouse tools designed to fill communications industry-specific needs," said Elisabeth Rainge, program director, Network Software, IDC. "The TM Forum Conformance Mark provides reassurance to customers seeking standards-based, and therefore, cost-effective and flexible solutions. TM Forum is extremely pleased to work with Oracle to certify its Oracle Communications Data Model solution. Upon successful completion, this certification will represent the broadest and most complete implementation of the TM Forum Information Framework to date, with more than 130 aggregate business entities," said Keith Willetts, chairman and chief executive officer, TM Forum. Supporting Resources Oracle Communications Oracle Communications Data Model Data Sheet Oracle Communications Data Model Podcast Oracle Data Warehousing Oracle Communications on YouTube Oracle Communications on Delicious Oracle Communications on Facebook Oracle Communications on Twitter Oracle Communications on LinkedIn Oracle Database on Twitter The Data Warehouse Insider Blog

    Read the article

  • What are the advantages of version control systems that version each file separately?

    - by Mike Daniels
    Over the past few years I have worked with several different version control systems. For me, one of the fundamental differences between them has been whether they version files individually (each file has its own separate version numbering and history) or the repository as a whole (a "commit" or version represents a snapshot of the whole repository). Some "per-file" version control systems: CVS ClearCase Visual SourceSafe Some "whole-repository" version control systems: SVN Git Mercurial In my experience, the per-file version control systems have only led to problems, and require much more configuration and maintenance to use correctly (for example, "config specs" in ClearCase). I've had many instances of a co-worker changing an unrelated file and breaking what would ideally be an isolated line of development. What are the advantages of these per-file version control systems? What problems do "whole-repository" version control systems have that per-file version control systems do not?

    Read the article

  • What are the advantages of version control systems that version each file separately?

    - by Mike Daniels
    Over the past few years I have worked with several different version control systems. For me, one of the fundamental differences between them has been whether they version files individually (each file has its own separate version numbering and history) or the repository as a whole (a "commit" or version represents a snapshot of the whole repository). Some "per-file" version control systems: CVS ClearCase Visual SourceSafe Some "whole-repository" version control systems: SVN Git Mercurial In my experience, the per-file version control systems have only led to problems, and require much more configuration and maintenance to use correctly (for example, "config specs" in ClearCase). I've had many instances of a co-worker changing an unrelated file and breaking what would ideally be an isolated line of development. What are the advantages of these per-file version control systems? What problems do "whole-repository" version control systems have that per-file version control systems do not?

    Read the article

  • Tricks and Optimizations for you Sitecore website

    - by amaniar
    When working with Sitecore there are some optimizations/configurations I usually repeat in order to make my app production ready. Following is a small list I have compiled from experience, Sitecore documentation, communicating with Sitecore Engineers etc. This is not supposed to be technically complete and might not be fit for all environments.   Simple configurations that can make a difference: 1) Configure Sitecore Caches. This is the most straight forward and sure way of increasing the performance of your website. Data and item cache sizes (/databases/database/ [id=web] ) should be configured as needed. You may start with a smaller number and tune them as needed. <cacheSizes hint="setting"> <data>300MB</data> <items>300MB</items> <paths>5MB</paths> <standardValues>5MB</standardValues> </cacheSizes> Tune the html, registry etc cache sizes for your website.   <cacheSizes> <sites> <website> <html>300MB</html> <registry>1MB</registry> <viewState>10MB</viewState> <xsl>5MB</xsl> </website> </sites> </cacheSizes> Tune the prefetch cache settings under the App_Config/Prefetch/ folder. Sample /App_Config/Prefetch/Web.Config: <configuration> <cacheSize>300MB</cacheSize> <!--preload items that use this template--> <template desc="mytemplate">{XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}</template> <!--preload this item--> <item desc="myitem">{XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX }</item> <!--preload children of this item--> <children desc="childitems">{XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}</children> </configuration> Break your page into sublayouts so you may cache most of them. Read the caching configuration reference: http://sdn.sitecore.net/upload/sitecore6/sc62keywords/cache_configuration_reference_a4.pdf   2) Disable Analytics for the Shell Site <site name="shell" virtualFolder="/sitecore/shell" physicalFolder="/sitecore/shell" rootPath="/sitecore/content" startItem="/home" language="en" database="core" domain="sitecore" loginPage="/sitecore/login" content="master" contentStartItem="/Home" enableWorkflow="true" enableAnalytics="false" xmlControlPage="/sitecore/shell/default.aspx" browserTitle="Sitecore" htmlCacheSize="2MB" registryCacheSize="3MB" viewStateCacheSize="200KB" xslCacheSize="5MB" />   3) Increase the Check Interval for the MemoryMonitorHook so it doesn’t run every 5 seconds (default). <hook type="Sitecore.Diagnostics.MemoryMonitorHook, Sitecore.Kernel"> <param desc="Threshold">800MB</param> <param desc="Check interval">00:05:00</param> <param desc="Minimum time between log entries">00:01:00</param> <ClearCaches>false</ClearCaches> <GarbageCollect>false</GarbageCollect> <AdjustLoadFactor>false</AdjustLoadFactor> </hook>   4) Set Analytics.PeformLookup (Sitecore.Analytics.config) to false if your environment doesn’t have access to the internet or you don’t intend to use reverse DNS lookup. <setting name="Analytics.PerformLookup" value="false" />   5) Set the value of the “Media.MediaLinkPrefix” setting to “-/media”: <setting name="Media.MediaLinkPrefix" value="-/media" /> Add the following line to the customHandlers section: <customHandlers> <handler trigger="-/media/" handler="sitecore_media.ashx" /> <handler trigger="~/media/" handler="sitecore_media.ashx" /> <handler trigger="~/api/" handler="sitecore_api.ashx" /> <handler trigger="~/xaml/" handler="sitecore_xaml.ashx" /> <handler trigger="~/icon/" handler="sitecore_icon.ashx" /> <handler trigger="~/feed/" handler="sitecore_feed.ashx" /> </customHandlers> Link: http://squad.jpkeisala.com/2011/10/sitecore-media-library-performance-optimization-checklist/   6) Performance counters should be disabled in production if not being monitored <setting name="Counters.Enabled" value="false" />   7) Disable Item/Memory/Timing threshold warnings. Due to the nature of this component, it brings no value in production. <!--<processor type="Sitecore.Pipelines.HttpRequest.StartMeasurements, Sitecore.Kernel" />--> <!--<processor type="Sitecore.Pipelines.HttpRequest.StopMeasurements, Sitecore.Kernel"> <TimingThreshold desc="Milliseconds">1000</TimingThreshold> <ItemThreshold desc="Item count">1000</ItemThreshold> <MemoryThreshold desc="KB">10000</MemoryThreshold> </processor>—>   8) The ContentEditor.RenderCollapsedSections setting is a hidden setting in the web.config file, which by default is true. Setting it to false will improve client performance for authoring environments. <setting name="ContentEditor.RenderCollapsedSections" value="false" />   9) Add a machineKey section to your Web.Config file when using a web farm. Link: http://msdn.microsoft.com/en-us/library/ff649308.aspx   10) If you get errors in the log files similar to: WARN Could not create an instance of the counter 'XXX.XXX' (category: 'Sitecore.System') Exception: System.UnauthorizedAccessException Message: Access to the registry key 'Global' is denied. Make sure the ApplicationPool user is a member of the system “Performance Monitor Users” group on the server.   11) Disable WebDAV configurations on the CD Server if not being used. More: http://sitecoreblog.alexshyba.com/2011/04/disable-webdav-in-sitecore.html   12) Change Log4Net settings to only log Errors on content delivery environments to avoid unnecessary logging. <root> <priority value="ERROR" /> <appender-ref ref="LogFileAppender" /> </root>   13) Disable Analytics for any content item that doesn’t add value. For example a page that redirects to another page.   14) When using Web User Controls avoid registering them on the page the asp.net way: <%@ Register Src="~/layouts/UserControls/MyControl.ascx" TagName="MyControl" TagPrefix="uc2" %> Use Sublayout web control instead – This way Sitecore caching could be leveraged <sc:Sublayout ID="ID" Path="/layouts/UserControls/MyControl.ascx" Cacheable="true" runat="server" />   15) Avoid querying for all children recursively when all items are direct children. Sitecore.Context.Database.SelectItems("/sitecore/content/Home//*"); //Use: Sitecore.Context.Database.GetItem("/sitecore/content/Home");   16) On IIS — you enable static & dynamic content compression on CM and CD More: http://technet.microsoft.com/en-us/library/cc754668%28WS.10%29.aspx   17) Enable HTTP Keep-alive and content expiration in IIS.   18) Use GUID’s when accessing items and fields instead of names or paths. Its faster and wont break your code when things get moved or renamed. Context.Database.GetItem("{324DFD16-BD4F-4853-8FF1-D663F6422DFF}") Context.Item.Fields["{89D38A8F-394E-45B0-826B-1A826CF4046D}"]; //is better than Context.Database.GetItem("/Home/MyItem") Context.Item.Fields["FieldName"]   Hope this helps.

    Read the article

  • And the Winners of Fusion Middleware Innovation Awards in Data Integration are…

    - by Irem Radzik
    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;} At OpenWorld, we announced the winners of Fusion Middleware Innovation Awards 2012. Raymond James and Morrison Supermarkets were selected for the data integration category for their innovative use of Oracle’s data integration products and the great results they have achieved. In this blog I would like to briefly introduce you to these award winning projects. Raymond James is a diversified financial services company, which provides financial planning, wealth management, investment banking, and asset management. They are using Oracle GoldenGate and Oracle Data Integrator to feed their operational data store (ODS), which supports application services across the enterprise. A major requirement for their project was low data latency, as key decisions are made based on the data in the ODS. They were able to fulfill this requirement due to the Oracle Data Integrator’s integrated solution with Oracle GoldenGate. Oracle GoldenGate captures changed data from different systems including Oracle Database, HP NonStop and Microsoft SQL Server into a single data store on SQL Server 2008. Oracle Data Integrator provides data transformations for the ODS. Leveraging ODI’s integration with GoldenGate, Raymond James now sees a 9 second median latency (from source commit to ODS target commit). The ODS solution delivers high quality, accurate data for consuming applications such as Raymond James’ next generation client and portfolio management systems as well as real-time operational reporting. It enables timely information for making better decisions. There are more benefits Raymond James achieved with this implementation of Oracle’s data integration solution. The software developers and architects of this solution, Tim Garrod and Ryan Fonnett, have told us during their presentation at OpenWorld that they also reduced application complexity significantly while improving developer productivity through trusted operational services. They were able to utilize CDC to generate alerts for business users, and for applications (for example for cache hydration mechanisms). One cool innovation example among many in this project is that using ODI's flexible architecture, Tim and Ryan could build 24/7 self-healing processes. And these processes have hardly failed. Integration processes fixes the errors itself. Pretty amazing; and a great solution for environments that need such reliability and availability. (You can see Tim and Ryan’s photo with the Innovation Award above.) The other winner of this year in the data integration category, Morrison Supermarkets, is the UK’s 4th largest grocery retailer. The company has been migrating all their legacy applications on to a new-world application set based on Oracle and consolidating all BI on to a single Oracle platform. The company recently implemented Oracle Exadata as the data warehouse engine and uses Oracle Business Intelligence EE. Their goal with deploying GoldenGate and ODI was to provide BI data to the enterprise in a way that it also supports operational decision making requirements from a wide range of Oracle based ERP applications such as E-Business Suite, PeopleSoft, Oracle Retail Suite. They use GoldenGate’s log-based change data capture capabilities and Oracle Data Integrator to populate the Oracle Retail Data Model. The electronic point of sale (EPOS) integration solution they built processes over 80 million transactions/day at busy periods in near real time (15 mins). It provides valuable insight to Retail and Commercial teams for both intra-day and historical trend analysis. As I mentioned in yesterday’s blog, the right data integration platform can transform the business. Here is another example: The point-of-sale integration enabled the grocery chain to optimize its stock management, leading to another award: Morrisons won the Grocer 33 award in 2012 - beating all other major UK supermarkets in product availability. Congratulations, Morrisons,on another award! Celebrating the innovation and the success of our customers with Oracle’s data integration products was definitely a highlight of Oracle OpenWorld for me. I look forward to hearing more from Raymond James, Morrisons, and the other customers that presented their data integration projects at OpenWorld, on how they are creating more value for their organizations.

    Read the article

  • When to implement: Together with or after the source product?

    - by Jeremy Oosthuizen
    Somebody recently relayed a prospect's question to me: How hard would it be to implement OUBI after the source product (CC&B, WAM or NMS) has already been implemented? Fact is that MOST non-OUBI Data Warehouse / Business Intelligence implementations take place after the source application(s) are in place and hopefully stable. If an organization decides that they need better reporting and management information, then the logical path (see The Data Warehouse Institute's Data Warehouse Maturity Model) is to a Data Warehouse -- no matter when their last applications were implemented. If there is a pre-built Data Warehouse for their specific application, or even for the desired business process in their industry, they're in luck. Else they have to design and build from scratch, using a toolset. The implementation of a toolset is unlike the implementation of OUBI which, like OBI Apps, contain pre-built ETL routines and user content. Much has been written before about the advantages of that. So, because OUBI is designed specifically for Oracle Utilities transactional products, we often implement them in parallel -- with OUBI lagging a little behind by necessity, like Reporting. Customers know from the start they're going to need the solution, and therefore purchase the products at the same time. My biggest argument FOR a parallel installation/implementation of OUBI with the source product is two-fold: - There could be things (which is the technical term for data elements) that customers figure out they need when implementing OUBI, which are often easier added to the source product's implementation project, than to add later; - OUBI's ETL often points out errors (severe or not) with converted data, which are easier to fix during the source product's implementation project, or it may even be impossible to fix afterwards. The Conversion routines sometimes miss these errors, because the source system can live with the not-quite-perfect converted data. If the data can't be properly extracted, i.e. the proper Dimensions linked to the Facts, then it can't get into OUBI. That means it can't be analyzed effectively along with the rest of the organization's data. Then there is also the throw-away-work argument, which may be significant. The operational / transactional system cannot go live without reports on Day 1. A lot of those reports would be taken care of by the implementation of OUBI. If OUBI is implemented after go-live, those reports STILL have to be built during the source product's implementation project, but they become throw-away after the OUBI implementation. I have sometimes been told that it is better to implement OUBI after the source product, because it cuts down on scope and risk for the source product's implementation project. All I can say to that, is bah humbug. No, seriously, given the arguments above, planning has to include the OUBI implementation and it has to be managed properly -- just like any other implementation. If so, it should not add any risk and it should be included in the scope from the start. The answer to the prospect's question is therefore that it is not that much more difficult; after all, most DW/BI implemenations are done like that. They just have to consider the points above.

    Read the article

  • Unit testing is… well, flawed.

    - by Dewald Galjaard
    Hey someone had to say it. I clearly recall my first IT job. I was appointed Systems Co-coordinator for a leading South African retailer at store level. Don’t get me wrong, there is absolutely nothing wrong with an honest day’s labor and in fact I highly recommend it, however I’m obliged to refer to the designation cautiously; in reality all I had to do was monitor in-store prices and two UNIX front line controllers. If anything went wrong – I only had to phone it in… Luckily that wasn’t all I did. My duties extended to some other interesting annual occurrence – stock take. Despite a bit more curious affair, it was still a tedious process that took weeks of preparation and several nights to complete.  Then also I remember that no matter how elaborate our planning was, the entire exercise would be rendered useless if we couldn’t get the basics right – that being the act of counting. Sounds simple right? We’ll with a store which could potentially carry over tens of thousands of different items… we’ll let’s just say I believe that’s when I first became a coffee addict. In those days the act of counting stock was a very humble process. Nothing like we have today. A staff member would be assigned a bin or shelve filled with items he or she had to sort then count. Thereafter they had to record their findings on a complementary piece of paper. Every night I would manage several teams. Each team was divided into two groups - counters and auditors. Both groups had the same task, only auditors followed shortly on the heels of the counters, recounting stock levels, making sure the original count correspond to their findings. It was a simple yet hugely responsible orchestration of people and thankfully there was one fundamental and golden rule I could always abide by to ensure things run smoothly – No-one was allowed to audit their own work. Nope, not even on nights when I didn’t have enough staff available. This meant I too at times had to get up there and get counting, or have the audit stand over until the next evening. The reason for this was obvious - late at night and with so much to do we were prone to make some mistakes, then on the recount, without a fresh set of eyes, you were likely to repeat the offence. Now years later this rule or guideline still holds true as we develop software (as far removed as software development from counting stock may be). For some reason it is a fundamental guideline we’re simply ignorant of. We write our code, we write our tests and thus commit the same horrendous offence. Yes, the procedure of writing unit tests as practiced in most development houses today – is flawed. Most if not all of the tests we write today exercise application logic – our logic. They are based on the way we believe an application or method should/may/will behave or function. As we write our tests, our unit tests mirror our best understanding of the inner workings of our application code. Unfortunately these tests will therefore also include (or be unaware of) any imperfections and errors on our part. If your logic is flawed as you write your initial code, chances are, without a fresh set of eyes, you will commit the same error second time around too. Not even experience seems to be a suitable solution. It certainly helps to have deeper insight, but is that really the answer we should be looking for? Is that really failsafe? What about code review? Code review is certainly an answer. You could have one developer coding away and another (or team) making sure the logic is sound. The practice however has its obvious drawbacks. Firstly and mainly it is resource intensive and from what I’ve seen in most development houses, given heavy deadlines, this guideline is seldom adhered to. Hardly ever do we have the resources, money or time readily available. So what other options are out there? A quest to find some solution revealed a project by Microsoft Research called PEX. PEX is a framework which creates several test scenarios for each method or class you write, automatically. Think of it as your own personal auditor. Within a few clicks the framework will auto generate several unit tests for a given class or method and save them to a single project. PEX help to audit your work. It lends a fresh set of eyes to any project you’re working on and best of all; it is cost effective and fast. Check them out at http://research.microsoft.com/en-us/projects/pex/ In upcoming posts we’ll dive deeper into how it works and how it can help you.   Certainly there are more similar frameworks out there and I would love to hear from you. Please share your experiences and insights.

    Read the article

  • Oracle OpenWorld Preview: Oracle Social Network Technical Tour

    - by kellsey.ruppel
      Originally posted by Jake Kuramoto on The Apps Lab blog. Yesterday, I told you about the Oracle Social Network Developer Challenge we’ll be hosting at OpenWorld (@oracleopenworld) next week. If you’re attending OpenWorld or JavaOne (@javaoneconf) and want to get hands-on experience with Oracle Social Network and show off your coding chops, this is the event for you. Go ahead and register. I’ll wait. But wait, there’s more. If you’re not sure you’ll have the time for the Challenge, don’t want to embarrass anyone with your awesome skills, have a hectic schedule and can’t commit, or just want to learn more about Oracle Social Network and how to extend it, then the Oracle Social Network Technical Tour is for you. Read Jake's originally entry to learn more about The Tour!

    Read the article

  • Building vs. Buying a Master Data Management Solution

    - by david.butler(at)oracle.com
    Many organizations prefer to build their own MDM solutions. The argument is that they know their data quality issues and their data better than anyone. Plus a focused solution will cost less in the long run then a vendor supplied general purpose product. This is not unreasonable if you think of MDM as a point solution for a particular data quality problem. But this approach carries significant risk. We now know that organizations achieve significant competitive advantages when they deploy MDM as a strategic enterprise wide solution: with the most common best practice being to deploy a tactical MDM solution and grow it into a full information architecture. A build your own approach most certainly will not scale to a larger architecture unless it is done correctly with the larger solution in mind. It is possible to build a home grown point MDM solution in such a way that it will dovetail into broader MDM architectures. A very good place to start is to use the same basic technologies that Oracle uses to build its own MDM solutions. Start with the Oracle 11g database to create a flexible, extensible and open data model to hold the master data and all needed attributes. The Oracle database is the most flexible, highly available and scalable database system on the market. With its Real Application Clusters (RAC) it can even support the mixed OLTP and BI workloads that represent typical MDM data access profiles. Use Oracle Data Integration (ODI) for batch data movement between applications, MDM data stores, and the BI layer. Use Oracle Golden Gate for more real-time data movement. Use Oracle's SOA Suite for application integration with its: BPEL Process Manager to orchestrate MDM connections to business processes; Identity Management for managing users; WS Manager for managing web services; Business Intelligence Enterprise Edition for analytics; and JDeveloper for creating or extending the MDM management application. Oracle utilizes these technologies to build its MDM Hubs.  Customers who build their own MDM solution using these components will easily migrate to Oracle provided MDM solutions when the home grown solution runs out of gas. But, even with a full stack of open flexible MDM technologies, creating a robust MDM application can be a daunting task. For example, a basic MDM solution will need: a set of data access methods that support master data as a service as well as direct real time access as well as batch loads and extracts; a data migration service for initial loads and periodic updates; a metadata management capability for items such as business entity matrixed relationships and hierarchies; a source system management capability to fully cross-reference business objects and to satisfy seemingly conflicting data ownership requirements; a data quality function that can find and eliminate duplicate data while insuring correct data attribute survivorship; a set of data quality functions that can manage structured and unstructured data; a data quality interface to assist with preventing new errors from entering the system even when data entry is outside the MDM application itself; a continuing data cleansing function to keep the data up to date; an internal triggering mechanism to create and deploy change information to all connected systems; a comprehensive role based data security system to control and monitor data access, update rights, and maintain change history; a flexible business rules engine for managing master data processes such as privacy and data movement; a user interface to support casual users and data stewards; a business intelligence structure to support profiling, compliance, and business performance indicators; and an analytical foundation for directly analyzing master data. Oracle's pre-built MDM Hub solutions are full-featured 3-tier Internet applications designed to participate in the full Oracle technology stack or to run independently in other open IT SOA environments. Building MDM solutions from scratch can take years. Oracle's pre-built MDM solutions can bring quality data to the enterprise in a matter of months. But if you must build, at lease build with the world's best technology stack in a way that simplifies the eventual upgrade to Oracle MDM and to the full enterprise wide information architecture that it enables.

    Read the article

  • Oracle Big Data Software Downloads

    - by Mike.Hallett(at)Oracle-BI&EPM
    Companies have been making business decisions for decades based on transactional data stored in relational databases. Beyond that critical data, is a potential treasure trove of less structured data: weblogs, social media, email, sensors, and photographs that can be mined for useful information. Oracle offers a broad integrated portfolio of products to help you acquire and organize these diverse data sources and analyze them alongside your existing data to find new insights and capitalize on hidden relationships. Oracle Big Data Connectors Downloads here, includes: Oracle SQL Connector for Hadoop Distributed File System Release 2.1.0 Oracle Loader for Hadoop Release 2.1.0 Oracle Data Integrator Companion 11g Oracle R Connector for Hadoop v 2.1 Oracle Big Data Documentation The Oracle Big Data solution offers an integrated portfolio of products to help you organize and analyze your diverse data sources alongside your existing data to find new insights and capitalize on hidden relationships. Oracle Big Data, Release 2.2.0 - E41604_01 zip (27.4 MB) Integrated Software and Big Data Connectors User's Guide HTML PDF Oracle Data Integrator (ODI) Application Adapter for Hadoop Apache Hadoop is designed to handle and process data that is typically from data sources that are non-relational and data volumes that are beyond what is handled by relational databases. Typical processing in Hadoop includes data validation and transformations that are programmed as MapReduce jobs. Designing and implementing a MapReduce job usually requires expert programming knowledge. However, when you use Oracle Data Integrator with the Application Adapter for Hadoop, you do not need to write MapReduce jobs. Oracle Data Integrator uses Hive and the Hive Query Language (HiveQL), a SQL-like language for implementing MapReduce jobs. Employing familiar and easy-to-use tools and pre-configured knowledge modules (KMs), the application adapter provides the following capabilities: Loading data into Hadoop from the local file system and HDFS Performing validation and transformation of data within Hadoop Loading processed data from Hadoop to an Oracle database for further processing and generating reports Oracle Database Loader for Hadoop Oracle Loader for Hadoop is an efficient and high-performance loader for fast movement of data from a Hadoop cluster into a table in an Oracle database. It pre-partitions the data if necessary and transforms it into a database-ready format. Oracle Loader for Hadoop is a Java MapReduce application that balances the data across reducers to help maximize performance. Oracle R Connector for Hadoop Oracle R Connector for Hadoop is a collection of R packages that provide: Interfaces to work with Hive tables, the Apache Hadoop compute infrastructure, the local R environment, and Oracle database tables Predictive analytic techniques, written in R or Java as Hadoop MapReduce jobs, that can be applied to data in HDFS files You install and load this package as you would any other R package. Using simple R functions, you can perform tasks such as: Access and transform HDFS data using a Hive-enabled transparency layer Use the R language for writing mappers and reducers Copy data between R memory, the local file system, HDFS, Hive, and Oracle databases Schedule R programs to execute as Hadoop MapReduce jobs and return the results to any of those locations Oracle SQL Connector for Hadoop Distributed File System Using Oracle SQL Connector for HDFS, you can use an Oracle Database to access and analyze data residing in Hadoop in these formats: Data Pump files in HDFS Delimited text files in HDFS Hive tables For other file formats, such as JSON files, you can stage the input in Hive tables before using Oracle SQL Connector for HDFS. Oracle SQL Connector for HDFS uses external tables to provide Oracle Database with read access to Hive tables, and to delimited text files and Data Pump files in HDFS. Related Documentation Cloudera's Distribution Including Apache Hadoop Library HTML Oracle R Enterprise HTML Oracle NoSQL Database HTML Recent Blog Posts Big Data Appliance vs. DIY Price Comparison Big Data: Architecture Overview Big Data: Achieve the Impossible in Real-Time Big Data: Vertical Behavioral Analytics Big Data: In-Memory MapReduce Flume and Hive for Log Analytics Building Workflows in Oozie

    Read the article

  • SQL SERVER – Planned and Unplanned Availablity Group Failovers – Notes from the Field #031

    - by Pinal Dave
    [Note from Pinal]: This is a new episode of Notes from the Fields series. AlwaysOn is a very complex subject and not everyone knows many things about this. The matter of the fact is there is very little information available on this subject online and not everyone knows everything about this. This is why when a very common question related to AlwaysOn comes, people get confused. In this episode of the Notes from the Field series database expert John Sterrett (Group Principal at Linchpin People) explains a very common issue DBAs and Developer faces in their career and is related to Planned and Unplanned Availablity Group Failovers. Linchpin People are database coaches and wellness experts for a data driven world. Read the experience of John in his own words. Whenever a disaster occurs it will be a stressful scenario regardless of how small or big the disaster is. This gets multiplied when it is your first time working with newer technology or the first time you are going through a disaster without a proper run book. Today, were going to help you establish a run book for creating a planned failover with availability groups. To make today’s session simple were going to have two instances of SQL Server 2012 included in an availability group and walk through the steps of doing an unplanned failover.  We will focus on using the user interface and T-SQL to complete the failovers. We are going to use a two replica Availability Group where each replica is in another location. Therefore, we will be covering Asynchronous (non automatic failover) the following is a breakdown of our availability group utilized today. Seeing the following screen might be scary the first time you come across an unplanned failover.  It looks like our test database used in this Availability Group is not functional and it currently isn’t. The database status is not synchronizing which makes sense because the primary replica went down so it couldn’t synchronize. With that said, we can still failover and make it functional while we troubleshoot why we lost our primary replica. To start we are going to right click on the availability group that needs to be restarted and select failover. This will bring up the following wizard, which will walk you through several steps needed to complete the failover using the graphical user interface provided with SQL Server Management Studio (SSMS). You are going to see warning messages simply because we are in Asynchronous commit mode and can not guarantee ‘no data loss’ when we do failover. Just incase you missed it; you get another screen warning you about potential data loss because we are in Asynchronous mode. Next we get to connect to the specific replica we want to become the primary replica after the failover occurs. In our case, we only have two replicas so this is trivial. In order to failover, it’s required to connect to the replica that will become primary.  The following screen shows that the connection has been made successfully. Next, you will see the final summary screen. Once again, this reminds you that the failover action will cause data loss as were using Asynchronous commit mode due to the distance between instances used for disaster recovery. Finally, once the failover is completed you will see the following screen. If you followed along this long you might be wondering what T-SQL scripts are generated for clicking through all the sections of the wizard. If you have used Database Mirroring in the past you might be surprised.  It’s not too different, which makes sense because the data is being replicated via SQL Server endpoints just like the good old database mirroring. Now were going to take a look at how to do a failover with just T-SQL. First, were going to need to open a new query window and run our query in SQLCMD mode. Just incase you haven’t used SQLCMD mode before we will show you how to enable it below. Now you can run the following statement. Notice, we connect to the replica we want to become primary after failover and specify to force failover to allow data loss. We can use the following script to failback over when our primary instance comes back online. -- YOU MUST EXECUTE THE FOLLOWING SCRIPT IN SQLCMD MODE. :Connect SQL2012PROD1 ALTER AVAILABILITY GROUP [AGSQL2] FORCE_FAILOVER_ALLOW_DATA_LOSS; GO Are your servers running at optimal speed or are you facing any SQL Server Performance Problems? If you want to get started with the help of experts read more over here: Fix Your SQL Server. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Notes from the Field, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

< Previous Page | 101 102 103 104 105 106 107 108 109 110 111 112  | Next Page >