processing efficiency - Page 98

Which jar has JBox2d's p5 package

- by Brantley Blanchard

Using eclipse, I'm trying to write a simple hello world program in processing that simply draws a rectangle on the screen then has gravity drop it as seen in this Tutorial. The problem is that when I try to import the p5 package, it's not resolving so I can't declare my Physics object. I tried two things. Download the zip, unzip it, then import the 3 jars (library, serialization, & testbed) a. import org.jbox2d.p5.*; doesn't resolve but the others do b. Physics physics; doesn't resolve Download the older standalone testbed jar then import it a. Physics physics; doesn't resolve; Here is basically where I'm starting import org.jbox2d.util.nonconvex.*; import org.jbox2d.dynamics.contacts.*; import org.jbox2d.testbed.*; import org.jbox2d.collision.*; import org.jbox2d.common.*; import org.jbox2d.dynamics.joints.*; import org.jbox2d.p5.*; import org.jbox2d.dynamics.*; import processing.core.PApplet; public class MyFirstJBox2d extends PApplet { Physics physics; public void setup() { size(640,480); frameRate(60); initScene(); } public void draw() { background(0); if (keyPressed) { //Reset everything physics.destroy(); initScene(); } } public void initScene() { physics = new Physics(this, width, height); physics.setDensity(1.0f); physics.createRect(300,200,340,300); } }

Read the article

Conventions for search result scoring

- by DeaconDesperado

I assume this type of question is more on-topic here than on regular SO. I have been working on a search feature for my team's web application and have had a lot of success building a multithreaded, "divide and conquer" processing system to work through a large amount of fulltext. Our problem domain is pretty specific. Users of the app generate posts, and as a general rule, posts that are more recent are considered to be of greater relevance. Some of the data we are trying to extract from search is very specific (user's feelings about specific items or things) and we are using python nltk to do named-entity extraction to find interesting likely query terms. Essentially we look for descriptive adjective-noun pairs and generate a general picture of a user's expressed sentiment as a list of tokens. This search is intended as an internal tool for our team to draw out a local picture of sentiments like "soggy pizza." There's some machine learning in there too to do entity resolution on terms like "soggy" to all manner of adjectives expressing nastiness. My problem is I am at a loss for how to go about scoring these results. The text being searched is split up into tokens in a list, so my initial approach would be to normalize a float score between 0.0-1.0 generated off of how far into the list the terms appear and how often they are repeated (a later mention of the term being worth less, earlier more, greater frequency-greater score, etc.) A certain amount of weight could be given to the timestamp as well, though I am not certain how to calculate this. I am curious if anyone has had to solve a similar problem in a search relevance grading between appreciable metrics (frequency, term location/colocation, recency) and if there are and guidelines for how to weight each. I should mention as well that the final fallback procedure in the search is to pipe the query to Sphinx, which has its own scoring practices. Sphinx operates as the last resort in case our application specific processing can't find any eligible candidates.

Read the article

Wednesday at Oracle OpenWorld 2012 - Must See Session: “Event-Driven Patterns and Best Practices: Even More Important with Big Data”

- by Lionel Dubreuil

Don’t miss this “CON8636 - Event-Driven Patterns and Best Practices: Even More Important with Big Data“ session: Speakers: Faisal Nazir - Senior Solutions Architect, Motorola Shinichiro Takahashi - Senior Manager, Service Platform Department, NTT DOCOMO, INC. Robin Smith - Product Management/Strategy Director - Oracle Event Processing, Oracle Date: Wednesday, Oct 3 Time: 10:15 AM - 11:15 AM Location: Moscone South - 310 As the demand for big data analytics and integration grows across all industries, this session focuses on the role of the Oracle event-driven solution platform in delivering vital real-time integrated analysis intelligence to the data streams consumed and emitted from these large distributed data stores. Objectives for this session are to: Increase awareness of Oracle Event Processing, showcasing tight alignment with big data solutions Highlight emerging usage patterns in relation to streaming event data and distributed data stores Show a significant Oracle competitive advantage over IBM solutions advertised in this domain Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif";}

Read the article

I have just upgraded to 13.10 and i can not install any programs

- by jason malitz

I upgraded to Ubuntu 13.10 last night and i tried to install empathy chat client and this is what I see after the failed down load installArchives() failed: (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 397719 files and directories currently installed.) Removing xserver-common-lts-raring ... Removing 'diversion of /usr/lib/xorg/protocol.txt to /usr/lib/xorg/protocol-precise.txt by xserver-common-lts-raring' dpkg-divert: error: rename involves overwriting `/usr/lib/xorg/protocol.txt' with different file `/usr/lib/xorg/protocol-precise.txt', not allowed dpkg: error processing xserver-common-lts-raring (--remove): subprocess installed post-removal script returned error exit status 2 No apport report written because MaxReports is reached already Errors were encountered while processing: xserver-common-lts-raring Error in function: So how do I fix this issue

Read the article

Software Architecture: How to divide work to a network of computers?

- by Morpork

Imagine a scenario as follows: Lets say you have a central computer which generates a lot of data. This data must go through some processing, which unfortunately takes longer than to generate. In order for the processing to catch up with real time, we plug in more slave computers. Further, we must take into account the possibility of slaves dropping out of the network mid-job as well as additional slaves being added. The central computer should ensure that all jobs are finished to its satisfaction, and that jobs dropped by a slave are retasked to another. The main question is: What approach should I use to achieve this? But perhaps the following would help me arrive at an answer: Is there a name or design pattern to what I am trying to do? What domain of knowledge do I need to achieve the goal of getting these computers to talk to each other? (eg. will a database, which I have some knowledge of, be enough or will this involve sockets, which I have yet to have knowledge of?) Are there any examples of such a system? The main question is a bit general so it would be good to have a starting point/reference point. Note I am assuming constraints of c++ and windows so solutions pointing in that direction would be appreciated.

Read the article

256 Windows Azure Worker Roles, Windows Kinect and a 90's Text-Based Ray-Tracer

- by Alan Smith

For a couple of years I have been demoing a simple render farm hosted in Windows Azure using worker roles and the Azure Storage service. At the start of the presentation I deploy an Azure application that uses 16 worker roles to render a 1,500 frame 3D ray-traced animation. At the end of the presentation, when the animation was complete, I would play the animation delete the Azure deployment. The standing joke with the audience was that it was that it was a “$2 demo”, as the compute charges for running the 16 instances for an hour was $1.92, factor in the bandwidth charges and it’s a couple of dollars. The point of the demo is that it highlights one of the great benefits of cloud computing, you pay for what you use, and if you need massive compute power for a short period of time using Windows Azure can work out very cost effective. The “$2 demo” was great for presenting at user groups and conferences in that it could be deployed to Azure, used to render an animation, and then removed in a one hour session. I have always had the idea of doing something a bit more impressive with the demo, and scaling it from a “$2 demo” to a “$30 demo”. The challenge was to create a visually appealing animation in high definition format and keep the demo time down to one hour. This article will take a run through how I achieved this. Ray Tracing Ray tracing, a technique for generating high quality photorealistic images, gained popularity in the 90’s with companies like Pixar creating feature length computer animations, and also the emergence of shareware text-based ray tracers that could run on a home PC. In order to render a ray traced image, the ray of light that would pass from the view point must be tracked until it intersects with an object. At the intersection, the color, reflectiveness, transparency, and refractive index of the object are used to calculate if the ray will be reflected or refracted. Each pixel may require thousands of calculations to determine what color it will be in the rendered image. Pin-Board Toys Having very little artistic talent and a basic understanding of maths I decided to focus on an animation that could be modeled fairly easily and would look visually impressive. I’ve always liked the pin-board desktop toys that become popular in the 80’s and when I was working as a 3D animator back in the 90’s I always had the idea of creating a 3D ray-traced animation of a pin-board, but never found the energy to do it. Even if I had a go at it, the render time to produce an animation that would look respectable on a 486 would have been measured in months. PolyRay Back in 1995 I landed my first real job, after spending three years being a beach-ski-climbing-paragliding-bum, and was employed to create 3D ray-traced animations for a CD-ROM that school kids would use to learn physics. I had got into the strange and wonderful world of text-based ray tracing, and was using a shareware ray-tracer called PolyRay. PolyRay takes a text file describing a scene as input and, after a few hours processing on a 486, produced a high quality ray-traced image. The following is an example of a basic PolyRay scene file. background Midnight_Blue static define matte surface { ambient 0.1 diffuse 0.7 } define matte_white texture { matte { color white } } define matte_black texture { matte { color dark_slate_gray } } define position_cylindrical 3 define lookup_sawtooth 1 define light_wood <0.6, 0.24, 0.1> define median_wood <0.3, 0.12, 0.03> define dark_wood <0.05, 0.01, 0.005> define wooden texture { noise surface { ambient 0.2 diffuse 0.7 specular white, 0.5 microfacet Reitz 10 position_fn position_cylindrical position_scale 1 lookup_fn lookup_sawtooth octaves 1 turbulence 1 color_map( [0.0, 0.2, light_wood, light_wood] [0.2, 0.3, light_wood, median_wood] [0.3, 0.4, median_wood, light_wood] [0.4, 0.7, light_wood, light_wood] [0.7, 0.8, light_wood, median_wood] [0.8, 0.9, median_wood, light_wood] [0.9, 1.0, light_wood, dark_wood]) } } define glass texture { surface { ambient 0 diffuse 0 specular 0.2 reflection white, 0.1 transmission white, 1, 1.5 }} define shiny surface { ambient 0.1 diffuse 0.6 specular white, 0.6 microfacet Phong 7 } define steely_blue texture { shiny { color black } } define chrome texture { surface { color white ambient 0.0 diffuse 0.2 specular 0.4 microfacet Phong 10 reflection 0.8 } } viewpoint { from <4.000, -1.000, 1.000> at <0.000, 0.000, 0.000> up <0, 1, 0> angle 60 resolution 640, 480 aspect 1.6 image_format 0 } light <-10, 30, 20> light <-10, 30, -20> object { disc <0, -2, 0>, <0, 1, 0>, 30 wooden } object { sphere <0.000, 0.000, 0.000>, 1.00 chrome } object { cylinder <0.000, 0.000, 0.000>, <0.000, 0.000, -4.000>, 0.50 chrome } After setting up the background and defining colors and textures, the viewpoint is specified. The “camera” is located at a point in 3D space, and it looks towards another point. The angle, image resolution, and aspect ratio are specified. Two lights are present in the image at defined coordinates. The three objects in the image are a wooden disc to represent a table top, and a sphere and cylinder that intersect to form a pin that will be used for the pin board toy in the final animation. When the image is rendered, the following image is produced. The pins are modeled with a chrome surface, so they reflect the environment around them. Note that the scale of the pin shaft is not correct, this will be fixed later. Modeling the Pin Board The frame of the pin-board is made up of three boxes, and six cylinders, the front box is modeled using a clear, slightly reflective solid, with the same refractive index of glass. The other shapes are modeled as metal. object { box <-5.5, -1.5, 1>, <5.5, 5.5, 1.2> glass } object { box <-5.5, -1.5, -0.04>, <5.5, 5.5, -0.09> steely_blue } object { box <-5.5, -1.5, -0.52>, <5.5, 5.5, -0.59> steely_blue } object { cylinder <-5.2, -1.2, 1.4>, <-5.2, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <5.2, -1.2, 1.4>, <5.2, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <-5.2, 5.2, 1.4>, <-5.2, 5.2, -0.74>, 0.2 steely_blue } object { cylinder <5.2, 5.2, 1.4>, <5.2, 5.2, -0.74>, 0.2 steely_blue } object { cylinder <0, -1.2, 1.4>, <0, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <0, 5.2, 1.4>, <0, 5.2, -0.74>, 0.2 steely_blue } In order to create the matrix of pins that make up the pin board I used a basic console application with a few nested loops to create two intersecting matrixes of pins, which models the layout used in the pin boards. The resulting image is shown below. The pin board contains 11,481 pins, with the scene file containing 23,709 lines of code. For the complete animation 2,000 scene files will be created, which is over 47 million lines of code. Each pin in the pin-board will slide out a specific distance when an object is pressed into the back of the board. This is easily modeled by setting the Z coordinate of the pin to a specific value. In order to set all of the pins in the pin-board to the correct position, a bitmap image can be used. The position of the pin can be set based on the color of the pixel at the appropriate position in the image. When the Windows Azure logo is used to set the Z coordinate of the pins, the following image is generated. The challenge now was to make a cool animation. The Azure Logo is fine, but it is static. Using a normal video to animate the pins would not work; the colors in the video would not be the same as the depth of the objects from the camera. In order to simulate the pin board accurately a series of frames from a depth camera could be used. Windows Kinect The Kenect controllers for the X-Box 360 and Windows feature a depth camera. The Kinect SDK for Windows provides a programming interface for Kenect, providing easy access for .NET developers to the Kinect sensors. The Kinect Explorer provided with the Kinect SDK is a great starting point for exploring Kinect from a developers perspective. Both the X-Box 360 Kinect and the Windows Kinect will work with the Kinect SDK, the Windows Kinect is required for commercial applications, but the X-Box Kinect can be used for hobby projects. The Windows Kinect has the advantage of providing a mode to allow depth capture with objects closer to the camera, which makes for a more accurate depth image for setting the pin positions. Creating a Depth Field Animation The depth field animation used to set the positions of the pin in the pin board was created using a modified version of the Kinect Explorer sample application. In order to simulate the pin board accurately, a small section of the depth range from the depth sensor will be used. Any part of the object in front of the depth range will result in a white pixel; anything behind the depth range will be black. Within the depth range the pixels in the image will be set to RGB values from 0,0,0 to 255,255,255. A screen shot of the modified Kinect Explorer application is shown below. The Kinect Explorer sample application was modified to include slider controls that are used to set the depth range that forms the image from the depth stream. This allows the fine tuning of the depth image that is required for simulating the position of the pins in the pin board. The Kinect Explorer was also modified to record a series of images from the depth camera and save them as a sequence JPEG files that will be used to animate the pins in the animation the Start and Stop buttons are used to start and stop the image recording. En example of one of the depth images is shown below. Once a series of 2,000 depth images has been captured, the task of creating the animation can begin. Rendering a Test Frame In order to test the creation of frames and get an approximation of the time required to render each frame a test frame was rendered on-premise using PolyRay. The output of the rendering process is shown below. The test frame contained 23,629 primitive shapes, most of which are the spheres and cylinders that are used for the 11,800 or so pins in the pin board. The 1280x720 image contains 921,600 pixels, but as anti-aliasing was used the number of rays that were calculated was 4,235,777, with 3,478,754,073 object boundaries checked. The test frame of the pin board with the depth field image applied is shown below. The tracing time for the test frame was 4 minutes 27 seconds, which means rendering the2,000 frames in the animation would take over 148 hours, or a little over 6 days. Although this is much faster that an old 486, waiting almost a week to see the results of an animation would make it challenging for animators to create, view, and refine their animations. It would be much better if the animation could be rendered in less than one hour. Windows Azure Worker Roles The cost of creating an on-premise render farm to render animations increases in proportion to the number of servers. The table below shows the cost of servers for creating a render farm, assuming a cost of $500 per server. Number of Servers Cost 1 $500 16 $8,000 256 $128,000 As well as the cost of the servers, there would be additional costs for networking, racks etc. Hosting an environment of 256 servers on-premise would require a server room with cooling, and some pretty hefty power cabling. The Windows Azure compute services provide worker roles, which are ideal for performing processor intensive compute tasks. With the scalability available in Windows Azure a job that takes 256 hours to complete could be perfumed using different numbers of worker roles. The time and cost of using 1, 16 or 256 worker roles is shown below. Number of Worker Roles Render Time Cost 1 256 hours $30.72 16 16 hours $30.72 256 1 hour $30.72 Using worker roles in Windows Azure provides the same cost for the 256 hour job, irrespective of the number of worker roles used. Provided the compute task can be broken down into many small units, and the worker role compute power can be used effectively, it makes sense to scale the application so that the task is completed quickly, making the results available in a timely fashion. The task of rendering 2,000 frames in an animation is one that can easily be broken down into 2,000 individual pieces, which can be performed by a number of worker roles. Creating a Render Farm in Windows Azure The architecture of the render farm is shown in the following diagram. The render farm is a hybrid application with the following components: · On-Premise o Windows Kinect – Used combined with the Kinect Explorer to create a stream of depth images. o Animation Creator – This application uses the depth images from the Kinect sensor to create scene description files for PolyRay. These files are then uploaded to the jobs blob container, and job messages added to the jobs queue. o Process Monitor – This application queries the role instance lifecycle table and displays statistics about the render farm environment and render process. o Image Downloader – This application polls the image queue and downloads the rendered animation files once they are complete. · Windows Azure o Azure Storage – Queues and blobs are used for the scene description files and completed frames. A table is used to store the statistics about the rendering environment. The architecture of each worker role is shown below. The worker role is configured to use local storage, which provides file storage on the worker role instance that can be use by the applications to render the image and transform the format of the image. The service definition for the worker role with the local storage configuration highlighted is shown below. <?xml version="1.0" encoding="utf-8"?> <ServiceDefinition name="CloudRay" > <WorkerRole name="CloudRayWorkerRole" vmsize="Small"> <Imports> </Imports> <ConfigurationSettings> <Setting name="DataConnectionString" /> </ConfigurationSettings> <LocalResources> <LocalStorage name="RayFolder" cleanOnRoleRecycle="true" /> </LocalResources> </WorkerRole> </ServiceDefinition> The two executable programs, PolyRay.exe and DTA.exe are included in the Azure project, with Copy Always set as the property. PolyRay will take the scene description file and render it to a Truevision TGA file. As the TGA format has not seen much use since the mid 90’s it is converted to a JPG image using Dave's Targa Animator, another shareware application from the 90’s. Each worker roll will use the following process to render the animation frames. 1. The worker process polls the job queue, if a job is available the scene description file is downloaded from blob storage to local storage. 2. PolyRay.exe is started in a process with the appropriate command line arguments to render the image as a TGA file. 3. DTA.exe is started in a process with the appropriate command line arguments convert the TGA file to a JPG file. 4. The JPG file is uploaded from local storage to the images blob container. 5. A message is placed on the images queue to indicate a new image is available for download. 6. The job message is deleted from the job queue. 7. The role instance lifecycle table is updated with statistics on the number of frames rendered by the worker role instance, and the CPU time used. The code for this is shown below. public override void Run() { // Set environment variables string polyRayPath = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), PolyRayLocation); string dtaPath = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), DTALocation); LocalResource rayStorage = RoleEnvironment.GetLocalResource("RayFolder"); string localStorageRootPath = rayStorage.RootPath; JobQueue jobQueue = new JobQueue("renderjobs"); JobQueue downloadQueue = new JobQueue("renderimagedownloadjobs"); CloudRayBlob sceneBlob = new CloudRayBlob("scenes"); CloudRayBlob imageBlob = new CloudRayBlob("images"); RoleLifecycleDataSource roleLifecycleDataSource = new RoleLifecycleDataSource(); Frames = 0; while (true) { // Get the render job from the queue CloudQueueMessage jobMsg = jobQueue.Get(); if (jobMsg != null) { // Get the file details string sceneFile = jobMsg.AsString; string tgaFile = sceneFile.Replace(".pi", ".tga"); string jpgFile = sceneFile.Replace(".pi", ".jpg"); string sceneFilePath = Path.Combine(localStorageRootPath, sceneFile); string tgaFilePath = Path.Combine(localStorageRootPath, tgaFile); string jpgFilePath = Path.Combine(localStorageRootPath, jpgFile); // Copy the scene file to local storage sceneBlob.DownloadFile(sceneFilePath); // Run the ray tracer. string polyrayArguments = string.Format("\"{0}\" -o \"{1}\" -a 2", sceneFilePath, tgaFilePath); Process polyRayProcess = new Process(); polyRayProcess.StartInfo.FileName = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), polyRayPath); polyRayProcess.StartInfo.Arguments = polyrayArguments; polyRayProcess.Start(); polyRayProcess.WaitForExit(); // Convert the image string dtaArguments = string.Format(" {0} /FJ /P{1}", tgaFilePath, Path.GetDirectoryName (jpgFilePath)); Process dtaProcess = new Process(); dtaProcess.StartInfo.FileName = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), dtaPath); dtaProcess.StartInfo.Arguments = dtaArguments; dtaProcess.Start(); dtaProcess.WaitForExit(); // Upload the image to blob storage imageBlob.UploadFile(jpgFilePath); // Add a download job. downloadQueue.Add(jpgFile); // Delete the render job message jobQueue.Delete(jobMsg); Frames++; } else { Thread.Sleep(1000); } // Log the worker role activity. roleLifecycleDataSource.Alive ("CloudRayWorker", RoleLifecycleDataSource.RoleLifecycleId, Frames); } } Monitoring Worker Role Instance Lifecycle In order to get more accurate statistics about the lifecycle of the worker role instances used to render the animation data was tracked in an Azure storage table. The following class was used to track the worker role lifecycles in Azure storage. public class RoleLifecycle : TableServiceEntity { public string ServerName { get; set; } public string Status { get; set; } public DateTime StartTime { get; set; } public DateTime EndTime { get; set; } public long SecondsRunning { get; set; } public DateTime LastActiveTime { get; set; } public int Frames { get; set; } public string Comment { get; set; } public RoleLifecycle() { } public RoleLifecycle(string roleName) { PartitionKey = roleName; RowKey = Utils.GetAscendingRowKey(); Status = "Started"; StartTime = DateTime.UtcNow; LastActiveTime = StartTime; EndTime = StartTime; SecondsRunning = 0; Frames = 0; } } A new instance of this class is created and added to the storage table when the role starts. It is then updated each time the worker renders a frame to record the total number of frames rendered and the total processing time. These statistics are used be the monitoring application to determine the effectiveness of use of resources in the render farm. Rendering the Animation The Azure solution was deployed to Windows Azure with the service configuration set to 16 worker role instances. This allows for the application to be tested in the cloud environment, and the performance of the application determined. When I demo the application at conferences and user groups I often start with 16 instances, and then scale up the application to the full 256 instances. The configuration to run 16 instances is shown below. <?xml version="1.0" encoding="utf-8"?> <ServiceConfiguration serviceName="CloudRay" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="1" osVersion="*"> <Role name="CloudRayWorkerRole"> <Instances count="16" /> <ConfigurationSettings> <Setting name="DataConnectionString" value="DefaultEndpointsProtocol=https;AccountName=cloudraydata;AccountKey=..." /> </ConfigurationSettings> </Role> </ServiceConfiguration> About six minutes after deploying the application the first worker roles become active and start to render the first frames of the animation. The CloudRay Monitor application displays an icon for each worker role instance, with a number indicating the number of frames that the worker role has rendered. The statistics on the left show the number of active worker roles and statistics about the render process. The render time is the time since the first worker role became active; the CPU time is the total amount of processing time used by all worker role instances to render the frames. Five minutes after the first worker role became active the last of the 16 worker roles activated. By this time the first seven worker roles had each rendered one frame of the animation. With 16 worker roles u and running it can be seen that one hour and 45 minutes CPU time has been used to render 32 frames with a render time of just under 10 minutes. At this rate it would take over 10 hours to render the 2,000 frames of the full animation. In order to complete the animation in under an hour more processing power will be required. Scaling the render farm from 16 instances to 256 instances is easy using the new management portal. The slider is set to 256 instances, and the configuration saved. We do not need to re-deploy the application, and the 16 instances that are up and running will not be affected. Alternatively, the configuration file for the Azure service could be modified to specify 256 instances. <?xml version="1.0" encoding="utf-8"?> <ServiceConfiguration serviceName="CloudRay" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="1" osVersion="*"> <Role name="CloudRayWorkerRole"> <Instances count="256" /> <ConfigurationSettings> <Setting name="DataConnectionString" value="DefaultEndpointsProtocol=https;AccountName=cloudraydata;AccountKey=..." /> </ConfigurationSettings> </Role> </ServiceConfiguration> Six minutes after the new configuration has been applied 75 new worker roles have activated and are processing their first frames. Five minutes later the full configuration of 256 worker roles is up and running. We can see that the average rate of frame rendering has increased from 3 to 12 frames per minute, and that over 17 hours of CPU time has been utilized in 23 minutes. In this test the time to provision 140 worker roles was about 11 minutes, which works out at about one every five seconds. We are now half way through the rendering, with 1,000 frames complete. This has utilized just under three days of CPU time in a little over 35 minutes. The animation is now complete, with 2,000 frames rendered in a little over 52 minutes. The CPU time used by the 256 worker roles is 6 days, 7 hours and 22 minutes with an average frame rate of 38 frames per minute. The rendering of the last 1,000 frames took 16 minutes 27 seconds, which works out at a rendering rate of 60 frames per minute. The frame counts in the server instances indicate that the use of a queue to distribute the workload has been very effective in distributing the load across the 256 worker role instances. The first 16 instances that were deployed first have rendered between 11 and 13 frames each, whilst the 240 instances that were added when the application was scaled have rendered between 6 and 9 frames each. Completed Animation I’ve uploaded the completed animation to YouTube, a low resolution preview is shown below. Pin Board Animation Created using Windows Kinect and 256 Windows Azure Worker Roles The animation can be viewed in 1280x720 resolution at the following link: http://www.youtube.com/watch?v=n5jy6bvSxWc Effective Use of Resources According to the CloudRay monitor statistics the animation took 6 days, 7 hours and 22 minutes CPU to render, this works out at 152 hours of compute time, rounded up to the nearest hour. As the usage for the worker role instances are billed for the full hour, it may have been possible to render the animation using fewer than 256 worker roles. When deciding the optimal usage of resources, the time required to provision and start the worker roles must also be considered. In the demo I started with 16 worker roles, and then scaled the application to 256 worker roles. It would have been more optimal to start the application with maybe 200 worker roles, and utilized the full hour that I was being billed for. This would, however, have prevented showing the ease of scalability of the application. The new management portal displays the CPU usage across the worker roles in the deployment. The average CPU usage across all instances is 93.27%, with over 99% used when all the instances are up and running. This shows that the worker role resources are being used very effectively. Grid Computing Scenarios Although I am using this scenario for a hobby project, there are many scenarios where a large amount of compute power is required for a short period of time. Windows Azure provides a great platform for developing these types of grid computing applications, and can work out very cost effective. · Windows Azure can provide massive compute power, on demand, in a matter of minutes. · The use of queues to manage the load balancing of jobs between role instances is a simple and effective solution. · Using a cloud-computing platform like Windows Azure allows proof-of-concept scenarios to be tested and evaluated on a very low budget. · No charges for inbound data transfer makes the uploading of large data sets to Windows Azure Storage services cost effective. (Transaction charges still apply.) Tips for using Windows Azure for Grid Computing Scenarios I found the implementation of a render farm using Windows Azure a fairly simple scenario to implement. I was impressed by ease of scalability that Azure provides, and by the short time that the application took to scale from 16 to 256 worker role instances. In this case it was around 13 minutes, in other tests it took between 10 and 20 minutes. The following tips may be useful when implementing a grid computing project in Windows Azure. · Using an Azure Storage queue to load-balance the units of work across multiple worker roles is simple and very effective. The design I have used in this scenario could easily scale to many thousands of worker role instances. · Windows Azure accounts are typically limited to 20 cores. If you need to use more than this, a call to support and a credit card check will be required. · Be aware of how the billing model works. You will be charged for worker role instances for the full clock our in which the instance is deployed. Schedule the workload to start just after the clock hour has started. · Monitor the utilization of the resources you are provisioning, ensure that you are not paying for worker roles that are idle. · If you are deploying third party applications to worker roles, you may well run into licensing issues. Purchasing software licenses on a per-processor basis when using hundreds of processors for a short time period would not be cost effective. · Third party software may also require installation onto the worker roles, which can be accomplished using start-up tasks. Bear in mind that adding a startup task and possible re-boot will add to the time required for the worker role instance to start and activate. An alternative may be to use a prepared VM and use VM roles. · Consider using the Windows Azure Autoscaling Application Block (WASABi) to autoscale the worker roles in your application. When using a large number of worker roles, the utilization must be carefully monitored, if the scaling algorithms are not optimal it could get very expensive!

Read the article

Developer Dashboard in SharePoint 2010

- by jcortez

Introducing the Developer Dashboard As a SharePoint developer (or IT Professional), how many times have you had the pleasure of figuring out why a particular page on your site is taking too long to render? I'm sure one of the techniques you have employed in troubleshooting is the process of elimination - removing individual web parts from the page hoping to identify which web part is misbehaving. One of the new features of SharePoint 2010 is the Developer Dashboard. This dashboard provides tracing and performance information that can be useful when you are trying to troubleshoot pages that are loading too slow. The Developer Dashboard is turned off by default and I'll go over 3 different ways to display it. Here is a screenshot of what the Developer Dashboard looks like when displayed at the bottom of the page: You can see on the left side the different events that fired during the page processing pipeline and how long these events took. This is where you will see individual web parts being processed and how long it took to complete (obviously the kind of processing depends on what the web part does). On the right side you would see the different database calls issued through the SharePoint Object Model to process the page. You will notice that each of these database queries are actually a hyperlink and clicking on it displays a pop-up window that shows the actual SQL Query Text, the Call Stack that triggered the database call, and the IO statistics of that query. Enabling the Developer Dashboard Option 1: Managed Code The Developer Dashboard is a farm-wide setting and the code above won't work if it is used within a web part hosted on any non-Central Admin site. The SPDeveloperDashboardLevel enum has three possible values: On, Off, and OnDemand. Setting it to On will always display the Developer Dashboard at the bottom of the page. Setting it Off will hide the Developer Dashboard. Setting it to OnDemand will add an icon at the top right corner of the page (see screenshot below) where a Site Collection Admin can toggle the display of the Developer Dashboard for a particular site collection. In my opinion, OnDemand is the best setting when troubleshooting a page or during development since a Site Collection Admin can turn it on or off and for a particular site only. The first cool thing about this is that the Site Collection Admin that turned it on will be the only one to see the Developer Dashboard output. Everyday users won't see the Developer Dashboard output even if it was turned on by a Site Collection Admin. If you need more flexibility on who gets to see the Developer Dashboard output, you can set the SPDeveloperDashboardSettings.RequiredPermissions to control which group of users will have the permission to see the output. Option 2: Using stsadm Using stsadm, you can run the following command to configure the Developer Dashboard: STSADM –o setproperty –pn developer-dashboard –pv OnDemand To successfully execute this command, be sure you that are running as a Farm Admin. Option 3: Using PowerShell For all scripts in SharePoint 2010, I prefer writing them as PowerShell scripts. Though the stsadm command is less verbose, the PowerShell equivalent is pretty straightforward and uses the SharePoint Object Model: You can of course parameterized the value that gets assigned to the DisplayLevel property so you can turn it On, Off or OnDemand depending on the parameter. Events and the Developer Dashboard Now, don't assume that all the code inside your web part or page will show up in the Developer Dashboard complete with all the great troubleshooting information. Only a finite set of events are monitored by default (for a web part it will events in the base web part class). Let's say you have a click event that could take some time, for example a web service call. And you want to include troubleshooting information for this event in the Developer Dashboard. Enter SPMonitoredScope which is also a new feature in SharePoint 2010. In SharePoint 2010, everything is executed within a "Monitored Scope". And each scope has a set of "Monitors" that measures and counts calls and timings which appears in the Developer Dashboard. Below is an example on how to get your custom code to get included in the Developer Dashboard by wrapping it inside a new monitored scope: The code above would include your new scope "My long web service call" into the Developer Dashboard and would log the time it took to complete processing. In my opinion, wrapping your custom code in a SPMonitoredScope is a SharePoint development best practice since it provides you visibility and a better understanding on the performance of your components.

Read the article

BizTalk host throttling – Singleton pattern and High database size

- by S.E.R.

Originally posted on: http://geekswithblogs.net/SERivas/archive/2013/06/30/biztalk-host-throttling-ndash-singleton-pattern-and-high-database-size.aspxI have worked for some days around the singleton pattern (for those unfamiliar with it, read this post by Victor Fehlberg) and have come across a few very interesting posts, among which one dealt with performance issues (here, also by Victor Fehlberg). Simply put: if you have an orchestration which implements the singleton pattern, then performances will continuously decrease as the orchestration receives and consumes messages, and that behavior is more obvious when the orchestration never ends (ie : it keeps looping and never terminates or completes). As I experienced the same kind of problem (actually I was alerted by SCOM, which told me that the host was being throttled because of High database size), I thought it would be a good idea to dig a little bit a see what happens deep inside BizTalk and thus understand the reasons for this behavior. NOTE: in this article, I will focus on this High database size throttling condition. I will try and work on the other conditions in some not too distant future… Test conditions The singleton orchestration For the purpose of this study, I have created the following orchestration, which is a very basic implementation of a singleton that piles up incoming messages, then does something else when a certain timeout has been reached without receiving another message: Throttling settings I have two distinct hosts : one that hosts the receive port (basic FILE port) : Ports_ReceiveHostone that hosts the orchestration : ProcessingHost In order to emphasize the throttling mechanism, I have modified the throttling settings for each of these hosts are as follows (all other parameters are set to the default value): [Throttling thresholds] Message count in database: 500 (default value : 50000) Evolution of performance counters when submitting messages Since we are investigating the High database size throttling condition, here are the performance counter that we should take a look at (all of them are in the BizTalk:Message Agent performance object): Database sizeHigh database sizeMessage delivery throttling stateMessage publishing throttling stateMessage delivery delay (ms)Message publishing delay (ms)Message delivery throttling state durationMessage publishing throttling state duration (If you are not used to Perfmon, I strongly recommend that you start using it right now: it is a wonderful tool that allows you to open the hood and see what is going on inside BizTalk – and other systems) Database size It is quite obvious that we will start by watching the database size and high database size counters, just to see when the first reaches the configured threshold (500) and when the second rings the alarm. NOTE : During this test I submitted 600 messages, one message at a time every 10ms to see the evolution of the counters we have previously selected. It might not show very well on this screenshot, but here is what happened: From 15:46:50 to 15:47:50, the database size for the Ports_ReceiveHost host (blue line) kept growing until it reached a maximum of 504.At 15:47:50, the high database size alert fires At first I was surprised by this result: why is it the database size of the receiving host that keeps growing since it is the processing host that piles up messages? Actually, it makes total sense. This counter measures the size of the database queue that is being filled by the host, not consumed. Therefore, the high database size alert is raised on the host that fills the queue: Ports_ReceiveHost. More information is available on the Public MPWiki page. Now, looking at the Message publishing throttling state for the receiving host (green line), we can see that a throttling condition has been reached at 15:47:50: We can also see that the Message publishing delay(ms) (blue line) has begun growing slowly from this point. All of this explains why performances keep decreasing when a singleton keeps processing new messages: the database size grows and when it has exceeded the Message count in database threshold, the host is throttled and the publishing delay keeps increasing. Digging further So, what happens to the database queue then? Is it flushed some day or does it keep growing and growing indefinitely? The real question being: will the host be throttled forever because of this singleton? To answer this question, I set the Message count in database threshold to 20 (this value is very low in order not to wait for too long, otherwise I certainly would have fallen asleep in front of my screen) and I submitted 30 messages. The test was started at 18:26. At 18:56 (ie : exactly 30min later) the throttling was stopped and the database size was divided by 2. 30 min later again, the database size had dropped to almost zero: I guess I’ll have to find some documentation and do some more testing before I sort this out! My guess is that some maintenance job is at work here, though I cannot tell which one Digging even further If we take a look at the Message delivery throttling state counter for the processing host, we can see that this host was also throttled during the submission of the 600 documents: The value for the counter was 1, meaning that Message delivery incoming rate for the host instance exceeds the Message delivery outgoing rate * the specified Rate overdrive factor (percent) value. We will see this another day… :) A last word Let’s end this article with a warning: DO NOT CHANGE THE THROTTLING SETTINGS LIGHTLY! The temptation can be great to just bypass throttling by setting very high values for each parameter (or zero in some cases, which simply disables throttling). Nevertheless, always keep in mind that this mechanism is here for a very good reason: prevent your BizTalk infrastructure from exploding!! So whatever you do with those settings, do a lot of testing and benchmarking!

Read the article

ca-certificates-java fails when trying to install openjdk-6-jre

- by Jonas

I use a VPS with Ubuntu Server 10.10 x64. I want to use Java and run the command sudo apt-get install openjdk-6-jre but it fails because the installation encounted errors while processing ca-certificates-java. I have tried to install the failed package with: sudo apt-get install ca-certificates-java How can I solve this? I have run sudo apt-get update and sudo apt-get upgrade but I get the same errors after that. I have also installed Ubuntu Server x64 on a VirtualBox, but the two Ubuntu Server 10.10 has different kernel versions (2.6.35 on VirtualBox and 2.6.18 on my VPS). And on VirtualBox I can install Jetty without any problems. The VPS is a fresh install of Ubuntu Server 10.10 x64, the first command I was running was sudo apt-get install openjdk-6-jre. When I run sudo apt-get install ca-certificates-java I get this message: Reading package lists... Done Building dependency tree Reading state information... Done ca-certificates-java is already the newest version. 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. 1 not fully installed or removed. After this operation, 0B of additional disk space will be used. Do you want to continue [Y/n]? Here I press Y then I get this message: Setting up ca-certificates-java (20100412) ... creating /etc/ssl/certs/java/cacerts... error adding brasil.gov.br/brasil.gov.br.crt error adding cacert.org/cacert.org.crt error adding debconf.org/ca.crt error adding gouv.fr/cert_igca_dsa.crt error adding gouv.fr/cert_igca_rsa.crt error adding mozilla/ABAecom_=sub.__Am._Bankers_Assn.=_Root_CA.crt error adding mozilla/AOL_Time_Warner_Root_Certification_Authority_1.crt error adding mozilla/AOL_Time_Warner_Root_Certification_Authority_2.crt error adding mozilla/AddTrust_External_Root.crt error adding mozilla/AddTrust_Low-Value_Services_Root.crt error adding mozilla/AddTrust_Public_Services_Root.crt error adding mozilla/AddTrust_Qualified_Certificates_Root.crt error adding mozilla/America_Online_Root_Certification_Authority_1.crt error adding mozilla/America_Online_Root_Certification_Authority_2.crt error adding mozilla/Baltimore_CyberTrust_Root.crt error adding mozilla/COMODO_Certification_Authority.crt error adding mozilla/COMODO_ECC_Certification_Authority.crt error adding mozilla/Camerfirma_Chambers_of_Commerce_Root.crt error adding mozilla/Camerfirma_Global_Chambersign_Root.crt error adding mozilla/Certplus_Class_2_Primary_CA.crt error adding mozilla/Certum_Root_CA.crt error adding mozilla/Comodo_AAA_Services_root.crt error adding mozilla/Comodo_Secure_Services_root.crt error adding mozilla/Comodo_Trusted_Services_root.crt error adding mozilla/DST_ACES_CA_X6.crt error adding mozilla/DST_Root_CA_X3.crt error adding mozilla/DigiCert_Assured_ID_Root_CA.crt error adding mozilla/DigiCert_Global_Root_CA.crt error adding mozilla/DigiCert_High_Assurance_EV_Root_CA.crt error adding mozilla/DigiNotar_Root_CA.crt error adding mozilla/Digital_Signature_Trust_Co._Global_CA_1.crt error adding mozilla/Digital_Signature_Trust_Co._Global_CA_2.crt error adding mozilla/Digital_Signature_Trust_Co._Global_CA_3.crt error adding mozilla/Digital_Signature_Trust_Co._Global_CA_4.crt error adding mozilla/Entrust.net_Global_Secure_Personal_CA.crt error adding mozilla/Entrust.net_Global_Secure_Server_CA.crt error adding mozilla/Entrust.net_Premium_2048_Secure_Server_CA.crt error adding mozilla/Entrust.net_Secure_Personal_CA.crt error adding mozilla/Entrust.net_Secure_Server_CA.crt error adding mozilla/Entrust_Root_Certification_Authority.crt error adding mozilla/Equifax_Secure_CA.crt error adding mozilla/Equifax_Secure_Global_eBusiness_CA.crt error adding mozilla/Equifax_Secure_eBusiness_CA_1.crt error adding mozilla/Equifax_Secure_eBusiness_CA_2.crt error adding mozilla/Firmaprofesional_Root_CA.crt error adding mozilla/GTE_CyberTrust_Global_Root.crt error adding mozilla/GTE_CyberTrust_Root_CA.crt error adding mozilla/GeoTrust_Global_CA.crt error adding mozilla/GeoTrust_Global_CA_2.crt error adding mozilla/GeoTrust_Primary_Certification_Authority.crt error adding mozilla/GeoTrust_Universal_CA.crt error adding mozilla/GeoTrust_Universal_CA_2.crt error adding mozilla/GlobalSign_Root_CA.crt error adding mozilla/GlobalSign_Root_CA_-_R2.crt error adding mozilla/Go_Daddy_Class_2_CA.crt error adding mozilla/IPS_CLASE1_root.crt error adding mozilla/IPS_CLASE3_root.crt error adding mozilla/IPS_CLASEA1_root.crt error adding mozilla/IPS_CLASEA3_root.crt error adding mozilla/IPS_Chained_CAs_root.crt error adding mozilla/IPS_Servidores_root.crt error adding mozilla/IPS_Timestamping_root.crt error adding mozilla/NetLock_Business_=Class_B=_Root.crt error adding mozilla/NetLock_Express_=Class_C=_Root.crt error adding mozilla/NetLock_Notary_=Class_A=_Root.crt error adding mozilla/NetLock_Qualified_=Class_QA=_Root.crt error adding mozilla/Network_Solutions_Certificate_Authority.crt error adding mozilla/QuoVadis_Root_CA.crt error adding mozilla/QuoVadis_Root_CA_2.crt error adding mozilla/QuoVadis_Root_CA_3.crt error adding mozilla/RSA_Root_Certificate_1.crt error adding mozilla/RSA_Security_1024_v3.crt error adding mozilla/RSA_Security_2048_v3.crt error adding mozilla/SecureTrust_CA.crt error adding mozilla/Secure_Global_CA.crt error adding mozilla/Security_Communication_Root_CA.crt error adding mozilla/Sonera_Class_1_Root_CA.crt error adding mozilla/Sonera_Class_2_Root_CA.crt error adding mozilla/Staat_der_Nederlanden_Root_CA.crt error adding mozilla/Starfield_Class_2_CA.crt error adding mozilla/StartCom_Certification_Authority.crt error adding mozilla/StartCom_Ltd..crt error adding mozilla/SwissSign_Gold_CA_-_G2.crt error adding mozilla/SwissSign_Platinum_CA_-_G2.crt error adding mozilla/SwissSign_Silver_CA_-_G2.crt error adding mozilla/Swisscom_Root_CA_1.crt error adding mozilla/TC_TrustCenter__Germany__Class_2_CA.crt error adding mozilla/TC_TrustCenter__Germany__Class_3_CA.crt error adding mozilla/TDC_Internet_Root_CA.crt error adding mozilla/TDC_OCES_Root_CA.crt error adding mozilla/TURKTRUST_Certificate_Services_Provider_Root_1.crt error adding mozilla/TURKTRUST_Certificate_Services_Provider_Root_2.crt error adding mozilla/Taiwan_GRCA.crt error adding mozilla/Thawte_Personal_Basic_CA.crt error adding mozilla/Thawte_Personal_Freemail_CA.crt error adding mozilla/Thawte_Personal_Premium_CA.crt error adding mozilla/Thawte_Premium_Server_CA.crt error adding mozilla/Thawte_Server_CA.crt error adding mozilla/Thawte_Time_Stamping_CA.crt error adding mozilla/UTN-USER_First-Network_Applications.crt error adding mozilla/UTN_DATACorp_SGC_Root_CA.crt error adding mozilla/UTN_USERFirst_Email_Root_CA.crt error adding mozilla/UTN_USERFirst_Hardware_Root_CA.crt error adding mozilla/ValiCert_Class_1_VA.crt error adding mozilla/ValiCert_Class_2_VA.crt error adding mozilla/VeriSign_Class_3_Public_Primary_Certification_Authority_-_G5.crt error adding mozilla/Verisign_Class_1_Public_Primary_Certification_Authority.crt error adding mozilla/Verisign_Class_1_Public_Primary_Certification_Authority_-_G2.crt error adding mozilla/Verisign_Class_1_Public_Primary_Certification_Authority_-_G3.crt error adding mozilla/Verisign_Class_2_Public_Primary_Certification_Authority.crt error adding mozilla/Verisign_Class_2_Public_Primary_Certification_Authority_-_G2.crt error adding mozilla/Verisign_Class_2_Public_Primary_Certification_Authority_-_G3.crt error adding mozilla/Verisign_Class_3_Public_Primary_Certification_Authority.crt error adding mozilla/Verisign_Class_3_Public_Primary_Certification_Authority_-_G2.crt error adding mozilla/Verisign_Class_3_Public_Primary_Certification_Authority_-_G3.crt error adding mozilla/Verisign_Class_4_Public_Primary_Certification_Authority_-_G2.crt error adding mozilla/Verisign_Class_4_Public_Primary_Certification_Authority_-_G3.crt error adding mozilla/Verisign_RSA_Secure_Server_CA.crt error adding mozilla/Verisign_Time_Stamping_Authority_CA.crt error adding mozilla/Visa_International_Global_Root_2.crt error adding mozilla/Visa_eCommerce_Root.crt error adding mozilla/WellsSecure_Public_Root_Certificate_Authority.crt error adding mozilla/Wells_Fargo_Root_CA.crt error adding mozilla/XRamp_Global_CA_Root.crt error adding mozilla/beTRUSTed_Root_CA-Baltimore_Implementation.crt error adding mozilla/beTRUSTed_Root_CA.crt error adding mozilla/beTRUSTed_Root_CA_-_Entrust_Implementation.crt error adding mozilla/beTRUSTed_Root_CA_-_RSA_Implementation.crt error adding mozilla/thawte_Primary_Root_CA.crt error adding signet.pl/signet_ca1_pem.crt error adding signet.pl/signet_ca2_pem.crt error adding signet.pl/signet_ca3_pem.crt error adding signet.pl/signet_ocspklasa2_pem.crt error adding signet.pl/signet_ocspklasa3_pem.crt error adding signet.pl/signet_pca2_pem.crt error adding signet.pl/signet_pca3_pem.crt error adding signet.pl/signet_rootca_pem.crt error adding signet.pl/signet_tsa1_pem.crt error adding spi-inc.org/spi-ca-2003.crt error adding spi-inc.org/spi-cacert-2008.crt error adding telesec.de/deutsche-telekom-root-ca-2.crt failed (VM used: java-6-openjdk). dpkg: error processing ca-certificates-java (--configure): subprocess installed post-installation script returned error exit status 1 Errors were encountered while processing: ca-certificates-java E: Sub-process /usr/bin/dpkg returned an error code (1) Update I also get a problem when running java -version: Error occurred during initialization of VM Could not reserve enough space for object heap Could not create the Java virtual machine. My VPS had 128MB of Memory, I changed to 256MB but got the same problem. Then I changed to 512MB and got the same problem. I found a related post on a forum: Sub-process /usr/bin/dpkg returned an error code (1) And I tried: sudo apt-get clean sudo apt-get --reinstall install openjdk-6-jre sudo dpkg --configure -a But I got the same problem, even when I'm using 512MB of Memory. Any suggestions?

Read the article

Handling Trailing Delimiters in HL7 Messages

- by Thomas Canter

Applies to: BizTalk Server 2006 with the HL7 1.3 Accelerator Outline of the problem Trailing Delimiters are empty values at the end of an object in a HL7 ER7 formatted message. Examples: Empty Field NTE|P| NTE|P|| Empty component ORC|1|725^ Empty Subcomponent ORC|1|||||27& Empty repeat OBR|1||||||||027~ Trailing delimiters indicate the following object exists and is empty, which is quite different from null, null is an explicit value indicated by a pair of double quotes -> "". The BizTalk HL7 Accelerator by default does not allow trailing delimiters. There are three methods to allow trailing delimiters. NOTE: All Schemas always allow trailing delimiters in the MSH Segment Using party identifiers MSH3.1 – Receive/inbound processing, using this value as a party allows you to configure the system to allow inbound trailing delimiters. MSH5.1 – Send/outbound processing, using this value as a party allows you to configure the system to allow outbound trailing delimiters. Generally, if you allow inbound trailing delimiters, unless you are willing to programmatically remove all trailing delimiters, then you need to configure the send to allow trailing delimiters. Add the appropriate parties to the BizTalk Parties list from these two fields in your message stream. Open the BizTalk HL7 Configuration tool and for each party check the "Allow trailing delimiters (separators)" check box on the Validation tab. Disadvantage – Each MSH3.1 and MSH5.1 value must be represented in the parties list and configured. Advantage – granular control over system behavior for each inbound/outbound system. Using instance properties of a pipeline used in a send port or receive location. Open the BizTalk Server Administration console locate the send port or receive location that contains the BTAHL72XReceivePipeline or BTAHL72XSendPipeline pipeline. Open the properties To the right of the pipeline selected locate the […] ellipses button In the property list, locate the "TrailingDelimiterAllowed" property and set it to True. Advantage – All messages through a particular Send Port or Receive Location will allow trailing delimiters. Disadvantage – Must configure each Send Port or Receive Location. No granular control over which remote parties will send or receive messages with trailing delimiters. Using a custom pipeline that uses a pre-configured BTA HL7 Pipeline component. Use Visual Studio to construct a custom receive and send pipeline using the appropriate assembler or dissasembler. Set the component property to "TrailingDelimitersAllowed" to True Compile and deploy the custom pipeline Use the custom pipeline instead of the standard pipeline for all HL7 message processing Advantage – All messages using the custom pipeline will automatically allow trailing delimiters. Disadvantage – Requires custom coding and development to create and deploy the custom pipeline. No granular control over which remote parties will send or receive messages with trailing delimiters. What does a Trailing Delimiter do to the XML Schema? Allowing trailing delimiters does not have the impact often expected in the actual XML Schema.The Schema reproduces the message with no data loss.Thus, the message when represented in XML must contain the extra fields, in order to reproduce the outbound message.Thus, a trialing delimiter results in an empty XML field.Trailing Delmiters are not stripped from the inbound message. Example:<PID_21>44172</PID_21><PID_21>9257</PID_21> -> the original maximum number of repeats<PID_21></PID_21> -> The empty repeated field Allowing trailing delimiters not remove the trailing delimiters from the message, it simply suppresses the check that will cause the message to fail parse with trailing delimiters. When can you not fix the problem by enabling trailing delimiters Each object in a message must have a location in the target BTAHL7 schema for its content to reside.If you have more objects in the message than are contained at that location, then enabling trailing delimiters will not resolve the problem. The schema must be extended to accommodate the empty message content.Examples: Extra Field NTE|P||||Only 4 fields in NTE Segment, the 4th field exists, but is empty. Extra component PID|1|1523|47^^^^^^^Only 5 components in a CX data type, the 5th component exists, but is empty Extra subcomponent ORC|1|||||27&&Only 2 subcomponents in a CQ data type, the 3rd subcomponent is empty, but exists. Extra Repeat PID|1||||||||||||||||||||4419~5217~Only 2 repeats allowed for the field "Mother's identifier", the repeat is empty, but exists. In each of these cases, you must locate the failing object and extend the type to allow an additional object of that type. FieldAdd a field of ST to the end of the segment with a suitable name in the segments_nnn.xsd Component Create a new Custom CX data type (i.e. CX_XtraComp) in the datatypes_nnn.xsd and add a new component to the custom CX data type. Update the field in the segments_nnn.xsd file to use the custom data type instead of the standard datatype. Subcomponent Create a new Custom CQ data type that accepts an additional TS value at the end of the data type. Create a custom TQ data type that uses the new custom CQ data type as the first subcomponent. Modify the ORC segment to use the new CQ data type at ORC.7 instead of the standard CQ data type. RepeatModify the Field definition for PID.21 in the segments_nnn.xsd to allow more repeats in the field.

Read the article

Kernel, dpkg, sudo and apt-get corrupted

- by TECH4JESUS

Here are some errors that I am getting: 1) A proper configuration for Firestarter was not found. If you are running Firestarter from the directory you built it in, run make install-data-local to install a configuration, or simply make install to install the whole program. Firestarter will now close. root@p:/# firestarter ** (firestarter:5890): WARNING **: The connection is closed (firestarter:5890): GnomeUI-WARNING **: While connecting to session manager: None of the authentication protocols specified are supported. (firestarter:5890): GConf-WARNING **: Client failed to connect to the D-BUS daemon: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. (firestarter:5890): GConf-WARNING **: Client failed to connect to the D-BUS daemon: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. (firestarter:5890): GConf-WARNING **: Client failed to connect to the D-BUS daemon: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. (firestarter:5890): GConf-WARNING **: Client failed to connect to the D-BUS daemon: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. (firestarter:5890): GConf-WARNING **: Client failed to connect to the D-BUS daemon: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. (firestarter:5890): GConf-WARNING **: Client failed to connect to the D-BUS daemon: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. (firestarter:5890): GConf-WARNING **: Client failed to connect to the D-BUS daemon: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. ^C 2) Also I cannot apt-get install sudo root@p:/# apt-get install sudo Reading package lists... Done Building dependency tree Reading state information... Done sudo is already the newest version. The following packages were automatically installed and are no longer required: gir1.2-rb-3.0 gir1.2-gstreamer-0.10 libntfs10 python-mako libdmapsharing-3.0-2 rhythmbox-data libx264-116 rhythmbox libiso9660-7 librhythmbox-core5 libvpx0 libmatroska4 gir1.2-gst-plugins-base-0.10 rhythmbox-mozilla rhythmbox-plugin-zeitgeist libattica0 libgpac0.4.5 python-markupsafe libmusicbrainz4c2a rhythmbox-plugin-cdrecorder rhythmbox-plugins libaudiofile0 Use 'apt-get autoremove' to remove them. 0 upgraded, 0 newly installed, 0 to remove and 18 not upgraded. 9 not fully installed or removed. Need to get 0 B/76.3 MB of archives. After this operation, 0 B of additional disk space will be used. Do you want to continue [Y/n]? Y /bin/sh: 1: /usr/sbin/dpkg-preconfigure: not found (Reading database ... 495741 files and directories currently installed.) Preparing to replace linux-image-3.2.0-24-generic 3.2.0-24.39 (using .../linux-image-3.2.0-24-generic_3.2.0-24.39_amd64.deb) ... dpkg (subprocess): unable to execute old pre-removal script (/var/lib/dpkg/info/linux-image-3.2.0-24-generic.prerm): No such file or directory dpkg: warning: subprocess old pre-removal script returned error exit status 2 dpkg - trying script from the new package instead ... dpkg (subprocess): unable to execute new pre-removal script (/var/lib/dpkg/tmp.ci/prerm): No such file or directory dpkg: error processing /var/cache/apt/archives/linux-image-3.2.0-24-generic_3.2.0-24.39_amd64.deb (--unpack): subprocess new pre-removal script returned error exit status 2 dpkg (subprocess): unable to execute installed post-installation script (/var/lib/dpkg/info/linux-image-3.2.0-24-generic.postinst): No such file or directory dpkg: error while cleaning up: subprocess installed post-installation script returned error exit status 2 Preparing to replace linux-image-3.2.0-25-generic 3.2.0-25.40 (using .../linux-image-3.2.0-25-generic_3.2.0-25.40_amd64.deb) ... dpkg (subprocess): unable to execute old pre-removal script (/var/lib/dpkg/info/linux-image-3.2.0-25-generic.prerm): No such file or directory dpkg: warning: subprocess old pre-removal script returned error exit status 2 dpkg - trying script from the new package instead ... dpkg (subprocess): unable to execute new pre-removal script (/var/lib/dpkg/tmp.ci/prerm): No such file or directory dpkg: error processing /var/cache/apt/archives/linux-image-3.2.0-25-generic_3.2.0-25.40_amd64.deb (--unpack): subprocess new pre-removal script returned error exit status 2 dpkg (subprocess): unable to execute installed post-installation script (/var/lib/dpkg/info/linux-image-3.2.0-25-generic.postinst): No such file or directory dpkg: error while cleaning up: subprocess installed post-installation script returned error exit status 2 Errors were encountered while processing: /var/cache/apt/archives/linux-image-3.2.0-24-generic_3.2.0-24.39_amd64.deb /var/cache/apt/archives/linux-image-3.2.0-25-generic_3.2.0-25.40_amd64.deb E: Sub-process /usr/bin/dpkg returned an error code (1)

Read the article

Recovering from apt-get upgrade gone wrong due to a full disk

- by Peter

I was performing an apt-get upgrade on an Ubuntu 12.04.5 LTS box that hadn't been updated in a little while and the upgrade failed due to 'No space left on device'. After a little while I worked out space meant inodes and I have freed some up but unfortunately things have been left something askew. I have tried manually installing the old versions of packages mentioned using dpkg -i but that doesn't help. I have tried apt-get upgrade and apt-get -f install to no avail. Results are below. Any ideas how to fix things up? FIXED: Installing the earlier versions again manually via dpkg -i and then apt-get -f install has done the trick. Not sure why this didn't work the first time. The packages in question are listed below but they will presumably vary. libssl1.0.0_1.0.1-4ubuntu5.14_i386.deb linux-headers-3.2.0-64-generic-pae_3.2.0-64.97_i386.deb linux-image-generic-pae_3.2.0.64.76_i386.deb linux-headers-3.2.0-64_3.2.0-64.97_all.deb linux-headers-generic-pae_3.2.0.64.76_i386.deb root@unlinked:/tmp# apt-get upgrade Reading package lists... Done Building dependency tree Reading state information... Done You might want to run ‘apt-get -f install’ to correct these. The following packages have unmet dependencies. libssl-dev : Depends: libssl1.0.0 (= 1.0.1-4ubuntu5.14) but 1.0.1-4ubuntu5.17 is installed linux-generic-pae : Depends: linux-image-generic-pae (= 3.2.0.64.76) but 3.2.0.67.79 is installed Depends: linux-headers-generic-pae (= 3.2.0.64.76) but 3.2.0.67.79 is installed E: Unmet dependencies. Try using -f. root@unlinked:/tmp# apt-get -f install Reading package lists... Done Building dependency tree Reading state information... Done Correcting dependencies... Done The following packages were automatically installed and are no longer required: linux-headers-3.2.0-43-generic-pae linux-headers-3.2.0-38-generic-pae linux-headers-3.2.0-41-generic-pae linux-headers-3.2.0-36-generic-pae linux-headers-3.2.0-63-generic-pae linux-headers-3.2.0-58-generic-pae linux-headers-3.2.0-60-generic-pae linux-headers-3.2.0-55-generic-pae linux-headers-3.2.0-40 linux-headers-3.2.0-41 linux-headers-3.2.0-36 linux-headers-3.2.0-37 linux-headers-3.2.0-43 linux-headers-3.2.0-38 linux-headers-3.2.0-44 linux-headers-3.2.0-39 linux-headers-3.2.0-45 linux-headers-3.2.0-51 linux-headers-3.2.0-52 linux-headers-3.2.0-53 linux-headers-3.2.0-48 linux-headers-3.2.0-54 linux-headers-3.2.0-60 linux-headers-3.2.0-55 linux-headers-3.2.0-61 linux-headers-3.2.0-56 linux-headers-3.2.0-57 linux-headers-3.2.0-63 linux-headers-3.2.0-58 linux-headers-3.2.0-59 linux-headers-3.2.0-52-generic-pae linux-headers-3.2.0-44-generic-pae linux-headers-3.2.0-39-generic-pae linux-headers-3.2.0-37-generic-pae linux-headers-3.2.0-59-generic-pae linux-headers-3.2.0-61-generic-pae linux-headers-3.2.0-56-generic-pae linux-headers-3.2.0-53-generic-pae linux-headers-3.2.0-48-generic-pae linux-headers-3.2.0-45-generic-pae linux-headers-3.2.0-40-generic-pae linux-headers-3.2.0-57-generic-pae linux-headers-3.2.0-54-generic-pae linux-headers-3.2.0-51-generic-pae Use 'apt-get autoremove' to remove them. The following extra packages will be installed: libssl-dev linux-generic-pae The following packages will be upgraded: libssl-dev linux-generic-pae 2 to upgrade, 0 to newly install, 0 to remove and 0 not to upgrade. 2 not fully installed or removed. Need to get 0 B/1,427 kB of archives. After this operation, 1,024 B of additional disk space will be used. Do you want to continue [Y/n]? y dpkg: dependency problems prevent configuration of libssl-dev: libssl-dev depends on libssl1.0.0 (= 1.0.1-4ubuntu5.14); however: Version of libssl1.0.0 on system is 1.0.1-4ubuntu5.17. dpkg: error processing libssl-dev (--configure): dependency problems - leaving unconfigured No apport report written because the error message indicates it's a follow-up error from a previous failure. dpkg: dependency problems prevent configuration of linux-generic-pae: linux-generic-pae depends on linux-image-generic-pae (= 3.2.0.64.76); however: Version of linux-image-generic-pae on system is 3.2.0.67.79. linux-generic-pae depends on linux-headers-generic-pae (= 3.2.0.64.76); however: Version of linux-headers-generic-pae on system is 3.2.0.67.79. dpkg: error processing linux-generic-pae (--configure): dependency problems - leaving unconfigured No apport report written because the error message indicates it's a follow-up error from a previous failure. Errors were encountered while processing: libssl-dev linux-generic-pae E: Sub-process /usr/bin/dpkg returned an error code (1)

Read the article

Mixed Emotions: Humans React to Natural Language Computer

- by Applications User Experience

There was a big event in Silicon Valley on Tuesday, November 15. Watson, the natural language computer developed at IBM Watson Research Center in Yorktown Heights, New York, and its inventor and principal research investigator, David Ferrucci, were guests at the Computer History Museum in Mountain View, California for another round of the television game Jeopardy. You may have read about or watched on YouTube how Watson beat Ken Jennings and Brad Rutter, two top Jeopardy competitors, last February. This time, Watson swept the floor with two Silicon Valley high-achievers, one a venture capitalist with a background in math, computer engineering, and physics, and the other a technology and finance writer well-versed in all aspects of culture and humanities. Watson is the product of the DeepQA research project, which attempts to create an artificially intelligent computing system through advances in natural language processing (NLP), among other technologies. NLP is a computing strategy that seeks to provide answers by processing large amounts of unstructured data contained in multiple large domains of human knowledge. There are several ways to perform NLP, but one way to start is by recognizing key words, then processing contextual cues associated with the keyword concepts so that you get many more “smart” (that is, human-like) deductions, rather than a series of “dumb” matches. Jeopardy questions often require more than key word matching to get the correct answer; typically several pieces of information put together, often from vastly different categories, to come up with a satisfactory word string solution that can be rephrased as a question. Smarter than your average search engine, but is it as smart as a human? Watson was especially fast at descrambling mixed-up state capital names, and recalling and pairing movie titles where one started and the other ended in the same word (e.g., Billion Dollar Baby Boom, where both titles used the word Baby). David said they had basically removed the variable of how fast Watson hit the buzzer compared to human contestants, but frustration frequently appeared on the faces of the contestants beaten to the punch by Watson. David explained that top Jeopardy winners like Jennings achieved their success with a similar strategy, timing their buzz to the end of the reading of the clue, and “running the board”, being first to respond on about 60% of the clues. Similar results for Watson. It made sense that Watson would be good at the technical and scientific stuff, so I figured the venture capitalist was toast. But I thought for sure Watson would lose to the writer in categories such as pop culture, wines and foods, and other humanities. Surprisingly, it held its own. I was amazed it could recognize a word definition of a syllogism in the category of philosophy. So what was the audience reaction to all of this? We started out expecting our formidable human contestants to easily run some of their categories; however, they started off on the wrong foot with the state capitals which Watson could unscramble so efficiently. By the end of the first round, contestants and the audience were feeling a little bit, well, …. deflated. Watson was winning by about $13,000, and the humans had gone into negative dollars. The IBM host said he was going to “slow Watson down a bit,” and the humans came back with respectable scores in Double Jeopardy. This was partially thanks to a very sympathetic audience (and host, also a human) providing “group-think” on many questions, especially baseball ‘s most valuable players, which by the way, couldn’t have been hard because even I knew them. Yes, that’s right, the humans cheated. Since Watson could speak but not hear us (it didn’t have speech recognition capability), it was probably unaware of this. In Final Jeopardy, the single question had to do with law. I was sure Watson would blow this one, but all contestants were able to answer correctly about a copyright law. In a career devoted to making computers more helpful to people, I think I may have seen how a computer can do too much. I’m not sure I’d want to work side-by-side with a Watson doing my job. Certainly listening and empathy are important traits we humans still have over Watson. While there was great enthusiasm in the packed room of computer scientists and their friends for this standing-room-only show, I think it made several of us uneasy (especially the poor human contestants whose egos were soundly bashed in the first round). This computer system, by the way , only took 4 years to program. David Ferrucci mentioned several practical uses for Watson, including medical diagnoses and legal strategies. Are you “the expert” in your job? Imagine NLP computing on an Oracle database. This may be the user interface of the future to enable users to better process big data. How do you think you’d like it? Postscript: There were three little boys sitting in front of me in the very first row. They looked, how shall I say it, … unimpressed!

Read the article

SPARC T4-4 Delivers World Record First Result on PeopleSoft Combined Benchmark

- by Brian

Oracle's SPARC T4-4 servers running Oracle's PeopleSoft HCM 9.1 combined online and batch benchmark achieved World Record 18,000 concurrent users while executing a PeopleSoft Payroll batch job of 500,000 employees in 43.32 minutes and maintaining online users response time at < 2 seconds. This world record is the first to run online and batch workloads concurrently. This result was obtained with a SPARC T4-4 server running Oracle Database 11g Release 2, a SPARC T4-4 server running PeopleSoft HCM 9.1 application server and a SPARC T4-2 server running Oracle WebLogic Server in the web tier. The SPARC T4-4 server running the application tier used Oracle Solaris Zones which provide a flexible, scalable and manageable virtualization environment. The average CPU utilization on the SPARC T4-2 server in the web tier was 17%, on the SPARC T4-4 server in the application tier it was 59%, and on the SPARC T4-4 server in the database tier was 35% (online and batch) leaving significant headroom for additional processing across the three tiers. The SPARC T4-4 server used for the database tier hosted Oracle Database 11g Release 2 using Oracle Automatic Storage Management (ASM) for database files management with I/O performance equivalent to raw devices. This is the first three tier mixed workload (online and batch) PeopleSoft benchmark also processing PeopleSoft payroll batch workload. Performance Landscape PeopleSoft HR Self-Service and Payroll Benchmark Systems Users Ave Response Search (sec) Ave Response Save (sec) Batch Time (min) Streams SPARC T4-2 (web) SPARC T4-4 (app) SPARC T4-2 (db) 18,000 0.944 0.503 43.32 64 Configuration Summary Application Configuration: 1 x SPARC T4-4 server with 4 x SPARC T4 processors, 3.0 GHz 512 GB memory 5 x 300 GB SAS internal disks 1 x 100 GB and 2 x 300 GB internal SSDs 2 x 10 Gbe HBA Oracle Solaris 11 11/11 PeopleTools 8.52 PeopleSoft HCM 9.1 Oracle Tuxedo, Version 10.3.0.0, 64-bit, Patch Level 031 Java Platform, Standard Edition Development Kit 6 Update 32 Database Configuration: 1 x SPARC T4-4 server with 4 x SPARC T4 processors, 3.0 GHz 256 GB memory 3 x 300 GB SAS internal disks Oracle Solaris 11 11/11 Oracle Database 11g Release 2 Web Tier Configuration: 1 x SPARC T4-2 server with 2 x SPARC T4 processors, 2.85 GHz 256 GB memory 2 x 300 GB SAS internal disks 1 x 100 GB internal SSD Oracle Solaris 11 11/11 PeopleTools 8.52 Oracle WebLogic Server 10.3.4 Java Platform, Standard Edition Development Kit 6 Update 32 Storage Configuration: 1 x Sun Server X2-4 as a COMSTAR head for data 4 x Intel Xeon X7550, 2.0 GHz 128 GB memory 1 x Sun Storage F5100 Flash Array (80 flash modules) 1 x Sun Storage F5100 Flash Array (40 flash modules) 1 x Sun Fire X4275 as a COMSTAR head for redo logs 12 x 2 TB SAS disks with Niwot Raid controller Benchmark Description This benchmark combines PeopleSoft HCM 9.1 HR Self Service online and PeopleSoft Payroll batch workloads to run on a unified database deployed on Oracle Database 11g Release 2. The PeopleSoft HRSS benchmark kit is a Oracle standard benchmark kit run by all platform vendors to measure the performance. It's an OLTP benchmark where DB SQLs are moderately complex. The results are certified by Oracle and a white paper is published. PeopleSoft HR SS defines a business transaction as a series of HTML pages that guide a user through a particular scenario. Users are defined as corporate Employees, Managers and HR administrators. The benchmark consist of 14 scenarios which emulate users performing typical HCM transactions such as viewing paycheck, promoting and hiring employees, updating employee profile and other typical HCM application transactions. All these transactions are well-defined in the PeopleSoft HR Self-Service 9.1 benchmark kit. This benchmark metric is the weighted average response search/save time for all the transactions. The PeopleSoft 9.1 Payroll (North America) benchmark demonstrates system performance for a range of processing volumes in a specific configuration. This workload represents large batch runs typical of a ERP environment during a mass update. The benchmark measures five application business process run times for a database representing large organization. They are Paysheet Creation, Payroll Calculation, Payroll Confirmation, Print Advice forms, and Create Direct Deposit File. The benchmark metric is the cumulative elapsed time taken to complete the Paysheet Creation, Payroll Calculation and Payroll Confirmation business application processes. The benchmark metrics are taken for each respective benchmark while running simultaneously on the same database back-end. Specifically, the payroll batch processes are started when the online workload reaches steady state (the maximum number of online users) and overlap with online transactions for the duration of the steady state. Key Points and Best Practices Two Oracle PeopleSoft Domain sets with 200 application servers each on a SPARC T4-4 server were hosted in 2 separate Oracle Solaris Zones to demonstrate consolidation of multiple application servers, ease of administration and performance tuning. Each Oracle Solaris Zone was bound to a separate processor set, each containing 15 cores (total 120 threads). The default set (1 core from first and third processor socket, total 16 threads) was used for network and disk interrupt handling. This was done to improve performance by reducing memory access latency by using the physical memory closest to the processors and offload I/O interrupt handling to default set threads, freeing up cpu resources for Application Servers threads and balancing application workload across 240 threads. See Also Oracle PeopleSoft Benchmark White Papers oracle.com SPARC T4-2 Server oracle.com OTN SPARC T4-4 Server oracle.com OTN PeopleSoft Enterprise Human Capital Management oracle.com OTN PeopleSoft Enterprise Human Capital Management (Payroll) oracle.com OTN Oracle Solaris oracle.com OTN Oracle Database 11g Release 2 Enterprise Edition oracle.com OTN Disclosure Statement Oracle's PeopleSoft HR and Payroll combined benchmark, www.oracle.com/us/solutions/benchmark/apps-benchmark/peoplesoft-167486.html, results 09/30/2012.

Read the article

Trying to install Team viewer on Ubuntu 12.04

- by Teknikk

I recently got Ubuntu installed on my server, I wanted to install TeamViewer so i could easy manage the virtual machines, However, I get errors when installing it from App store?, And I also get errors, but more detailed on the terminal. Error output: tek@tek-G53SW:~/Download$ sudo dpkg -i ipts teamviewer_linux_x64.deb dpkg: error processing ipts (--install): cannot access archive: No such file or directory (Reading database ... 142115 files and directories currently installed.) Preparing to replace teamviewer7 7.0.9360 (using teamviewer_linux_x64.deb) ... Unpacking replacement teamviewer7 ... dpkg: dependency problems prevent configuration of teamviewer7: teamviewer7 depends on libc6-i386 (>= 2.7); however: Package libc6-i386 is not installed. teamviewer7 depends on lib32asound2; however: Package lib32asound2 is not installed. teamviewer7 depends on lib32z1; however: Package lib32z1 is not installed. teamviewer7 depends on ia32-libs; however: Package ia32-libs is not installed. dpkg: error processing teamviewer7 (--install): dependency problems - leaving unconfigured Errors were encountered while processing: ipts teamviewer7 I tried to install it manually, but with no luck, I heard some others has this problems. I am running Ubuntu 12.04 x64. Error @ sudo apt-get install libc6-i386 lib32asound2 lib32z1 ia32-libs : tek@tek-G53SW:~/Download$ sudo apt-get install libc6-i386 lib32asound2 lib32z1 ia32-libs Reading package lists... Done Building dependency tree Reading state information... Done You might want to run 'apt-get -f install' to correct these: The following packages have unmet dependencies: ia32-libs : Depends: ia32-libs-multiarch E: Unmet dependencies. Try 'apt-get -f install' with no packages (or specify a solution). tek@tek-G53SW:~/Download$ More errors tek@tek-G53SW:~/Download$ sudo apt-get -f install [sudo] password for tek: Reading package lists... Done Building dependency tree Reading state information... Done Correcting dependencies... Done The following packages will be REMOVED: teamviewer7 0 upgraded, 0 newly installed, 1 to remove and 0 not upgraded. 1 not fully installed or removed. After this operation, 81.9 MB disk space will be freed. Do you want to continue [Y/n]? y (Reading database ... 142441 files and directories currently installed.) Removing teamviewer7 ... tek@tek-G53SW:~/Download$ sudo apt-get install libc6-i386 lib32asound2 lib32z1 ia32-libs Reading package lists... Done Building dependency tree Reading state information... Done lib32z1 is already the newest version. libc6-i386 is already the newest version. lib32asound2 is already the newest version. Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: ia32-libs : Depends: ia32-libs-multiarch E: Unable to correct problems, you have held broken packages. tek@tek-G53SW:~/Download$ sudo apt-get install ia32-libs-multiarch Reading package lists... Done Building dependency tree Reading state information... Done Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: ia32-libs-multiarch:i386 : Depends: gstreamer0.10-plugins-good:i386 but it is not going to be installed Depends: gtk2-engines:i386 but it is not going to be installed Depends: gtk2-engines-murrine:i386 but it is not going to be installed Depends: gtk2-engines-pixbuf:i386 but it is not going to be installed Depends: gtk2-engines-oxygen:i386 but it is not going to be installed Depends: ibus-gtk:i386 but it is not going to be installed Depends: libcanberra-gtk-module:i386 but it is not going to be installed Depends: libcups2:i386 but it is not going to be installed Depends: libcupsimage2:i386 but it is not going to be installed Depends: libfontconfig1:i386 but it is not going to be installed Depends: libgail-common:i386 but it is not going to be installed Depends: libgphoto2-2:i386 but it is not going to be installed Depends: libgtk2.0-0:i386 but it is not going to be installed Depends: libnss3:i386 but it is not going to be installed Depends: libqt4-opengl:i386 but it is not going to be installed Depends: libqt4-qt3support:i386 but it is not going to be installed Depends: libqt4-scripttools:i386 but it is not going to be installed Depends: libqt4-svg:i386 but it is not going to be installed Depends: libqtgui4:i386 but it is not going to be installed Depends: libqtwebkit4:i386 but it is not going to be installed Depends: librsvg2-common:i386 but it is not going to be installed Depends: libsane:i386 but it is not going to be installed E: Unable to correct problems, you have held broken packages. tek@tek-G53SW:~/Download$

Read the article

Common usecases and techniques when integrating a 3rd party application with Oracle Sales Cloud

- by asantaga

Over the last year or so I've see a lot of partners migrating and integrate their applications with Oracle Sales Cloud. Interestingly I'd say 60% of the partners use the same set of design patterns over and over again. Most of the time I see that they want to embed their application into Oracle Sales Cloud, within a tab usually, perhaps click on a link to their application (passing some piece of data + credentials) and then within their application update sales cloud again using webservices. Here are some examples of the different use-cases I've seen , and how partners are embedding their applications into Sales Cloud, NB : The following examples use the "Desktop" User Interface rather than the Newer "Simplified User Interface", I'll update the sample application soon but the integration patterns are precisely the same Use Case 1 : Navigator "Link out" to third party application This is an example of where the developer has added a link to the global navigator and this links out to the 3rd Party Application. Typically one doesn't pass any contextual data with the exception of perhaps user credentials, or better still JWT Token. Techniques Used Adding Link to Menu Item Using JWT Token in Sales Cloud Use Case 2 : Application Embedded within the Sales Cloud Dashboard Within the Oracle Sales Cloud application there is a tab called "Sales", within this tab its possible to embed a SubTab and embed a iFrame pointing to your application. To do this the developer simply needs to edit the page in customization mode, add the tab and then add the iFrame, simples! The developer can pass credentials/JWT Token and some other pieces of data but not object data (ie the current OpportunityID etc) Techniques Used Adding a page to the dashboard Using JWT Token in Sales Cloud Use Case 3 : Embedding a Tab and Context Linking out from a Sales Cloud object to the 3rd party application In this usecase the developer embeds two components into Oracle Sales Cloud. The first is a SubTab showing summary data to the user (a quote in our case) and then secondly a hyperlink, (although it could be a button) which when clicked navigates the user to the 3rd party application. In this case the developer almost always passes context specific data (i.e. the opportunityId) and a security token (username password combo or JWT Token). The third party application usually takes the data, perhaps queries more data using the Sales Cloud SOAP/WebService interface and then displays the resulting mashup to the user for further processing. When the user has finished their work in the 3rd party application they normally navigate back to Oracle Sales Cloud using what's called a "DeepLink", ie taking them back to the object [opportunity in our case] they came from. This image visually shows a "Happy Path" a user may follow, and combines linking out to an application , webservice calls and deep linking back to Sales Cloud. Techniques Used Extending a SalesCloud application with a custom button Using JWT Token in Sales Cloud Extending Oracle Sales Cloud [Opportnity] with a custom tab exposing External Content Retrieving Data from Oracle Sales cloud using WebServices Coding some groovy script to generate the URLs required (Doc 1571200.1 on MyOracle Support) DeepLinking to specific Oracle Sales Cloud Pages (Doc 1516151.1 on My Oracle Support) Use-Case 4 : Server Side processing/synchronization This usecase focuses on the Server Side processing of data, in this case synchronizing data. Here the 3rd party application is running on a "timer", e.g. cron or similar, and when triggered it queries data from Oracle Sales Cloud, then it queries data from the 3rd party application, determines the deltas and then inserts the data where required. Specifically here we are calling Oracle Sales Cloud using SOAP/WebServices and the 3rd party application is being communicated to using the REST API, for Oracle Sales Cloud one would use standard JAX-WS WebService calls and for REST one would use the JAX-RS api and perhap the Jackson api for managing JSON objects.. This is a very common use case and one which specifically lends itself to using the Oracle Java Cloud Service as the ideal application server where to host the mediator between the two applications. Techniques Used Using JWT Token in Sales Cloud Integrating with the Oracle Java Cloud Service Retrieving Data from Oracle Sales cloud using WebServices General Resources The above is just a small set of techniques and use-cases which are used today. There are plenty of other sources of documentation and resources available on the internet but to get you started here are a few of my favourite places Sales Cloud General Documentation Sales Cloud Customize Tab is useful for general customization of Sales Cloud Sales Cloud Integration Tab focuses on the 3rd party integration techniques Official Oracle Fusion Developer Relations Blog Official Oracle Fusion Developer Relations YouTube Channel Enjoy integrating!

Read the article

Imaging: Paper Paper Everywhere, but None Should be in Sight

- by Kellsey Ruppel

Author: Vikrant Korde, Technical Architect, Aurionpro's Oracle Implementation Services team My wedding photos are stored in several empty shoeboxes. Yes...I got married before digital photography was mainstream...which means I'm old. But my parents are really old. They have shoeboxes filled with vacation photos on slides (I doubt many of you have even seen a home slide projector...and I hope you never do!). Neither me nor my parents should have shoeboxes filled with any form of photographs whatsoever. They should obviously live in the digital world...with no physical versions in sight (other than a few framed on our walls). Businesses grapple with similar challenges. But instead of shoeboxes, they have file cabinets and warehouses jam packed with paper invoices, legal documents, human resource files, material safety data sheets, incident reports, and the list goes on and on. In fact, regulatory and compliance rules govern many industries, requiring that this paperwork is available for any number of years. It's a real challenge...especially trying to find archived documents quickly and many times with no backup. Which brings us to a set of technologies called Image Process Management (or simply Imaging or Image Processing) that are transforming these antiquated, paper-based processes. Oracle's WebCenter Content Imaging solution is a combination of their WebCenter suite, which offers a robust set of content and document management features, and their Business Process Management (BPM) suite, which helps to automate business processes through the definition of workflows and business rules. Overall, the solution provides an enterprise-class platform for end-to-end management of document images within transactional business processes. It's a solution that provides all of the capabilities needed - from document capture and recognition, to imaging and workflow - to effectively transform your ‘shoeboxes’ of files into digitally managed assets that comply with strict industry regulations. The terminology can be quite overwhelming if you're new to the space, so we've provided a summary of the primary components of the solution below, along with a short description of the two paths that can be executed to load images of scanned documents into Oracle's WebCenter suite. WebCenter Imaging (WCI): the electronic document repository that provides security, annotations, and search capabilities, and is the primary user interface for managing work items in the imaging solution SOA & BPM Suites (workflow): provide business process management capabilities, including human tasks, workflow management, service integration, and all other standard SOA features. It's interesting to note that there a number of 'jumpstart' processes available to help accelerate the integration of business applications, such as the accounts payable invoice processing solution for E-Business Suite that facilitates the processing of large volumes of invoices WebCenter Enterprise Capture (WEC): expedites the capture process of paper documents to digital images, offering high volume scanning and importing from email, and allows for flexible indexing options WebCenter Forms Recognition (WFR): automatically recognizes, categorizes, and extracts information from paper documents with greatly reduced human intervention WebCenter Content: the backend content server that provides versioning, security, and content storage There are two paths that can be executed to send data from WebCenter Capture to WebCenter Imaging, both of which are described below: 1. Direct Flow - This is the simplest and quickest way to push an image scanned from WebCenter Enterprise Capture (WEC) to WebCenter Imaging (WCI), using the bare minimum metadata. The WEC activities are defined below: The paper document is scanned (or imported from email). The scanned image is indexed using a predefined indexing profile. The image is committed directly into the process flow 2. WFR (WebCenter Forms Recognition) Flow - This is the more complex process, during which data is extracted from the image using a series of operations including Optical Character Recognition (OCR), Classification, Extraction, and Export. This process creates three files (Tiff, XML, and TXT), which are fed to the WCI Input Agent (the high speed import/filing module). The WCI Input Agent directory is a standard ingestion method for adding content to WebCenter Imaging, the process for doing so is described below: WEC commits the batch using the respective commit profile. A TIFF file is created, passing data through the file name by including values separated by "_" (underscores). WFR completes OCR, classification, extraction, export, and pulls the data from the image. In addition to the TIFF file, which contains the document image, an XML file containing the extracted data, and a TXT file containing the metadata that will be filled in WCI, are also created. All three files are exported to WCI's Input agent directory. Based on previously defined "input masks", the WCI Input Agent will pick up the seeding file (often the TXT file). Finally, the TIFF file is pushed in UCM and a unique web-viewable URL is created. Based on the mapping data read from the TXT file, a new record is created in the WCI application. Although these processes may seem complex, each Oracle component works seamlessly together to achieve a high performing and scalable platform. The solution has been field tested at some of the largest enterprises in the world and has transformed millions and millions of paper-based documents to more easily manageable digital assets. For more information on how an Imaging solution can help your business, please contact [email protected] (for U.S. West inquiries) or [email protected] (for U.S. East inquiries). About the Author: Vikrant is a Technical Architect in Aurionpro's Oracle Implementation Services team, where he delivers WebCenter-based Content and Imaging solutions to Fortune 1000 clients. With more than twelve years of experience designing, developing, and implementing Java-based software solutions, Vikrant was one of the founding members of Aurionpro's WebCenter-based offshore delivery team. He can be reached at [email protected].

Read the article

Mixing Forms and Token Authentication in a single ASP.NET Application (the Details)

- by Your DisplayName here!

The scenario described in my last post works because of the design around HTTP modules in ASP.NET. Authentication related modules (like Forms authentication and WIF WS-Fed/Sessions) typically subscribe to three events in the pipeline – AuthenticateRequest/PostAuthenticateRequest for pre-processing and EndRequest for post-processing (like making redirects to a login page). In the pre-processing stage it is the modules’ job to determine the identity of the client based on incoming HTTP details (like a header, cookie, form post) and set HttpContext.User and Thread.CurrentPrincipal. The actual page (in the ExecuteHandler event) “sees” the identity that the last module has set. So in our case there are three modules in effect: FormsAuthenticationModule (AuthenticateRequest, EndRequest) WSFederationAuthenticationModule (AuthenticateRequest, PostAuthenticateRequest, EndRequest) SessionAuthenticationModule (AuthenticateRequest, PostAuthenticateRequest) So let’s have a look at the different scenario we have when mixing Forms auth and WS-Federation. Anoymous request to unprotected resource This is the easiest case. Since there is no WIF session cookie or a FormsAuth cookie, these modules do nothing. The WSFed module creates an anonymous ClaimsPrincipal and calls the registered ClaimsAuthenticationManager (if any) to transform it. The result (by default an anonymous ClaimsPrincipal) gets set. Anonymous request to FormsAuth protected resource This is the scenario where an anonymous user tries to access a FormsAuth protected resource for the first time. The principal is anonymous and before the page gets rendered, the Authorize attribute kicks in. The attribute determines that the user needs authentication and therefor sets a 401 status code and ends the request. Now execution jumps to the EndRequest event, where the FormsAuth module takes over. The module then converts the 401 to a redirect (302) to the forms login page. If authentication is successful, the login page sets the FormsAuth cookie. FormsAuth authenticated request to a FormsAuth protected resource Now a FormsAuth cookie is present, which gets validated by the FormsAuth module. This cookie gets turned into a GenericPrincipal/FormsIdentity combination. The WS-Fed module turns the principal into a ClaimsPrincipal and calls the registered ClaimsAuthenticationManager. The outcome of that gets set on the context. Anonymous request to STS protected resource This time the anonymous user tries to access an STS protected resource (a controller decorated with the RequireTokenAuthentication attribute). The attribute determines that the user needs STS authentication by checking the authentication type on the current principal. If this is not Federation, the redirect to the STS will be made. After successful authentication at the STS, the STS posts the token back to the application (using WS-Federation syntax). Postback from STS authentication After the postback, the WS-Fed module finds the token response and validates the contained token. If successful, the token gets transformed by the ClaimsAuthenticationManager, and the outcome is a) stored in a session cookie, and b) set on the context. STS authenticated request to an STS protected resource This time the WIF Session authentication module kicks in because it can find the previously issued session cookie. The module re-hydrates the ClaimsPrincipal from the cookie and sets it. FormsAuth and STS authenticated request to a protected resource This is kind of an odd case – e.g. the user first authenticated using Forms and after that using the STS. This time the FormsAuth module does its work, and then afterwards the session module stomps over the context with the session principal. In other words, the STS identity wins. What about roles? A common way to set roles in ASP.NET is to use the role manager feature. There is a corresponding HTTP module for that (RoleManagerModule) that handles PostAuthenticateRequest. Does this collide with the above combinations? No it doesn’t! When the WS-Fed module turns existing principals into a ClaimsPrincipal (like it did with the FormsIdentity), it also checks for RolePrincipal (which is the principal type created by role manager), and turns the roles in role claims. Nice! But as you can see in the last scenario above, this might result in unnecessary work, so I would rather recommend consolidating all role work (and other claims transformations) into the ClaimsAuthenticationManager. In there you can check for the authentication type of the incoming principal and act accordingly. HTH

Read the article

LINQ and ArcObjects

- by Marko Apfel

Motivation LINQ (language integrated query) is a component of the Microsoft. NET Framework since version 3.5. It allows a SQL-like query to various data sources such as SQL, XML etc. Like SQL also LINQ to SQL provides a declarative notation of problem solving – i.e. you don’t need describe in detail how a task could be solved, you describe what to be solved at all. This frees the developer from error-prone iterator constructs. Ideally, of course, would be to access features with this way. Then this construct is conceivable: var largeFeatures = from feature in features where (feature.GetValue("SHAPE_Area").ToDouble() > 3000) select feature; or its equivalent as a lambda expression: var largeFeatures = features.Where(feature => (feature.GetValue("SHAPE_Area").ToDouble() > 3000)); This requires an appropriate provider, which manages the corresponding iterator logic. This is easier than you might think at first sight - you have to deliver only the desired entities as IEnumerable<IFeature>. LINQ automatically establishes a state machine in the background, whose execution is delayed (deferred execution) - when you are really request entities (foreach, Count (), ToList (), ..) an instantiation processing takes place, although it was already created at a completely different place. Especially in multiple iteration through entities in the first debuggings you are rubbing your eyes when the execution pointer jumps magically back in the iterator logic. Realization A very concise logic for constructing IEnumerable<IFeature> can be achieved by running through a IFeatureCursor. You return each feature via yield. For an easier usage I have put the logic in an extension method Getfeatures() for IFeatureClass: public static IEnumerable<IFeature> GetFeatures(this IFeatureClass featureClass, IQueryFilter queryFilter, RecyclingPolicy policy) { IFeatureCursor featureCursor = featureClass.Search(queryFilter, RecyclingPolicy.Recycle == policy); IFeature feature; while (null != (feature = featureCursor.NextFeature())) { yield return feature; } //this is skipped in unit tests with cursor-mock if (Marshal.IsComObject(featureCursor)) { Marshal.ReleaseComObject(featureCursor); } } So you can now easily generate the IEnumerable<IFeature>: IEnumerable<IFeature> features = _featureClass.GetFeatures(RecyclingPolicy.DoNotRecycle); You have to be careful with the recycling cursor. After a delayed execution in the same context it is not a good idea to re-iterated on the features. In this case only the content of the last (recycled) features is provided and all the features are the same in the second set. Therefore, this expression would be critical: largeFeatures.ToList(). ForEach(feature => Debug.WriteLine(feature.OID)); because ToList() iterates once through the list and so the the cursor was once moved through the features. So the extension method ForEach() always delivers the same feature. In such situations, you must not use a recycling cursor. Repeated executions of ForEach() is not a problem, because for every time the state machine is re-instantiated and thus the cursor runs again - that's the magic already mentioned above. Perspective Now you can also go one step further and realize your own implementation for the interface IEnumerable<IFeature>. This requires that only the method and property to access the enumerator have to be programmed. In the enumerator himself in the Reset() method you organize the re-executing of the search. This could be archived with an appropriate delegate in the constructor: new FeatureEnumerator<IFeatureclass>(_featureClass, featureClass => featureClass.Search(_filter, isRecyclingCursor)); which is called in Reset(): public void Reset() { _featureCursor = _resetCursor(_t); } In this manner, enumerators for completely different scenarios could be implemented, which are used on the client side completely identical like described above. Thus cursors, selection sets, etc. merge into a single matter and the reusability of code is increasing immensely. On top of that in automated unit tests an IEnumerable could be mocked very easily - a major step towards better software quality. Conclusion Nevertheless, caution should be exercised with these constructs in performance-relevant queries. Because of managing a state machine in the background, a lot of overhead is created. The processing costs additional time - about 20 to 100 percent. In addition, working without a recycling cursor is fast a performance gap. However declarative LINQ code is much more elegant, flawless and easy to maintain than manually iterating, compare and establish a list of results. The code size is reduced according to experience an average of 75 to 90 percent! So I like to wait a few milliseconds longer. As so often it has to be balanced between maintainability and performance - which for me is gaining in priority maintainability. In times of multi-core processors, the processing time of most business processes is anyway not dominated by code execution but by waiting for user input. Demo source code The source code for this prototype with several unit tests, you can download here: https://github.com/esride-apf/Linq2ArcObjects. .

Read the article

Industry perspectives on managing content

- by aahluwalia

Earlier this week I was noodling over a topic for my first blog post. My intention for this blog is to bring a practitioner's perspective on ECM to the community; to share and collaborate on best practices and approaches that address today's business problems. Reviewing my past 14 years of experience with web technologies, I wondered what topic would serve as a good "conversation starter". During this time, I received a call from a friend who was seeking insights on how content management applies to specific industries. She approached me because she vaguely remembered that I had worked in the Health Insurance industry in the recent past. She wanted me to tell her about the specific business needs of this industry. She was in for quite a surprise as she found out that I had spent the better part of a decade managing content within the Health Insurance industry and I discovered a great topic for my first blog post! I offer some insights from Health Insurance and invite my fellow practitioners to share their insights from other industries. What does content management mean to these industries? What can solution providers be aware of when offering solutions to these industries? The United States health care system relies heavily on private health insurance, which is the primary source of coverage for approximately 58% Americans. In the late 19th century, "accident insurance" began to be available, which operated much like modern disability insurance. In the late 20th century, traditional disability insurance evolved into modern health insurance programs. The first thing a solution provider must be aware of about the Health Insurance industry is that it tends to be transaction intensive. They are the ones who manage and administer our health plans and process our claims when we visit our health care providers. It helps to keep in mind that they are in the business of delivering health insurance and not technology. You may find the mindset conservative in comparison to the IT industry, however, the Health Insurance industry has benefited and will continue to benefit from the efficiency that technology brings to traditionally paper-driven processes. We are all aware of the impact that Healthcare reform bill has had a significant impact on the Health Insurance industry. They are under a great deal of pressure to explore ways to reduce their administrative costs and increase operational efficiency. Overall, administrative costs of health insurance include the insurer's cost to administer the health plan, the costs borne by employers, health-care providers, governments and individual consumers. Inefficiencies plague health insurance, owing largely to the absence of standardized processes across the industry. To achieve this, industry leaders have come together to establish standards and invest in initiatives to help their healthcare provider partners transition to the next generation of healthcare technology. The move to online services and paperless explanation of benefits are some manifestations of technological advancements in health insurance. Several companies have adopted Toyota's LEAN methodology or Six Sigma principles to improve quality, reduce waste and excessive costs, thereby increasing the value of their plan offerings. A growing number of health insurance companies have transformed their business systems in the past decade alone and adopted some form of content management to reduce the costs involved in administering health plans. The key strategy has been to convert paper documents and forms into electronic formats, automate the content development process and securely distribute content to various audiences via diverse marketing channels, including web and mobile. Enterprise content management solutions can enable document capture of claim forms, manage digital assets, integrate with Enterprise Resource Planning (ERP) and Human Capital Management (HCM) solutions, build Business Process Management (BPM) processes, define retention and disposition instructions to comply with state and federal regulations and allow eBusiness and Marketing departments to develop and deliver web content to multiple websites, mobile devices and portals. Content can be shared securely within and outside the organization using Information Rights Management. At the end of the day, solution providers who can translate strategic goals into solutions that maximize process automation, increase ease of use and minimize IT overhead are likely to be successful in today's health insurance environment.

Read the article

Why Ultra-Low Power Computing Will Change Everything

- by Tori Wieldt

The ARM TechCon keynote "Why Ultra-Low Power Computing Will Change Everything" was anything but low-powered. The speaker, Dr. Johnathan Koomey, knows his subject: he is a Consulting Professor at Stanford University, worked for more than two decades at Lawrence Berkeley National Laboratory, and has been a visiting professor at Stanford University, Yale University, and UC Berkeley's Energy and Resources Group. His current focus is creating a standard (computations per kilowatt hour) and measuring computer energy consumption over time. The trends are impressive: energy consumption has halved every 1.5 years for the last 60 years. Battery life has made roughly a 10x improvement each decade since 1960. It's these improvements that have made laptops and cell phones possible. What does the future hold? Dr. Koomey said that in the past, the race by chip manufacturers was to create the fastest computer, but the priorities have now changed. New computers are tiny, smart, connected and cheap. "You can't underestimate the importance of a shift in industry focus from raw performance to power efficiency for mobile devices," he said. There is also a confluence of trends in computing, communications, sensors, and controls. The challenge is how to reduce the power requirements for these tiny devices. Alternate sources of power that are being explored are light, heat, motion, and even blood sugar. The University of Michigan has produced a miniature sensor that harnesses solar energy and could last for years without needing to be replaced. Also, the University of Washington has created a sensor that scavenges power from existing radio and TV signals.Specific devices designed for a purpose are much more efficient than general purpose computers. With all these sensors, instead of big data, developers should focus on nano-data, personalized information that will adjust the lights in a room, a machine, a variable sign, etc.Dr. Koomey showed some examples:The Proteus Digital Health Feedback System, an ingestible sensor that transmits when a patient has taken their medicine and is powered by their stomach juices. (Gives "powered by you" a whole new meaning!) Streetline Parking Systems, that provide real-time data about available parking spaces. The information can be sent to your phone or update parking signs around the city to point to areas with available spaces. Less driving around looking for parking spaces!The BigBelly trash system that uses solar power, compacts trash, and sends a text message when it is full. This dramatically reduces the number of times a truck has to come to pick up trash, freeing up resources and slashing fuel costs. This is a classic example of the efficiency of moving "bits not atoms." But researchers are approaching the physical limits of sensors, Dr. Kommey explained. With the current rate of technology improvement, they'll reach the three-atom transistor by 2041. Once they hit that wall, it will force a revolution they way we do computing. But wait, researchers at Purdue University and the University of New South Wales are both working on a reliable one-atom transistors! Other researchers are working on "approximate computing" that will reduce computing requirements drastically. So it's unclear where the wall actually is. In the meantime, as Dr. Koomey promised, ultra-low power computing will change everything.

Read the article

WebCenter Innovation Award Winners

- by Michael Snow

Of course, here on our WebCenter blog – we’d like to highlight and brag about our great WebCenter winners. The 2012 WebCenter Innovation Award Winners University of Louisville Location: Louisville, KY, USA Industry: Higher Education Fusion Middleware Products: WebCenter Portal, WebCenter Content, JDeveloper, WebLogic, Oracle BI, Oracle IdM University of Louisville is a state supported research university Statewide Informatics Network to improve public health The University of Louisville has implemented WebCenter as part of the LOUI (Louisville Informatics Institute) Initiative, a Statewide Informatics Network, which will improve public healthcare and lower cost through the use of novel technology and next generation analytics, decision support and innovative outcomes-based payment systems. ---------- News Limited Country/Region: Australia Industry: News/Media FMW Products: WebCenter Sites Single platform running websites for 50% of Australia's newspapers News Corp is running half of Australia's newspaper websites on this shared platform powered by Oracle WebCenter Sites and have overtaken their nearest competitors and are now leading in terms of monthly page impressions. At peak they have over 250 editors on the system publishing in real-time.Sites include: www.newsspace.com.au, www.news.com.au, www.theaustralian.com.au and many others ------ Life Technologies Corp. Country/Region: Carlsbad, CA, USAIndustry: Life SciencesFMW Products: WebCenter Portal, SOA Suite Life Technologies Corp. is a global biotechnology tools company dedicated to improving the human condition with innovative life science products. They were awarded an innovation award for their solution utilizing WebCenter Portal for remotely monitoring & repairing biotech instruments. They deployed WebCenter as a portal that accesses Life Technologies cloud based service monitoring system where all customer deployed instruments can be remotely monitored and proactively repaired. The portal provides alerts from these cloud based monitoring services directly to the customer and to Life Technologies Field Engineers. The Portal provides insight into the instruments and services customers purchased for the purpose of analyzing and anticipating future customer needs and creating targeted sales and service programs. ----- China Mobile Jiangsu China Mobile Jiangsu is one of the biggest subsidiaries of China Mobile. It has over 25,000 employees and 40 million mobile subscribers. Country/Region: Jiangsu, China Industry: Telecommunications FMW Products: WebCenter Portal, WebCenter Content, JDeveloper, SOA Suite, IdM They were awarded an Innovation Award for their new employee platform powered by WebCenter Portal is designed to serve their 25,000+ employees and help them drive collaboration & productivity. JSMCC (Chian Mobile Jiangsu) Employee Enterprise Portal and Collaboration Platform. It is one of the China Mobile’s most important IT innovation projects. The new platform is designed to serve for JSMCC’s 25000+ employees and to help them improve the working efficiency, changing their traditional working mode to social ways, encouraging employees on business collaboration and innovation. The solution is built on top of Oracle WebCenter Portal Framework and WebCenter Spaces while also leveraging Weblogic Server, UCM, OID, OAM, SES, IRM and Oracle Database 11g. By providing rich collaboration services, knowledge management services, sensitive document protection services, unified user identity management services, unified information search services and personalized information integration capabilities, the working efficiency of JSMCC employees has been greatly improved. Main Functionality : Information portal, office automation integration, personal space, group space, team collaboration with web2.0 services, unified search engine for multiple data sources, document management and protection. SSO for multiple platforms. -------- LADWP – Los Angeles Department for Water and Power Los Angeles Department of Water and Power (LADWP) is the largest public utility company in United States with over 1.6 Million customers. LADWP provides water and power for millions of residential & commercial customers in Southern California. LADWP also bills most of these customers for sanitation services provided by another city department. Country/Region: US – Los Angeles, CA Industry: Public Utility FMW Products: WebCenter Portal, WebCenter Content, JDeveloper, SOA Suite, IdM The new infrastructure consists of: Oracle WebCenter Portal including mobile portal Oracle WebCenter Content for Content Management and Digital Asset Management (DAM) Oracle OAM (IDM, OVD, OAM) integrated with AD for enterprise identity management Oracle Siebel for CRM Oracle DB Oracle SOA Suite for integration of various subsystems and back end systems The new portal's features include: Complete Graphical redesign based on best practices in UI Design for high usability Customer Self Service implemented through MyAccount (Bill Pay, Payment History, Bill History, Usage Analysis, Service Request Management) Financial Assistance Programs (CRM, WebCenter) Customer Rebate Programs (CRM, WebCenter) Turn On/Off/Transfer of services (Commercial & Residential) Outage Reporting eNotification (SMS, email) Multilingual (English & Spanish) – using WebCenter multi-language support Section 508 (ADA) Compliant Search – Using WebCenter SES (Secured Enterprise Search) Distributed Authorship in WebCenter Content Mobile Access (any Mobile Browser)

Read the article

An Introduction to ASP.NET Web API

- by Rick Strahl

Microsoft recently released ASP.NET MVC 4.0 and .NET 4.5 and along with it, the brand spanking new ASP.NET Web API. Web API is an exciting new addition to the ASP.NET stack that provides a new, well-designed HTTP framework for creating REST and AJAX APIs (API is Microsoft’s new jargon for a service, in case you’re wondering). Although Web API ships and installs with ASP.NET MVC 4, you can use Web API functionality in any ASP.NET project, including WebForms, WebPages and MVC or just a Web API by itself. And you can also self-host Web API in your own applications from Console, Desktop or Service applications. If you're interested in a high level overview on what ASP.NET Web API is and how it fits into the ASP.NET stack you can check out my previous post: Where does ASP.NET Web API fit? In the following article, I'll focus on a practical, by example introduction to ASP.NET Web API. All the code discussed in this article is available in GitHub: https://github.com/RickStrahl/AspNetWebApiArticle [republished from my Code Magazine Article and updated for RTM release of ASP.NET Web API] Getting Started To start I’ll create a new empty ASP.NET application to demonstrate that Web API can work with any kind of ASP.NET project. Although you can create a new project based on the ASP.NET MVC/Web API template to quickly get up and running, I’ll take you through the manual setup process, because one common use case is to add Web API functionality to an existing ASP.NET application. This process describes the steps needed to hook up Web API to any ASP.NET 4.0 application. Start by creating an ASP.NET Empty Project. Then create a new folder in the project called Controllers. Add a Web API Controller Class Once you have any kind of ASP.NET project open, you can add a Web API Controller class to it. Web API Controllers are very similar to MVC Controller classes, but they work in any kind of project. Add a new item to this folder by using the Add New Item option in Visual Studio and choose Web API Controller Class, as shown in Figure 1. Figure 1: This is how you create a new Controller Class in Visual Studio Make sure that the name of the controller class includes Controller at the end of it, which is required in order for Web API routing to find it. Here, the name for the class is AlbumApiController. For this example, I’ll use a Music Album model to demonstrate basic behavior of Web API. The model consists of albums and related songs where an album has properties like Name, Artist and YearReleased and a list of songs with a SongName and SongLength as well as an AlbumId that links it to the album. You can find the code for the model (and the rest of these samples) on Github. To add the file manually, create a new folder called Model, and add a new class Album.cs and copy the code into it. There’s a static AlbumData class with a static CreateSampleAlbumData() method that creates a short list of albums on a static .Current that I’ll use for the examples. Before we look at what goes into the controller class though, let’s hook up routing so we can access this new controller. Hooking up Routing in Global.asax To start, I need to perform the one required configuration task in order for Web API to work: I need to configure routing to the controller. Like MVC, Web API uses routing to provide clean, extension-less URLs to controller methods. Using an extension method to ASP.NET’s static RouteTable class, you can use the MapHttpRoute() (in the System.Web.Http namespace) method to hook-up the routing during Application_Start in global.asax.cs shown in Listing 1.using System; using System.Web.Routing; using System.Web.Http; namespace AspNetWebApi { public class Global : System.Web.HttpApplication { protected void Application_Start(object sender, EventArgs e) { RouteTable.Routes.MapHttpRoute( name: "AlbumVerbs", routeTemplate: "albums/{title}", defaults: new { symbol = RouteParameter.Optional, controller="AlbumApi" } ); } } } This route configures Web API to direct URLs that start with an albums folder to the AlbumApiController class. Routing in ASP.NET is used to create extensionless URLs and allows you to map segments of the URL to specific Route Value parameters. A route parameter, with a name inside curly brackets like {name}, is mapped to parameters on the controller methods. Route parameters can be optional, and there are two special route parameters – controller and action – that determine the controller to call and the method to activate respectively. HTTP Verb Routing Routing in Web API can route requests by HTTP Verb in addition to standard {controller},{action} routing. For the first examples, I use HTTP Verb routing, as shown Listing 1. Notice that the route I’ve defined does not include an {action} route value or action value in the defaults. Rather, Web API can use the HTTP Verb in this route to determine the method to call the controller, and a GET request maps to any method that starts with Get. So methods called Get() or GetAlbums() are matched by a GET request and a POST request maps to a Post() or PostAlbum(). Web API matches a method by name and parameter signature to match a route, query string or POST values. In lieu of the method name, the [HttpGet,HttpPost,HttpPut,HttpDelete, etc] attributes can also be used to designate the accepted verbs explicitly if you don’t want to follow the verb naming conventions. Although HTTP Verb routing is a good practice for REST style resource APIs, it’s not required and you can still use more traditional routes with an explicit {action} route parameter. When {action} is supplied, the HTTP verb routing is ignored. I’ll talk more about alternate routes later. When you’re finished with initial creation of files, your project should look like Figure 2. Figure 2: The initial project has the new API Controller Album model Creating a small Album Model Now it’s time to create some controller methods to serve data. For these examples, I’ll use a very simple Album and Songs model to play with, as shown in Listing 2. public class Song { public string AlbumId { get; set; } [Required, StringLength(80)] public string SongName { get; set; } [StringLength(5)] public string SongLength { get; set; } } public class Album { public string Id { get; set; } [Required, StringLength(80)] public string AlbumName { get; set; } [StringLength(80)] public string Artist { get; set; } public int YearReleased { get; set; } public DateTime Entered { get; set; } [StringLength(150)] public string AlbumImageUrl { get; set; } [StringLength(200)] public string AmazonUrl { get; set; } public virtual List<Song> Songs { get; set; } public Album() { Songs = new List<Song>(); Entered = DateTime.Now; // Poor man's unique Id off GUID hash Id = Guid.NewGuid().GetHashCode().ToString("x"); } public void AddSong(string songName, string songLength = null) { this.Songs.Add(new Song() { AlbumId = this.Id, SongName = songName, SongLength = songLength }); } } Once the model has been created, I also added an AlbumData class that generates some static data in memory that is loaded onto a static .Current member. The signature of this class looks like this and that's what I'll access to retrieve the base data:public static class AlbumData { // sample data - static list public static List<Album> Current = CreateSampleAlbumData(); /// <summary> /// Create some sample data /// </summary> /// <returns></returns> public static List<Album> CreateSampleAlbumData() { … }} You can check out the full code for the data generation online. Creating an AlbumApiController Web API shares many concepts of ASP.NET MVC, and the implementation of your API logic is done by implementing a subclass of the System.Web.Http.ApiController class. Each public method in the implemented controller is a potential endpoint for the HTTP API, as long as a matching route can be found to invoke it. The class name you create should end in Controller, which is how Web API matches the controller route value to figure out which class to invoke. Inside the controller you can implement methods that take standard .NET input parameters and return .NET values as results. Web API’s binding tries to match POST data, route values, form values or query string values to your parameters. Because the controller is configured for HTTP Verb based routing (no {action} parameter in the route), any methods that start with Getxxxx() are called by an HTTP GET operation. You can have multiple methods that match each HTTP Verb as long as the parameter signatures are different and can be matched by Web API. In Listing 3, I create an AlbumApiController with two methods to retrieve a list of albums and a single album by its title .public class AlbumApiController : ApiController { public IEnumerable<Album> GetAlbums() { var albums = AlbumData.Current.OrderBy(alb => alb.Artist); return albums; } public Album GetAlbum(string title) { var album = AlbumData.Current .SingleOrDefault(alb => alb.AlbumName.Contains(title)); return album; }} To access the first two requests, you can use the following URLs in your browser: http://localhost/aspnetWebApi/albumshttp://localhost/aspnetWebApi/albums/Dirty%20Deeds Note that you’re not specifying the actions of GetAlbum or GetAlbums in these URLs. Instead Web API’s routing uses HTTP GET verb to route to these methods that start with Getxxx() with the first mapping to the parameterless GetAlbums() method and the latter to the GetAlbum(title) method that receives the title parameter mapped as optional in the route. Content Negotiation When you access any of the URLs above from a browser, you get either an XML or JSON result returned back. The album list result for Chrome 17 and Internet Explorer 9 is shown Figure 3. Figure 3: Web API responses can vary depending on the browser used, demonstrating Content Negotiation in action as these two browsers send different HTTP Accept headers. Notice that the results are not the same: Chrome returns an XML response and IE9 returns a JSON response. Whoa, what’s going on here? Shouldn’t we see the same result in both browsers? Actually, no. Web API determines what type of content to return based on Accept headers. HTTP clients, like browsers, use Accept headers to specify what kind of content they’d like to see returned. Browsers generally ask for HTML first, followed by a few additional content types. Chrome (and most other major browsers) ask for: Accept: text/html, application/xhtml+xml,application/xml; q=0.9,*/*;q=0.8 IE9 asks for: Accept: text/html, application/xhtml+xml, */* Note that Chrome’s Accept header includes application/xml, which Web API finds in its list of supported media types and returns an XML response. IE9 does not include an Accept header type that works on Web API by default, and so it returns the default format, which is JSON. This is an important and very useful feature that was missing from any previous Microsoft REST tools: Web API automatically switches output formats based on HTTP Accept headers. Nowhere in the server code above do you have to explicitly specify the output format. Rather, Web API determines what format the client is requesting based on the Accept headers and automatically returns the result based on the available formatters. This means that a single method can handle both XML and JSON results.. Using this simple approach makes it very easy to create a single controller method that can return JSON, XML, ATOM or even OData feeds by providing the appropriate Accept header from the client. By default you don’t have to worry about the output format in your code. Note that you can still specify an explicit output format if you choose, either globally by overriding the installed formatters, or individually by returning a lower level HttpResponseMessage instance and setting the formatter explicitly. More on that in a minute. Along the same lines, any content sent to the server via POST/PUT is parsed by Web API based on the HTTP Content-type of the data sent. The same formats allowed for output are also allowed on input. Again, you don’t have to do anything in your code – Web API automatically performs the deserialization from the content. Accessing Web API JSON Data with jQuery A very common scenario for Web API endpoints is to retrieve data for AJAX calls from the Web browser. Because JSON is the default format for Web API, it’s easy to access data from the server using jQuery and its getJSON() method. This example receives the albums array from GetAlbums() and databinds it into the page using knockout.js.$.getJSON("albums/", function (albums) { // make knockout template visible $(".album").show(); // create view object and attach array var view = { albums: albums }; ko.applyBindings(view); }); Figure 4 shows this and the next example’s HTML output. You can check out the complete HTML and script code at http://goo.gl/Ix33C (.html) and http://goo.gl/tETlg (.js). Figu Figure 4: The Album Display sample uses JSON data loaded from Web API. The result from the getJSON() call is a JavaScript object of the server result, which comes back as a JavaScript array. In the code, I use knockout.js to bind this array into the UI, which as you can see, requires very little code, instead using knockout’s data-bind attributes to bind server data to the UI. Of course, this is just one way to use the data – it’s entirely up to you to decide what to do with the data in your client code. Along the same lines, I can retrieve a single album to display when the user clicks on an album. The response returns the album information and a child array with all the songs. The code to do this is very similar to the last example where we pulled the albums array:$(".albumlink").live("click", function () { var id = $(this).data("id"); // title $.getJSON("albums/" + id, function (album) { ko.applyBindings(album, $("#divAlbumDialog")[0]); $("#divAlbumDialog").show(); }); }); Here the URL looks like this: /albums/Dirty%20Deeds, where the title is the ID captured from the clicked element’s data ID attribute. Explicitly Overriding Output Format When Web API automatically converts output using content negotiation, it does so by matching Accept header media types to the GlobalConfiguration.Configuration.Formatters and the SupportedMediaTypes of each individual formatter. You can add and remove formatters to globally affect what formats are available and it’s easy to create and plug in custom formatters.The example project includes a JSONP formatter that can be plugged in to provide JSONP support for requests that have a callback= querystring parameter. Adding, removing or replacing formatters is a global option you can use to manipulate content. It’s beyond the scope of this introduction to show how it works, but you can review the sample code or check out my blog entry on the subject (http://goo.gl/UAzaR). If automatic processing is not desirable in a particular Controller method, you can override the response output explicitly by returning an HttpResponseMessage instance. HttpResponseMessage is similar to ActionResult in ASP.NET MVC in that it’s a common way to return an abstract result message that contains content. HttpResponseMessage s parsed by the Web API framework using standard interfaces to retrieve the response data, status code, headers and so on[MS2] . Web API turns every response – including those Controller methods that return static results – into HttpResponseMessage instances. Explicitly returning an HttpResponseMessage instance gives you full control over the output and lets you mostly bypass WebAPI’s post-processing of the HTTP response on your behalf. HttpResponseMessage allows you to customize the response in great detail. Web API’s attention to detail in the HTTP spec really shows; many HTTP options are exposed as properties and enumerations with detailed IntelliSense comments. Even if you’re new to building REST-based interfaces, the API guides you in the right direction for returning valid responses and response codes. For example, assume that I always want to return JSON from the GetAlbums() controller method and ignore the default media type content negotiation. To do this, I can adjust the output format and headers as shown in Listing 4.public HttpResponseMessage GetAlbums() { var albums = AlbumData.Current.OrderBy(alb => alb.Artist); // Create a new HttpResponse with Json Formatter explicitly var resp = new HttpResponseMessage(HttpStatusCode.OK); resp.Content = new ObjectContent<IEnumerable<Album>>( albums, new JsonMediaTypeFormatter()); // Get Default Formatter based on Content Negotiation //var resp = Request.CreateResponse<IEnumerable<Album>>(HttpStatusCode.OK, albums); resp.Headers.ConnectionClose = true; resp.Headers.CacheControl = new CacheControlHeaderValue(); resp.Headers.CacheControl.Public = true; return resp; } This example returns the same IEnumerable<Album> value, but it wraps the response into an HttpResponseMessage so you can control the entire HTTP message result including the headers, formatter and status code. In Listing 4, I explicitly specify the formatter using the JsonMediaTypeFormatter to always force the content to JSON. If you prefer to use the default content negotiation with HttpResponseMessage results, you can create the Response instance using the Request.CreateResponse method:var resp = Request.CreateResponse<IEnumerable<Album>>(HttpStatusCode.OK, albums); This provides you an HttpResponse object that's pre-configured with the default formatter based on Content Negotiation. Once you have an HttpResponse object you can easily control most HTTP aspects on this object. What's sweet here is that there are many more detailed properties on HttpResponse than the core ASP.NET Response object, with most options being explicitly configurable with enumerations that make it easy to pick the right headers and response codes from a list of valid codes. It makes HTTP features available much more discoverable even for non-hardcore REST/HTTP geeks. Non-Serialized Results The output returned doesn’t have to be a serialized value but can also be raw data, like strings, binary data or streams. You can use the HttpResponseMessage.Content object to set a number of common Content classes. Listing 5 shows how to return a binary image using the ByteArrayContent class from a Controller method. [HttpGet] public HttpResponseMessage AlbumArt(string title) { var album = AlbumData.Current.FirstOrDefault(abl => abl.AlbumName.StartsWith(title)); if (album == null) { var resp = Request.CreateResponse<ApiMessageError>( HttpStatusCode.NotFound, new ApiMessageError("Album not found")); return resp; } // kinda silly - we would normally serve this directly // but hey - it's a demo. var http = new WebClient(); var imageData = http.DownloadData(album.AlbumImageUrl); // create response and return var result = new HttpResponseMessage(HttpStatusCode.OK); result.Content = new ByteArrayContent(imageData); result.Content.Headers.ContentType = new MediaTypeHeaderValue("image/jpeg"); return result; } The image retrieval from Amazon is contrived, but it shows how to return binary data using ByteArrayContent. It also demonstrates that you can easily return multiple types of content from a single controller method, which is actually quite common. If an error occurs - such as a resource can’t be found or a validation error – you can return an error response to the client that’s very specific to the error. In GetAlbumArt(), if the album can’t be found, we want to return a 404 Not Found status (and realistically no error, as it’s an image). Note that if you are not using HTTP Verb-based routing or not accessing a method that starts with Get/Post etc., you have to specify one or more HTTP Verb attributes on the method explicitly. Here, I used the [HttpGet] attribute to serve the image. Another option to handle the error could be to return a fixed placeholder image if no album could be matched or the album doesn’t have an image. When returning an error code, you can also return a strongly typed response to the client. For example, you can set the 404 status code and also return a custom error object (ApiMessageError is a class I defined) like this:return Request.CreateResponse<ApiMessageError>( HttpStatusCode.NotFound, new ApiMessageError("Album not found") ); If the album can be found, the image will be returned. The image is downloaded into a byte[] array, and then assigned to the result’s Content property. I created a new ByteArrayContent instance and assigned the image’s bytes and the content type so that it displays properly in the browser. There are other content classes available: StringContent, StreamContent, ByteArrayContent, MultipartContent, and ObjectContent are at your disposal to return just about any kind of content. You can create your own Content classes if you frequently return custom types and handle the default formatter assignments that should be used to send the data out . Although HttpResponseMessage results require more code than returning a plain .NET value from a method, it allows much more control over the actual HTTP processing than automatic processing. It also makes it much easier to test your controller methods as you get a response object that you can check for specific status codes and output messages rather than just a result value. Routing Again Ok, let’s get back to the image example. Using the original routing we have setup using HTTP Verb routing there's no good way to serve the image. In order to return my album art image I’d like to use a URL like this: http://localhost/aspnetWebApi/albums/Dirty%20Deeds/image In order to create a URL like this, I have to create a new Controller because my earlier routes pointed to the AlbumApiController using HTTP Verb routing. HTTP Verb based routing is great for representing a single set of resources such as albums. You can map operations like add, delete, update and read easily using HTTP Verbs. But you cannot mix action based routing into a an HTTP Verb routing controller - you can only map HTTP Verbs and each method has to be unique based on parameter signature. You can't have multiple GET operations to methods with the same signature. So GetImage(string id) and GetAlbum(string title) are in conflict in an HTTP GET routing scenario. In fact, I was unable to make the above Image URL work with any combination of HTTP Verb plus Custom routing using the single Albums controller. There are number of ways around this, but all involve additional controllers. Personally, I think it’s easier to use explicit Action routing and then add custom routes if you need to simplify your URLs further. So in order to accommodate some of the other examples, I created another controller – AlbumRpcApiController – to handle all requests that are explicitly routed via actions (/albums/rpc/AlbumArt) or are custom routed with explicit routes defined in the HttpConfiguration. I added the AlbumArt() method to this new AlbumRpcApiController class. For the image URL to work with the new AlbumRpcApiController, you need a custom route placed before the default route from Listing 1.RouteTable.Routes.MapHttpRoute( name: "AlbumRpcApiAction", routeTemplate: "albums/rpc/{action}/{title}", defaults: new { title = RouteParameter.Optional, controller = "AlbumRpcApi", action = "GetAblums" } ); Now I can use either of the following URLs to access the image: Custom route: (/albums/rpc/{title}/image)http://localhost/aspnetWebApi/albums/PowerAge/image Action route: (/albums/rpc/action/{title})http://localhost/aspnetWebAPI/albums/rpc/albumart/PowerAge Sending Data to the Server To send data to the server and add a new album, you can use an HTTP POST operation. Since I’m using HTTP Verb-based routing in the original AlbumApiController, I can implement a method called PostAlbum()to accept a new album from the client. Listing 6 shows the Web API code to add a new album.public HttpResponseMessage PostAlbum(Album album) { if (!this.ModelState.IsValid) { // my custom error class var error = new ApiMessageError() { message = "Model is invalid" }; // add errors into our client error model for client foreach (var prop in ModelState.Values) { var modelError = prop.Errors.FirstOrDefault(); if (!string.IsNullOrEmpty(modelError.ErrorMessage)) error.errors.Add(modelError.ErrorMessage); else error.errors.Add(modelError.Exception.Message); } return Request.CreateResponse<ApiMessageError>(HttpStatusCode.Conflict, error); } // update song id which isn't provided foreach (var song in album.Songs) song.AlbumId = album.Id; // see if album exists already var matchedAlbum = AlbumData.Current .SingleOrDefault(alb => alb.Id == album.Id || alb.AlbumName == album.AlbumName); if (matchedAlbum == null) AlbumData.Current.Add(album); else matchedAlbum = album; // return a string to show that the value got here var resp = Request.CreateResponse(HttpStatusCode.OK, string.Empty); resp.Content = new StringContent(album.AlbumName + " " + album.Entered.ToString(), Encoding.UTF8, "text/plain"); return resp; } The PostAlbum() method receives an album parameter, which is automatically deserialized from the POST buffer the client sent. The data passed from the client can be either XML or JSON. Web API automatically figures out what format it needs to deserialize based on the content type and binds the content to the album object. Web API uses model binding to bind the request content to the parameter(s) of controller methods. Like MVC you can check the model by looking at ModelState.IsValid. If it’s not valid, you can run through the ModelState.Values collection and check each binding for errors. Here I collect the error messages into a string array that gets passed back to the client via the result ApiErrorMessage object. When a binding error occurs, you’ll want to return an HTTP error response and it’s best to do that with an HttpResponseMessage result. In Listing 6, I used a custom error class that holds a message and an array of detailed error messages for each binding error. I used this object as the content to return to the client along with my Conflict HTTP Status Code response. If binding succeeds, the example returns a string with the name and date entered to demonstrate that you captured the data. Normally, a method like this should return a Boolean or no response at all (HttpStatusCode.NoConent). The sample uses a simple static list to hold albums, so once you’ve added the album using the Post operation, you can hit the /albums/ URL to see that the new album was added. The client jQuery code to call the POST operation from the client with jQuery is shown in Listing 7. var id = new Date().getTime().toString(); var album = { "Id": id, "AlbumName": "Power Age", "Artist": "AC/DC", "YearReleased": 1977, "Entered": "2002-03-11T18:24:43.5580794-10:00", "AlbumImageUrl": http://ecx.images-amazon.com/images/…, "AmazonUrl": http://www.amazon.com/…, "Songs": [ { "SongName": "Rock 'n Roll Damnation", "SongLength": 3.12}, { "SongName": "Downpayment Blues", "SongLength": 4.22 }, { "SongName": "Riff Raff", "SongLength": 2.42 } ] } $.ajax( { url: "albums/", type: "POST", contentType: "application/json", data: JSON.stringify(album), processData: false, beforeSend: function (xhr) { // not required since JSON is default output xhr.setRequestHeader("Accept", "application/json"); }, success: function (result) { // reload list of albums page.loadAlbums(); }, error: function (xhr, status, p3, p4) { var err = "Error"; if (xhr.responseText && xhr.responseText[0] == "{") err = JSON.parse(xhr.responseText).message; alert(err); } }); The code in Listing 7 creates an album object in JavaScript to match the structure of the .NET Album class. This object is passed to the $.ajax() function to send to the server as POST. The data is turned into JSON and the content type set to application/json so that the server knows what to convert when deserializing in the Album instance. The jQuery code hooks up success and failure events. Success returns the result data, which is a string that’s echoed back with an alert box. If an error occurs, jQuery returns the XHR instance and status code. You can check the XHR to see if a JSON object is embedded and if it is, you can extract it by de-serializing it and accessing the .message property. REST standards suggest that updates to existing resources should use PUT operations. REST standards aside, I’m not a big fan of separating out inserts and updates so I tend to have a single method that handles both. But if you want to follow REST suggestions, you can create a PUT method that handles updates by forwarding the PUT operation to the POST method:public HttpResponseMessage PutAlbum(Album album) { return PostAlbum(album); } To make the corresponding $.ajax() call, all you have to change from Listing 7 is the type: from POST to PUT. Model Binding with UrlEncoded POST Variables In the example in Listing 7 I used JSON objects to post a serialized object to a server method that accepted an strongly typed object with the same structure, which is a common way to send data to the server. However, Web API supports a number of different ways that data can be received by server methods. For example, another common way is to use plain UrlEncoded POST values to send to the server. Web API supports Model Binding that works similar (but not the same) as MVC's model binding where POST variables are mapped to properties of object parameters of the target method. This is actually quite common for AJAX calls that want to avoid serialization and the potential requirement of a JSON parser on older browsers. For example, using jQUery you might use the $.post() method to send a new album to the server (albeit one without songs) using code like the following:$.post("albums/",{AlbumName: "Dirty Deeds", YearReleased: 1976 … },albumPostCallback); Although the code looks very similar to the client code we used before passing JSON, here the data passed is URL encoded values (AlbumName=Dirty+Deeds&YearReleased=1976 etc.). Web API then takes this POST data and maps each of the POST values to the properties of the Album object in the method's parameter. Although the client code is different the server can both handle the JSON object, or the UrlEncoded POST values. Dynamic Access to POST Data There are also a few options available to dynamically access POST data, if you know what type of data you're dealing with. If you have POST UrlEncoded values, you can dynamically using a FormsDataCollection:[HttpPost] public string PostAlbum(FormDataCollection form) { return string.Format("{0} - released {1}", form.Get("AlbumName"),form.Get("RearReleased")); } The FormDataCollection is a very simple object, that essentially provides the same functionality as Request.Form[] in ASP.NET. Request.Form[] still works if you're running hosted in an ASP.NET application. However as a general rule, while ASP.NET's functionality is always available when running Web API hosted inside of an ASP.NET application, using the built in classes specific to Web API makes it possible to run Web API applications in a self hosted environment outside of ASP.NET. If your client is sending JSON to your server, and you don't want to map the JSON to a strongly typed object because you only want to retrieve a few simple values, you can also accept a JObject parameter in your API methods:[HttpPost] public string PostAlbum(JObject jsonData) { dynamic json = jsonData; JObject jalbum = json.Album; JObject juser = json.User; string token = json.UserToken; var album = jalbum.ToObject<Album>(); var user = juser.ToObject<User>(); return String.Format("{0} {1} {2}", album.AlbumName, user.Name, token); } There quite a few options available to you to receive data with Web API, which gives you more choices for the right tool for the job. Unfortunately one shortcoming of Web API is that POST data is always mapped to a single parameter. This means you can't pass multiple POST parameters to methods that receive POST data. It's possible to accept multiple parameters, but only one can map to the POST content - the others have to come from the query string or route values. I have a couple of Blog POSTs that explain what works and what doesn't here: Passing multiple POST parameters to Web API Controller Methods Mapping UrlEncoded POST Values in ASP.NET Web API Handling Delete Operations Finally, to round out the server API code of the album example we've been discussin, here’s the DELETE verb controller method that allows removal of an album by its title:public HttpResponseMessage DeleteAlbum(string title) { var matchedAlbum = AlbumData.Current.Where(alb => alb.AlbumName == title) .SingleOrDefault(); if (matchedAlbum == null) return new HttpResponseMessage(HttpStatusCode.NotFound); AlbumData.Current.Remove(matchedAlbum); return new HttpResponseMessage(HttpStatusCode.NoContent); } To call this action method using jQuery, you can use:$(".removeimage").live("click", function () { var $el = $(this).parent(".album"); var txt = $el.find("a").text(); $.ajax({ url: "albums/" + encodeURIComponent(txt), type: "Delete", success: function (result) { $el.fadeOut().remove(); }, error: jqError }); } Note the use of the DELETE verb in the $.ajax() call, which routes to DeleteAlbum on the server. DELETE is a non-content operation, so you supply a resource ID (the title) via route value or the querystring. Routing Conflicts In all requests with the exception of the AlbumArt image example shown so far, I used HTTP Verb routing that I set up in Listing 1. HTTP Verb Routing is a recommendation that is in line with typical REST access to HTTP resources. However, it takes quite a bit of effort to create REST-compliant API implementations based only on HTTP Verb routing only. You saw one example that didn’t really fit – the return of an image where I created a custom route albums/{title}/image that required creation of a second controller and a custom route to work. HTTP Verb routing to a controller does not mix with custom or action routing to the same controller because of the limited mapping of HTTP verbs imposed by HTTP Verb routing. To understand some of the problems with verb routing, let’s look at another example. Let’s say you create a GetSortableAlbums() method like this and add it to the original AlbumApiController accessed via HTTP Verb routing:[HttpGet] public IQueryable<Album> SortableAlbums() { var albums = AlbumData.Current; // generally should be done only on actual queryable results (EF etc.) // Done here because we're running with a static list but otherwise might be slow return albums.AsQueryable(); } If you compile this code and try to now access the /albums/ link, you get an error: Multiple Actions were found that match the request. HTTP Verb routing only allows access to one GET operation per parameter/route value match. If more than one method exists with the same parameter signature, it doesn’t work. As I mentioned earlier for the image display, the only solution to get this method to work is to throw it into another controller. Because I already set up the AlbumRpcApiController I can add the method there. First, I should rename the method to SortableAlbums() so I’m not using a Get prefix for the method. This also makes the action parameter look cleaner in the URL - it looks less like a method and more like a noun. I can then create a new route that handles direct-action mapping:RouteTable.Routes.MapHttpRoute( name: "AlbumRpcApiAction", routeTemplate: "albums/rpc/{action}/{title}", defaults: new { title = RouteParameter.Optional, controller = "AlbumRpcApi", action = "GetAblums" } ); As I am explicitly adding a route segment – rpc – into the route template, I can now reference explicit methods in the Web API controller using URLs like this: http://localhost/AspNetWebApi/rpc/SortableAlbums Error Handling I’ve already done some minimal error handling in the examples. For example in Listing 6, I detected some known-error scenarios like model validation failing or a resource not being found and returning an appropriate HttpResponseMessage result. But what happens if your code just blows up or causes an exception? If you have a controller method, like this:[HttpGet] public void ThrowException() { throw new UnauthorizedAccessException("Unauthorized Access Sucka"); } You can call it with this: http://localhost/AspNetWebApi/albums/rpc/ThrowException The default exception handling displays a 500-status response with the serialized exception on the local computer only. When you connect from a remote computer, Web API throws back a 500 HTTP Error with no data returned (IIS then adds its HTML error page). The behavior is configurable in the GlobalConfiguration:GlobalConfiguration .Configuration .IncludeErrorDetailPolicy = IncludeErrorDetailPolicy.Never; If you want more control over your error responses sent from code, you can throw explicit error responses yourself using HttpResponseException. When you throw an HttpResponseException the response parameter is used to generate the output for the Controller action. [HttpGet] public void ThrowError() { var resp = Request.CreateResponse<ApiMessageError>( HttpStatusCode.BadRequest, new ApiMessageError("Your code stinks!")); throw new HttpResponseException(resp); } Throwing an HttpResponseException stops the processing of the controller method and immediately returns the response you passed to the exception. Unlike other Exceptions fired inside of WebAPI, HttpResponseException bypasses the Exception Filters installed and instead just outputs the response you provide. In this case, the serialized ApiMessageError result string is returned in the default serialization format – XML or JSON. You can pass any content to HttpResponseMessage, which includes creating your own exception objects and consistently returning error messages to the client. Here’s a small helper method on the controller that you might use to send exception info back to the client consistently:private void ThrowSafeException(string message, HttpStatusCode statusCode = HttpStatusCode.BadRequest) { var errResponse = Request.CreateResponse<ApiMessageError>(statusCode, new ApiMessageError() { message = message }); throw new HttpResponseException(errResponse); } You can then use it to output any captured errors from code:[HttpGet] public void ThrowErrorSafe() { try { List<string> list = null; list.Add("Rick"); } catch (Exception ex) { ThrowSafeException(ex.Message); } } Exception Filters Another more global solution is to create an Exception Filter. Filters in Web API provide the ability to pre- and post-process controller method operations. An exception filter looks at all exceptions fired and then optionally creates an HttpResponseMessage result. Listing 8 shows an example of a basic Exception filter implementation.public class UnhandledExceptionFilter : ExceptionFilterAttribute { public override void OnException(HttpActionExecutedContext context) { HttpStatusCode status = HttpStatusCode.InternalServerError; var exType = context.Exception.GetType(); if (exType == typeof(UnauthorizedAccessException)) status = HttpStatusCode.Unauthorized; else if (exType == typeof(ArgumentException)) status = HttpStatusCode.NotFound; var apiError = new ApiMessageError() { message = context.Exception.Message }; // create a new response and attach our ApiError object // which now gets returned on ANY exception result var errorResponse = context.Request.CreateResponse<ApiMessageError>(status, apiError); context.Response = errorResponse; base.OnException(context); } } Exception Filter Attributes can be assigned to an ApiController class like this:[UnhandledExceptionFilter] public class AlbumRpcApiController : ApiController or you can globally assign it to all controllers by adding it to the HTTP Configuration's Filters collection:GlobalConfiguration.Configuration.Filters.Add(new UnhandledExceptionFilter()); The latter is a great way to get global error trapping so that all errors (short of hard IIS errors and explicit HttpResponseException errors) return a valid error response that includes error information in the form of a known-error object. Using a filter like this allows you to throw an exception as you normally would and have your filter create a response in the appropriate output format that the client expects. For example, an AJAX application can on failure expect to see a JSON error result that corresponds to the real error that occurred rather than a 500 error along with HTML error page that IIS throws up. You can even create some custom exceptions so you can differentiate your own exceptions from unhandled system exceptions - you often don't want to display error information from 'unknown' exceptions as they may contain sensitive system information or info that's not generally useful to users of your application/site. This is just one example of how ASP.NET Web API is configurable and extensible. Exception filters are just one example of how you can plug-in into the Web API request flow to modify output. Many more hooks exist and I’ll take a closer look at extensibility in Part 2 of this article in the future. Summary Web API is a big improvement over previous Microsoft REST and AJAX toolkits. The key features to its usefulness are its ease of use with simple controller based logic, familiar MVC-style routing, low configuration impact, extensibility at all levels and tight attention to exposing and making HTTP semantics easily discoverable and easy to use. Although none of the concepts used in Web API are new or radical, Web API combines the best of previous platforms into a single framework that’s highly functional, easy to work with, and extensible to boot. I think that Microsoft has hit a home run with Web API. Related Resources Where does ASP.NET Web API fit? Sample Source Code on GitHub Passing multiple POST parameters to Web API Controller Methods Mapping UrlEncoded POST Values in ASP.NET Web API Creating a JSONP Formatter for ASP.NET Web API Removing the XML Formatter from ASP.NET Web API Applications© Rick Strahl, West Wind Technologies, 2005-2012Posted in Web Api Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article

How John Got 15x Improvement Without Really Trying

- by rchrd

The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here. How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

Read the article

C#/.NET Little Wonders: The Useful But Overlooked Sets

- by James Michael Hare

Once again we consider some of the lesser known classes and keywords of C#. Today we will be looking at two set implementations in the System.Collections.Generic namespace: HashSet<T> and SortedSet<T>. Even though most people think of sets as mathematical constructs, they are actually very useful classes that can be used to help make your application more performant if used appropriately. A Background From Math In mathematical terms, a set is an unordered collection of unique items. In other words, the set {2,3,5} is identical to the set {3,5,2}. In addition, the set {2, 2, 4, 1} would be invalid because it would have a duplicate item (2). In addition, you can perform set arithmetic on sets such as: Intersections: The intersection of two sets is the collection of elements common to both. Example: The intersection of {1,2,5} and {2,4,9} is the set {2}. Unions: The union of two sets is the collection of unique items present in either or both set. Example: The union of {1,2,5} and {2,4,9} is {1,2,4,5,9}. Differences: The difference of two sets is the removal of all items from the first set that are common between the sets. Example: The difference of {1,2,5} and {2,4,9} is {1,5}. Supersets: One set is a superset of a second set if it contains all elements that are in the second set. Example: The set {1,2,5} is a superset of {1,5}. Subsets: One set is a subset of a second set if all the elements of that set are contained in the first set. Example: The set {1,5} is a subset of {1,2,5}. If We’re Not Doing Math, Why Do We Care? Now, you may be thinking: why bother with the set classes in C# if you have no need for mathematical set manipulation? The answer is simple: they are extremely efficient ways to determine ownership in a collection. For example, let’s say you are designing an order system that tracks the price of a particular equity, and once it reaches a certain point will trigger an order. Now, since there’s tens of thousands of equities on the markets, you don’t want to track market data for every ticker as that would be a waste of time and processing power for symbols you don’t have orders for. Thus, we just want to subscribe to the stock symbol for an equity order only if it is a symbol we are not already subscribed to. Every time a new order comes in, we will check the list of subscriptions to see if the new order’s stock symbol is in that list. If it is, great, we already have that market data feed! If not, then and only then should we subscribe to the feed for that symbol. So far so good, we have a collection of symbols and we want to see if a symbol is present in that collection and if not, add it. This really is the essence of set processing, but for the sake of comparison, let’s say you do a list instead: 1: // class that handles are order processing service 2: public sealed class OrderProcessor 3: { 4: // contains list of all symbols we are currently subscribed to 5: private readonly List<string> _subscriptions = new List<string>(); 6: 7: ... 8: } Now whenever you are adding a new order, it would look something like: 1: public PlaceOrderResponse PlaceOrder(Order newOrder) 2: { 3: // do some validation, of course... 4: 5: // check to see if already subscribed, if not add a subscription 6: if (!_subscriptions.Contains(newOrder.Symbol)) 7: { 8: // add the symbol to the list 9: _subscriptions.Add(newOrder.Symbol); 10: 11: // do whatever magic is needed to start a subscription for the symbol 12: } 13: 14: // place the order logic! 15: } What’s wrong with this? In short: performance! Finding an item inside a List<T> is a linear - O(n) – operation, which is not a very performant way to find if an item exists in a collection. (I used to teach algorithms and data structures in my spare time at a local university, and when you began talking about big-O notation you could immediately begin to see eyes glossing over as if it was pure, useless theory that would not apply in the real world, but I did and still do believe it is something worth understanding well to make the best choices in computer science). Let’s think about this: a linear operation means that as the number of items increases, the time that it takes to perform the operation tends to increase in a linear fashion. Put crudely, this means if you double the collection size, you might expect the operation to take something like the order of twice as long. Linear operations tend to be bad for performance because they mean that to perform some operation on a collection, you must potentially “visit” every item in the collection. Consider finding an item in a List<T>: if you want to see if the list has an item, you must potentially check every item in the list before you find it or determine it’s not found. Now, we could of course sort our list and then perform a binary search on it, but sorting is typically a linear-logarithmic complexity – O(n * log n) - and could involve temporary storage. So performing a sort after each add would probably add more time. As an alternative, we could use a SortedList<TKey, TValue> which sorts the list on every Add(), but this has a similar level of complexity to move the items and also requires a key and value, and in our case the key is the value. This is why sets tend to be the best choice for this type of processing: they don’t rely on separate keys and values for ordering – so they save space – and they typically don’t care about ordering – so they tend to be extremely performant. The .NET BCL (Base Class Library) has had the HashSet<T> since .NET 3.5, but at that time it did not implement the ISet<T> interface. As of .NET 4.0, HashSet<T> implements ISet<T> and a new set, the SortedSet<T> was added that gives you a set with ordering. HashSet<T> – For Unordered Storage of Sets When used right, HashSet<T> is a beautiful collection, you can think of it as a simplified Dictionary<T,T>. That is, a Dictionary where the TKey and TValue refer to the same object. This is really an oversimplification, but logically it makes sense. I’ve actually seen people code a Dictionary<T,T> where they store the same thing in the key and the value, and that’s just inefficient because of the extra storage to hold both the key and the value. As it’s name implies, the HashSet<T> uses a hashing algorithm to find the items in the set, which means it does take up some additional space, but it has lightning fast lookups! Compare the times below between HashSet<T> and List<T>: Operation HashSet<T> List<T> Add() O(1) O(1) at end O(n) in middle Remove() O(1) O(n) Contains() O(1) O(n) Now, these times are amortized and represent the typical case. In the very worst case, the operations could be linear if they involve a resizing of the collection – but this is true for both the List and HashSet so that’s a less of an issue when comparing the two. The key thing to note is that in the general case, HashSet is constant time for adds, removes, and contains! This means that no matter how large the collection is, it takes roughly the exact same amount of time to find an item or determine if it’s not in the collection. Compare this to the List where almost any add or remove must rearrange potentially all the elements! And to find an item in the list (if unsorted) you must search every item in the List. So as you can see, if you want to create an unordered collection and have very fast lookup and manipulation, the HashSet is a great collection. And since HashSet<T> implements ICollection<T> and IEnumerable<T>, it supports nearly all the same basic operations as the List<T> and can use the System.Linq extension methods as well. All we have to do to switch from a List<T> to a HashSet<T> is change our declaration. Since List and HashSet support many of the same members, chances are we won’t need to change much else. 1: public sealed class OrderProcessor 2: { 3: private readonly HashSet<string> _subscriptions = new HashSet<string>(); 4: 5: // ... 6: 7: public PlaceOrderResponse PlaceOrder(Order newOrder) 8: { 9: // do some validation, of course... 10: 11: // check to see if already subscribed, if not add a subscription 12: if (!_subscriptions.Contains(newOrder.Symbol)) 13: { 14: // add the symbol to the list 15: _subscriptions.Add(newOrder.Symbol); 16: 17: // do whatever magic is needed to start a subscription for the symbol 18: } 19: 20: // place the order logic! 21: } 22: 23: // ... 24: } 25: Notice, we didn’t change any code other than the declaration for _subscriptions to be a HashSet<T>. Thus, we can pick up the performance improvements in this case with minimal code changes. SortedSet<T> – Ordered Storage of Sets Just like HashSet<T> is logically similar to Dictionary<T,T>, the SortedSet<T> is logically similar to the SortedDictionary<T,T>. The SortedSet can be used when you want to do set operations on a collection, but you want to maintain that collection in sorted order. Now, this is not necessarily mathematically relevant, but if your collection needs do include order, this is the set to use. So the SortedSet seems to be implemented as a binary tree (possibly a red-black tree) internally. Since binary trees are dynamic structures and non-contiguous (unlike List and SortedList) this means that inserts and deletes do not involve rearranging elements, or changing the linking of the nodes. There is some overhead in keeping the nodes in order, but it is much smaller than a contiguous storage collection like a List<T>. Let’s compare the three: Operation HashSet<T> SortedSet<T> List<T> Add() O(1) O(log n) O(1) at end O(n) in middle Remove() O(1) O(log n) O(n) Contains() O(1) O(log n) O(n) The MSDN documentation seems to indicate that operations on SortedSet are O(1), but this seems to be inconsistent with its implementation and seems to be a documentation error. There’s actually a separate MSDN document (here) on SortedSet that indicates that it is, in fact, logarithmic in complexity. Let’s put it in layman’s terms: logarithmic means you can double the collection size and typically you only add a single extra “visit” to an item in the collection. Take that in contrast to List<T>’s linear operation where if you double the size of the collection you double the “visits” to items in the collection. This is very good performance! It’s still not as performant as HashSet<T> where it always just visits one item (amortized), but for the addition of sorting this is a good thing. Consider the following table, now this is just illustrative data of the relative complexities, but it’s enough to get the point: Collection Size O(1) Visits O(log n) Visits O(n) Visits 1 1 1 1 10 1 4 10 100 1 7 100 1000 1 10 1000 Notice that the logarithmic – O(log n) – visit count goes up very slowly compare to the linear – O(n) – visit count. This is because since the list is sorted, it can do one check in the middle of the list, determine which half of the collection the data is in, and discard the other half (binary search). So, if you need your set to be sorted, you can use the SortedSet<T> just like the HashSet<T> and gain sorting for a small performance hit, but it’s still faster than a List<T>. Unique Set Operations Now, if you do want to perform more set-like operations, both implementations of ISet<T> support the following, which play back towards the mathematical set operations described before: IntersectWith() – Performs the set intersection of two sets. Modifies the current set so that it only contains elements also in the second set. UnionWith() – Performs a set union of two sets. Modifies the current set so it contains all elements present both in the current set and the second set. ExceptWith() – Performs a set difference of two sets. Modifies the current set so that it removes all elements present in the second set. IsSupersetOf() – Checks if the current set is a superset of the second set. IsSubsetOf() – Checks if the current set is a subset of the second set. For more information on the set operations themselves, see the MSDN description of ISet<T> (here). What Sets Don’t Do Don’t get me wrong, sets are not silver bullets. You don’t really want to use a set when you want separate key to value lookups, that’s what the IDictionary implementations are best for. Also sets don’t store temporal add-order. That is, if you are adding items to the end of a list all the time, your list is ordered in terms of when items were added to it. This is something the sets don’t do naturally (though you could use a SortedSet with an IComparer with a DateTime but that’s overkill) but List<T> can. Also, List<T> allows indexing which is a blazingly fast way to iterate through items in the collection. Iterating over all the items in a List<T> is generally much, much faster than iterating over a set. Summary Sets are an excellent tool for maintaining a lookup table where the item is both the key and the value. In addition, if you have need for the mathematical set operations, the C# sets support those as well. The HashSet<T> is the set of choice if you want the fastest possible lookups but don’t care about order. In contrast the SortedSet<T> will give you a sorted collection at a slight reduction in performance. Technorati Tags: C#,.Net,Little Wonders,BlackRabbitCoder,ISet,HashSet,SortedSet

Search Results

Search found 7107 results on 285 pages for 'processing efficiency'.

Page 98/285 | < Previous Page | 94 95 96 97 98 99 100 101 102 103 104 105 | Next Page >

- by Brantley Blanchard

- by DeaconDesperado

- by Lionel Dubreuil

- by jason malitz

- by Morpork

- by Alan Smith

- by jcortez

- by S.E.R.

- by Jonas

- by Thomas Canter

- by TECH4JESUS

- by Peter

- by Applications User Experience

- by Brian

- by Teknikk

- by asantaga

- by Kellsey Ruppel

- by Your DisplayName here!

- by Marko Apfel

- by aahluwalia

- by Tori Wieldt

- by Michael Snow

- by Rick Strahl

- by rchrd

- by James Michael Hare

< Previous Page | 94 95 96 97 98 99 100 101 102 103 104 105 | Next Page >