Search Results

Search found 8429 results on 338 pages for 'batch processing'.

Page 130/338 | < Previous Page | 126 127 128 129 130 131 132 133 134 135 136 137  | Next Page >

  • Sql Table Refactoring Challenge

    Ive been working a bit on cleaning up a large table to make it more efficient.  I pretty much know what I need to do at this point, but I figured Id offer up a challenge for my readers, to see if they can catch everything I have as well as to see if Ive missed anything.  So to that end, I give you my table: CREATE TABLE [dbo].[lq_ActivityLog]( [ID] [bigint] IDENTITY(1,1) NOT NULL, [PlacementID] [int] NOT NULL, [CreativeID] [int] NOT NULL, [PublisherID] [int] NOT NULL, [CountryCode] [nvarchar](10) NOT NULL, [RequestedZoneID] [int] NOT NULL, [AboveFold] [int] NOT NULL, [Period] [datetime] NOT NULL, [Clicks] [int] NOT NULL, [Impressions] [int] NOT NULL, CONSTRAINT [PK_lq_ActivityLog2] PRIMARY KEY CLUSTERED ( [Period] ASC, [PlacementID] ASC, [CreativeID] ASC, [PublisherID] ASC, [RequestedZoneID] ASC, [AboveFold] ASC, [CountryCode] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]) ON [PRIMARY] And now some assumptions and additional information: The table has 200,000,000 rows currently PlacementID ranges from 1 to 5000 and should support at least 50,000 CreativeID ranges from 1 to 5000 and should support at least 50,000 PublisherID ranges from 1 to 500 and should support at least 50,000 CountryCode is a 2-character ISO standard (e.g. US) and there is a country table with an integer ID already.  There are < 300 rows. RequestedZoneID ranges from 1 to 100 and should support at least 50,000 AboveFold has values of 1, 0, or 1 only. Period is a date (no time). Clicks range from 0 to 5000. Impressions range from 0 to 5000000. The table is currently write-mostly.  Its primary purpose is to log advertising activity as quickly as possible.  Nothing in the rest of the system reads from it except for batch jobs that pull the data into summary tables. Heres the current information on the database tables size: Design Goals This table has been in use for about 5 years and has performed very well during that time.  The only complaints we have are that it is quite large and also there are occasionally timeouts for queries that reference it, particularly when batch jobs are pulling data from it.  Any changes should be made with an eye toward keeping write performance optimal  while trying to reduce space and improve read performance / eliminate timeouts during read operations. Refactor There are, I suggest to you, some glaringly obvious optimizations that can be made to this table.  And Im sure there are some ninja tweaks known to SQL gurus that would be a big help as well.  Ill post my own suggested changes in a follow-up post for now feel free to comment with your suggestions. Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Install Script Question

    - by Michael Dobbins
    Sorry if this has been discussed, wasn't sure what to use to search for the particular question I had. I'm installing sickrage and couchpotato and headphones, and will be doing multiple installs I made a small batch script just to install everything but what I don't know how to do is automatically make them auto start on boot. Since they all require a file to be created I do not know how to make this file auto create and populate what needs to be put into it.

    Read the article

  • How to handle status columns in designing tables

    - by altsyset
    How to handle multiple statuses for a table entry, for example an item table may have an active, inactive, fast moving, and/or batch statuses. And I wanted to handle them in single column with VARCHAR type. Also I might set each of those attributes as a boolean with different columns. But I am not sure what consequences this might lead to. So if you have experienced such situations which one would be the best way to handle it?

    Read the article

  • How can I add a character and enemies to a game that uses Parallax Scrolling? [on hold]

    - by Homer_Simpson
    I use the following code to create Parallax Scrolling: http://www.david-gouveia.com/portfolio/2d-camera-with-parallax-scrolling-in-xna/ Parallax Scrolling is working but I don't know how to add the player and the enemies. I tried to add a player class to the existing code, but if the player moves, then the camera isn't pointing at the player. The player leaves the camera viewport after a few seconds. I use the following code(as described in the tutorial), but it's not working: // Updates my camera to lock on the character _camera.LookAt(player.Playerposition); What can I do so that the player is always the center of the camera? How should I add the character and the enemies to the game? Should I create a layer for the character and the enemies? For example: new Layer(_camera) { Parallax = new Vector2(0.9f, 1.0f) } At the moment, I don't use a layer for the player and I don't have implemented the enemies because I don't know how to do that. My player class: public class Player { Texture2D Playertex; public Vector2 Playerposition = new Vector2(400, 240); private Game1 game1; public Player(Game1 game) { game1 = game; } public void Load(ContentManager content) { Playertex = content.Load<Texture2D>("8bitmario"); TouchPanel.EnabledGestures = GestureType.HorizontalDrag; } public void Update(GameTime gameTime) { while (TouchPanel.IsGestureAvailable) { GestureSample gs = TouchPanel.ReadGesture(); switch (gs.GestureType) { case GestureType.HorizontalDrag: Playerposition.X += 3f; break; } } } public void Render(SpriteBatch batch) { batch.Draw(Playertex, new Vector2(Playerposition.X - Playertex.Width / 2, Playerposition.Y - Playertex.Height / 2), Color.White); } } In Game1, I update the player and camera class: protected override void Update(GameTime gameTime) { // Updates my character's position player.Update(gameTime); // Updates my camera to lock on the character _camera.LookAt(player.Playerposition); base.Update(gameTime); } protected override void Draw(GameTime gameTime) { GraphicsDevice.Clear(Color.CornflowerBlue); foreach (Layer layer in _layers) layer.Draw(spriteBatch); spriteBatch.Begin(SpriteSortMode.Deferred, null, null, null, null, null, _camera.GetViewMatrix(new Vector2(0.0f, 0.0f))); player.Render(spriteBatch); spriteBatch.End(); base.Draw(gameTime); }

    Read the article

  • Monitor Exchange Email Address and run scripts

    - by WernerCD
    Okay... Not sure how "out there" this thought is... Right now to send a pager message (aka text message), a user logs into our AS400... logs into the program... enters user name and message and hit's F10 to send. With a little looking, it seems that you can run remote commands to the AS400 via FTP. So I'm working on building a script (batch or otherwise) that, given two parameters (user, message), will FTP into the AS400 and run a remote command: c:\>ftp server user: admin password: ***** ftp> quote rcmd SNDPGRMSG TOPGR(JDOE) MSG('This is a Test') ftp> quit So... what I want to do is setup an email account on our Exchange server Monitor the account for incoming mail upon receipt of incoming mail, parse it... say for example subject is defined as "Recipient" and email text is defined as "Pager message" run a batch that uses the above mentioned TOPGR and MSG as parameters... via FTP to the AS400 mark email as "read" The main thing I'm not sure about is monitoring an exchange account and running a script on incoming emails. I'm sure what I want to do is possible... but where would I start? EDIT: Clarification The main reasons for using this four part system are logging (messages sent via this are logged and reported by the AS400 program) and the existing scheduler for redirecting pages (For example, the weekly on-call person = TOPGR(oncall) gets updated by the AS400 program). I'm also trying to remove duplicate work. If I can get this setup working, I can redirect pages from OTHER systems into this one. I then don't have to update 2, soon to be 3, systems with current phone numbers, carriers, on-call schedules, etc. System #2 and #3 can just "email" [email protected].

    Read the article

  • Backing up SQL NetApp Snapshots using TSM

    - by WerkkreW
    In our environment we have a 3 node SQL 2005 Cluster which is on NetApp storage. We are currently using SMSQL (NetApp SnapManager for SQL) to take Snapshot backups of the data. This works great, but due to some audit requirements we are also forced to maintain some copies on tape. We have used NDMP in other places across the enterprise but we do not want to use it in this specific instance. Basically what I need to do is, get the most recent snapshot copy of the databases on tape, via Tivoli Storage Manager (TSM). What I have done is, obtained a basic Windows Server 2003 VM with SnapDrive installed, which is SAN attached and zoned to the NetApp, and I have written a batch file to do the following: Mount the latest __RECENT snapshot lun to the host, using a specific drive letter Perform a TSM based incremental backup Dis-mount the LUN This seems to work fine, except sometimes the LUN's do not mount due to some sort of timeout. Also, due to my limited knowledge of windows batch scripting, I have no way to monitor the success or failure of these backups since I do not know how to send a valid return code back to the TSM scheduling service. Is there a more efficient/elegant way to accomplish this without NDMP?

    Read the article

  • How to make gpg2 to flush the stream?

    - by Vi
    I want to get some slowly flowing data saved in encrypted form at the device which can be turned off abruptly. But gpg2 seems to not to flush it's output frequently and I get broken files when I try to read such truncated file. vi@vi-notebook:~$ cat asdkfgmafl asdkfgmafl ggggg ggggg 2342 2342 cat behaves normally. I see the output right after input. vi@vi-notebook:~$ gpg2 -er _Vi --batch ?pE??x...(more binary data here)....???-??.... asdfsadf 22223 sdfsdfasf Still no data... Still no output... ^C gpg: signal Interrupt caught ... exiting vi@vi-notebook:~$ gpg2 -er _Vi --batch /tmp/qqq skdmfasldf gkvmdfwwerwer zfzdfdsfl ^\ gpg: signal Quit caught ... exiting Quit vi@vi-notebook:~$ gpg2 < /tmp/qqq You need a passphrase to unlock the secret key for user: "Vitaly Shukela (_Vi) " 2048-bit ELG key, ID 78F446CA, created 2008-01-06 (main key ID 1735A052) gpg: [don't know]: 1st length byte missing vi@vi-notebook:~$ # Where is my "skdmfasldf" How to make gpg2 to handle such case? I want it to put enough output to reconstruct each incoming chunk of input. (Also fsyncing after each output can be benefitial as an additional option). Should I use other tool (I need pubkey encryption).

    Read the article

  • regsvr32 fails on Windows Server 2003 - LoadLibaray ("my.dll") failed

    - by John Mo
    The DLL in question is a VB6 web class. When deploying, the old one needs to be unregistered and the new one registered. This has worked for many years but has failed now with the following error in a message box: LoadLibrary("c:...\my.dll") failed - The filename, directory name, or volume label syntax is incorrect. I don't think there's a problem with the DLL being (un)registered because it's failing to unregister the existing DLL that was at one time successfully registered. I don't think there's a filename or path problem. Regsvr32 is executed using a batch file where the full path to the DLL is specified. The path has not changed. The batch file has not changed. The DLL has not changed (at least not the old one I need to unregister). It's a programming problem to me because I need to release an updated DLL, but I'm thinking it's maybe an admin problem best suited to the talents of the serverfault audience. I'm hoping somebody knows about a Windows Update or something that might hose regsvr32. Thank you.

    Read the article

  • Backing up SQL NetApp Snapshots using TSM

    - by WerkkreW
    In our environment we have a 3 node SQL 2005 Cluster which is on NetApp storage. We are currently using SMSQL (NetApp SnapManager for SQL) to take Snapshot backups of the data. This works great, but due to some audit requirements we are also forced to maintain some copies on tape. We have used NDMP in other places across the enterprise but we do not want to use it in this specific instance. Basically what I need to do is, get the most recent snapshot copy of the databases on tape, via Tivoli Storage Manager (TSM). What I have done is, obtained a basic Windows Server 2003 VM with SnapDrive installed, which is SAN attached and zoned to the NetApp, and I have written a batch file to do the following: Mount the latest __RECENT snapshot lun to the host, using a specific drive letter Perform a TSM based incremental backup Dis-mount the LUN This seems to work fine, except sometimes the LUN's do not mount due to some sort of timeout. Also, due to my limited knowledge of windows batch scripting, I have no way to monitor the success or failure of these backups since I do not know how to send a valid return code back to the TSM scheduling service. Is there a more efficient/elegant way to accomplish this without NDMP?

    Read the article

  • esxi 5 monitoring

    - by user134880
    I'm new to esxi/vmware world. Have few question about monitoring esxi host. Now I'm using trial versian esxi 5.1 I can see performance charts in vShpere client. Cool. But I want to export this raw data (raw - I mean cvs,txt or some format which I can parse later) to other server to be able to parse this data later and create custom charts. (please do not advise to try vCenter, I need custom charts etc.) I could run esxtop in batch mode and use this data... But... How does vShpere client performance charts work? Where client takes data for charts? So if I will use esxtop batch mode it will add extra load to server. Is it possible to use same source as vSphere client use for charts? There is /var/lib/vmware/hostd/stats/hostAgentStats-20.stats file. Its looks like it is binary format. As I understood it is exactly data which I need?!?! Any ideas how to parse it? Thanks! PS: maybe some one know where to find, if there is one, info about processes running on esxi host?

    Read the article

  • Postgres pgpass windows - not working

    - by Scott
    DB: Postgres 9.0 Client: Windows 7 Server Windows 2008, 64bit I'm trying to connect remotely to a postgres instance for purposes of performing a pg_dump to my local machine. Everything works from my client machine, except that I need to provide a password at the password prompt, and I'd ultimately like to batch this with a script. I've followed the instructions here: http://www.postgresql.org/docs/current/static/libpq-pgpass.html but it's not working. To recap, I've created a file on the client (and tried the server as well): C:/Users/postgres/AppData/postgresql/pgpass.conf, where postgresql is the db user. The file has one line with the following data: *:5432:*postgres:[mypassword] (also tried explicit ip/dbname values, all asterisks, and every combination in between. (I've also tried replacing each '*' with [localhost|myip] and [mydatabasename] respectively. From my client machine, I connect using: pg_dump -h [myip] -U postgres -w [mydbname] [mylocaldumpfile] I'm presuming that I need to provide the '-w' switch in order to ignore password prompt, at which point it should look in the AppData directory on the server. It just comes back with "connection to database failed: fe_sendauth: no password supplied. Any insights are appreciated. As a hack workaround, if there was a way I could tell the windows batch file on my client machine to inject the password at the postgres prompt, that would work as well. Thanks.

    Read the article

  • issue in installing postgresql 9.3.4 on Windows server 2003 x64

    - by randydom
    Hello i really did all what i know to install the PostgreSQL 9.3.4 on my windows 2003 server x64, but i'm always stopped with this error : please see the error : http://oi57.tinypic.com/s4tb8i.jpg I really don't know what to do , if i click OK then when i go to the windows services list i don't find the PostgreSQL service so i can't Start the service . can any one please help me to install it correctly . PS: i've followed all steps in the : wiki.postgresql.org/wiki/Troubleshooting_Installation many thanks . here's the installer log * where i get " Failed to initialise the database cluster with initdb " : Called IsVistaOrNewer()... 'winmgmts' object initialized... Version:5.2 MajorVersion:5 Ensuring we can write to the data directory (using cacls): Executing batch file 'rad22ADE.bat'... processed dir: C:\Program Files\PostgreSQL\9.2\data Executing batch file 'rad22ADE.bat'... The files belonging to this database system will be owned by user "Administrator". This user must also own the server process. The database cluster will be initialized with locale "English_United States.1252". The default text search configuration will be set to "english". fixing permissions on existing directory C:/Program Files/PostgreSQL/9.2/data ... initdb: could not change permissions of directory "C:/Program Files/PostgreSQL/9.2/data": Permission denied Called Die(Failed to initialise the database cluster with initdb)... Failed to initialise the database cluster with initdb Script stderr: Program ended with an error exit code Error running cscript //NoLogo "C:\Program Files\PostgreSQL\9.2/installer/server/initcluster.vbs" "NT AUTHORITY\NetworkService" "postgres" "****" "C:\Program Files\PostgreSQL\9.2" "C:\Program Files\PostgreSQL\9.2\data" 5432 "DEFAULT" 0 : Program ended with an error exit code Problem running post-install step. Installation may not complete correctly The database cluster initialisation failed. Creating Uninstaller Creating uninstaller 25% Creating uninstaller 50% Creating uninstaller 75% Creating uninstaller 100% Installation completed Log finished 05/02/2014 at 04:04:04

    Read the article

  • How to make gpg2 flush the stream?

    - by Vi
    I want to get some slowly flowing data saved in encrypted form at the device which can be turned off abruptly. But gpg2 seems to not to flush it's output frequently and I get broken files when I try to read such truncated file. vi@vi-notebook:~$ cat asdkfgmafl asdkfgmafl ggggg ggggg 2342 2342 cat behaves normally. I see the output right after input. vi@vi-notebook:~$ gpg2 -er _Vi --batch ?pE??x...(more binary data here)....???-??.... asdfsadf 22223 sdfsdfasf Still no data... Still no output... ^C gpg: signal Interrupt caught ... exiting vi@vi-notebook:~$ gpg2 -er _Vi --batch /tmp/qqq skdmfasldf gkvmdfwwerwer zfzdfdsfl ^\ gpg: signal Quit caught ... exiting Quit vi@vi-notebook:~$ gpg2 " 2048-bit ELG key, ID 78F446CA, created 2008-01-06 (main key ID 1735A052) gpg: [don't know]: 1st length byte missing vi@vi-notebook:~$ # Where is my "skdmfasldf" How to make gpg2 to handle such case? I want it to put enough output to reconstruct each incoming chunk of input. (Also fsyncing after each output can be benefitial as an additional option). Should I use other tool (I need pubkey encryption).

    Read the article

  • Postgres pgpass windows - not working

    - by Scott
    DB: Postgres 9.0 Client: Windows 7 Server Windows 2008, 64bit I'm trying to connect remotely to a postgres instance for purposes of performing a pg_dump to my local machine. Everything works from my client machine, except that I need to provide a password at the password prompt, and I'd ultimately like to batch this with a script. I've followed the instructions here: http://www.postgresql.org/docs/current/static/libpq-pgpass.html but it's not working. To recap, I've created a file on the client (and tried the server as well): C:/Users/postgres/AppData/postgresql/pgpass.conf, where postgresql is the db user. The file has one line with the following data: *:5432:*postgres:[mypassword] (also tried explicit ip/dbname values, all asterisks, and every combination in between. (I've also tried replacing each '*' with [localhost|myip] and [mydatabasename] respectively. From my client machine, I connect using: pg_dump -h [myip] -U postgres -w [mydbname] [mylocaldumpfile] I'm presuming that I need to provide the '-w' switch in order to ignore password prompt, at which point it should look in the AppData directory on the server. It just comes back with "connection to database failed: fe_sendauth: no password supplied. Any insights are appreciated. As a hack workaround, if there was a way I could tell the windows batch file on my client machine to inject the password at the postgres prompt, that would work as well. Thanks.

    Read the article

  • Convert MPG w/ AC3 audio to something else - on a Mac

    - by anonymous coward
    I'm helping with a small volunteer media team, and they have several .mpg videos that don't appear to have sound when played in QuickTime, iTunes, Real Player, etc, on the local Mac machine. I was able to hear audio after transferring one of the movies to a Windows machine that had VLC media player on it. Through VLC I was able to discover that the audio stream is a52 / AC3 format. We use Autodesk Cleaner in our normal workflow of converting the format of our videos to FLV, but for some reason it's unable to convert this particular batch of videos (well, the video converts fine, but with no audio). Obviously, it seems that there's a codec issue here, but I'm not sure how to correct it. (I'm not extremely familiar with Macs, and/or Autodesk Cleaner). I've seen the Perian codec pack, but I'm not sure that having the codecs on the system will enable Cleaner to convert these videos (particularly the audio stream, since the video converts fine). Is there something obvious that I'm overlooking, or will we have to use something else for this particular batch of videos? If so, what?

    Read the article

  • Powershell Script Scheduled Task Stopped Running (Could not Start)

    - by Hatsune Yuki
    I'm running a scheduled task (for Powershell Script) on Windows 2003 Server. I believe the script works fine. The task is scheduled to run every 10 minutes from 7:00am to 11:50pm everyday. However, it never gets to run more for than a day. It always stops some time in the afternoon (between 2pm and 6pm). I'm not sure exactly what happened but I always get the error The attempt to log on to the account associated with the task failed, therefore, the task did not run. The specific error is: 0x80070569: Logon failure: the user has not been granted the requested logon type at this computer. Verify that the task's Run-as name and password are valid and try again. It seems like most people with this error are saying that they need to make user "logon as a batch job". However, this option is greyed-out for me. I search for other places where users have similar problems but the solutions are not written in detail (some of them have something to do with GPO). I've only used the basic features of Windows Server and I have no clue how to get to the place they are referring to. Can someone please confirm whether "logon as a batch job" is indeed a solution and provide a detailed walkthrough on how to solve my problem? Thanks. p.s. someone suggested the website http://technet.microsoft.com/en-us/library/cc755659(v=ws.10) I tried to followed the method for web server with domain. However, got stuck on the 6th step where it mentions Group Policy Object. I don't know where it is.

    Read the article

  • Windows preventing running of Telnet client

    - by palswim
    At first, I had issues because Windows 7 doesn't install the Telnet client by default (also, SuperUser has a thread). So, after installing it (and restarting, like Windows asked, though completely unnecessary), I opened a command prompt, and went to run my new Telnet program. I enter telnet, and receive: C:\Users\[USER]>telnet 'telnet' is not recognized as an internal or external command, operable program or batch file. "That's odd," I think to myself. So, in Windows explorer, I navigate to \Windows\System32 and see telnet.exe sitting in that folder. If I double-click on the executable file, the Telnet command prompt opens for me without a problem. So, I return to my Windows Command Prompt, and enter: C:\Users\[USER]>\Windows\System32\telnet.exe '\Windows\System32\telnet.exe' is not recognized as an internal or external command, operable program or batch file. And then (grep comes from cygwin): C:\Users\ryan\Desktop>dir \Windows\System32 | grep telnet Nothing. I've disabled UAC and have no idea why my Command Prompt is lying to me. Anyone experience something similar? To recap: In Windows 7, I have installed Telnet and can see it in my System32 folder, but cannot run it via a Command Prompt.

    Read the article

  • Using GPO to collect data about VMware view activity

    - by MoSiAc
    Our security group wants us to begin logging data for external access to our view enviroment. At first we thought that view security would be logging all source ip's that are external in nature so if for some reason there is an intrusion we would have record of it there. Of course our firewall logs all that information but correlating it to view is sketchy at best with our current implementation. We know on viewdesktops there is a set of keys in VolitateEnviroment that contains stuff such as source ip and username, etc. We have a script in place that, when run as a logon script attached to a user account in AD collects the information as we need it. If we have a GPO run the same script the information does not get collected. We feel like there is a piece of the puzzle we're missing but we don't know what. If anyone knows what we're forgetting or misconfiguring that would be great, or if you have a better way of us collecting external source ip's for view specifically we'd be interested in that as well. Thanks, EDIT CODE Batch script to dump to text file @echo off timeout 20 echo %computername%/%username% %time% %date% c:\vdi\vmware.txt echo ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~c:\vdi\vmware.txt reg query "HKEY_CURRENT_USER\Volatile Environment" /v "ViewClient_LoggedOn_Username"c:\vdi\vmware.txt reg query "HKEY_CURRENT_USER\Volatile Environment" /v "ViewClient_IP_Address"c:\vdi\vmware.txt echo.c:\vdi\vmware.txt VB Script to display values Const HKEY_CURRENT_USER = &H80000001 Set wmiLocator=CreateObject("WbemScripting.SWbemLocator") Set wmiNameSpace = wmiLocator.ConnectServer(".", "root\default") Set objRegistry = wmiNameSpace.Get("StdRegProv") sPath = "Volatile Environment" lRC = objRegistry.GetStringValue(HKEY_CURRENT_USER, sPath, "ViewClien_Machine_Name", vMachine) lRC = objRegistry.GetStringValue(HKEY_CURRENT_USER, sPath, "ViewClien_IP_Address", vIP) lRC = objRegistry.GetStringValue(HKEY_CURRENT_USER, sPath, "ViewClien_MAC_Address", vMAC) msgbox "The Remote Device Name is " & vMachine & " @ " & vIP & " (" & vMAC & ") " he wanted me to mention that the batch file actually runs and I can see it counting down when I reconnect but it does not grab the registry values.

    Read the article

  • An Introduction to ASP.NET Web API

    - by Rick Strahl
    Microsoft recently released ASP.NET MVC 4.0 and .NET 4.5 and along with it, the brand spanking new ASP.NET Web API. Web API is an exciting new addition to the ASP.NET stack that provides a new, well-designed HTTP framework for creating REST and AJAX APIs (API is Microsoft’s new jargon for a service, in case you’re wondering). Although Web API ships and installs with ASP.NET MVC 4, you can use Web API functionality in any ASP.NET project, including WebForms, WebPages and MVC or just a Web API by itself. And you can also self-host Web API in your own applications from Console, Desktop or Service applications. If you're interested in a high level overview on what ASP.NET Web API is and how it fits into the ASP.NET stack you can check out my previous post: Where does ASP.NET Web API fit? In the following article, I'll focus on a practical, by example introduction to ASP.NET Web API. All the code discussed in this article is available in GitHub: https://github.com/RickStrahl/AspNetWebApiArticle [republished from my Code Magazine Article and updated for RTM release of ASP.NET Web API] Getting Started To start I’ll create a new empty ASP.NET application to demonstrate that Web API can work with any kind of ASP.NET project. Although you can create a new project based on the ASP.NET MVC/Web API template to quickly get up and running, I’ll take you through the manual setup process, because one common use case is to add Web API functionality to an existing ASP.NET application. This process describes the steps needed to hook up Web API to any ASP.NET 4.0 application. Start by creating an ASP.NET Empty Project. Then create a new folder in the project called Controllers. Add a Web API Controller Class Once you have any kind of ASP.NET project open, you can add a Web API Controller class to it. Web API Controllers are very similar to MVC Controller classes, but they work in any kind of project. Add a new item to this folder by using the Add New Item option in Visual Studio and choose Web API Controller Class, as shown in Figure 1. Figure 1: This is how you create a new Controller Class in Visual Studio   Make sure that the name of the controller class includes Controller at the end of it, which is required in order for Web API routing to find it. Here, the name for the class is AlbumApiController. For this example, I’ll use a Music Album model to demonstrate basic behavior of Web API. The model consists of albums and related songs where an album has properties like Name, Artist and YearReleased and a list of songs with a SongName and SongLength as well as an AlbumId that links it to the album. You can find the code for the model (and the rest of these samples) on Github. To add the file manually, create a new folder called Model, and add a new class Album.cs and copy the code into it. There’s a static AlbumData class with a static CreateSampleAlbumData() method that creates a short list of albums on a static .Current that I’ll use for the examples. Before we look at what goes into the controller class though, let’s hook up routing so we can access this new controller. Hooking up Routing in Global.asax To start, I need to perform the one required configuration task in order for Web API to work: I need to configure routing to the controller. Like MVC, Web API uses routing to provide clean, extension-less URLs to controller methods. Using an extension method to ASP.NET’s static RouteTable class, you can use the MapHttpRoute() (in the System.Web.Http namespace) method to hook-up the routing during Application_Start in global.asax.cs shown in Listing 1.using System; using System.Web.Routing; using System.Web.Http; namespace AspNetWebApi { public class Global : System.Web.HttpApplication { protected void Application_Start(object sender, EventArgs e) { RouteTable.Routes.MapHttpRoute( name: "AlbumVerbs", routeTemplate: "albums/{title}", defaults: new { symbol = RouteParameter.Optional, controller="AlbumApi" } ); } } } This route configures Web API to direct URLs that start with an albums folder to the AlbumApiController class. Routing in ASP.NET is used to create extensionless URLs and allows you to map segments of the URL to specific Route Value parameters. A route parameter, with a name inside curly brackets like {name}, is mapped to parameters on the controller methods. Route parameters can be optional, and there are two special route parameters – controller and action – that determine the controller to call and the method to activate respectively. HTTP Verb Routing Routing in Web API can route requests by HTTP Verb in addition to standard {controller},{action} routing. For the first examples, I use HTTP Verb routing, as shown Listing 1. Notice that the route I’ve defined does not include an {action} route value or action value in the defaults. Rather, Web API can use the HTTP Verb in this route to determine the method to call the controller, and a GET request maps to any method that starts with Get. So methods called Get() or GetAlbums() are matched by a GET request and a POST request maps to a Post() or PostAlbum(). Web API matches a method by name and parameter signature to match a route, query string or POST values. In lieu of the method name, the [HttpGet,HttpPost,HttpPut,HttpDelete, etc] attributes can also be used to designate the accepted verbs explicitly if you don’t want to follow the verb naming conventions. Although HTTP Verb routing is a good practice for REST style resource APIs, it’s not required and you can still use more traditional routes with an explicit {action} route parameter. When {action} is supplied, the HTTP verb routing is ignored. I’ll talk more about alternate routes later. When you’re finished with initial creation of files, your project should look like Figure 2.   Figure 2: The initial project has the new API Controller Album model   Creating a small Album Model Now it’s time to create some controller methods to serve data. For these examples, I’ll use a very simple Album and Songs model to play with, as shown in Listing 2. public class Song { public string AlbumId { get; set; } [Required, StringLength(80)] public string SongName { get; set; } [StringLength(5)] public string SongLength { get; set; } } public class Album { public string Id { get; set; } [Required, StringLength(80)] public string AlbumName { get; set; } [StringLength(80)] public string Artist { get; set; } public int YearReleased { get; set; } public DateTime Entered { get; set; } [StringLength(150)] public string AlbumImageUrl { get; set; } [StringLength(200)] public string AmazonUrl { get; set; } public virtual List<Song> Songs { get; set; } public Album() { Songs = new List<Song>(); Entered = DateTime.Now; // Poor man's unique Id off GUID hash Id = Guid.NewGuid().GetHashCode().ToString("x"); } public void AddSong(string songName, string songLength = null) { this.Songs.Add(new Song() { AlbumId = this.Id, SongName = songName, SongLength = songLength }); } } Once the model has been created, I also added an AlbumData class that generates some static data in memory that is loaded onto a static .Current member. The signature of this class looks like this and that's what I'll access to retrieve the base data:public static class AlbumData { // sample data - static list public static List<Album> Current = CreateSampleAlbumData(); /// <summary> /// Create some sample data /// </summary> /// <returns></returns> public static List<Album> CreateSampleAlbumData() { … }} You can check out the full code for the data generation online. Creating an AlbumApiController Web API shares many concepts of ASP.NET MVC, and the implementation of your API logic is done by implementing a subclass of the System.Web.Http.ApiController class. Each public method in the implemented controller is a potential endpoint for the HTTP API, as long as a matching route can be found to invoke it. The class name you create should end in Controller, which is how Web API matches the controller route value to figure out which class to invoke. Inside the controller you can implement methods that take standard .NET input parameters and return .NET values as results. Web API’s binding tries to match POST data, route values, form values or query string values to your parameters. Because the controller is configured for HTTP Verb based routing (no {action} parameter in the route), any methods that start with Getxxxx() are called by an HTTP GET operation. You can have multiple methods that match each HTTP Verb as long as the parameter signatures are different and can be matched by Web API. In Listing 3, I create an AlbumApiController with two methods to retrieve a list of albums and a single album by its title .public class AlbumApiController : ApiController { public IEnumerable<Album> GetAlbums() { var albums = AlbumData.Current.OrderBy(alb => alb.Artist); return albums; } public Album GetAlbum(string title) { var album = AlbumData.Current .SingleOrDefault(alb => alb.AlbumName.Contains(title)); return album; }} To access the first two requests, you can use the following URLs in your browser: http://localhost/aspnetWebApi/albumshttp://localhost/aspnetWebApi/albums/Dirty%20Deeds Note that you’re not specifying the actions of GetAlbum or GetAlbums in these URLs. Instead Web API’s routing uses HTTP GET verb to route to these methods that start with Getxxx() with the first mapping to the parameterless GetAlbums() method and the latter to the GetAlbum(title) method that receives the title parameter mapped as optional in the route. Content Negotiation When you access any of the URLs above from a browser, you get either an XML or JSON result returned back. The album list result for Chrome 17 and Internet Explorer 9 is shown Figure 3. Figure 3: Web API responses can vary depending on the browser used, demonstrating Content Negotiation in action as these two browsers send different HTTP Accept headers.   Notice that the results are not the same: Chrome returns an XML response and IE9 returns a JSON response. Whoa, what’s going on here? Shouldn’t we see the same result in both browsers? Actually, no. Web API determines what type of content to return based on Accept headers. HTTP clients, like browsers, use Accept headers to specify what kind of content they’d like to see returned. Browsers generally ask for HTML first, followed by a few additional content types. Chrome (and most other major browsers) ask for: Accept: text/html, application/xhtml+xml,application/xml; q=0.9,*/*;q=0.8 IE9 asks for: Accept: text/html, application/xhtml+xml, */* Note that Chrome’s Accept header includes application/xml, which Web API finds in its list of supported media types and returns an XML response. IE9 does not include an Accept header type that works on Web API by default, and so it returns the default format, which is JSON. This is an important and very useful feature that was missing from any previous Microsoft REST tools: Web API automatically switches output formats based on HTTP Accept headers. Nowhere in the server code above do you have to explicitly specify the output format. Rather, Web API determines what format the client is requesting based on the Accept headers and automatically returns the result based on the available formatters. This means that a single method can handle both XML and JSON results.. Using this simple approach makes it very easy to create a single controller method that can return JSON, XML, ATOM or even OData feeds by providing the appropriate Accept header from the client. By default you don’t have to worry about the output format in your code. Note that you can still specify an explicit output format if you choose, either globally by overriding the installed formatters, or individually by returning a lower level HttpResponseMessage instance and setting the formatter explicitly. More on that in a minute. Along the same lines, any content sent to the server via POST/PUT is parsed by Web API based on the HTTP Content-type of the data sent. The same formats allowed for output are also allowed on input. Again, you don’t have to do anything in your code – Web API automatically performs the deserialization from the content. Accessing Web API JSON Data with jQuery A very common scenario for Web API endpoints is to retrieve data for AJAX calls from the Web browser. Because JSON is the default format for Web API, it’s easy to access data from the server using jQuery and its getJSON() method. This example receives the albums array from GetAlbums() and databinds it into the page using knockout.js.$.getJSON("albums/", function (albums) { // make knockout template visible $(".album").show(); // create view object and attach array var view = { albums: albums }; ko.applyBindings(view); }); Figure 4 shows this and the next example’s HTML output. You can check out the complete HTML and script code at http://goo.gl/Ix33C (.html) and http://goo.gl/tETlg (.js). Figu Figure 4: The Album Display sample uses JSON data loaded from Web API.   The result from the getJSON() call is a JavaScript object of the server result, which comes back as a JavaScript array. In the code, I use knockout.js to bind this array into the UI, which as you can see, requires very little code, instead using knockout’s data-bind attributes to bind server data to the UI. Of course, this is just one way to use the data – it’s entirely up to you to decide what to do with the data in your client code. Along the same lines, I can retrieve a single album to display when the user clicks on an album. The response returns the album information and a child array with all the songs. The code to do this is very similar to the last example where we pulled the albums array:$(".albumlink").live("click", function () { var id = $(this).data("id"); // title $.getJSON("albums/" + id, function (album) { ko.applyBindings(album, $("#divAlbumDialog")[0]); $("#divAlbumDialog").show(); }); }); Here the URL looks like this: /albums/Dirty%20Deeds, where the title is the ID captured from the clicked element’s data ID attribute. Explicitly Overriding Output Format When Web API automatically converts output using content negotiation, it does so by matching Accept header media types to the GlobalConfiguration.Configuration.Formatters and the SupportedMediaTypes of each individual formatter. You can add and remove formatters to globally affect what formats are available and it’s easy to create and plug in custom formatters.The example project includes a JSONP formatter that can be plugged in to provide JSONP support for requests that have a callback= querystring parameter. Adding, removing or replacing formatters is a global option you can use to manipulate content. It’s beyond the scope of this introduction to show how it works, but you can review the sample code or check out my blog entry on the subject (http://goo.gl/UAzaR). If automatic processing is not desirable in a particular Controller method, you can override the response output explicitly by returning an HttpResponseMessage instance. HttpResponseMessage is similar to ActionResult in ASP.NET MVC in that it’s a common way to return an abstract result message that contains content. HttpResponseMessage s parsed by the Web API framework using standard interfaces to retrieve the response data, status code, headers and so on[MS2] . Web API turns every response – including those Controller methods that return static results – into HttpResponseMessage instances. Explicitly returning an HttpResponseMessage instance gives you full control over the output and lets you mostly bypass WebAPI’s post-processing of the HTTP response on your behalf. HttpResponseMessage allows you to customize the response in great detail. Web API’s attention to detail in the HTTP spec really shows; many HTTP options are exposed as properties and enumerations with detailed IntelliSense comments. Even if you’re new to building REST-based interfaces, the API guides you in the right direction for returning valid responses and response codes. For example, assume that I always want to return JSON from the GetAlbums() controller method and ignore the default media type content negotiation. To do this, I can adjust the output format and headers as shown in Listing 4.public HttpResponseMessage GetAlbums() { var albums = AlbumData.Current.OrderBy(alb => alb.Artist); // Create a new HttpResponse with Json Formatter explicitly var resp = new HttpResponseMessage(HttpStatusCode.OK); resp.Content = new ObjectContent<IEnumerable<Album>>( albums, new JsonMediaTypeFormatter()); // Get Default Formatter based on Content Negotiation //var resp = Request.CreateResponse<IEnumerable<Album>>(HttpStatusCode.OK, albums); resp.Headers.ConnectionClose = true; resp.Headers.CacheControl = new CacheControlHeaderValue(); resp.Headers.CacheControl.Public = true; return resp; } This example returns the same IEnumerable<Album> value, but it wraps the response into an HttpResponseMessage so you can control the entire HTTP message result including the headers, formatter and status code. In Listing 4, I explicitly specify the formatter using the JsonMediaTypeFormatter to always force the content to JSON.  If you prefer to use the default content negotiation with HttpResponseMessage results, you can create the Response instance using the Request.CreateResponse method:var resp = Request.CreateResponse<IEnumerable<Album>>(HttpStatusCode.OK, albums); This provides you an HttpResponse object that's pre-configured with the default formatter based on Content Negotiation. Once you have an HttpResponse object you can easily control most HTTP aspects on this object. What's sweet here is that there are many more detailed properties on HttpResponse than the core ASP.NET Response object, with most options being explicitly configurable with enumerations that make it easy to pick the right headers and response codes from a list of valid codes. It makes HTTP features available much more discoverable even for non-hardcore REST/HTTP geeks. Non-Serialized Results The output returned doesn’t have to be a serialized value but can also be raw data, like strings, binary data or streams. You can use the HttpResponseMessage.Content object to set a number of common Content classes. Listing 5 shows how to return a binary image using the ByteArrayContent class from a Controller method. [HttpGet] public HttpResponseMessage AlbumArt(string title) { var album = AlbumData.Current.FirstOrDefault(abl => abl.AlbumName.StartsWith(title)); if (album == null) { var resp = Request.CreateResponse<ApiMessageError>( HttpStatusCode.NotFound, new ApiMessageError("Album not found")); return resp; } // kinda silly - we would normally serve this directly // but hey - it's a demo. var http = new WebClient(); var imageData = http.DownloadData(album.AlbumImageUrl); // create response and return var result = new HttpResponseMessage(HttpStatusCode.OK); result.Content = new ByteArrayContent(imageData); result.Content.Headers.ContentType = new MediaTypeHeaderValue("image/jpeg"); return result; } The image retrieval from Amazon is contrived, but it shows how to return binary data using ByteArrayContent. It also demonstrates that you can easily return multiple types of content from a single controller method, which is actually quite common. If an error occurs - such as a resource can’t be found or a validation error – you can return an error response to the client that’s very specific to the error. In GetAlbumArt(), if the album can’t be found, we want to return a 404 Not Found status (and realistically no error, as it’s an image). Note that if you are not using HTTP Verb-based routing or not accessing a method that starts with Get/Post etc., you have to specify one or more HTTP Verb attributes on the method explicitly. Here, I used the [HttpGet] attribute to serve the image. Another option to handle the error could be to return a fixed placeholder image if no album could be matched or the album doesn’t have an image. When returning an error code, you can also return a strongly typed response to the client. For example, you can set the 404 status code and also return a custom error object (ApiMessageError is a class I defined) like this:return Request.CreateResponse<ApiMessageError>( HttpStatusCode.NotFound, new ApiMessageError("Album not found") );   If the album can be found, the image will be returned. The image is downloaded into a byte[] array, and then assigned to the result’s Content property. I created a new ByteArrayContent instance and assigned the image’s bytes and the content type so that it displays properly in the browser. There are other content classes available: StringContent, StreamContent, ByteArrayContent, MultipartContent, and ObjectContent are at your disposal to return just about any kind of content. You can create your own Content classes if you frequently return custom types and handle the default formatter assignments that should be used to send the data out . Although HttpResponseMessage results require more code than returning a plain .NET value from a method, it allows much more control over the actual HTTP processing than automatic processing. It also makes it much easier to test your controller methods as you get a response object that you can check for specific status codes and output messages rather than just a result value. Routing Again Ok, let’s get back to the image example. Using the original routing we have setup using HTTP Verb routing there's no good way to serve the image. In order to return my album art image I’d like to use a URL like this: http://localhost/aspnetWebApi/albums/Dirty%20Deeds/image In order to create a URL like this, I have to create a new Controller because my earlier routes pointed to the AlbumApiController using HTTP Verb routing. HTTP Verb based routing is great for representing a single set of resources such as albums. You can map operations like add, delete, update and read easily using HTTP Verbs. But you cannot mix action based routing into a an HTTP Verb routing controller - you can only map HTTP Verbs and each method has to be unique based on parameter signature. You can't have multiple GET operations to methods with the same signature. So GetImage(string id) and GetAlbum(string title) are in conflict in an HTTP GET routing scenario. In fact, I was unable to make the above Image URL work with any combination of HTTP Verb plus Custom routing using the single Albums controller. There are number of ways around this, but all involve additional controllers.  Personally, I think it’s easier to use explicit Action routing and then add custom routes if you need to simplify your URLs further. So in order to accommodate some of the other examples, I created another controller – AlbumRpcApiController – to handle all requests that are explicitly routed via actions (/albums/rpc/AlbumArt) or are custom routed with explicit routes defined in the HttpConfiguration. I added the AlbumArt() method to this new AlbumRpcApiController class. For the image URL to work with the new AlbumRpcApiController, you need a custom route placed before the default route from Listing 1.RouteTable.Routes.MapHttpRoute( name: "AlbumRpcApiAction", routeTemplate: "albums/rpc/{action}/{title}", defaults: new { title = RouteParameter.Optional, controller = "AlbumRpcApi", action = "GetAblums" } ); Now I can use either of the following URLs to access the image: Custom route: (/albums/rpc/{title}/image)http://localhost/aspnetWebApi/albums/PowerAge/image Action route: (/albums/rpc/action/{title})http://localhost/aspnetWebAPI/albums/rpc/albumart/PowerAge Sending Data to the Server To send data to the server and add a new album, you can use an HTTP POST operation. Since I’m using HTTP Verb-based routing in the original AlbumApiController, I can implement a method called PostAlbum()to accept a new album from the client. Listing 6 shows the Web API code to add a new album.public HttpResponseMessage PostAlbum(Album album) { if (!this.ModelState.IsValid) { // my custom error class var error = new ApiMessageError() { message = "Model is invalid" }; // add errors into our client error model for client foreach (var prop in ModelState.Values) { var modelError = prop.Errors.FirstOrDefault(); if (!string.IsNullOrEmpty(modelError.ErrorMessage)) error.errors.Add(modelError.ErrorMessage); else error.errors.Add(modelError.Exception.Message); } return Request.CreateResponse<ApiMessageError>(HttpStatusCode.Conflict, error); } // update song id which isn't provided foreach (var song in album.Songs) song.AlbumId = album.Id; // see if album exists already var matchedAlbum = AlbumData.Current .SingleOrDefault(alb => alb.Id == album.Id || alb.AlbumName == album.AlbumName); if (matchedAlbum == null) AlbumData.Current.Add(album); else matchedAlbum = album; // return a string to show that the value got here var resp = Request.CreateResponse(HttpStatusCode.OK, string.Empty); resp.Content = new StringContent(album.AlbumName + " " + album.Entered.ToString(), Encoding.UTF8, "text/plain"); return resp; } The PostAlbum() method receives an album parameter, which is automatically deserialized from the POST buffer the client sent. The data passed from the client can be either XML or JSON. Web API automatically figures out what format it needs to deserialize based on the content type and binds the content to the album object. Web API uses model binding to bind the request content to the parameter(s) of controller methods. Like MVC you can check the model by looking at ModelState.IsValid. If it’s not valid, you can run through the ModelState.Values collection and check each binding for errors. Here I collect the error messages into a string array that gets passed back to the client via the result ApiErrorMessage object. When a binding error occurs, you’ll want to return an HTTP error response and it’s best to do that with an HttpResponseMessage result. In Listing 6, I used a custom error class that holds a message and an array of detailed error messages for each binding error. I used this object as the content to return to the client along with my Conflict HTTP Status Code response. If binding succeeds, the example returns a string with the name and date entered to demonstrate that you captured the data. Normally, a method like this should return a Boolean or no response at all (HttpStatusCode.NoConent). The sample uses a simple static list to hold albums, so once you’ve added the album using the Post operation, you can hit the /albums/ URL to see that the new album was added. The client jQuery code to call the POST operation from the client with jQuery is shown in Listing 7. var id = new Date().getTime().toString(); var album = { "Id": id, "AlbumName": "Power Age", "Artist": "AC/DC", "YearReleased": 1977, "Entered": "2002-03-11T18:24:43.5580794-10:00", "AlbumImageUrl": http://ecx.images-amazon.com/images/…, "AmazonUrl": http://www.amazon.com/…, "Songs": [ { "SongName": "Rock 'n Roll Damnation", "SongLength": 3.12}, { "SongName": "Downpayment Blues", "SongLength": 4.22 }, { "SongName": "Riff Raff", "SongLength": 2.42 } ] } $.ajax( { url: "albums/", type: "POST", contentType: "application/json", data: JSON.stringify(album), processData: false, beforeSend: function (xhr) { // not required since JSON is default output xhr.setRequestHeader("Accept", "application/json"); }, success: function (result) { // reload list of albums page.loadAlbums(); }, error: function (xhr, status, p3, p4) { var err = "Error"; if (xhr.responseText && xhr.responseText[0] == "{") err = JSON.parse(xhr.responseText).message; alert(err); } }); The code in Listing 7 creates an album object in JavaScript to match the structure of the .NET Album class. This object is passed to the $.ajax() function to send to the server as POST. The data is turned into JSON and the content type set to application/json so that the server knows what to convert when deserializing in the Album instance. The jQuery code hooks up success and failure events. Success returns the result data, which is a string that’s echoed back with an alert box. If an error occurs, jQuery returns the XHR instance and status code. You can check the XHR to see if a JSON object is embedded and if it is, you can extract it by de-serializing it and accessing the .message property. REST standards suggest that updates to existing resources should use PUT operations. REST standards aside, I’m not a big fan of separating out inserts and updates so I tend to have a single method that handles both. But if you want to follow REST suggestions, you can create a PUT method that handles updates by forwarding the PUT operation to the POST method:public HttpResponseMessage PutAlbum(Album album) { return PostAlbum(album); } To make the corresponding $.ajax() call, all you have to change from Listing 7 is the type: from POST to PUT. Model Binding with UrlEncoded POST Variables In the example in Listing 7 I used JSON objects to post a serialized object to a server method that accepted an strongly typed object with the same structure, which is a common way to send data to the server. However, Web API supports a number of different ways that data can be received by server methods. For example, another common way is to use plain UrlEncoded POST  values to send to the server. Web API supports Model Binding that works similar (but not the same) as MVC's model binding where POST variables are mapped to properties of object parameters of the target method. This is actually quite common for AJAX calls that want to avoid serialization and the potential requirement of a JSON parser on older browsers. For example, using jQUery you might use the $.post() method to send a new album to the server (albeit one without songs) using code like the following:$.post("albums/",{AlbumName: "Dirty Deeds", YearReleased: 1976 … },albumPostCallback); Although the code looks very similar to the client code we used before passing JSON, here the data passed is URL encoded values (AlbumName=Dirty+Deeds&YearReleased=1976 etc.). Web API then takes this POST data and maps each of the POST values to the properties of the Album object in the method's parameter. Although the client code is different the server can both handle the JSON object, or the UrlEncoded POST values. Dynamic Access to POST Data There are also a few options available to dynamically access POST data, if you know what type of data you're dealing with. If you have POST UrlEncoded values, you can dynamically using a FormsDataCollection:[HttpPost] public string PostAlbum(FormDataCollection form) { return string.Format("{0} - released {1}", form.Get("AlbumName"),form.Get("RearReleased")); } The FormDataCollection is a very simple object, that essentially provides the same functionality as Request.Form[] in ASP.NET. Request.Form[] still works if you're running hosted in an ASP.NET application. However as a general rule, while ASP.NET's functionality is always available when running Web API hosted inside of an  ASP.NET application, using the built in classes specific to Web API makes it possible to run Web API applications in a self hosted environment outside of ASP.NET. If your client is sending JSON to your server, and you don't want to map the JSON to a strongly typed object because you only want to retrieve a few simple values, you can also accept a JObject parameter in your API methods:[HttpPost] public string PostAlbum(JObject jsonData) { dynamic json = jsonData; JObject jalbum = json.Album; JObject juser = json.User; string token = json.UserToken; var album = jalbum.ToObject<Album>(); var user = juser.ToObject<User>(); return String.Format("{0} {1} {2}", album.AlbumName, user.Name, token); } There quite a few options available to you to receive data with Web API, which gives you more choices for the right tool for the job. Unfortunately one shortcoming of Web API is that POST data is always mapped to a single parameter. This means you can't pass multiple POST parameters to methods that receive POST data. It's possible to accept multiple parameters, but only one can map to the POST content - the others have to come from the query string or route values. I have a couple of Blog POSTs that explain what works and what doesn't here: Passing multiple POST parameters to Web API Controller Methods Mapping UrlEncoded POST Values in ASP.NET Web API   Handling Delete Operations Finally, to round out the server API code of the album example we've been discussin, here’s the DELETE verb controller method that allows removal of an album by its title:public HttpResponseMessage DeleteAlbum(string title) { var matchedAlbum = AlbumData.Current.Where(alb => alb.AlbumName == title) .SingleOrDefault(); if (matchedAlbum == null) return new HttpResponseMessage(HttpStatusCode.NotFound); AlbumData.Current.Remove(matchedAlbum); return new HttpResponseMessage(HttpStatusCode.NoContent); } To call this action method using jQuery, you can use:$(".removeimage").live("click", function () { var $el = $(this).parent(".album"); var txt = $el.find("a").text(); $.ajax({ url: "albums/" + encodeURIComponent(txt), type: "Delete", success: function (result) { $el.fadeOut().remove(); }, error: jqError }); }   Note the use of the DELETE verb in the $.ajax() call, which routes to DeleteAlbum on the server. DELETE is a non-content operation, so you supply a resource ID (the title) via route value or the querystring. Routing Conflicts In all requests with the exception of the AlbumArt image example shown so far, I used HTTP Verb routing that I set up in Listing 1. HTTP Verb Routing is a recommendation that is in line with typical REST access to HTTP resources. However, it takes quite a bit of effort to create REST-compliant API implementations based only on HTTP Verb routing only. You saw one example that didn’t really fit – the return of an image where I created a custom route albums/{title}/image that required creation of a second controller and a custom route to work. HTTP Verb routing to a controller does not mix with custom or action routing to the same controller because of the limited mapping of HTTP verbs imposed by HTTP Verb routing. To understand some of the problems with verb routing, let’s look at another example. Let’s say you create a GetSortableAlbums() method like this and add it to the original AlbumApiController accessed via HTTP Verb routing:[HttpGet] public IQueryable<Album> SortableAlbums() { var albums = AlbumData.Current; // generally should be done only on actual queryable results (EF etc.) // Done here because we're running with a static list but otherwise might be slow return albums.AsQueryable(); } If you compile this code and try to now access the /albums/ link, you get an error: Multiple Actions were found that match the request. HTTP Verb routing only allows access to one GET operation per parameter/route value match. If more than one method exists with the same parameter signature, it doesn’t work. As I mentioned earlier for the image display, the only solution to get this method to work is to throw it into another controller. Because I already set up the AlbumRpcApiController I can add the method there. First, I should rename the method to SortableAlbums() so I’m not using a Get prefix for the method. This also makes the action parameter look cleaner in the URL - it looks less like a method and more like a noun. I can then create a new route that handles direct-action mapping:RouteTable.Routes.MapHttpRoute( name: "AlbumRpcApiAction", routeTemplate: "albums/rpc/{action}/{title}", defaults: new { title = RouteParameter.Optional, controller = "AlbumRpcApi", action = "GetAblums" } ); As I am explicitly adding a route segment – rpc – into the route template, I can now reference explicit methods in the Web API controller using URLs like this: http://localhost/AspNetWebApi/rpc/SortableAlbums Error Handling I’ve already done some minimal error handling in the examples. For example in Listing 6, I detected some known-error scenarios like model validation failing or a resource not being found and returning an appropriate HttpResponseMessage result. But what happens if your code just blows up or causes an exception? If you have a controller method, like this:[HttpGet] public void ThrowException() { throw new UnauthorizedAccessException("Unauthorized Access Sucka"); } You can call it with this: http://localhost/AspNetWebApi/albums/rpc/ThrowException The default exception handling displays a 500-status response with the serialized exception on the local computer only. When you connect from a remote computer, Web API throws back a 500  HTTP Error with no data returned (IIS then adds its HTML error page). The behavior is configurable in the GlobalConfiguration:GlobalConfiguration .Configuration .IncludeErrorDetailPolicy = IncludeErrorDetailPolicy.Never; If you want more control over your error responses sent from code, you can throw explicit error responses yourself using HttpResponseException. When you throw an HttpResponseException the response parameter is used to generate the output for the Controller action. [HttpGet] public void ThrowError() { var resp = Request.CreateResponse<ApiMessageError>( HttpStatusCode.BadRequest, new ApiMessageError("Your code stinks!")); throw new HttpResponseException(resp); } Throwing an HttpResponseException stops the processing of the controller method and immediately returns the response you passed to the exception. Unlike other Exceptions fired inside of WebAPI, HttpResponseException bypasses the Exception Filters installed and instead just outputs the response you provide. In this case, the serialized ApiMessageError result string is returned in the default serialization format – XML or JSON. You can pass any content to HttpResponseMessage, which includes creating your own exception objects and consistently returning error messages to the client. Here’s a small helper method on the controller that you might use to send exception info back to the client consistently:private void ThrowSafeException(string message, HttpStatusCode statusCode = HttpStatusCode.BadRequest) { var errResponse = Request.CreateResponse<ApiMessageError>(statusCode, new ApiMessageError() { message = message }); throw new HttpResponseException(errResponse); } You can then use it to output any captured errors from code:[HttpGet] public void ThrowErrorSafe() { try { List<string> list = null; list.Add("Rick"); } catch (Exception ex) { ThrowSafeException(ex.Message); } }   Exception Filters Another more global solution is to create an Exception Filter. Filters in Web API provide the ability to pre- and post-process controller method operations. An exception filter looks at all exceptions fired and then optionally creates an HttpResponseMessage result. Listing 8 shows an example of a basic Exception filter implementation.public class UnhandledExceptionFilter : ExceptionFilterAttribute { public override void OnException(HttpActionExecutedContext context) { HttpStatusCode status = HttpStatusCode.InternalServerError; var exType = context.Exception.GetType(); if (exType == typeof(UnauthorizedAccessException)) status = HttpStatusCode.Unauthorized; else if (exType == typeof(ArgumentException)) status = HttpStatusCode.NotFound; var apiError = new ApiMessageError() { message = context.Exception.Message }; // create a new response and attach our ApiError object // which now gets returned on ANY exception result var errorResponse = context.Request.CreateResponse<ApiMessageError>(status, apiError); context.Response = errorResponse; base.OnException(context); } } Exception Filter Attributes can be assigned to an ApiController class like this:[UnhandledExceptionFilter] public class AlbumRpcApiController : ApiController or you can globally assign it to all controllers by adding it to the HTTP Configuration's Filters collection:GlobalConfiguration.Configuration.Filters.Add(new UnhandledExceptionFilter()); The latter is a great way to get global error trapping so that all errors (short of hard IIS errors and explicit HttpResponseException errors) return a valid error response that includes error information in the form of a known-error object. Using a filter like this allows you to throw an exception as you normally would and have your filter create a response in the appropriate output format that the client expects. For example, an AJAX application can on failure expect to see a JSON error result that corresponds to the real error that occurred rather than a 500 error along with HTML error page that IIS throws up. You can even create some custom exceptions so you can differentiate your own exceptions from unhandled system exceptions - you often don't want to display error information from 'unknown' exceptions as they may contain sensitive system information or info that's not generally useful to users of your application/site. This is just one example of how ASP.NET Web API is configurable and extensible. Exception filters are just one example of how you can plug-in into the Web API request flow to modify output. Many more hooks exist and I’ll take a closer look at extensibility in Part 2 of this article in the future. Summary Web API is a big improvement over previous Microsoft REST and AJAX toolkits. The key features to its usefulness are its ease of use with simple controller based logic, familiar MVC-style routing, low configuration impact, extensibility at all levels and tight attention to exposing and making HTTP semantics easily discoverable and easy to use. Although none of the concepts used in Web API are new or radical, Web API combines the best of previous platforms into a single framework that’s highly functional, easy to work with, and extensible to boot. I think that Microsoft has hit a home run with Web API. Related Resources Where does ASP.NET Web API fit? Sample Source Code on GitHub Passing multiple POST parameters to Web API Controller Methods Mapping UrlEncoded POST Values in ASP.NET Web API Creating a JSONP Formatter for ASP.NET Web API Removing the XML Formatter from ASP.NET Web API Applications© Rick Strahl, West Wind Technologies, 2005-2012Posted in Web Api   Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

    Read the article

  • How John Got 15x Improvement Without Really Trying

    - by rchrd
    The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here.  How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

    Read the article

  • C#/.NET Little Wonders: The Useful But Overlooked Sets

    - by James Michael Hare
    Once again we consider some of the lesser known classes and keywords of C#.  Today we will be looking at two set implementations in the System.Collections.Generic namespace: HashSet<T> and SortedSet<T>.  Even though most people think of sets as mathematical constructs, they are actually very useful classes that can be used to help make your application more performant if used appropriately. A Background From Math In mathematical terms, a set is an unordered collection of unique items.  In other words, the set {2,3,5} is identical to the set {3,5,2}.  In addition, the set {2, 2, 4, 1} would be invalid because it would have a duplicate item (2).  In addition, you can perform set arithmetic on sets such as: Intersections: The intersection of two sets is the collection of elements common to both.  Example: The intersection of {1,2,5} and {2,4,9} is the set {2}. Unions: The union of two sets is the collection of unique items present in either or both set.  Example: The union of {1,2,5} and {2,4,9} is {1,2,4,5,9}. Differences: The difference of two sets is the removal of all items from the first set that are common between the sets.  Example: The difference of {1,2,5} and {2,4,9} is {1,5}. Supersets: One set is a superset of a second set if it contains all elements that are in the second set. Example: The set {1,2,5} is a superset of {1,5}. Subsets: One set is a subset of a second set if all the elements of that set are contained in the first set. Example: The set {1,5} is a subset of {1,2,5}. If We’re Not Doing Math, Why Do We Care? Now, you may be thinking: why bother with the set classes in C# if you have no need for mathematical set manipulation?  The answer is simple: they are extremely efficient ways to determine ownership in a collection. For example, let’s say you are designing an order system that tracks the price of a particular equity, and once it reaches a certain point will trigger an order.  Now, since there’s tens of thousands of equities on the markets, you don’t want to track market data for every ticker as that would be a waste of time and processing power for symbols you don’t have orders for.  Thus, we just want to subscribe to the stock symbol for an equity order only if it is a symbol we are not already subscribed to. Every time a new order comes in, we will check the list of subscriptions to see if the new order’s stock symbol is in that list.  If it is, great, we already have that market data feed!  If not, then and only then should we subscribe to the feed for that symbol. So far so good, we have a collection of symbols and we want to see if a symbol is present in that collection and if not, add it.  This really is the essence of set processing, but for the sake of comparison, let’s say you do a list instead: 1: // class that handles are order processing service 2: public sealed class OrderProcessor 3: { 4: // contains list of all symbols we are currently subscribed to 5: private readonly List<string> _subscriptions = new List<string>(); 6:  7: ... 8: } Now whenever you are adding a new order, it would look something like: 1: public PlaceOrderResponse PlaceOrder(Order newOrder) 2: { 3: // do some validation, of course... 4:  5: // check to see if already subscribed, if not add a subscription 6: if (!_subscriptions.Contains(newOrder.Symbol)) 7: { 8: // add the symbol to the list 9: _subscriptions.Add(newOrder.Symbol); 10: 11: // do whatever magic is needed to start a subscription for the symbol 12: } 13:  14: // place the order logic! 15: } What’s wrong with this?  In short: performance!  Finding an item inside a List<T> is a linear - O(n) – operation, which is not a very performant way to find if an item exists in a collection. (I used to teach algorithms and data structures in my spare time at a local university, and when you began talking about big-O notation you could immediately begin to see eyes glossing over as if it was pure, useless theory that would not apply in the real world, but I did and still do believe it is something worth understanding well to make the best choices in computer science). Let’s think about this: a linear operation means that as the number of items increases, the time that it takes to perform the operation tends to increase in a linear fashion.  Put crudely, this means if you double the collection size, you might expect the operation to take something like the order of twice as long.  Linear operations tend to be bad for performance because they mean that to perform some operation on a collection, you must potentially “visit” every item in the collection.  Consider finding an item in a List<T>: if you want to see if the list has an item, you must potentially check every item in the list before you find it or determine it’s not found. Now, we could of course sort our list and then perform a binary search on it, but sorting is typically a linear-logarithmic complexity – O(n * log n) - and could involve temporary storage.  So performing a sort after each add would probably add more time.  As an alternative, we could use a SortedList<TKey, TValue> which sorts the list on every Add(), but this has a similar level of complexity to move the items and also requires a key and value, and in our case the key is the value. This is why sets tend to be the best choice for this type of processing: they don’t rely on separate keys and values for ordering – so they save space – and they typically don’t care about ordering – so they tend to be extremely performant.  The .NET BCL (Base Class Library) has had the HashSet<T> since .NET 3.5, but at that time it did not implement the ISet<T> interface.  As of .NET 4.0, HashSet<T> implements ISet<T> and a new set, the SortedSet<T> was added that gives you a set with ordering. HashSet<T> – For Unordered Storage of Sets When used right, HashSet<T> is a beautiful collection, you can think of it as a simplified Dictionary<T,T>.  That is, a Dictionary where the TKey and TValue refer to the same object.  This is really an oversimplification, but logically it makes sense.  I’ve actually seen people code a Dictionary<T,T> where they store the same thing in the key and the value, and that’s just inefficient because of the extra storage to hold both the key and the value. As it’s name implies, the HashSet<T> uses a hashing algorithm to find the items in the set, which means it does take up some additional space, but it has lightning fast lookups!  Compare the times below between HashSet<T> and List<T>: Operation HashSet<T> List<T> Add() O(1) O(1) at end O(n) in middle Remove() O(1) O(n) Contains() O(1) O(n)   Now, these times are amortized and represent the typical case.  In the very worst case, the operations could be linear if they involve a resizing of the collection – but this is true for both the List and HashSet so that’s a less of an issue when comparing the two. The key thing to note is that in the general case, HashSet is constant time for adds, removes, and contains!  This means that no matter how large the collection is, it takes roughly the exact same amount of time to find an item or determine if it’s not in the collection.  Compare this to the List where almost any add or remove must rearrange potentially all the elements!  And to find an item in the list (if unsorted) you must search every item in the List. So as you can see, if you want to create an unordered collection and have very fast lookup and manipulation, the HashSet is a great collection. And since HashSet<T> implements ICollection<T> and IEnumerable<T>, it supports nearly all the same basic operations as the List<T> and can use the System.Linq extension methods as well. All we have to do to switch from a List<T> to a HashSet<T>  is change our declaration.  Since List and HashSet support many of the same members, chances are we won’t need to change much else. 1: public sealed class OrderProcessor 2: { 3: private readonly HashSet<string> _subscriptions = new HashSet<string>(); 4:  5: // ... 6:  7: public PlaceOrderResponse PlaceOrder(Order newOrder) 8: { 9: // do some validation, of course... 10: 11: // check to see if already subscribed, if not add a subscription 12: if (!_subscriptions.Contains(newOrder.Symbol)) 13: { 14: // add the symbol to the list 15: _subscriptions.Add(newOrder.Symbol); 16: 17: // do whatever magic is needed to start a subscription for the symbol 18: } 19: 20: // place the order logic! 21: } 22:  23: // ... 24: } 25: Notice, we didn’t change any code other than the declaration for _subscriptions to be a HashSet<T>.  Thus, we can pick up the performance improvements in this case with minimal code changes. SortedSet<T> – Ordered Storage of Sets Just like HashSet<T> is logically similar to Dictionary<T,T>, the SortedSet<T> is logically similar to the SortedDictionary<T,T>. The SortedSet can be used when you want to do set operations on a collection, but you want to maintain that collection in sorted order.  Now, this is not necessarily mathematically relevant, but if your collection needs do include order, this is the set to use. So the SortedSet seems to be implemented as a binary tree (possibly a red-black tree) internally.  Since binary trees are dynamic structures and non-contiguous (unlike List and SortedList) this means that inserts and deletes do not involve rearranging elements, or changing the linking of the nodes.  There is some overhead in keeping the nodes in order, but it is much smaller than a contiguous storage collection like a List<T>.  Let’s compare the three: Operation HashSet<T> SortedSet<T> List<T> Add() O(1) O(log n) O(1) at end O(n) in middle Remove() O(1) O(log n) O(n) Contains() O(1) O(log n) O(n)   The MSDN documentation seems to indicate that operations on SortedSet are O(1), but this seems to be inconsistent with its implementation and seems to be a documentation error.  There’s actually a separate MSDN document (here) on SortedSet that indicates that it is, in fact, logarithmic in complexity.  Let’s put it in layman’s terms: logarithmic means you can double the collection size and typically you only add a single extra “visit” to an item in the collection.  Take that in contrast to List<T>’s linear operation where if you double the size of the collection you double the “visits” to items in the collection.  This is very good performance!  It’s still not as performant as HashSet<T> where it always just visits one item (amortized), but for the addition of sorting this is a good thing. Consider the following table, now this is just illustrative data of the relative complexities, but it’s enough to get the point: Collection Size O(1) Visits O(log n) Visits O(n) Visits 1 1 1 1 10 1 4 10 100 1 7 100 1000 1 10 1000   Notice that the logarithmic – O(log n) – visit count goes up very slowly compare to the linear – O(n) – visit count.  This is because since the list is sorted, it can do one check in the middle of the list, determine which half of the collection the data is in, and discard the other half (binary search).  So, if you need your set to be sorted, you can use the SortedSet<T> just like the HashSet<T> and gain sorting for a small performance hit, but it’s still faster than a List<T>. Unique Set Operations Now, if you do want to perform more set-like operations, both implementations of ISet<T> support the following, which play back towards the mathematical set operations described before: IntersectWith() – Performs the set intersection of two sets.  Modifies the current set so that it only contains elements also in the second set. UnionWith() – Performs a set union of two sets.  Modifies the current set so it contains all elements present both in the current set and the second set. ExceptWith() – Performs a set difference of two sets.  Modifies the current set so that it removes all elements present in the second set. IsSupersetOf() – Checks if the current set is a superset of the second set. IsSubsetOf() – Checks if the current set is a subset of the second set. For more information on the set operations themselves, see the MSDN description of ISet<T> (here). What Sets Don’t Do Don’t get me wrong, sets are not silver bullets.  You don’t really want to use a set when you want separate key to value lookups, that’s what the IDictionary implementations are best for. Also sets don’t store temporal add-order.  That is, if you are adding items to the end of a list all the time, your list is ordered in terms of when items were added to it.  This is something the sets don’t do naturally (though you could use a SortedSet with an IComparer with a DateTime but that’s overkill) but List<T> can. Also, List<T> allows indexing which is a blazingly fast way to iterate through items in the collection.  Iterating over all the items in a List<T> is generally much, much faster than iterating over a set. Summary Sets are an excellent tool for maintaining a lookup table where the item is both the key and the value.  In addition, if you have need for the mathematical set operations, the C# sets support those as well.  The HashSet<T> is the set of choice if you want the fastest possible lookups but don’t care about order.  In contrast the SortedSet<T> will give you a sorted collection at a slight reduction in performance.   Technorati Tags: C#,.Net,Little Wonders,BlackRabbitCoder,ISet,HashSet,SortedSet

    Read the article

  • ????ASMM

    - by Liu Maclean(???)
    ???Oracle??????????????SGA/PGA???,????10g????????????ASMM????,????????ASMM?????????Oracle??????????,?ASMM??????DBA????????????;????????ASMM???????????????DBA???:????????????DB,?????????????DBA?????????????????????????????????,ASMM??????????,???????????,??????????,??????????????????;?10g release 1?10.2??????ASMM?????????????,???????ASMM????????ASMM?????startup???????????ASMM??AMM??,????????DBA????SGA/PGA?????????”??”??”???”???,???????????DBA????chemist(???????1??2??????????????)? ?????????????????ASMM?????,?????????????…… Oracle?SGA???????9i???????????,????: Buffer Cache ????????????,??????????????? Default Pool                  ??????,???DB_CACHE_SIZE?? Keep Pool                     ??????,???DB_KEEP_CACHE_SIZE?? Non standard pool         ???????,???DB_nK_cache_size?? Recycle pool                 ???,???db_recycle_cache_size?? Shared Pool ???,???shared_pool_size?? Library cache   ?????? Row cache      ???,?????? Java Pool         java?,???Java_pool_size?? Large Pool       ??,???Large_pool_size?? Fixed SGA       ???SGA??,???Oracle???????,?????????granule? ?9i?????ASMM,???????????SGA,??????MSMM??9i???buffer cache??????????,?????????????????????????,???9i?????????????,?????????????????????????? ????SGA?????: ?????shared pool?default buffer pool????????,??????????? ?9i???????????(advisor),?????????? ??????????????? ?????????,?????? ?????,?????ORA-04031?????????? ASMM?????: ?????????? ???????????????? ???????sga_target?? ???????????,??????????? ??MSMM???????: ???? ???? ?????? ???? ??????????,??????????? ??????????????????,??????????ORA-04031??? ASMM???????????:1.??????sga_target???????2.???????,???:????(memory component),????(memory broker)???????(memory mechanism)3.????(memory advisor) ASMM????????????(Automatically set),??????:shared_pool_size?db_cache_size?java_pool_size?large_pool _size?streams_pool_size;?????????????????,???:db_keep_cache_size?db_recycle_cache_size?db_nk_cache_size?log_buffer????SGA?????,????????????????,??log_buffer?fixed sga??????????????? ??ASMM?????????sga_target??,???????ASMM??????????????????db_cache_size?java_pool_size???,?????????????????????,????????????????????(???)????????,Oracle?????????(granule,?SGA<1GB?granule???4M,?SGA>1GB?granule???16M)???????,??????????????buffer cache,??????????????????(granule)??????????????????????sga_target??,???????????????????(dism,???????)???ASMM?????????????statistics_level?????typical?ALL,?????BASIC??MMON????(Memory Monitor is a background process that gathers memory statistics (snapshots) stores this information in the AWR (automatic workload repository). MMON is also responsible for issuing alerts for metrics that exceed their thresholds)?????????????????????ASMM?????,???????????sga_target?????statistics_level?BASIC: SQL> show parameter sga NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ lock_sga boolean FALSE pre_page_sga boolean FALSE sga_max_size big integer 2000M sga_target big integer 2000M SQL> show parameter sga_target NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ sga_target big integer 2000M SQL> alter system set statistics_level=BASIC; alter system set statistics_level=BASIC * ERROR at line 1: ORA-02097: parameter cannot be modified because specified value is invalid ORA-00830: cannot set statistics_level to BASIC with auto-tune SGA enabled ?????server parameter file?spfile??,ASMM????shutdown??????????????(Oracle???????,????????)???spfile?,?????strings?????spfile????????????????????,?: G10R2.__db_cache_size=973078528 G10R2.__java_pool_size=16777216 G10R2.__large_pool_size=16777216 G10R2.__shared_pool_size=1006632960 G10R2.__streams_pool_size=67108864 ???spfile?????????????????,???????????”???”?????,??????????”??”?? ?ASMM?????????????? ?????(tunable):????????????????????????????buffer cache?????????,cache????????????????,?????????? IO????????????????????????????Library cache????? subheap????,?????????????????????????????????(open cursors)?????????client??????????????buffer cache???????,???????????pin??buffer???(???????) ?????(Un-tunable):???????????????????,?????????????????,?????????????????????????large pool?????? ??????(Fixed Size):???????????,??????????????????????????????????????? ????????????????(memory resize request)?????????,?????: ??????(Immediate Request):???????????ASMM????????????????????????(chunk)?,??????OUT-OF-MEMORY(ORA-04031)???,????????????????????(granule)????????????????????granule,????????????,?????????????????????????????,????granule??????????????? ??????(Deferred Request):???????????????????????????,??????????????granule???????????????MMON??????????delta. ??????(Manual Request):????????????alter system?????????????????????????????????????????????????granule,??????grow?????ORA-4033??,?????shrink?????ORA-4034??? ?ASMM????,????(Memory Broker)????????????????????????????(Deferred)??????????????????????(auto-tunable component)???????????????,???????????????MMON??????????????????????????????????,????????????????;MMON????Memory Broker?????????????????????????MMON????????????????????????????????????????(resize request system queue)?MMAN????(Memory Manager is a background process that manages the dynamic resizing of SGA memory areas as the workload increases or decreases)??????????????????? ?10gR1?Shared Pool?shrink??????????,?????????????Buffer Cache???????????granule,????Buffer Cache?granule????granule header?Metadata(???buffer header??RAC??Lock Elements)????,?????????????????????shared pool????????duration(?????)?chunk??????granule?,????????????granule??10gR2????Buffer Cache Granule????????granule header?buffer?Metadata(buffer header?LE)????,??shared pool???duration?chunk????????granule,??????buffer cache?shared pool??????????????10gr2?streams pool?????????(???????streams pool duration????) ??????????(Donor,???trace????)???,?????????granule???buffer cache,????granule????????????: ????granule???????granule header ?????chunk????granule?????????buffer header ???,???chunk??????????????????????metadata? ???2-4??,???granule???? ??????????????????,??buffer cache??granule???shared pool?,???????: MMAN??????????buffer cache???granule MMAN????granule??quiesce???(Moving 1 granule from inuse to quiesce list of DEFAULT buffer cache for an immediate req) DBWR???????quiesced???granule????buffer(dirty buffer) MMAN??shared pool????????(consume callback),granule?free?chunk???shared pool??(consume)?,????????????????????granule????shared granule??????,???????????granule???????????,??????pin??buffer??Metadata(???buffer header?LE)?????buffer cache??? ???granule???????shared pool,???granule?????shared??? ?????ASMM???????????,??????????: _enabled_shared_pool_duration:?????????10g????shared pool duration??,?????sga_target?0?????false;???10.2.0.5??cursor_space_for_time???true??????false,???10.2.0.5??cursor_space_for_time????? _memory_broker_shrink_heaps:???????0??Oracle?????shared pool?java pool,??????0,??shrink request??????????????????? _memory_management_tracing: ???????MMON?MMAN??????????(advisor)?????(Memory Broker)?????trace???;??ORA-04031????????36,???8?????????????trace,???23????Memory Broker decision???,???32???cache resize???;??????????: Level Contents 0×01 Enables statistics tracing 0×02 Enables policy tracing 0×04 Enables transfer of granules tracing 0×08 Enables startup tracing 0×10 Enables tuning tracing 0×20 Enables cache tracing ?????????_memory_management_tracing?????DUMP_TRANSFER_OPS????????????????,?????????????????trace?????????mman_trace?transfer_ops_dump? SQL> alter system set "_memory_management_tracing"=63; System altered Operation make shared pool grow and buffer cache shrink!!!.............. ???????granule?????,????default buffer pool?resize??: AUTO SGA: Request 0xdc9c2628 after pre-processing, ret=0 /* ???0xdc9c2628??????addr */ AUTO SGA: IMMEDIATE, FG request 0xdc9c2628 /* ???????????Immediate???? */ AUTO SGA: Receiver of memory is shared pool, size=16, state=3, flg=0 /* ?????????shared pool,???,????16?granule,??grow?? */ AUTO SGA: Donor of memory is DEFAULT buffer cache, size=106, state=4, flg=0 /* ???????Default buffer cache,????,????106?granule,??shrink?? */ AUTO SGA: Memory requested=3896, remaining=3896 /* ??immeidate request???????3896 bytes */ AUTO SGA: Memory received=0, minreq=3896, gransz=16777216 /* ????free?granule,??received?0,gransz?granule??? */ AUTO SGA: Request 0xdc9c2628 status is INACTIVE /* ??????????,??????inactive?? */ AUTO SGA: Init bef rsz for request 0xdc9c2628 /* ????????before-process???? */ AUTO SGA: Set rq dc9c2628 status to PENDING /* ?request??pending?? */ AUTO SGA: 0xca000000 rem=3896, rcvd=16777216, 105, 16777216, 17 /* ???????0xca000000?16M??granule */ AUTO SGA: Returning 4 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 4, 1, a AUTO SGA: Resize done for pool DEFAULT, 8192 /* ???default pool?resize */ AUTO SGA: Init aft rsz for request 0xdc9c2628 AUTO SGA: Request 0xdc9c2628 after processing AUTO SGA: IMMEDIATE, FG request 0x7fff917964a0 AUTO SGA: Receiver of memory is shared pool, size=17, state=0, flg=0 AUTO SGA: Donor of memory is DEFAULT buffer cache, size=105, state=0, flg=0 AUTO SGA: Memory requested=3896, remaining=0 AUTO SGA: Memory received=16777216, minreq=3896, gransz=16777216 AUTO SGA: Request 0x7fff917964a0 status is COMPLETE /* shared pool????16M?granule */ AUTO SGA: activated granule 0xca000000 of shared pool ?????partial granule????????????trace: AUTO SGA: Request 0xdc9c2628 after pre-processing, ret=0 AUTO SGA: IMMEDIATE, FG request 0xdc9c2628 AUTO SGA: Receiver of memory is shared pool, size=82, state=3, flg=1 AUTO SGA: Donor of memory is DEFAULT buffer cache, size=36, state=4, flg=1 /* ????????shared pool,?????default buffer cache */ AUTO SGA: Memory requested=4120, remaining=4120 AUTO SGA: Memory received=0, minreq=4120, gransz=16777216 AUTO SGA: Request 0xdc9c2628 status is INACTIVE AUTO SGA: Init bef rsz for request 0xdc9c2628 AUTO SGA: Set rq dc9c2628 status to PENDING AUTO SGA: Moving granule 0x93000000 of DEFAULT buffer cache to activate list AUTO SGA: Moving 1 granule 0x8c000000 from inuse to quiesce list of DEFAULT buffer cache for an immediate req /* ???buffer cache??????0x8c000000?granule??????inuse list, ???????quiesce list? */ AUTO SGA: Returning 0 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 0, 1, 20a AUTO SGA: activated granule 0x93000000 of DEFAULT buffer cache AUTO SGA: NOT_FREE for imm req for gran 0x8c000000 / * ??dbwr??0x8c000000 granule????dirty buffer */ AUTO SGA: Returning 0 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 0, 1, 20a AUTO SGA: NOT_FREE for imm req for gran 0x8c000000 AUTO SGA: Returning 0 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 0, 1, 20a AUTO SGA: NOT_FREE for imm req for gran 0x8c000000 AUTO SGA: Returning 0 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 0, 1, 20a AUTO SGA: NOT_FREE for imm req for gran 0x8c000000 AUTO SGA: Returning 0 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 0, 1, 20a AUTO SGA: NOT_FREE for imm req for gran 0x8c000000 AUTO SGA: Returning 0 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 0, 1, 20a AUTO SGA: NOT_FREE for imm req for gran 0x8c000000 ......................................... AUTO SGA: Rcv shared pool consuming 8192 from 0x8c000000 in granule 0x8c000000; owner is DEFAULT buffer cache AUTO SGA: Rcv shared pool consuming 90112 from 0x8c002000 in granule 0x8c000000; owner is DEFAULT buffer cache AUTO SGA: Rcv shared pool consuming 24576 from 0x8c01a000 in granule 0x8c000000; owner is DEFAULT buffer cache AUTO SGA: Rcv shared pool consuming 65536 from 0x8c022000 in granule 0x8c000000; owner is DEFAULT buffer cache AUTO SGA: Rcv shared pool consuming 131072 from 0x8c034000 in granule 0x8c000000; owner is DEFAULT buffer cache AUTO SGA: Rcv shared pool consuming 286720 from 0x8c056000 in granule 0x8c000000; owner is DEFAULT buffer cache AUTO SGA: Rcv shared pool consuming 98304 from 0x8c09e000 in granule 0x8c000000; owner is DEFAULT buffer cache AUTO SGA: Rcv shared pool consuming 106496 from 0x8c0b8000 in granule 0x8c000000; owner is DEFAULT buffer cache ..................... /* ??shared pool????0x8c000000 granule??chunk, ??granule?owner????default buffer cache */ AUTO SGA: Imm xfer 0x8c000000 from quiesce list of DEFAULT buffer cache to partial inuse list of shared pool /* ???0x8c000000 granule?default buffer cache????????shared pool????inuse list */ AUTO SGA: Returning 4 from kmgs_process for request dc9c2628 AUTO SGA: Process req dc9c2628 ret 4, 1, 20a AUTO SGA: Init aft rsz for request 0xdc9c2628 AUTO SGA: Request 0xdc9c2628 after processing AUTO SGA: IMMEDIATE, FG request 0x7fffe9bcd0e0 AUTO SGA: Receiver of memory is shared pool, size=83, state=0, flg=1 AUTO SGA: Donor of memory is DEFAULT buffer cache, size=35, state=0, flg=1 AUTO SGA: Memory requested=4120, remaining=0 AUTO SGA: Memory received=14934016, minreq=4120, gransz=16777216 AUTO SGA: Request 0x7fffe9bcd0e0 status is COMPLETE /* ????partial transfer?? */ ?????partial transfer??????DUMP_TRANSFER_OPS????0x8c000000 partial granule???????,?: SQL> oradebug setmypid; Statement processed. SQL> oradebug dump DUMP_TRANSFER_OPS 1; Statement processed. SQL> oradebug tracefile_name; /s01/admin/G10R2/udump/g10r2_ora_21482.trc =======================trace content============================== GRANULE SIZE is 16777216 COMPONENT NAME : shared pool Number of granules in partially inuse list (listid 4) is 23 Granule addr is 0x8c000000 Granule owner is DEFAULT buffer cache /* ?0x8c000000 granule?shared pool?partially inuse list, ?????owner??default buffer cache */ Granule 0x8c000000 dump from owner perspective gptr = 0x8c000000, num buf hdrs = 1989, num buffers = 156, ghdr = 0x8cffe000 / * ?????granule?granule header????0x8cffe000, ????156?buffer block,1989?buffer header */ /* ??granule??????,??????buffer cache??shared pool chunk */ BH:0x8cf76018 BA:(nil) st:11 flg:20000 BH:0x8cf76128 BA:(nil) st:11 flg:20000 BH:0x8cf76238 BA:(nil) st:11 flg:20000 BH:0x8cf76348 BA:(nil) st:11 flg:20000 BH:0x8cf76458 BA:(nil) st:11 flg:20000 BH:0x8cf76568 BA:(nil) st:11 flg:20000 BH:0x8cf76678 BA:(nil) st:11 flg:20000 BH:0x8cf76788 BA:(nil) st:11 flg:20000 BH:0x8cf76898 BA:(nil) st:11 flg:20000 BH:0x8cf769a8 BA:(nil) st:11 flg:20000 BH:0x8cf76ab8 BA:(nil) st:11 flg:20000 BH:0x8cf76bc8 BA:(nil) st:11 flg:20000 BH:0x8cf76cd8 BA:0x8c018000 st:1 flg:622202 ............... Address 0x8cf30000 to 0x8cf74000 not in cache Address 0x8cf74000 to 0x8d000000 in cache Granule 0x8c000000 dump from receivers perspective Dumping layout Address 0x8c000000 to 0x8c018000 in sga heap(1,3) (idx=1, dur=4) Address 0x8c018000 to 0x8c01a000 not in this pool Address 0x8c01a000 to 0x8c020000 in sga heap(1,3) (idx=1, dur=4) Address 0x8c020000 to 0x8c022000 not in this pool Address 0x8c022000 to 0x8c032000 in sga heap(1,3) (idx=1, dur=4) Address 0x8c032000 to 0x8c034000 not in this pool Address 0x8c034000 to 0x8c054000 in sga heap(1,3) (idx=1, dur=4) Address 0x8c054000 to 0x8c056000 not in this pool Address 0x8c056000 to 0x8c09c000 in sga heap(1,3) (idx=1, dur=4) Address 0x8c09c000 to 0x8c09e000 not in this pool Address 0x8c09e000 to 0x8c0b6000 in sga heap(1,3) (idx=1, dur=4) Address 0x8c0b6000 to 0x8c0b8000 not in this pool Address 0x8c0b8000 to 0x8c0d2000 in sga heap(1,3) (idx=1, dur=4) ???????granule?????shared granule??????,?????????buffer block,????1?shared subpool??????durtaion?4?chunk,duration=4?execution duration;??duration?chunk???????????,??extent???quiesce list??????????????free?execution duration?????????????,??????duration???extent(??????extent????granule)??????? ?????????????ASMM?????????,????: V$SGAINFODisplays summary information about the system global area (SGA). V$SGADisplays size information about the SGA, including the sizes of different SGA components, the granule size, and free memory. V$SGASTATDisplays detailed information about the SGA. V$SGA_DYNAMIC_COMPONENTSDisplays information about the dynamic SGA components. This view summarizes information based on all completed SGA resize operations since instance startup. V$SGA_DYNAMIC_FREE_MEMORYDisplays information about the amount of SGA memory available for future dynamic SGA resize operations. V$SGA_RESIZE_OPSDisplays information about the last 400 completed SGA resize operations. V$SGA_CURRENT_RESIZE_OPSDisplays information about SGA resize operations that are currently in progress. A resize operation is an enlargement or reduction of a dynamic SGA component. V$SGA_TARGET_ADVICEDisplays information that helps you tune SGA_TARGET. ?????????shared pool duration???,?????????

    Read the article

  • Is there a Windows equivalent for eventfd?

    - by Leaurus
    I am writing a cross-platform library which emulates sockets behaviour, having additional functionality in the between (App-mylib-sockets). I want it to be the most transparent possible for the programmer, so primitives like select and poll must work with accordingly with this lib. The problem is when data becomes available (for instance) in the real socket, it will have to go through a lot of processing, so if select points to the real socket fd, app will be blocked a lot of time. I want the select/poll to unblock only when data is ready to be consumed (after my lib has done all the processing). So I came across this eventfd which allows me to do exactly what I want, i.e. to manipule select/poll behaviour on a given fd. Since I am much more familiarized with Linux environment, I don't know what is the windows equivalent of eventfd. Tried to search but got no luck. Anyone can help me, please?

    Read the article

< Previous Page | 126 127 128 129 130 131 132 133 134 135 136 137  | Next Page >