Search Results

Search found 8328 results on 334 pages for 'dour high arch'.

Page 67/334 | < Previous Page | 63 64 65 66 67 68 69 70 71 72 73 74  | Next Page >

  • When spliting MP4s with ffmpeg how do I include metadata?

    - by Josh
    I have a few MP4s that i want to upload to my flickr account but they have a maximum size of 500mb as mine is only about 550 i was planing to simply split them in half then upload them, but i want to make sure all the meta data is included but it does not seem to be. I have tried each of the following with no luck, (at the end of this post i have the original and the new ffprobe outputs): ffmpeg -ss 00:00:00.00 -t 00:04:19.35 -i SANY0069.MP4 -acodec copy -vcodec copy -map_metadata 0:0 SANY0069A.MP4 ffmpeg -ss 00:00:00.00 -t 00:04:19.35 -i SANY0069.MP4 -acodec copy -vcodec copy -map_meta_data SANY0069.MP4:SANY0069A.MP4 SANY0069A.MP4 with the this one I manually produced the individual meta tags that i took from this command ffmpeg -i SANY0069A.MP4 -f ffmetadata meta.txt ffmpeg -ss 00:00:00.00 -t 00:04:19.35 -i SANY0069.MP4 -acodec copy -vcodec copy -metadata major_brand="mp42" -metadata minor_version="1" -metadata compatible_brands="mp42avc1" -metadata creation_time="2012-09-29 09:05:50" -metadata comment="SANYO DIGITAL CAMERA CA9" -metadata comment-eng="SANYO DIGITAL CAMERA CA9" SANY0069A.MP4 using the output of the former command i also tried this: ffmpeg -ss 00:00:00.00 -t 00:04:19.35 -i SANY0069.MP4 -acodec copy -vcodec copy -f ffmetadata -i meta.txt SANY0069A.MP4 Output: sample output from my first command: ffmpeg -ss 00:00:00.00 -t 00:04:19.35 -i SANY0069.MP4 -acodec copy -vcodec copy -map_metadata 0:0 SANY0069A.MP4 ffmpeg version 0.8.12, Copyright (c) 2000-2011 the FFmpeg developers built on Jun 13 2012 09:57:38 with gcc 4.6.3 20120306 (Red Hat 4.6.3-2) configuration: --prefix=/usr --bindir=/usr/bin --datadir=/usr/share/ffmpeg --incdir=/usr/include/ffmpeg --libdir=/usr/lib64 --mandir=/usr/share/man --arch=x86_64 --extra-cflags='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' --enable-bzlib --enable-libcelt --enable-libdc1394 --enable-libdirac --enable-libfreetype --enable-libgsm --enable-libmp3lame --enable-libopenjpeg --enable-librtmp --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libxvid --enable-x11grab --enable-avfilter --enable-postproc --enable-pthreads --disable-static --enable-shared --enable-gpl --disable-debug --disable-stripping --shlibdir=/usr/lib64 --enable-runtime-cpudetect libavutil 51. 9. 1 / 51. 9. 1 libavcodec 53. 8. 0 / 53. 8. 0 libavformat 53. 5. 0 / 53. 5. 0 libavdevice 53. 1. 1 / 53. 1. 1 libavfilter 2. 23. 0 / 2. 23. 0 libswscale 2. 0. 0 / 2. 0. 0 libpostproc 51. 2. 0 / 51. 2. 0 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'SANY0069.MP4': Metadata: major_brand : mp42 minor_version : 1 compatible_brands: mp42avc1 creation_time : 2012-09-29 09:05:50 comment : SANYO DIGITAL CAMERA CA9 comment-eng : SANYO DIGITAL CAMERA CA9 Duration: 00:08:38.71, start: 0.000000, bitrate: 9142 kb/s Stream #0.0(eng): Video: h264 (Constrained Baseline), yuv420p, 1280x720 [PAR 1:1 DAR 16:9], 9007 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc Metadata: creation_time : 2012-09-29 09:05:50 Stream #0.1(eng): Audio: aac, 48000 Hz, stereo, s16, 127 kb/s Metadata: creation_time : 2012-09-29 09:05:50 File 'SANY0069A.MP4' already exists. Overwrite ? [y/N] y Output #0, mp4, to 'SANY0069A.MP4': Metadata: major_brand : mp42 minor_version : 1 compatible_brands: mp42avc1 creation_time : 2012-09-29 09:05:50 comment : SANYO DIGITAL CAMERA CA9 comment-eng : SANYO DIGITAL CAMERA CA9 encoder : Lavf53.5.0 Stream #0.0(eng): Video: libx264, yuv420p, 1280x720 [PAR 1:1 DAR 16:9], q=2-31, 9007 kb/s, 30k tbn, 29.97 tbc Metadata: creation_time : 2012-09-29 09:05:50 Stream #0.1(eng): Audio: aac, 48000 Hz, stereo, 127 kb/s Metadata: creation_time : 2012-09-29 09:05:50 Stream mapping: Stream #0.0 -> #0.0 Stream #0.1 -> #0.1 Press [q] to stop, [?] for help frame= 7773 fps=4644 q=-1.0 Lsize= 289607kB time=00:04:19.35 bitrate=9147.4kbits/s video:285416kB audio:4033kB global headers:0kB muxing overhead 0.054571% and finaly, when i compare the ffprobe of the original and the first split part i get the 2 following outputs: original ffprobe version 0.8.12, Copyright (c) 2007-2011 the FFmpeg developers built on Jun 13 2012 09:57:38 with gcc 4.6.3 20120306 (Red Hat 4.6.3-2) configuration: --prefix=/usr --bindir=/usr/bin --datadir=/usr/share/ffmpeg --incdir=/usr/include/ffmpeg --libdir=/usr/lib64 --mandir=/usr/share/man --arch=x86_64 --extra-cflags='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' --enable-bzlib --enable-libcelt --enable-libdc1394 --enable-libdirac --enable-libfreetype --enable-libgsm --enable-libmp3lame --enable-libopenjpeg --enable-librtmp --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libxvid --enable-x11grab --enable-avfilter --enable-postproc --enable-pthreads --disable-static --enable-shared --enable-gpl --disable-debug --disable-stripping --shlibdir=/usr/lib64 --enable-runtime-cpudetect libavutil 51. 9. 1 / 51. 9. 1 libavcodec 53. 8. 0 / 53. 8. 0 libavformat 53. 5. 0 / 53. 5. 0 libavdevice 53. 1. 1 / 53. 1. 1 libavfilter 2. 23. 0 / 2. 23. 0 libswscale 2. 0. 0 / 2. 0. 0 libpostproc 51. 2. 0 / 51. 2. 0 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'SANY0069.MP4': Metadata: major_brand : mp42 minor_version : 1 compatible_brands: mp42avc1 creation_time : 2012-09-29 09:05:50 comment : SANYO DIGITAL CAMERA CA9 comment-eng : SANYO DIGITAL CAMERA CA9 Duration: 00:08:38.71, start: 0.000000, bitrate: 9142 kb/s Stream #0.0(eng): Video: h264 (Constrained Baseline), yuv420p, 1280x720 [PAR 1:1 DAR 16:9], 9007 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc Metadata: creation_time : 2012-09-29 09:05:50 Stream #0.1(eng): Audio: aac, 48000 Hz, stereo, s16, 127 kb/s Metadata: creation_time : 2012-09-29 09:05:50 Split ffprobe version 0.8.12, Copyright (c) 2007-2011 the FFmpeg developers built on Jun 13 2012 09:57:38 with gcc 4.6.3 20120306 (Red Hat 4.6.3-2) configuration: --prefix=/usr --bindir=/usr/bin --datadir=/usr/share/ffmpeg --incdir=/usr/include/ffmpeg --libdir=/usr/lib64 --mandir=/usr/share/man --arch=x86_64 --extra-cflags='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' --enable-bzlib --enable-libcelt --enable-libdc1394 --enable-libdirac --enable-libfreetype --enable-libgsm --enable-libmp3lame --enable-libopenjpeg --enable-librtmp --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libxvid --enable-x11grab --enable-avfilter --enable-postproc --enable-pthreads --disable-static --enable-shared --enable-gpl --disable-debug --disable-stripping --shlibdir=/usr/lib64 --enable-runtime-cpudetect libavutil 51. 9. 1 / 51. 9. 1 libavcodec 53. 8. 0 / 53. 8. 0 libavformat 53. 5. 0 / 53. 5. 0 libavdevice 53. 1. 1 / 53. 1. 1 libavfilter 2. 23. 0 / 2. 23. 0 libswscale 2. 0. 0 / 2. 0. 0 libpostproc 51. 2. 0 / 51. 2. 0 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'SANY0069A.MP4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 creation_time : 1970-01-01 00:00:00 encoder : Lavf53.5.0 comment : SANYO DIGITAL CAMERA CA9 Duration: 00:04:19.37, start: 0.000000, bitrate: 9146 kb/s Stream #0.0(eng): Video: h264 (Constrained Baseline), yuv420p, 1280x720 [PAR 1:1 DAR 16:9], 9015 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc Metadata: creation_time : 1970-01-01 00:00:00 Stream #0.1(eng): Audio: aac, 48000 Hz, stereo, s16, 127 kb/s Metadata: creation_time : 1970-01-01 00:00:00 I know this is incredibly long but its actually a quite simple question. I thought it would be best to provide as much detail as possible. any advice here would be great, Thanks

    Read the article

  • GUI tool to extract iso file contents on Linux (KDE)

    - by extropy
    I have an ISO CD image file and want to extract it's contents to a folder. I know there are ways to mount the image and stuff, but it's complicated. I'm looking for a GUI tool to open up the contets and extract needed files. On windows I would use WinRar to do this. K3B only allows me to burn the stuff, Arch does not work with ISO files :( Is there a similar tool on Linux, preferably from KDE world?

    Read the article

  • Duplication of Architecture State means physically extra?

    - by Doopy Doo
    Hyper-Threading Technology makes a single physical processor appear as two logical processors; the physical execution resources are shared and the architecture state is duplicated for the two logical processors. So, this means that there are two sets of basic registers such as Next Instruction Pointer, processor registers like AX, BX, CX etc physically embedded in the micro-processor chip, OR they(arch. state) are made to look two sets by some low level duplication by software/OS.

    Read the article

  • Why do I get different openssl versions?

    - by CoCoMonk
    I'm trying to check if I have the latest OpenSSL, my main concern in the heartbleed bug. I tried 2 commands: openssl version yum info openssl openssl version output OpenSSL 1.0.1e-fips 11 Feb 2013 yum info openssl output Installed Packages Name : openssl Arch : x86_64 Version : 1.0.1e Release : 16.el6_5.14 ... I have a couple of questions: Why do I get different versions from these 2 commands? How do I check the heartbleed vulnerability without having the 443 port open?

    Read the article

  • xmonad + urxvt issue: text disappears after resizing

    - by user1212010
    I'm using Arch Linux + xmonad + urxvt bundle and trying to resolve the conflict between xmonad and urxvt. Better to explain with shots: Firstly, open the terminal and get some full-length output. Secondly, create another window, which squeezes the first. And, finally, close it to find out that half of the output disappeared. Sometimes it behaves correctly, sometimes not. Tried to find out why, but failed. Many thx in advance!

    Read the article

  • How to disable Skype from using system tray (notification area) on Linux?

    - by Ivan
    I use AWN dock on Linux (which behaves much like Windows 7 panes, using one icon for launching an application and managing its windows). It loses Skype (as well as any other application) when it goes minimized to system tray (notification area). Can I disable Skype from minimizing to system tray favouring Win7-like behaviour? Skype version I use is 2.1 beta, but I would not mind reinstalling it in favour of another version. I use Arch Linux with all the latest kernel, x.org and xfce.

    Read the article

  • Google chrome proxy authentication dialogue timeout

    - by Nihar Sarangi
    I am on a network that uses LDAP proxy for authentication based on a username and password. Whenever I start Google Chrome, it pops up with a proxy authentication dialogue, but the dialogue disappears automatically after variable amount of time (sometimes it stays for 5 seconds some times less than 1 second). I have found the same issue with Chromium also. Is there any configuration I can set to control this timeout, or say, auto-authenticate with my authentication details from the shell or DE (Gnome3 on Arch)?

    Read the article

  • SQL SERVER – Shrinking Database is Bad – Increases Fragmentation – Reduces Performance

    - by pinaldave
    Earlier, I had written two articles related to Shrinking Database. I wrote about why Shrinking Database is not good. SQL SERVER – SHRINKDATABASE For Every Database in the SQL Server SQL SERVER – What the Business Says Is Not What the Business Wants I received many comments on Why Database Shrinking is bad. Today we will go over a very interesting example that I have created for the same. Here are the quick steps of the example. Create a test database Create two tables and populate with data Check the size of both the tables Size of database is very low Check the Fragmentation of one table Fragmentation will be very low Truncate another table Check the size of the table Check the fragmentation of the one table Fragmentation will be very low SHRINK Database Check the size of the table Check the fragmentation of the one table Fragmentation will be very HIGH REBUILD index on one table Check the size of the table Size of database is very HIGH Check the fragmentation of the one table Fragmentation will be very low Here is the script for the same. USE MASTER GO CREATE DATABASE ShrinkIsBed GO USE ShrinkIsBed GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Create FirstTable CREATE TABLE FirstTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_FirstTable_ID] ON FirstTable ( [ID] ASC ) ON [PRIMARY] GO -- Create SecondTable CREATE TABLE SecondTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_SecondTable_ID] ON SecondTable ( [ID] ASC ) ON [PRIMARY] GO -- Insert One Hundred Thousand Records INSERT INTO FirstTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Insert One Hundred Thousand Records INSERT INTO SecondTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO Let us check the table size and fragmentation. Now let us TRUNCATE the table and check the size and Fragmentation. USE MASTER GO CREATE DATABASE ShrinkIsBed GO USE ShrinkIsBed GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Create FirstTable CREATE TABLE FirstTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_FirstTable_ID] ON FirstTable ( [ID] ASC ) ON [PRIMARY] GO -- Create SecondTable CREATE TABLE SecondTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_SecondTable_ID] ON SecondTable ( [ID] ASC ) ON [PRIMARY] GO -- Insert One Hundred Thousand Records INSERT INTO FirstTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Insert One Hundred Thousand Records INSERT INTO SecondTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can clearly see that after TRUNCATE, the size of the database is not reduced and it is still the same as before TRUNCATE operation. After the Shrinking database operation, we were able to reduce the size of the database. If you notice the fragmentation, it is considerably high. The major problem with the Shrink operation is that it increases fragmentation of the database to very high value. Higher fragmentation reduces the performance of the database as reading from that particular table becomes very expensive. One of the ways to reduce the fragmentation is to rebuild index on the database. Let us rebuild the index and observe fragmentation and database size. -- Rebuild Index on FirstTable ALTER INDEX IX_SecondTable_ID ON SecondTable REBUILD GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can notice that after rebuilding, Fragmentation reduces to a very low value (almost same to original value); however the database size increases way higher than the original. Before rebuilding, the size of the database was 5 MB, and after rebuilding, it is around 20 MB. Regular rebuilding the index is rebuild in the same user database where the index is placed. This usually increases the size of the database. Look at irony of the Shrinking database. One person shrinks the database to gain space (thinking it will help performance), which leads to increase in fragmentation (reducing performance). To reduce the fragmentation, one rebuilds index, which leads to size of the database to increase way more than the original size of the database (before shrinking). Well, by Shrinking, one did not gain what he was looking for usually. Rebuild indexing is not the best suggestion as that will create database grow again. I have always remembered the excellent post from Paul Randal regarding Shrinking the database is bad. I suggest every one to read that for accuracy and interesting conversation. Let us run following script where we Shrink the database and REORGANIZE. -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO -- Shrink the Database DBCC SHRINKDATABASE (ShrinkIsBed); GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO -- Rebuild Index on FirstTable ALTER INDEX IX_SecondTable_ID ON SecondTable REORGANIZE GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can see that REORGANIZE does not increase the size of the database or remove the fragmentation. Again, I no way suggest that REORGANIZE is the solution over here. This is purely observation using demo. Read the blog post of Paul Randal. Following script will clean up the database -- Clean up USE MASTER GO ALTER DATABASE ShrinkIsBed SET SINGLE_USER WITH ROLLBACK IMMEDIATE GO DROP DATABASE ShrinkIsBed GO There are few valid cases of the Shrinking database as well, but that is not covered in this blog post. We will cover that area some other time in future. Additionally, one can rebuild index in the tempdb as well, and we will also talk about the same in future. Brent has written a good summary blog post as well. Are you Shrinking your database? Well, when are you going to stop Shrinking it? Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Index, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

    Read the article

  • MySQL – Scalability on Amazon RDS: Scale out to multiple RDS instances

    - by Pinal Dave
    Today, I’d like to discuss getting better MySQL scalability on Amazon RDS. The question of the day: “What can you do when a MySQL database needs to scale write-intensive workloads beyond the capabilities of the largest available machine on Amazon RDS?” Let’s take a look. In a typical EC2/RDS set-up, users connect to app servers from their mobile devices and tablets, computers, browsers, etc.  Then app servers connect to an RDS instance (web/cloud services) and in some cases they might leverage some read-only replicas.   Figure 1. A typical RDS instance is a single-instance database, with read replicas.  This is not very good at handling high write-based throughput. As your application becomes more popular you can expect an increasing number of users, more transactions, and more accumulated data.  User interactions can become more challenging as the application adds more sophisticated capabilities. The result of all this positive activity: your MySQL database will inevitably begin to experience scalability pressures. What can you do? Broadly speaking, there are four options available to improve MySQL scalability on RDS. 1. Larger RDS Instances – If you’re not already using the maximum available RDS instance, you can always scale up – to larger hardware.  Bigger CPUs, more compute power, more memory et cetera. But the largest available RDS instance is still limited.  And they get expensive. “High-Memory Quadruple Extra Large DB Instance”: 68 GB of memory 26 ECUs (8 virtual cores with 3.25 ECUs each) 64-bit platform High I/O Capacity Provisioned IOPS Optimized: 1000Mbps 2. Provisioned IOPs – You can get provisioned IOPs and higher throughput on the I/O level. However, there is a hard limit with a maximum instance size and maximum number of provisioned IOPs you can buy from Amazon and you simply cannot scale beyond these hardware specifications. 3. Leverage Read Replicas – If your application permits, you can leverage read replicas to offload some reads from the master databases. But there are a limited number of replicas you can utilize and Amazon generally requires some modifications to your existing application. And read-replicas don’t help with write-intensive applications. 4. Multiple Database Instances – Amazon offers a fourth option: “You can implement partitioning,thereby spreading your data across multiple database Instances” (Link) However, Amazon does not offer any guidance or facilities to help you with this. “Multiple database instances” is not an RDS feature.  And Amazon doesn’t explain how to implement this idea. In fact, when asked, this is the response on an Amazon forum: Q: Is there any documents that describe the partition DB across multiple RDS? I need to use DB with more 1TB but exist a limitation during the create process, but I read in the any FAQ that you need to partition database, but I don’t find any documents that describe it. A: “DB partitioning/sharding is not an official feature of Amazon RDS or MySQL, but a technique to scale out database by using multiple database instances. The appropriate way to split data depends on the characteristics of the application or data set. Therefore, there is no concrete and specific guidance.” So now what? The answer is to scale out with ScaleBase. Amazon RDS with ScaleBase: What you get – MySQL Scalability! ScaleBase is specifically designed to scale out a single MySQL RDS instance into multiple MySQL instances. Critically, this is accomplished with no changes to your application code.  Your application continues to “see” one database.   ScaleBase does all the work of managing and enforcing an optimized data distribution policy to create multiple MySQL instances. With ScaleBase, data distribution, transactions, concurrency control, and two-phase commit are all 100% transparent and 100% ACID-compliant, so applications, services and tooling continue to interact with your distributed RDS as if it were a single MySQL instance. The result: now you can cost-effectively leverage multiple MySQL RDS instance to scale out write-intensive workloads to an unlimited number of users, transactions, and data. Amazon RDS with ScaleBase: What you keep – Everything! And how does this change your Amazon environment? 1. Keep your application, unchanged – There is no change your application development life-cycle at all.  You still use your existing development tools, frameworks and libraries.  Application quality assurance and testing cycles stay the same. And, critically, you stay with an ACID-compliant MySQL environment. 2. Keep your RDS value-added services – The value-added services that you rely on are all still available. Amazon will continue to handle database maintenance and updates for you. You can still leverage High Availability via Multi A-Z.  And, if it benefits youra application throughput, you can still use read replicas. 3. Keep your RDS administration – Finally the RDS monitoring and provisioning tools you rely on still work as they did before. With your one large MySQL instance, now split into multiple instances, you can actually use less expensive, smallersmaller available RDS hardware and continue to see better database performance. Conclusion Amazon RDS is a tremendous service, but it doesn’t offer solutions to scale beyond a single MySQL instance. Larger RDS instances get more expensive.  And when you max-out on the available hardware, you’re stuck.  Amazon recommends scaling out your single instance into multiple instances for transaction-intensive apps, but offers no services or guidance to help you. This is where ScaleBase comes in to save the day. It gives you a simple and effective way to create multiple MySQL RDS instances, while removing all the complexities typically caused by “DIY” sharding andwith no changes to your applications . With ScaleBase you continue to leverage the AWS/RDS ecosystem: commodity hardware and value added services like read replicas, multi A-Z, maintenance/updates and administration with monitoring tools and provisioning. SCALEBASE ON AMAZON If you’re curious to try ScaleBase on Amazon, it can be found here – Download NOW. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: MySQL, PostADay, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • General monitoring for SQL Server Analysis Services using Performance Monitor

    - by Testas
    A recent customer engagement required a setup of a monitoring solution for SSAS, due to the time restrictions placed upon this, native Windows Performance Monitor (Perfmon) and SQL Server Profiler Monitoring Tools was used as using a third party tool would have meant the customer providing an additional monitoring server that was not available.I wanted to outline the performance monitoring counters that was used to monitor the system on which SSAS was running. Due to the slow query performance that was occurring during certain scenarios, perfmon was used to establish if any pressure was being placed on the Disk, CPU or Memory subsystem when concurrent connections access the same query, and Profiler to pinpoint how the query was being managed within SSAS, profiler I will leave for another blogThis guide is not designed to provide a definitive list of what should be used when monitoring SSAS, different situations may require the addition or removal of counters as presented by the situation. However I hope that it serves as a good basis for starting your monitoring of SSAS. I would also like to acknowledge Chris Webb’s awesome chapters from “Expert Cube Development” that also helped shape my monitoring strategy:http://cwebbbi.spaces.live.com/blog/cns!7B84B0F2C239489A!6657.entrySimulating ConnectionsTo simulate the additional connections to the SSAS server whilst monitoring, I used ascmd to simulate multiple connections to the typical and worse performing queries that were identified by the customer. A similar sript can be downloaded from codeplex at http://www.codeplex.com/SQLSrvAnalysisSrvcs.     File name: ASCMD_StressTestingScripts.zip. Performance MonitorWithin performance monitor,  a counter log was created that contained the list of counters below. The important point to note when running the counter log is that the RUN AS property within the counter log properties should be changed to an account that has rights to the SSAS instance when monitoring MSAS counters. Failure to do so means that the counter log runs under the system account, no errors or warning are given while running the counter log, and it is not until you need to view the MSAS counters that they will not be displayed if run under the default account that has no right to SSAS. If your connection simulation takes hours, this could prove quite frustrating if not done beforehand JThe counters used……  Object Counter Instance Justification System Processor Queue legnth N/A Indicates how many threads are waiting for execution against the processor. If this counter is consistently higher than around 5 when processor utilization approaches 100%, then this is a good indication that there is more work (active threads) available (ready for execution) than the machine's processors are able to handle. System Context Switches/sec N/A Measures how frequently the processor has to switch from user- to kernel-mode to handle a request from a thread running in user mode. The heavier the workload running on your machine, the higher this counter will generally be, but over long term the value of this counter should remain fairly constant. If this counter suddenly starts increasing however, it may be an indicating of a malfunctioning device, especially if the Processor\Interrupts/sec\(_Total) counter on your machine shows a similar unexplained increase Process % Processor Time sqlservr Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process % Processor Time msmdsrv Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process Working Set sqlservr If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Process Working Set msmdsrv If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Processor % Processor Time _Total and individual cores measures the total utilization of your processor by all running processes. If multi-proc then be mindful only an average is provided Processor % Privileged Time _Total To see how the OS is handling basic IO requests. If kernel mode utilization is high, your machine is likely underpowered as it's too busy handling basic OS housekeeping functions to be able to effectively run other applications. Processor % User Time _Total To see how the applications is interacting from a processor perspective, a high percentage utilisation determine that the server is dealing with too many apps and may require increasing thje hardware or scaling out Processor Interrupts/sec _Total  The average rate, in incidents per second, at which the processor received and serviced hardware interrupts. Shoulr be consistant over time but a sudden unexplained increase could indicate a device malfunction which can be confirmed using the System\Context Switches/sec counter Memory Pages/sec N/A Indicates the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays, this is the primary counter to watch for indication of possible insufficient RAM to meet your server's needs. A good idea here is to configure a perfmon alert that triggers when the number of pages per second exceeds 50 per paging disk on your system. May also want to see the configuration of the page file on the Server Memory Available Mbytes N/A is the amount of physical memory, in bytes, available to processes running on the computer. if this counter is greater than 10% of the actual RAM in your machine then you probably have more than enough RAM. monitor it regularly to see if any downward trend develops, and set an alert to trigger if it drops below 2% of the installed RAM. Physical Disk Disk Transfers/sec for each physical disk If it goes above 10 disk I/Os per second then you've got poor response time for your disk. Physical Disk Idle Time _total If Disk Transfers/sec is above  25 disk I/Os per second use this counter. which measures the percent time that your hard disk is idle during the measurement interval, and if you see this counter fall below 20% then you've likely got read/write requests queuing up for your disk which is unable to service these requests in a timely fashion. Physical Disk Disk queue legnth For the OLAP and SQL physical disk A value that is consistently less than 2 means that the disk system is handling the IO requests against the physical disk Network Interface Bytes Total/sec For the NIC Should be monitored over a period of time to see if there is anb increase/decrease in network utilisation Network Interface Current Bandwidth For the NIC is an estimate of the current bandwidth of the network interface in bits per second (BPS). MSAS 2005: Memory Memory Limit High KB N/A Shows (as a percentage) the high memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Limit Low KB N/A Shows (as a percentage) the low memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Usage KB N/A Displays the memory usage of the server process. MSAS 2005: Memory File Store KB N/A Displays the amount of memory that is reserved for the Cache. Note if total memory limit in the msmdsrv.ini is set to 0, no memory is reserved for the cache MSAS 2005: Storage Engine Query Queries from Cache Direct / sec N/A Displays the rate of queries answered from the cache directly MSAS 2005: Storage Engine Query Queries from Cache Filtered / Sec N/A Displays the Rate of queries answered by filtering existing cache entry. MSAS 2005: Storage Engine Query Queries from File / Sec N/A Displays the Rate of queries answered from files. MSAS 2005: Storage Engine Query Average time /query N/A Displays the average time of a query MSAS 2005: Connection Current connections N/A Displays the number of connections against the SSAS instance MSAS 2005: Connection Requests / sec N/A Displays the rate of query requests per second MSAS 2005: Locks Current Lock Waits N/A Displays thhe number of connections waiting on a lock MSAS 2005: Threads Query Pool job queue Length N/A The number of queries in the job queue MSAS 2005:Proc Aggregations Temp file bytes written/sec N/A Shows the number of bytes of data processed in a temporary file MSAS 2005:Proc Aggregations Temp file rows written/sec N/A Shows the number of bytes of data processed in a temporary file 

    Read the article

  • How to Achieve OC4J RMI Load Balancing

    - by fip
    This is an old, Oracle SOA and OC4J 10G topic. In fact this is not even a SOA topic per se. Questions of RMI load balancing arise when you developed custom web applications accessing human tasks running off a remote SOA 10G cluster. Having returned from a customer who faced challenges with OC4J RMI load balancing, I felt there is still some confusions in the field how OC4J RMI load balancing work. Hence I decide to dust off an old tech note that I wrote a few years back and share it with the general public. Here is the tech note: Overview A typical use case in Oracle SOA is that you are building web based, custom human tasks UI that will interact with the task services housed in a remote BPEL 10G cluster. Or, in a more generic way, you are just building a web based application in Java that needs to interact with the EJBs in a remote OC4J cluster. In either case, you are talking to an OC4J cluster as RMI client. Then immediately you must ask yourself the following questions: 1. How do I make sure that the web application, as an RMI client, even distribute its load against all the nodes in the remote OC4J cluster? 2. How do I make sure that the web application, as an RMI client, is resilient to the node failures in the remote OC4J cluster, so that in the unlikely case when one of the remote OC4J nodes fail, my web application will continue to function? That is the topic of how to achieve load balancing with OC4J RMI client. Solutions You need to configure and code RMI load balancing in two places: 1. Provider URL can be specified with a comma separated list of URLs, so that the initial lookup will land to one of the available URLs. 2. Choose a proper value for the oracle.j2ee.rmi.loadBalance property, which, along side with the PROVIDER_URL property, is one of the JNDI properties passed to the JNDI lookup.(http://docs.oracle.com/cd/B31017_01/web.1013/b28958/rmi.htm#BABDGFBI) More details below: About the PROVIDER_URL The JNDI property java.name.provider.url's job is, when the client looks up for a new context at the very first time in the client session, to provide a list of RMI context The value of the JNDI property java.name.provider.url goes by the format of a single URL, or a comma separate list of URLs. A single URL. For example: opmn:ormi://host1:6003:oc4j_instance1/appName1 A comma separated list of multiple URLs. For examples:  opmn:ormi://host1:6003:oc4j_instanc1/appName, opmn:ormi://host2:6003:oc4j_instance1/appName, opmn:ormi://host3:6003:oc4j_instance1/appName When the client looks up for a new Context the very first time in the client session, it sends a query against the OPMN referenced by the provider URL. The OPMN host and port specifies the destination of such query, and the OC4J instance name and appName are actually the “where clause” of the query. When the PROVIDER URL reference a single OPMN server Let's consider the case when the provider url only reference a single OPMN server of the destination cluster. In this case, that single OPMN server receives the query and returns a list of the qualified Contexts from all OC4Js within the cluster, even though there is a single OPMN server in the provider URL. A context represent a particular starting point at a particular server for subsequent object lookup. For example, if the URL is opmn:ormi://host1:6003:oc4j_instance1/appName, then, OPMN will return the following contexts: appName on oc4j_instance1 on host1 appName on oc4j_instance1 on host2, appName on oc4j_instance1 on host3,  (provided that host1, host2, host3 are all in the same cluster) Please note that One OPMN will be sufficient to find the list of all contexts from the entire cluster that satisfy the JNDI lookup query. You can do an experiment by shutting down appName on host1, and observe that OPMN on host1 will still be able to return you appname on host2 and appName on host3. When the PROVIDER URL reference a comma separated list of multiple OPMN servers When the JNDI propery java.naming.provider.url references a comma separated list of multiple URLs, the lookup will return the exact same things as with the single OPMN server: a list of qualified Contexts from the cluster. The purpose of having multiple OPMN servers is to provide high availability in the initial context creation, such that if OPMN at host1 is unavailable, client will try the lookup via OPMN on host2, and so on. After the initial lookup returns and cache a list of contexts, the JNDI URL(s) are no longer used in the same client session. That explains why removing the 3rd URL from the list of JNDI URLs will not stop the client from getting the EJB on the 3rd server. About the oracle.j2ee.rmi.loadBalance Property After the client acquires the list of contexts, it will cache it at the client side as “list of available RMI contexts”.  This list includes all the servers in the destination cluster. This list will stay in the cache until the client session (JVM) ends. The RMI load balancing against the destination cluster is happening at the client side, as the client is switching between the members of the list. Whether and how often the client will fresh the Context from the list of Context is based on the value of the  oracle.j2ee.rmi.loadBalance. The documentation at http://docs.oracle.com/cd/B31017_01/web.1013/b28958/rmi.htm#BABDGFBI list all the available values for the oracle.j2ee.rmi.loadBalance. Value Description client If specified, the client interacts with the OC4J process that was initially chosen at the first lookup for the entire conversation. context Used for a Web client (servlet or JSP) that will access EJBs in a clustered OC4J environment. If specified, a new Context object for a randomly-selected OC4J instance will be returned each time InitialContext() is invoked. lookup Used for a standalone client that will access EJBs in a clustered OC4J environment. If specified, a new Context object for a randomly-selected OC4J instance will be created each time the client calls Context.lookup(). Please note the regardless of the setting of oracle.j2ee.rmi.loadBalance property, the “refresh” only occurs at the client. The client can only choose from the "list of available context" that was returned and cached from the very first lookup. That is, the client will merely get a new Context object from the “list of available RMI contexts” from the cache at the client side. The client will NOT go to the OPMN server again to get the list. That also implies that if you are adding a node to the server cluster AFTER the client’s initial lookup, the client would not know it because neither the server nor the client will initiate a refresh of the “list of available servers” to reflect the new node. About High Availability (i.e. Resilience Against Node Failure of Remote OC4J Cluster) What we have discussed above is about load balancing. Let's also discuss high availability. This is how the High Availability works in RMI: when the client use the context but get an exception such as socket is closed, it knows that the server referenced by that Context is problematic and will try to get another unused Context from the “list of available contexts”. Again, this list is the list that was returned and cached at the very first lookup in the entire client session.

    Read the article

  • People, Process & Engagement: WebCenter Partner Keste

    - by Michael Snow
    v\:* {behavior:url(#default#VML);} o\:* {behavior:url(#default#VML);} w\:* {behavior:url(#default#VML);} .shape {behavior:url(#default#VML);} Within the WebCenter group here at Oracle, discussions about people, process and engagement cross over many vertical industries and products. Amidst our growing partner ecosystem, the community provides us insight into great customer use cases every day. Such is the case with our partner, Keste, who provides us a guest post on our blog today with an overview of their innovative solution for a customer in the transportation industry. Keste is an Oracle software solutions and development company headquartered in Dallas, Texas. As a Platinum member of the Oracle® PartnerNetwork, Keste designs, develops and deploys custom solutions that automate complex business processes. Seamless Customer Self-Service Experience in the Trucking Industry with Oracle WebCenter Portal  Keste, Oracle Platinum Partner Customer Overview Omnitracs, Inc., a Qualcomm company provides mobility solutions for trucking fleets to companies in the transportation industry. Omnitracs’ mobility services include basic communications such as text as well as advanced monitoring services such as GPS tracking, temperature tracking of perishable goods, load tracking and weighting distribution, and many others. Customer Business Needs Already the leading provider of mobility solutions for large trucking fleets, they chose to target smaller trucking fleets as new customers. However their existing high-touch customer support method would not be a cost effective or scalable method to manage and service these smaller customers. Omnitracs needed to provide several self-service features to make customer support more scalable while keeping customer satisfaction levels high and the costs manageable. The solution also had to be very intuitive and easy to use. The systems that Omnitracs sells to these trucking customers require professional installation and smaller customers need to track and schedule the installation. Information captured in Oracle eBusiness Suite needed to be readily available for new customers to track these purchases and delivery details. Omnitracs wanted a high impact User Interface to significantly improve customer experience with the ability to integrate with EBS, provisioning systems as well as CRM systems that were already implemented. Omnitracs also wanted to build an architecture platform that could potentially be extended to other Portals. Omnitracs’ stated goal was to deliver an “eBay-like” or “Amazon-like” experience for all of their customers so that they could reach a much broader market beyond their large company customer base. Solution Overview In order to manage the increased complexity, the growing support needs of global customers and improve overall product time-to-market in a cost-effective manner, IT began to deliver a self-service model. This self service model not only transformed numerous business processes but is also allowing the business to keep up with the growing demands of the (internal and external) customers. This solution was a customer service Portal that provided self service capabilities for large and small customers alike for Activation of mobility products, managing add-on applications for the devices (much like the Apple App Store), transferring services when trucks are sold to other companies as well as deactivation all without the involvement of a call service agent or sending multiple emails to different Omnitracs contacts. This is a conceptual view of the Customer Portal showing the details of the components that make up the solution. 12.00 The portal application for transactions was entirely built using ADF 11g R2. Omnitracs’ business had a pressing requirement to have a portal available 24/7 for its customers. Since there were interactions with EBS in the back-end, the downtimes on the EBS would negate this availability. Omnitracs devised a decoupling strategy at the database side for the EBS data. The decoupling of the database was done using Oracle Data Guard and completely insulated the solution from any eBusiness Suite down time. The customer has no knowledge whether eBS is running or not. Here are two sample screenshots of the portal application built in Oracle ADF. Customer Benefits The Customer Portal not only provided the scalability to grow the business but also provided the seamless integration with other disparate applications. Some of the key benefits are: Improved Customer Experience: With a modern look and feel and a Portal that has the aspects of an App Store, the customer experience was significantly improved. Page response times went from several seconds to sub-second for all of the pages. Enabled new product launches: After successfully dominating the large fleet market, Omnitracs now has a scalable solution to sell and manage smaller fleet customers giving them a huge advantage over their nearest competitors. Dozens of new customers have been acquired via this portal through an onboarding process that now takes minutes Seamless Integrations Improves Customer Support: ADF 11gR2 allowed Omnitracs to bring a diverse list of applications into one integrated solution. This provided a seamless experience for customers to route them from Marketing focused application to a customer-oriented portal. Internally, it also allowed Sales Representatives to have an integrated flow for taking a prospect through the various steps to onboard them as a customer. Key integrations included: Unity Core Salesforce.com Merchant e-Solution for credit card Custom Omnitracs Applications like CUPS and AUTO Security utilizing OID and OVD Back end integration with EBS (Data Guard) and iQ Database Business Impact Significant business impacts were realized through the launch of customer portal. It not only allows the business to push through in underserved segments, but also reduces the time it needs to spend on customer support—allowing the business to focus more on sales and identifying the market for new products. Some of the Immediate Benefits are The entire onboarding process is now completely automated and now completes in minutes. This represents an 85% productivity improvement over their previous processes. And it was 160 times faster! With the success of this self-service solution, the business is now targeting about 3X customer growth in the next five years. This represents a tripling of their overall customer base and significant downstream revenue for the ongoing services. 90%+ improvement of customer onboarding and management process by utilizing, single sign on integration using OID/OAM solution, performance improvements and new self-service functionality Unified login for all Customers, Partners and Internal Users enables login to a common portal and seamless access to all other integrated applications targeted at the respective audience Significantly improved customer experience with a better look and feel with a more user experience focused Portal screens. Helped sales of the new product by having an easy way of ordering and activating the product. Data Guard helped increase availability of the Portal to 99%+ and make it independent of EBS downtime. This gave customers the feel of high availability of the portal application. Some of the anticipated longer term Benefits are: Platform that can be leveraged to launch any new product introduction and enable all product teams to reach new customers and new markets Easy integration with content management to allow business owners more control of the product catalog Overall reduced TCO with standardization of the Oracle platform Managed IT support cost savings through optimization of technology skills needed to support and modify this solution ------------------------------------------------------------ 12.00 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 -"/ /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-family:"Times New Roman","serif";}

    Read the article

  • Best Practices - which domain types should be used to run applications

    - by jsavit
    This post is one of a series of "best practices" notes for Oracle VM Server for SPARC (formerly named Logical Domains) One question that frequently comes up is "which types of domain should I use to run applications?" There used to be a simple answer in most cases: "only run applications in guest domains", but enhancements to T-series servers, Oracle VM Server for SPARC and the advent of SPARC SuperCluster have made this question more interesting and worth qualifying differently. This article reviews the relevant concepts and provides suggestions on where to deploy applications in a logical domains environment. Review: division of labor and types of domain Oracle VM Server for SPARC offloads many functions from the hypervisor to domains (also called virtual machines). This is a modern alternative to using a "thick" hypervisor that provides all virtualization functions, as in traditional VM designs, This permits a simpler hypervisor design, which enhances reliability, and security. It also reduces single points of failure by assigning responsibilities to multiple system components, which further improves reliability and security. In this architecture, management and I/O functionality are provided within domains. Oracle VM Server for SPARC does this by defining the following types of domain, each with their own roles: Control domain - management control point for the server, used to configure domains and manage resources. It is the first domain to boot on a power-up, is an I/O domain, and is usually a service domain as well. I/O domain - has been assigned physical I/O devices: a PCIe root complex, a PCI device, or a SR-IOV (single-root I/O Virtualization) function. It has native performance and functionality for the devices it owns, unmediated by any virtualization layer. Service domain - provides virtual network and disk devices to guest domains. Guest domain - a domain whose devices are all virtual rather than physical: virtual network and disk devices provided by one or more service domains. In common practice, this is where applications are run. Typical deployment A service domain is generally also an I/O domain: otherwise it wouldn't have access to physical device "backends" to offer to its clients. Similarly, an I/O domain is also typically a service domain in order to leverage the available PCI busses. Control domains must be I/O domains, because they boot up first on the server and require physical I/O. It's typical for the control domain to also be a service domain too so it doesn't "waste" the I/O resources it uses. A simple configuration consists of a control domain, which is also the one I/O and service domain, and some number of guest domains using virtual I/O. In production, customers typically use multiple domains with I/O and service roles to eliminate single points of failure: guest domains have virtual disk and virtual devices provisioned from more than one service domain, so failure of a service domain or I/O path or device doesn't result in an application outage. This is also used for "rolling upgrades" in which service domains are upgraded one at a time while their guests continue to operate without disruption. (It should be noted that resiliency to I/O device failures can also be provided by the single control domain, using multi-path I/O) In this type of deployment, control, I/O, and service domains are used for virtualization infrastructure, while applications run in guest domains. Changing application deployment patterns The above model has been widely and successfully used, but more configuration options are available now. Servers got bigger than the original T2000 class machines with 2 I/O busses, so there is more I/O capacity that can be used for applications. Increased T-series server capacity made it attractive to run more vertical applications, such as databases, with higher resource requirements than the "light" applications originally seen. This made it attractive to run applications in I/O domains so they could get bare-metal native I/O performance. This is leveraged by the SPARC SuperCluster engineered system, announced a year ago at Oracle OpenWorld. In SPARC SuperCluster, I/O domains are used for high performance applications, with native I/O performance for disk and network and optimized access to the Infiniband fabric. Another technical enhancement is the introduction of Direct I/O (DIO) and Single Root I/O Virtualization (SR-IOV), which make it possible to give domains direct connections and native I/O performance for selected I/O devices. A domain with either a DIO or SR-IOV device is an I/O domain. In summary: not all I/O domains own PCI complexes, and there are increasingly more I/O domains that are not service domains. They use their I/O connectivity for performance for their own applications. However, there are some limitations and considerations: at this time, a domain using physical I/O cannot be live-migrated to another server. There is also a need to plan for security and introducing unneeded dependencies: if an I/O domain is also a service domain providing virtual I/O go guests, it has the ability to affect the correct operation of its client guest domains. This is even more relevant for the control domain. where the ldm has to be protected from unauthorized (or even mistaken) use that would affect other domains. As a general rule, running applications in the service domain or the control domain should be avoided. To recap: Guest domains with virtual I/O still provide the greatest operational flexibility, including features like live migration. I/O domains can be used for applications with high performance requirements. This is used to great effect in SPARC SuperCluster and in general T4 deployments. Direct I/O (DIO) and Single Root I/O Virtualization (SR-IOV) make this more attractive by giving direct I/O access to more domains. Service domains should in general not be used for applications, because compromised security in the domain, or an outage, can affect other domains that depend on it. This concern can be mitigated by providing guests' their virtual I/O from more than one service domain, so an interruption of service in the service domain does not cause an application outage. The control domain should in general not be used to run applications, for the same reason. SPARC SuperCluster use the control domain for applications, but it is an exception: it's not a general purpose environment; it's an engineered system with specifically configured applications and optimization for optimal performance. These are recommended "best practices" based on conversations with a number of Oracle architects. Keep in mind that "one size does not fit all", so you should evaluate these practices in the context of your own requirements. Summary Higher capacity T-series servers have made it more attractive to use them for applications with high resource requirements. New deployment models permit native I/O performance for demanding applications by running them in I/O domains with direct access to their devices. This is leveraged in SPARC SuperCluster, and can be leveraged in T-series servers to provision high-performance applications running in domains. Carefully planned, this can be used to provide higher performance for critical applications.

    Read the article

  • How to read oom-killer syslog messages?

    - by Grant
    I have a Ubuntu 12.04 server which sometimes dies completely - no SSH, no ping, nothing until it is physically rebooted. After the reboot, I see in syslog that the oom-killer killed, well, pretty much everything. There's a lot of detailed memory usage information in them. How do I read these logs to see what caused the OOM issue? The server has far more memory than it needs, so it shouldn't be running out of memory. Oct 25 07:28:04 nldedip4k031 kernel: [87946.529511] oom_kill_process: 9 callbacks suppressed Oct 25 07:28:04 nldedip4k031 kernel: [87946.529514] irqbalance invoked oom-killer: gfp_mask=0x80d0, order=0, oom_adj=0, oom_score_adj=0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529516] irqbalance cpuset=/ mems_allowed=0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529518] Pid: 948, comm: irqbalance Not tainted 3.2.0-55-generic-pae #85-Ubuntu Oct 25 07:28:04 nldedip4k031 kernel: [87946.529519] Call Trace: Oct 25 07:28:04 nldedip4k031 kernel: [87946.529525] [] dump_header.isra.6+0x85/0xc0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529528] [] oom_kill_process+0x5c/0x80 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529530] [] out_of_memory+0xc5/0x1c0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529532] [] __alloc_pages_nodemask+0x72c/0x740 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529535] [] __get_free_pages+0x1c/0x30 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529537] [] get_zeroed_page+0x12/0x20 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529541] [] fill_read_buffer.isra.8+0xaa/0xd0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529543] [] sysfs_read_file+0x7d/0x90 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529546] [] vfs_read+0x8c/0x160 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529548] [] ? fill_read_buffer.isra.8+0xd0/0xd0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529550] [] sys_read+0x3d/0x70 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529554] [] sysenter_do_call+0x12/0x28 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529555] Mem-Info: Oct 25 07:28:04 nldedip4k031 kernel: [87946.529556] DMA per-cpu: Oct 25 07:28:04 nldedip4k031 kernel: [87946.529557] CPU 0: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529558] CPU 1: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529560] CPU 2: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529561] CPU 3: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529562] CPU 4: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529563] CPU 5: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529564] CPU 6: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529565] CPU 7: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529566] Normal per-cpu: Oct 25 07:28:04 nldedip4k031 kernel: [87946.529567] CPU 0: hi: 186, btch: 31 usd: 179 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529568] CPU 1: hi: 186, btch: 31 usd: 182 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529569] CPU 2: hi: 186, btch: 31 usd: 132 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529570] CPU 3: hi: 186, btch: 31 usd: 175 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529571] CPU 4: hi: 186, btch: 31 usd: 91 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529572] CPU 5: hi: 186, btch: 31 usd: 173 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529573] CPU 6: hi: 186, btch: 31 usd: 159 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529574] CPU 7: hi: 186, btch: 31 usd: 164 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529575] HighMem per-cpu: Oct 25 07:28:04 nldedip4k031 kernel: [87946.529576] CPU 0: hi: 186, btch: 31 usd: 165 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529577] CPU 1: hi: 186, btch: 31 usd: 183 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529578] CPU 2: hi: 186, btch: 31 usd: 185 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529579] CPU 3: hi: 186, btch: 31 usd: 138 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529580] CPU 4: hi: 186, btch: 31 usd: 155 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529581] CPU 5: hi: 186, btch: 31 usd: 104 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529582] CPU 6: hi: 186, btch: 31 usd: 133 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529583] CPU 7: hi: 186, btch: 31 usd: 170 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529586] active_anon:5523 inactive_anon:354 isolated_anon:0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529586] active_file:2815 inactive_file:6849119 isolated_file:0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529587] unevictable:0 dirty:449 writeback:10 unstable:0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529587] free:1304125 slab_reclaimable:104672 slab_unreclaimable:3419 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529588] mapped:2661 shmem:138 pagetables:313 bounce:0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529591] DMA free:4252kB min:780kB low:972kB high:1168kB active_anon:0kB inactive_anon:0kB active_file:4kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15756kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:11564kB slab_unreclaimable:4kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:1 all_unreclaimable? yes Oct 25 07:28:04 nldedip4k031 kernel: [87946.529594] lowmem_reserve[]: 0 869 32460 32460 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529599] Normal free:44052kB min:44216kB low:55268kB high:66324kB active_anon:0kB inactive_anon:0kB active_file:616kB inactive_file:568kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:890008kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:407124kB slab_unreclaimable:13672kB kernel_stack:992kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2083 all_unreclaimable? yes Oct 25 07:28:04 nldedip4k031 kernel: [87946.529602] lowmem_reserve[]: 0 0 252733 252733 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529606] HighMem free:5168196kB min:512kB low:402312kB high:804112kB active_anon:22092kB inactive_anon:1416kB active_file:10640kB inactive_file:27395920kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:32349872kB mlocked:0kB dirty:1796kB writeback:40kB mapped:10640kB shmem:552kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1252kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Oct 25 07:28:04 nldedip4k031 kernel: [87946.529609] lowmem_reserve[]: 0 0 0 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529611] DMA: 6*4kB 6*8kB 6*16kB 5*32kB 5*64kB 4*128kB 2*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 4232kB Oct 25 07:28:04 nldedip4k031 kernel: [87946.529616] Normal: 297*4kB 180*8kB 119*16kB 73*32kB 67*64kB 47*128kB 35*256kB 13*512kB 5*1024kB 1*2048kB 1*4096kB = 44052kB Oct 25 07:28:04 nldedip4k031 kernel: [87946.529622] HighMem: 1*4kB 6*8kB 27*16kB 11*32kB 2*64kB 1*128kB 0*256kB 0*512kB 4*1024kB 1*2048kB 1260*4096kB = 5168196kB Oct 25 07:28:04 nldedip4k031 kernel: [87946.529627] 6852076 total pagecache pages Oct 25 07:28:04 nldedip4k031 kernel: [87946.529628] 0 pages in swap cache Oct 25 07:28:04 nldedip4k031 kernel: [87946.529629] Swap cache stats: add 0, delete 0, find 0/0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529630] Free swap = 3998716kB Oct 25 07:28:04 nldedip4k031 kernel: [87946.529631] Total swap = 3998716kB Oct 25 07:28:04 nldedip4k031 kernel: [87946.571914] 8437743 pages RAM Oct 25 07:28:04 nldedip4k031 kernel: [87946.571916] 8209409 pages HighMem Oct 25 07:28:04 nldedip4k031 kernel: [87946.571917] 159556 pages reserved Oct 25 07:28:04 nldedip4k031 kernel: [87946.571917] 6862034 pages shared Oct 25 07:28:04 nldedip4k031 kernel: [87946.571918] 123540 pages non-shared Oct 25 07:28:04 nldedip4k031 kernel: [87946.571919] [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name Oct 25 07:28:04 nldedip4k031 kernel: [87946.571927] [ 421] 0 421 709 152 3 0 0 upstart-udev-br Oct 25 07:28:04 nldedip4k031 kernel: [87946.571929] [ 429] 0 429 773 326 5 -17 -1000 udevd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571931] [ 567] 0 567 772 224 4 -17 -1000 udevd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571932] [ 568] 0 568 772 231 7 -17 -1000 udevd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571934] [ 764] 0 764 712 103 1 0 0 upstart-socket- Oct 25 07:28:04 nldedip4k031 kernel: [87946.571936] [ 772] 103 772 815 164 5 0 0 dbus-daemon Oct 25 07:28:04 nldedip4k031 kernel: [87946.571938] [ 785] 0 785 1671 600 1 -17 -1000 sshd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571940] [ 809] 101 809 7766 380 1 0 0 rsyslogd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571942] [ 869] 0 869 1158 213 3 0 0 getty Oct 25 07:28:04 nldedip4k031 kernel: [87946.571943] [ 873] 0 873 1158 214 6 0 0 getty Oct 25 07:28:04 nldedip4k031 kernel: [87946.571945] [ 911] 0 911 1158 215 3 0 0 getty Oct 25 07:28:04 nldedip4k031 kernel: [87946.571947] [ 912] 0 912 1158 214 2 0 0 getty Oct 25 07:28:04 nldedip4k031 kernel: [87946.571949] [ 914] 0 914 1158 213 1 0 0 getty Oct 25 07:28:04 nldedip4k031 kernel: [87946.571950] [ 916] 0 916 618 86 1 0 0 atd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571952] [ 917] 0 917 655 226 3 0 0 cron Oct 25 07:28:04 nldedip4k031 kernel: [87946.571954] [ 948] 0 948 902 159 3 0 0 irqbalance Oct 25 07:28:04 nldedip4k031 kernel: [87946.571956] [ 993] 0 993 1145 363 3 0 0 master Oct 25 07:28:04 nldedip4k031 kernel: [87946.571957] [ 1002] 104 1002 1162 333 1 0 0 qmgr Oct 25 07:28:04 nldedip4k031 kernel: [87946.571959] [ 1016] 0 1016 730 149 2 0 0 mdadm Oct 25 07:28:04 nldedip4k031 kernel: [87946.571961] [ 1057] 0 1057 6066 2160 3 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571963] [ 1086] 0 1086 1158 213 3 0 0 getty Oct 25 07:28:04 nldedip4k031 kernel: [87946.571965] [ 1088] 33 1088 6191 1517 0 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571967] [ 1089] 33 1089 6191 1451 1 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571969] [ 1090] 33 1090 6175 1451 3 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571971] [ 1091] 33 1091 6191 1451 1 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571972] [ 1092] 33 1092 6191 1451 0 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571974] [ 1109] 33 1109 6191 1517 0 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571976] [ 1151] 33 1151 6191 1451 1 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571978] [ 1201] 104 1201 1803 652 1 0 0 tlsmgr Oct 25 07:28:04 nldedip4k031 kernel: [87946.571980] [ 2475] 0 2475 2435 812 0 0 0 sshd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571982] [ 2494] 0 2494 1745 839 1 0 0 bash Oct 25 07:28:04 nldedip4k031 kernel: [87946.571984] [ 2573] 0 2573 3394 1689 0 0 0 sshd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571986] [ 2589] 0 2589 5014 457 3 0 0 rsync Oct 25 07:28:04 nldedip4k031 kernel: [87946.571988] [ 2590] 0 2590 7970 522 1 0 0 rsync Oct 25 07:28:04 nldedip4k031 kernel: [87946.571990] [ 2652] 104 2652 1150 326 5 0 0 pickup Oct 25 07:28:04 nldedip4k031 kernel: [87946.571992] Out of memory: Kill process 421 (upstart-udev-br) score 1 or sacrifice child Oct 25 07:28:04 nldedip4k031 kernel: [87946.572407] Killed process 421 (upstart-udev-br) total-vm:2836kB, anon-rss:156kB, file-rss:452kB Oct 25 07:28:04 nldedip4k031 kernel: [87946.573107] init: upstart-udev-bridge main process (421) killed by KILL signal Oct 25 07:28:04 nldedip4k031 kernel: [87946.573126] init: upstart-udev-bridge main process ended, respawning Oct 25 07:28:34 nldedip4k031 kernel: [87976.461570] irqbalance invoked oom-killer: gfp_mask=0x80d0, order=0, oom_adj=0, oom_score_adj=0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461573] irqbalance cpuset=/ mems_allowed=0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461576] Pid: 948, comm: irqbalance Not tainted 3.2.0-55-generic-pae #85-Ubuntu Oct 25 07:28:34 nldedip4k031 kernel: [87976.461578] Call Trace: Oct 25 07:28:34 nldedip4k031 kernel: [87976.461585] [] dump_header.isra.6+0x85/0xc0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461588] [] oom_kill_process+0x5c/0x80 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461591] [] out_of_memory+0xc5/0x1c0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461595] [] __alloc_pages_nodemask+0x72c/0x740 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461599] [] __get_free_pages+0x1c/0x30 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461602] [] get_zeroed_page+0x12/0x20 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461606] [] fill_read_buffer.isra.8+0xaa/0xd0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461609] [] sysfs_read_file+0x7d/0x90 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461613] [] vfs_read+0x8c/0x160 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461616] [] ? fill_read_buffer.isra.8+0xd0/0xd0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461619] [] sys_read+0x3d/0x70 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461624] [] sysenter_do_call+0x12/0x28 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461626] Mem-Info: Oct 25 07:28:34 nldedip4k031 kernel: [87976.461628] DMA per-cpu: Oct 25 07:28:34 nldedip4k031 kernel: [87976.461629] CPU 0: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461631] CPU 1: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461633] CPU 2: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461634] CPU 3: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461636] CPU 4: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461638] CPU 5: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461639] CPU 6: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461641] CPU 7: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461642] Normal per-cpu: Oct 25 07:28:34 nldedip4k031 kernel: [87976.461644] CPU 0: hi: 186, btch: 31 usd: 61 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461646] CPU 1: hi: 186, btch: 31 usd: 49 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461647] CPU 2: hi: 186, btch: 31 usd: 8 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461649] CPU 3: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461651] CPU 4: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461652] CPU 5: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461654] CPU 6: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461656] CPU 7: hi: 186, btch: 31 usd: 30 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461657] HighMem per-cpu: Oct 25 07:28:34 nldedip4k031 kernel: [87976.461658] CPU 0: hi: 186, btch: 31 usd: 4 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461660] CPU 1: hi: 186, btch: 31 usd: 204 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461662] CPU 2: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461663] CPU 3: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461665] CPU 4: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461667] CPU 5: hi: 186, btch: 31 usd: 31 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461668] CPU 6: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461670] CPU 7: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461674] active_anon:5441 inactive_anon:412 isolated_anon:0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461674] active_file:2668 inactive_file:6922842 isolated_file:0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461675] unevictable:0 dirty:836 writeback:0 unstable:0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461676] free:1231664 slab_reclaimable:105781 slab_unreclaimable:3399 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461677] mapped:2649 shmem:138 pagetables:313 bounce:0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461682] DMA free:4248kB min:780kB low:972kB high:1168kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15756kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:11560kB slab_unreclaimable:4kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:5687 all_unreclaimable? yes Oct 25 07:28:34 nldedip4k031 kernel: [87976.461686] lowmem_reserve[]: 0 869 32460 32460 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461693] Normal free:44184kB min:44216kB low:55268kB high:66324kB active_anon:0kB inactive_anon:0kB active_file:20kB inactive_file:1096kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:890008kB mlocked:0kB dirty:4kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:411564kB slab_unreclaimable:13592kB kernel_stack:992kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:1816 all_unreclaimable? yes Oct 25 07:28:34 nldedip4k031 kernel: [87976.461697] lowmem_reserve[]: 0 0 252733 252733 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461703] HighMem free:4878224kB min:512kB low:402312kB high:804112kB active_anon:21764kB inactive_anon:1648kB active_file:10652kB inactive_file:27690268kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:32349872kB mlocked:0kB dirty:3340kB writeback:0kB mapped:10592kB shmem:552kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1252kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Oct 25 07:28:34 nldedip4k031 kernel: [87976.461708] lowmem_reserve[]: 0 0 0 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461711] DMA: 8*4kB 7*8kB 6*16kB 5*32kB 5*64kB 4*128kB 2*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 4248kB Oct 25 07:28:34 nldedip4k031 kernel: [87976.461719] Normal: 272*4kB 178*8kB 76*16kB 52*32kB 42*64kB 36*128kB 23*256kB 20*512kB 7*1024kB 2*2048kB 1*4096kB = 44176kB Oct 25 07:28:34 nldedip4k031 kernel: [87976.461727] HighMem: 1*4kB 45*8kB 31*16kB 24*32kB 5*64kB 3*128kB 1*256kB 2*512kB 4*1024kB 2*2048kB 1188*4096kB = 4877852kB Oct 25 07:28:34 nldedip4k031 kernel: [87976.461736] 6925679 total pagecache pages Oct 25 07:28:34 nldedip4k031 kernel: [87976.461737] 0 pages in swap cache Oct 25 07:28:34 nldedip4k031 kernel: [87976.461739] Swap cache stats: add 0, delete 0, find 0/0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461740] Free swap = 3998716kB Oct 25 07:28:34 nldedip4k031 kernel: [87976.461741] Total swap = 3998716kB Oct 25 07:28:34 nldedip4k031 kernel: [87976.524951] 8437743 pages RAM Oct 25 07:28:34 nldedip4k031 kernel: [87976.524953] 8209409 pages HighMem Oct 25 07:28:34 nldedip4k031 kernel: [87976.524954] 159556 pages reserved Oct 25 07:28:34 nldedip4k031 kernel: [87976.524955] 6936141 pages shared Oct 25 07:28:34 nldedip4k031 kernel: [87976.524956] 124602 pages non-shared Oct 25 07:28:34 nldedip4k031 kernel: [87976.524957] [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name Oct 25 07:28:34 nldedip4k031 kernel: [87976.524966] [ 429] 0 429 773 326 5 -17 -1000 udevd Oct 25 07:28:34 nldedip4k031 kernel: [87976.524968] [ 567] 0 567 772 224 4 -17 -1000 udevd Oct 25 07:28:34 nldedip4k031 kernel: [87976.524971] [ 568] 0 568 772 231 7 -17 -1000 udevd Oct 25 07:28:34 nldedip4k031 kernel: [87976.524973] [ 764] 0 764 712 103 3 0 0 upstart-socket- Oct 25 07:28:34 nldedip4k031 kernel: [87976.524976] [ 772] 103 772 815 164 2 0 0 dbus-daemon Oct 25 07:28:34 nldedip4k031 kernel: [87976.524979] [ 785] 0 785 1671 600 1 -17 -1000 sshd Oct 25 07:28:34 nldedip4k031 kernel: [87976.524981] [ 809] 101 809 7766 380 1 0 0 rsyslogd Oct 25 07:28:34 nldedip4k031 kernel: [87976.524983] [ 869] 0 869 1158 213 3 0 0 getty Oct 25 07:28:34 nldedip4k031 kernel: [87976.524986] [ 873] 0 873 1158 214 6 0 0 getty Oct 25 07:28:34 nldedip4k031 kernel: [87976.524988] [ 911] 0 911 1158 215 3 0 0 getty Oct 25 07:28:34 nldedip4k031 kernel: [87976.524990] [ 912] 0 912 1158 214 2 0 0 getty Oct 25 07:28:34 nldedip4k031 kernel: [87976.524992] [ 914] 0 914 1158 213 1 0 0 getty Oct 25 07:28:34 nldedip4k031 kernel: [87976.524995] [ 916] 0 916 618 86 1 0 0 atd Oct 25 07:28:34 nldedip4k031 kernel: [87976.524997] [ 917] 0 917 655 226 3 0 0 cron Oct 25 07:28:34 nldedip4k031 kernel: [87976.524999] [ 948] 0 948 902 159 5 0 0 irqbalance Oct 25 07:28:34 nldedip4k031 kernel: [87976.525002] [ 993] 0 993 1145 363 3 0 0 master Oct 25 07:28:34 nldedip4k031 kernel: [87976.525004] [ 1002] 104 1002 1162 333 1 0 0 qmgr Oct 25 07:28:34 nldedip4k031 kernel: [87976.525007] [ 1016] 0 1016 730 149 2 0 0 mdadm Oct 25 07:28:34 nldedip4k031 kernel: [87976.525009] [ 1057] 0 1057 6066 2160 3 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525012] [ 1086] 0 1086 1158 213 3 0 0 getty Oct 25 07:28:34 nldedip4k031 kernel: [87976.525014] [ 1088] 33 1088 6191 1517 0 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525017] [ 1089] 33 1089 6191 1451 1 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525019] [ 1090] 33 1090 6175 1451 1 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525021] [ 1091] 33 1091 6191 1451 1 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525024] [ 1092] 33 1092 6191 1451 0 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525026] [ 1109] 33 1109 6191 1517 0 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525029] [ 1151] 33 1151 6191 1451 1 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525031] [ 1201] 104 1201 1803 652 1 0 0 tlsmgr Oct 25 07:28:34 nldedip4k031 kernel: [87976.525033] [ 2475] 0 2475 2435 812 0 0 0 sshd Oct 25 07:28:34 nldedip4k031 kernel: [87976.525036] [ 2494] 0 2494 1745 839 1 0 0 bash Oct 25 07:28:34 nldedip4k031 kernel: [87976.525038] [ 2573] 0 2573 3394 1689 3 0 0 sshd Oct 25 07:28:34 nldedip4k031 kernel: [87976.525040] [ 2589] 0 2589 5014 457 3 0 0 rsync Oct 25 07:28:34 nldedip4k031 kernel: [87976.525043] [ 2590] 0 2590 7970 522 1 0 0 rsync Oct 25 07:28:34 nldedip4k031 kernel: [87976.525045] [ 2652] 104 2652 1150 326 5 0 0 pickup Oct 25 07:28:34 nldedip4k031 kernel: [87976.525048] [ 2847] 0 2847 709 89 0 0 0 upstart-udev-br Oct 25 07:28:34 nldedip4k031 kernel: [87976.525050] Out of memory: Kill process 764 (upstart-socket-) score 1 or sacrifice child Oct 25 07:28:34 nldedip4k031 kernel: [87976.525484] Killed process 764 (upstart-socket-) total-vm:2848kB, anon-rss:204kB, file-rss:208kB Oct 25 07:28:34 nldedip4k031 kernel: [87976.526161] init: upstart-socket-bridge main process (764) killed by KILL signal Oct 25 07:28:34 nldedip4k031 kernel: [87976.526180] init: upstart-socket-bridge main process ended, respawning Oct 25 07:28:44 nldedip4k031 kernel: [87986.439671] irqbalance invoked oom-killer: gfp_mask=0x80d0, order=0, oom_adj=0, oom_score_adj=0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439674] irqbalance cpuset=/ mems_allowed=0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439676] Pid: 948, comm: irqbalance Not tainted 3.2.0-55-generic-pae #85-Ubuntu Oct 25 07:28:44 nldedip4k031 kernel: [87986.439678] Call Trace: Oct 25 07:28:44 nldedip4k031 kernel: [87986.439684] [] dump_header.isra.6+0x85/0xc0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439686] [] oom_kill_process+0x5c/0x80 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439688] [] out_of_memory+0xc5/0x1c0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439691] [] __alloc_pages_nodemask+0x72c/0x740 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439694] [] __get_free_pages+0x1c/0x30 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439696] [] get_zeroed_page+0x12/0x20 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439699] [] fill_read_buffer.isra.8+0xaa/0xd0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439702] [] sysfs_read_file+0x7d/0x90 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439704] [] vfs_read+0x8c/0x160 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439707] [] ? fill_read_buffer.isra.8+0xd0/0xd0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439709] [] sys_read+0x3d/0x70 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439712] [] sysenter_do_call+0x12/0x28 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439714] Mem-Info: Oct 25 07:28:44 nldedip4k031 kernel: [87986.439714] DMA per-cpu: Oct 25 07:28:44 nldedip4k031 kernel: [87986.439716] CPU 0: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439717] CPU 1: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439718] CPU 2: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439719] CPU 3: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439720] CPU 4: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439721] CPU 5: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439722] CPU 6: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439723] CPU 7: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439724] Normal per-cpu: Oct 25 07:28:44 nldedip4k031 kernel: [87986.439725] CPU 0: hi: 186, btch: 31 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439726] CPU 1: hi: 186, btch: 31 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439727] CPU 2: hi: 186, btch: 31 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439728] CPU 3: hi: 186, btch: 31 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439729] CPU 4: hi: 186, btch: 31 usd: 0 Oct 25 07:33:48 nldedip4k031 kernel: imklog 5.8.6, log source = /proc/kmsg started. Oct 25 07:33:48 nldedip4k031 rsyslogd: [origin software="rsyslogd" swVersion="5.8.6" x-pid="2880" x-info="http://www.rsyslog.com"] start Oct 25 07:33:48 nldedip4k031 rsyslogd: rsyslogd's groupid changed to 103 Oct 25 07:33:48 nldedip4k031 rsyslogd: rsyslogd's userid changed to 101 Oct 25 07:33:48 nldedip4k031 rsyslogd-2039: Could not open output pipe '/dev/xconsole' [try http://www.rsyslog.com/e/2039 ]

    Read the article

  • GPGPU

    WhatGPU obviously stands for Graphics Processing Unit (the silicon powering the display you are using to read this blog post). The extra GP in front of that stands for General Purpose computing.So, altogether GPGPU refers to computing we can perform on GPU for purposes beyond just drawing on the screen. In effect, we can use a GPGPU a bit like we already use a CPU: to perform some calculation (that doesn’t have to have any visual element to it). The attraction is that a GPGPU can be orders of magnitude faster than a CPU.WhyWhen I was at the SuperComputing conference in Portland last November, GPGPUs were all the rage. A quick online search reveals many articles introducing the GPGPU topic. I'll just share 3 here: pcper (ignoring all pages except the first, it is a good consumer perspective), gizmodo (nice take using mostly layman terms) and vizworld (answering the question on "what's the big deal").The GPGPU programming paradigm (from a high level) is simple: in your CPU program you define functions (aka kernels) that take some input, can perform the costly operation and return the output. The kernels are the things that execute on the GPGPU leveraging its power (and hence execute faster than what they could on the CPU) while the host CPU program waits for the results or asynchronously performs other tasks.However, GPGPUs have different characteristics to CPUs which means they are suitable only for certain classes of problem (i.e. data parallel algorithms) and not for others (e.g. algorithms with branching or recursion or other complex flow control). You also pay a high cost for transferring the input data from the CPU to the GPU (and vice versa the results back to the CPU), so the computation itself has to be long enough to justify the overhead transfer costs. If your problem space fits the criteria then you probably want to check out this technology.HowSo where can you get a graphics card to start playing with all this? At the time of writing, the two main vendors ATI (owned by AMD) and NVIDIA are the obvious players in this industry. You can read about GPGPU on this AMD page and also on this NVIDIA page. NVIDIA's website also has a free chapter on the topic from the "GPU Gems" book: A Toolkit for Computation on GPUs.If you followed the links above, then you've already come across some of the choices of programming models that are available today. Essentially, AMD is offering their ATI Stream technology accessible via a language they call Brook+; NVIDIA offers their CUDA platform which is accessible from CUDA C. Choosing either of those locks you into the GPU vendor and hence your code cannot run on systems with cards from the other vendor (e.g. imagine if your CPU code would run on Intel chips but not AMD chips). Having said that, both vendors plan to support a new emerging standard called OpenCL, which theoretically means your kernels can execute on any GPU that supports it. To learn more about all of these there is a website: gpgpu.org. The caveat about that site is that (currently) it completely ignores the Microsoft approach, which I touch on next.On Windows, there is already a cross-GPU-vendor way of programming GPUs and that is the DirectX API. Specifically, on Windows Vista and Windows 7, the DirectX 11 API offers a dedicated subset of the API for GPGPU programming: DirectCompute. You use this API on the CPU side, to set up and execute the kernels that run on the GPU. The kernels are written in a language called HLSL (High Level Shader Language). You can use DirectCompute with HLSL to write a "compute shader", which is the term DirectX uses for what I've been referring to in this post as a "kernel". For a comprehensive collection of links about this (including tutorials, videos and samples) please see my blog post: DirectCompute.Note that there are many efforts to build even higher level languages on top of DirectX that aim to expose GPGPU programming to a wider audience by making it as easy as today's mainstream programming models. I'll mention here just two of those efforts: Accelerator from MSR and Brahma by Ananth. Comments about this post welcome at the original blog.

    Read the article

  • Using NServiceBus behind a custom web service

    - by Michael Stephenson
    In this post I'd like to talk about an architecture scenario we had recently and how we were able to utilise NServiceBus to help us address this problem. Scenario Cognos is a reporting system used by one of my clients. A while back we developed a web service façade to allow line of business applications to be able to access reports from Cognos to support their various functions. The service was intended to provide access to reports which were quick running reports or pre-generated reports which could be accessed real-time on demand. One of the key aims of the web service was to provide a simple generic interface to allow applications to get any report without needing to worry about the complex .net SDK for Cognos. The web service also supported multi-hop kerberos delegation so that report data could be accesses under the context of the end user. This service was working well for a period of time. The Problem The problem we encountered was that reports were now also required to be available to batch processes. The original design was optimised for low latency so users would enjoy a positive experience, however when the batch processes started to request 250+ concurrent reports over an extended period of time you can begin to imagine the sorts of problems that come into play. The key problems this new scenario caused are: Users may be affected and the latency of on demand reports was significantly slower The Cognos infrastructure was not scaled sufficiently to be able to cope with these long peaks of load From a cost perspective it just isn't feasible to scale the Cognos infrastructure to be able to handle the load when it is only for a couple of hour window each night. We really needed to introduce a second pattern for accessing this service which would support high through-put scenarios. We also had little control over the batch process in terms of being able to throttle its load. We could however make some changes to the way it accessed the reports. The Approach My idea was to introduce a throttling mechanism between the Web Service Façade and Cognos. This would allow the batch processes to push reports requests hard at the web service which we were confident the web service can handle. The web service would then queue these requests and process them behind the scenes and make a call back to the batch application to provide the report once it had been accessed. In terms of technology we had some limitations because we were not able to use WCF or IIS7 where the MSMQ-Activated WCF services could have helped, but we did have MSMQ as an option and I thought NServiceBus could do just the job to help us here. The flow of how this would work was as follows: The batch applications would send a request for a report to the web service The web service uses NServiceBus to send the message to a Queue The NServiceBus Generic Host is running as a windows service with a message handler which subscribes to these messages The message handler gets the message, accesses the report from Cognos The message handler calls back to the original batch application, this is decoupled because the calling application provides a call back url The report gets into the batch application and is processed as normal This approach looks something like the below diagram: The key points are an application wanting to take advantage of the batch driven reports needs to do the following: Implement our call back contract Make a call to the service providing a call back url Provide a correlation ID so it knows how to tie each response back to its request What does NServiceBus offer in this solution So this scenario is not the typical messaging service bus type of solution people implement with NServiceBus, but it did offer the following: Simplified interaction with MSMQ Offered the ability to configure the number of processes working through the queue so we could find a balance between load on Cognos versus the applications end to end processing time NServiceBus offers retries and a way to manage failed messages NServiceBus offers a high availability setup The simple thing is that NServiceBus gave us the platform to build the solution on. We just implemented a message handler which functionally processed a message and we could rely on NServiceBus to do all of the hard work around managing the queues and all of the lower level things that would have took ages to write to any kind of robust level. Conclusion With this approach we were able to deal with a fairly significant performance issue with out too much rework. Hopefully this write up gives people some insight into ideas on how to leverage the excellent NServiceBus framework to help solve integration and high through-put scenarios.

    Read the article

  • Java Spotlight Episode 103: 2012 Duke Choice Award Winners

    - by Roger Brinkley
    Our annual interview with the 2012 Duke Choice Award Winners recorded live at the JavaOne 2012. Right-click or Control-click to download this MP3 file. You can also subscribe to the Java Spotlight Podcast Feed to get the latest podcast automatically. If you use iTunes you can open iTunes and subscribe with this link:  Java Spotlight Podcast in iTunes. Show Notes Events Oct 13, Devoxx 4 Kids Nederlands Oct 15-17, JAX London Oct 20, Devoxx 4 Kids Français Oct 22-23, Freescale Technology Forum - Japan, Tokyo Oct 30-Nov 1, Arm TechCon, Santa Clara Oct 31, JFall, Netherlands Nov 2-3, JMagreb, Morocco Nov 13-17, Devoxx, Belgium Feature Interview Duke Choice Award Winners 2012 - Show Presentation London Java CommunityThe second user group receiving a Duke’s Choice Award this year, the London Java Community (LJC) and its users have been active in the OpenJDK, the Java Community Process (JCP) and other efforts within the global Java community. Student Nokia Developer GroupThis year’s student winner, Ram Kashyap, is the founder and president of the Nokia Student Network, and was profiled in the “The New Java Developers” feature in the March/April 2012 issue of Java Magazine. Since then, Ram has maintained a hectic pace, graduating from the People’s Education Society Institute of Technology in Bangalore, India, while working on a Java mobile startup and training students on Java ME. Jelastic, Inc.Moving existing Java applications to the cloud can be a daunting task, but startup Jelastic, Inc. offers the first all-Java platform-as-a-service (PaaS) that enables existing Java applications to be deployed in the cloud without code changes or lock-in. NATOThe first-ever Community Choice Award goes to the MASE Integrated Console Environment (MICE) in use at NATO. Built in Java on the NetBeans platform, MICE provides a high-performance visualization environment for conducting air defense and battle-space operations. DuchessRather than focus on a specific geographic area like most Java User Groups (JUGs), Duchess fosters the participation of women in the Java community worldwide. The group has more than 500 members in 60 countries, and provides a platform through which women can connect with each other and get involved in all aspects of the Java community. AgroSense ProjectImproving farming methods to feed a hungry world is the goal of AgroSense, an open source farm information management system built in Java and the NetBeans platform. AgroSense enables farmers, agribusinesses, suppliers and others to develop modular applications that will easily exchange information through a common underlying NetBeans framework. Apache Software Foundation Hadoop ProjectThe Apache Software Foundation’s Hadoop project, written in Java, provides a framework for distributed processing of big data sets across clusters of computers, ranging from a few servers to thousands of machines. This harnessing of large data pools allows organizations to better understand and improve their business. Parleys.comE-learning specialist Parleys.com, based in Brussels, Belgium, uses Java technologies to bring online classes and full IT conferences to desktops, laptops, tablets and mobile devices. Parleys.com has hosted more than 1,700 conferences—including Devoxx and JavaOne—for more than 800,000 unique visitors. Winners not presenting at JavaOne 2012 Duke Choice Awards BOF Liquid RoboticsRobotics – Liquid Robotics is an ocean data services provider whose Wave Glider technology collects information from the world’s oceans for application in government, science and commercial applications. The organization features the “father of Java” James Gosling as its chief software architect.United Nations High Commissioner for RefugeesThe United Nations High Commissioner for Refugees (UNHCR) is on the front lines of crises around the world, from civil wars to natural disasters. To help facilitate its mission of humanitarian relief, the UNHCR has developed a light-client Java application on the NetBeans platform. The Level One registration tool enables the UNHCR to collect information on the number of refugees and their water, food, housing, health, and other needs in the field, and combines that with geocoding information from various sources. This enables the UNHCR to deliver the appropriate kind and amount of assistance where it is needed.

    Read the article

  • How John Got 15x Improvement Without Really Trying

    - by rchrd
    The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here.  How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

    Read the article

  • Developing an Implementation Plan with Iterations by Russ Pitts

    - by user535886
    Developing an Implementation Plan with Iterations by Russ Pitts  Ok, so you have come to grips with understanding that applying the iterative concept, as defined by OUM is simply breaking up the project effort you have estimated for each phase into one or more six week calendar duration blocks of work. Idea being the business user(s) or key recipient(s) of work product(s) being developed never go longer than six weeks without having some sort of review or prototyping of the work results for an iteration…”think-a-little”, “do-a-little”, and “show-a-little” in a six week or less timeframe…ideally the business user(s) or key recipients(s) are involved throughout. You also understand the OUM concept that you only plan for that which you have knowledge of. The concept further defined, a project plan initially is developed at a high-level, and becomes more detailed as project knowledge grows. Agreeing to this concept means you also have to admit to the fallacy that one can plan with precision beyond six weeks into a project…Anything beyond six weeks is a best guess in most cases when dealing with software implementation projects. Project planning, as defined by OUM begins with the Implementation Plan view, which is a very high-level perspective of the effort estimated for each of the five OUM phases, as well as the number of iterations within each phase. You might wonder how can you predict the number of iterations for each phase at this early point in the project. Remember project planning is not an exact science, and initially is high-level and abstract in nature, and then becomes more detailed and precise as the project proceeds. So where do you start in defining iterations for each phase for a project? The following are three easy steps to initially define the number of iterations for each phase: Step 1 => Start with identifying the known factors… …Prior to starting a project you should know: · The agreed upon time-period for an iteration (e.g 6 weeks, or 4 weeks, or…) within a phase (recommend keeping iteration time-period consistent within a phase, if not for the entire project) · The number of resources available for the project · The number of total number of man-day (effort) you have estimated for each of the five OUM phases of the project · The number of work days for a week Step 2 => Calculate the man-days of effort required for an iteration within a phase… Lets assume for the sake of this example there are 10 project resources, and you have estimated 2,536 man-days of work effort which will need to occur for the elaboration phase of the project. Let’s also assume a week for this project is defined as 5 business days, and that each iteration in the elaboration phase will last a calendar duration of 6 weeks. A simple calculation is performed to calculate the daily burn rate for a single iteration, which produces a result of… ((Number of resources * days per week) * duration of iteration) = Number of days required per iteration ((10 resources * 5 days/week) * 6 weeks) = 300 man days of effort required per iteration Step 3 => Calculate the number of iterations that can occur within a phase Next calculate the number of iterations that can occur for the amount of man-days of effort estimated for the phase being considered… (number of man-days of effort estimated / number of man-days required per iteration) = # of iterations for phase (2,536 man-days of estimated effort for phase / 300 man days of effort required per iteration) = 8.45 iterations, which should be rounded to a whole number such as 9 iterations* *Note - It is important to note this is an approximate calculation, not an exact science. This particular example is a simple one, which assumes all resources are utilized throughout the phase, including tech resources, etc. (rounding down or up to a whole number based on project factor considerations). It is also best in many cases to round up to higher number, as this provides some calendar scheduling contingency.

    Read the article

  • Microsoft hosting free Hyper-V training for VMware Pros

    - by Ryan Roussel
    Microsoft will be hosting free training for virtualization professionals focused on Hyper-V, System Center, and virtualization architecture.  Details are below:   Just one week after Microsoft Management Summit 2011 (MMS), Microsoft Learning will be hosting an exclusive three-day Jump Start class specially tailored for VMware and Microsoft virtualization technology pros.  Registration for “Microsoft Virtualization for VMware Professionals” is open now and will be delivered as an online class on March 29-31, 2010 from 10:00am-4:00pm PDT.    The course is COMPLETELY FREE and OPEN TO ANYONE!  Please share with your customers, blog, Tweet, etc. – help us get the word out to strengthen support for Microsoft’s virtualization offerings. What’s the high-level overview? This cutting edge course will feature expert instruction and real-world demonstrations of Hyper-V and brand new releases from System Center Virtual Machine Manager 2012 Beta (many of which will be announced just one week earlier at MMS).  Register Now!   Day 1 will focus on “Platform” (Hyper-V, virtualization architecture, high availability & clustering) 10:00am – 10:30pm PDT:  Virtualization 360 Overview 10:30am – 12:00pm:  Microsoft Hyper-V Deployment Options & Architecture 1:00pm – 2:00pm:  Differentiating Microsoft and VMware (terminology, etc.) 2:00pm – 4:00pm:  High Availability & Clustering Day 2 will focus on “Management” (System Center Suite, SCVMM 2012 Beta, Opalis, Private Cloud solutions) 10:00am – 11:00pm PDT:  System Center Suite Overview w/ focus on DPM 11:00am – 12:00pm:  Virtual Machine Manager 2012 | Part 1 1:00pm –   1:30pm:  Virtual Machine Manager 2012 | Part 2 1:30pm – 2:30pm:  Automation with System Center Opalis & PowerShell 2:30pm – 4:00pm:  Private Cloud Solutions, Architecture & VMM SSP 2.0 Day 3 will focus on “VDI” (VDI Infrastructure/architecture, v-Alliance, application delivery via VDI) 10:00am – 11:00pm PDT:  Virtual Desktop Infrastructure (VDI) Architecture | Part 1 11:00am – 12:00pm:  Virtual Desktop Infrastructure (VDI) Architecture | Part 2 1:00pm – 2:30pm:  v-Alliance Solution Overview 2:30pm – 4:00pm:  Application Delivery for VDI     Every section will be team-taught by two of the most respected authorities on virtualization technologies: Microsoft Technical Evangelist Symon Perriman and leading Hyper-V, VMware, and XEN infrastructure consultant, Corey Hynes Who is the target audience for this training? Suggested prerequisite skills include real-world experience with Windows Server 2008 R2, virtualization and datacenter management. The course is tailored to these types of roles: · IT Professional · IT Decision Maker · Network Administrators & Architects · Storage/Infrastructure Administrators & Architects How do I to register and learn more about this great training opportunity? · Register: Visit the Registration Page and sign up for all three sessions · Blog: Learn more from the Microsoft Learning Blog · Twitter: Here are a few posts you can retweet: o Mar. 29-31 "Microsoft #Virtualization for VMware Pros" @SymonPerriman Corey Hynes http://bit.ly/JS-Hyper-V @MSLearning #Hyper-V o @SysCtrOpalis Mar. 29-31 "Microsoft #Virtualization for VMware Pros" @SymonPerriman Corey Hynes http://bit.ly/JS-Hyper-V #Hyper-V o Learn all the cool new features in Hyper-V & System Center 2012! SCVMM, Self-Service Portal 2.0, http://bit.ly/JS-Hyper-V #Hyper-V #Opalis What is a “Jump Start” course? A “Jump Start” course is “team-taught” by two expert instructors in an engaging radio talk show style format. The idea is to deliver readiness training on strategic and emerging technologies that drive awareness at scale before Microsoft Learning develops mainstream Microsoft Official Courses (MOC) that map to certifications.  All sessions are professionally recorded and distributed through MS Showcase, Channel 9, Zune Marketplace and iTunes for broader reach.

    Read the article

  • Big Data – Operational Databases Supporting Big Data – Key-Value Pair Databases and Document Databases – Day 13 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned the importance of the Relational Database and NoSQL database in the Big Data Story. In this article we will understand the role of Key-Value Pair Databases and Document Databases Supporting Big Data Story. Now we will see a few of the examples of the operational databases. Relational Databases (Yesterday’s post) NoSQL Databases (Yesterday’s post) Key-Value Pair Databases (This post) Document Databases (This post) Columnar Databases (Tomorrow’s post) Graph Databases (Tomorrow’s post) Spatial Databases (Tomorrow’s post) Key Value Pair Databases Key Value Pair Databases are also known as KVP databases. A key is a field name and attribute, an identifier. The content of that field is its value, the data that is being identified and stored. They have a very simple implementation of NoSQL database concepts. They do not have schema hence they are very flexible as well as scalable. The disadvantages of Key Value Pair (KVP) database are that they do not follow ACID (Atomicity, Consistency, Isolation, Durability) properties. Additionally, it will require data architects to plan for data placement, replication as well as high availability. In KVP databases the data is stored as strings. Here is a simple example of how Key Value Database will look like: Key Value Name Pinal Dave Color Blue Twitter @pinaldave Name Nupur Dave Movie The Hero As the number of users grow in Key Value Pair databases it starts getting difficult to manage the entire database. As there is no specific schema or rules associated with the database, there are chances that database grows exponentially as well. It is very crucial to select the right Key Value Pair Database which offers an additional set of tools to manage the data and provides finer control over various business aspects of the same. Riak Rick is one of the most popular Key Value Database. It is known for its scalability and performance in high volume and velocity database. Additionally, it implements a mechanism for collection key and values which further helps to build manageable system. We will further discuss Riak in future blog posts. Key Value Databases are a good choice for social media, communities, caching layers for connecting other databases. In simpler words, whenever we required flexibility of the data storage keeping scalability in mind – KVP databases are good options to consider. Document Database There are two different kinds of document databases. 1) Full document Content (web pages, word docs etc) and 2) Storing Document Components for storage. The second types of the document database we are talking about over here. They use Javascript Object Notation (JSON) and Binary JSON for the structure of the documents. JSON is very easy to understand language and it is very easy to write for applications. There are two major structures of JSON used for Document Database – 1) Name Value Pairs and 2) Ordered List. MongoDB and CouchDB are two of the most popular Open Source NonRelational Document Database. MongoDB MongoDB databases are called collections. Each collection is build of documents and each document is composed of fields. MongoDB collections can be indexed for optimal performance. MongoDB ecosystem is highly available, supports query services as well as MapReduce. It is often used in high volume content management system. CouchDB CouchDB databases are composed of documents which consists fields and attachments (known as description). It supports ACID properties. The main attraction points of CouchDB are that it will continue to operate even though network connectivity is sketchy. Due to this nature CouchDB prefers local data storage. Document Database is a good choice of the database when users have to generate dynamic reports from elements which are changing very frequently. A good example of document usages is in real time analytics in social networking or content management system. Tomorrow In tomorrow’s blog post we will discuss about various other Operational Databases supporting Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • F1 Pit Pragmatics

    - by mikef
    "I hate computers. No, really, I hate them. I love the communications they facilitate, I love the conveniences they provide to my life. but I actually hate the computers themselves." - Scott Merrill, 'I hate computers: confessions of a Sysadmin' If Scott's goal was to polarize opinion and trigger raging arguments over the 'real reasons why computers suck', then he certainly succeeded. Impassioned vitriol sits side-by-side with rational debate. Yet Scott's fundamental point is absolutely on the money - Computers are a means to an end. The IT industry is finally starting to put weight behind the notion that good User Experience is an absolutely crucial goal, a cause championed by the likes of Microsoft's Bill Buxton, and which Apple's increasingly ubiquitous touch screen interface exemplifies. However, that doesn't change the fact that, occasionally, you just have to man up and deal with complex systems. In fact, sometimes you just need to sacrifice everything else in the name of performance. You'll find a perfect example of this Faustian bargain in Trevor Clarke's fascinating look into the (diabolical) IT infrastructure of modern F1 racing - high performance, high availability. high everything. To paraphrase, each car has up to 100 sensors, transmitting around 30Gb of data over the course of a race (70% in real-time). This data is then processed by no less than 3 servers (per car) so that the engineers in the pit have access to telemetry, strategy information, timing feeds, a connection back to the operations room in the team's home base - the list goes on. All of this while the servers are exposed "to carbon dust, oil, vibration, rain, heat, [and] variable power". Now, this is admittedly an extreme context where there's no real choice but to use complex systems where ease-of-use is, at best, a secondary concern. The flip-side is seen in small-scale personal computing such as that seen in Apple's iDevices, which are incredibly intuitive but limited in their scope. In terms of what kinds of systems they prefer to use, I suspect that most SysAdmins find themselves somewhere along this axis of Power vs. Usability, and which end of this axis you resonate with also hints at where you think the IT industry should focus its energy. Do you see yourself in the F1 pit, making split-second decisions, wrestling with information flows and reticent hardware to bend them to your will? If so, I imagine you feel that computers are subtle tools which need to be tuned and honed, using the advanced knowledge possessed only by responsible SysAdmins (If you have an iPhone, I suspect it's jail-broken). If the machines throw enigmatic errors, it's the price of flexibility and raw power. Alternatively, would you prefer to have your role more accessible, with users empowered by knowledge, spreading the load of managing IT environments? In that case, then you want hardware and software to have User Experience as their primary focus, and are of the "means to an end" school of thought (you're probably also fed up with users not listening to you when you try and help). At its heart, the dichotomy is between raw power (which might be difficult to use) and ease-of-use (which might have some limitations, but you can be up and running immediately). Of course, the ultimate goal is a fusion of flexibility, power and usability all in one system. It's achievable in specific software environments, and Red Gate considers it a target worth aiming for, but in other cases it's a goal right up there with cold fusion. I think it'll be a long time before we see it become ubiquitous. In the meantime, are you Power-Hungry or a Champion of Usability? Cheers, Michael Francis Simple Talk SysAdmin Editor

    Read the article

  • ALC889 - GA-P55-UD4 -no sound - Ubuntu 11.10

    - by george
    I have computer with a Gigabyte P55A-UD4 motherboard. I have on-board audio - Realtek ALC889. I am using Ubuntu 11.10 and have no sound. please please heeeelp :). i have tryed to install high definition audio codecs from realtek but it doesn't work. in bios the azalia codec is turned on. ps : sorry for my english. 00:00.0 Host bridge: Intel Corporation Core Processor DRAM Controller (rev 12) 00:01.0 PCI bridge: Intel Corporation Core Processor PCI Express x16 Root Port (rev 12) 00:1a.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 06) 00:1a.1 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 06) 00:1a.2 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 06) 00:1a.7 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 06) 00:1b.0 Audio device: Intel Corporation 5 Series/3400 Series Chipset High Definition Audio (rev 06) 00:1c.0 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 1 (rev 06) 00:1c.4 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 5 (rev 06) 00:1c.5 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 6 (rev 06) 00:1c.6 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 7 (rev 06) 00:1d.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 06) 00:1d.1 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 06) 00:1d.2 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 06) 00:1d.3 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 06) 00:1d.7 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 06) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a6) 00:1f.0 ISA bridge: Intel Corporation 5 Series Chipset LPC Interface Controller (rev 06) 00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset 6 port SATA AHCI Controller (rev 06) 00:1f.3 SMBus: Intel Corporation 5 Series/3400 Series Chipset SMBus Controller (rev 06) 01:00.0 VGA compatible controller: nVidia Corporation GT216 [GeForce GT 220] (rev a2) 01:00.1 Audio device: nVidia Corporation High Definition Audio Controller (rev a1) 03:00.0 SATA controller: JMicron Technology Corp. JMB362/JMB363 Serial ATA Controller (rev 03) 03:00.1 IDE interface: JMicron Technology Corp. JMB362/JMB363 Serial ATA Controller (rev 03) 04:00.0 IDE interface: Marvell Technology Group Ltd. Device 91a3 (rev 11) 05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06) 06:03.0 IDE interface: Integrated Technology Express, Inc. IT8213 IDE Controller 06:07.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller (PHY/Link) 3f:00.0 Host bridge: Intel Corporation Core Processor QuickPath Architecture Generic Non-core Registers (rev 02) 3f:00.1 Host bridge: Intel Corporation Core Processor QuickPath Architecture System Address Decoder (rev 02) 3f:02.0 Host bridge: Intel Corporation Core Processor QPI Link 0 (rev 02) 3f:02.1 Host bridge: Intel Corporation Core Processor QPI Physical 0 (rev 02) 3f:02.2 Host bridge: Intel Corporation Core Processor Reserved (rev 02) 3f:02.3 Host bridge: Intel Corporation Core Processor Reserved (rev 02) aplay -l karta 0: Intel [HDA Intel], urzadzenie 0: ALC889 Analog [ALC889 Analog] Urzadzenia podrzedne: 1/1 Urzadzenie podrzedne #0: subdevice #0 karta 0: Intel [HDA Intel], urzadzenie 1: ALC889 Digital [ALC889 Digital] Urzadzenia podrzedne: 1/1 Urzadzenie podrzedne #0: subdevice #0 karta 1: NVidia [HDA NVidia], urzadzenie 3: HDMI 0 [HDMI 0] Urzadzenia podrzedne: 1/1 Urzadzenie podrzedne #0: subdevice #0 karta 1: NVidia [HDA NVidia], urzadzenie 7: HDMI 0 [HDMI 0] Urzadzenia podrzedne: 1/1 Urzadzenie podrzedne #0: subdevice #0 karta 1: NVidia [HDA NVidia], urzadzenie 8: HDMI 0 [HDMI 0] Urzadzenia podrzedne: 1/1 Urzadzenie podrzedne #0: subdevice #0 karta 1: NVidia [HDA NVidia], urzadzenie 9: HDMI 0 [HDMI 0] Urzadzenia podrzedne: 1/1 Urzadzenie podrzedne #0: subdevice #0

    Read the article

  • IPgallery banks on Solaris SPARC

    - by Frederic Pariente
    IPgallery is a global supplier of converged legacy and Next Generation Networks (NGN) products and solutions, including: core network components and cloud-based Value Added Services (VAS) for voice, video and data sessions. IPgallery enables network operators and service providers to offer advanced converged voice, chat, video/content services and rich unified social communications in a combined legacy (fixed/mobile), Over-the-Top (OTT) and Social Community (SC) environments for home and business customers. Technically speaking, this offer is a scalable and robust telco solution enabling operators to offer new services while controlling operating expenses (OPEX). In its solutions, IPgallery leverages the following Oracle components: Oracle Solaris, Netra T4 and SPARC T4 in order to provide a competitive and scalable solution without the price tag often associated with high-end systems. Oracle Solaris Binary Application Guarantee A unique feature of Oracle Solaris is the guaranteed binary compatibility between releases of the Solaris OS. That means, if a binary application runs on Solaris 2.6 or later, it will run on the latest release of Oracle Solaris.  IPgallery developed their application on Solaris 9 and Solaris 10 then runs it on Solaris 11, without any code modification or rebuild. The Solaris Binary Application Guarantee helps IPgallery protect their long-term investment in the development, training and maintenance of their applications. Oracle Solaris Image Packaging System (IPS) IPS is a new repository-based package management system that comes with Oracle Solaris 11. It provides a framework for complete software life-cycle management such as installation, upgrade and removal of software packages. IPgallery leverages this new packaging system in order to speed up and simplify software installation for the R&D and production environments. Notably, they use IPS to deliver Solaris Studio 12.3 packages as part of the rapid installation process of R&D environments, and during the production software deployment phase, they ensure software package integrity using the built-in verification feature. Solaris IPS thus improves IPgallery's time-to-market with a faster, more reliable software installation and deployment in production environments. Extreme Network Performance IPgallery saw a huge improvement in application performance both in CPU and I/O, when running on SPARC T4 architecture in compared to UltraSPARC T2 servers.  The same application (with the same activation environment) running on T2 consumes 40%-50% CPU, while it consumes only 10% of the CPU on T4. The testing environment comprised of: Softswitch (Call management), TappS (Telecom Application Server) and Billing Server running on same machine and initiating various services in capacity of 1000 CAPS (Call Attempts Per Second). In addition, tests showed a huge improvement in the performance of the TCP/IP stack, which reduces network layer processing and in the end Call Attempts latency. Finally, there is a huge improvement within the file system and disk I/O operations; they ran all tests with maximum logging capability and it didn't influence any benchmark values. "Due to the huge improvements in performance and capacity using the T4-1 architecture, IPgallery has engineered the solution with less hardware.  This means instead of deploying the solution on six T2-based machines, we will deploy on 2 redundant machines while utilizing Oracle Solaris Zones and Oracle VM for higher availability and virtualization" Shimon Lichter, VP R&D, IPgallery In conclusion, using the unique combination of Oracle Solaris and SPARC technologies, IPgallery is able to offer solutions with much lower TCO, while providing a higher level of service capacity, scalability and resiliency. This low-OPEX solution enables the operator, the end-customer, to deliver a high quality service while maintaining high profitability.

    Read the article

  • Duke's Choice Award Ceremony

    - by Tori Wieldt
    The 2012 Duke's Choice Awards winners and their creative, Java-based technologies and Java community contributions were honored after the Sunday night JavaOne keynotes. Sharat Chander, Group Director for Java Technology Outreach, presented the awards. "Having the community participate directly in both submission and selection truly shows how we are driving exposure of the innovation happening in the Java community," he said. Apache Software Foundation Hadoop Project The Apache Software Foundation’s Hadoop project, written in Java, provides a framework for distributed processing of big data sets across clusters of computers, ranging from a few servers to thousands of machines. This harnessing of large data pools allows organizations to better understand and improve their business. AgroSense Project Improving farming methods to feed a hungry world is the goal of AgroSense, an open source farm information management system built in Java and the NetBeans platform. AgroSense enables farmers, agribusinesses, suppliers and others to develop modular applications that will easily exchange information through a common underlying NetBeans framework. JDuchess Rather than focus on a specific geographic area like most Java User Groups (JUGs), JDuchess fosters the participation of women in the Java community worldwide. The group has more than 500 members in 60 countries, and provides a platform through which women can connect with each other and get involved in all aspects of the Java community. Jelastic, Inc. Moving existing Java applications to the cloud can be a daunting task, but startup Jelastic, Inc. offers the first all-Java platform-as-a-service (PaaS) that enables existing Java applications to be deployed in the cloud without code changes or lock-in. Liquid Robotics Robotics – Liquid Robotics is an ocean data services provider whose Wave Glider technology collects information from the world’s oceans for application in government, science and commercial applications. The organization features the “father of Java” James Gosling as its chief software architect. London Java Community The second user group receiving a Duke’s Choice Award this year, the London Java Community (LJC) and its users have been active in the OpenJDK, the Java Community Process (JCP) and other efforts within the global Java community. NATO The first-ever Community Choice Award goes to the MASE Integrated Console Environment (MICE) in use at NATO. Built in Java on the NetBeans platform, MICE provides a high-performance visualization environment for conducting air defense and battle-space operations. Parleys.com E-learning specialist Parleys.com, based in Brussels, Belgium, uses Java technologies to bring online classes and full IT conferences to desktops, laptops, tablets and mobile devices. Parleys.com has hosted more than 1,700 conferences—including Devoxx and JavaOne—for more than 800,000 unique visitors. Student Nokia Developer Group This year’s student winner, Ram Kashyap, is the founder and president of the Nokia Student Network, and was profiled in the “The New Java Developers” feature in the March/April 2012 issue of Java Magazine. Since then, Ram has maintained a hectic pace, graduating from the People’s Education Society Institute of Technology in Bangalore, India, while working on a Java mobile startup and training students on Java ME. United Nations High Commissioner for Refugees The United Nations High Commissioner for Refugees (UNHCR) is on the front lines of crises around the world, from civil wars to natural disasters. To help facilitate its mission of humanitarian relief, the UNHCR has developed a light-client Java application on the NetBeans platform. The Level One registration tool enables the UNHCR to collect information on the number of refugees and their water, food, housing, health, and other needs in the field, and combines that with geocoding information from various sources. This enables the UNHCR to deliver the appropriate kind and amount of assistance where it is needed. You can read more about the winners in the current issue of Java Magazine.

    Read the article

< Previous Page | 63 64 65 66 67 68 69 70 71 72 73 74  | Next Page >