Search Results

Search found 24943 results on 998 pages for 'scientific linux'.

Page 173/998 | < Previous Page | 169 170 171 172 173 174 175 176 177 178 179 180  | Next Page >

  • Take Two: Comparing JVMs on ARM/Linux

    - by user12608080
    Although the intent of the previous article, entitled Comparing JVMs on ARM/Linux, was to introduce and highlight the availability of the HotSpot server compiler (referred to as c2) for Java SE-Embedded ARM v7,  it seems, based on feedback, that everyone was more interested in the OpenJDK comparisons to Java SE-E.  In fact there were two main concerns: The fact that the previous article compared Java SE-E 7 against OpenJDK 6 might be construed as an unlevel playing field because version 7 is newer and therefore potentially more optimized. That the generic compiler settings chosen to build the OpenJDK implementations did not put those versions in a particularly favorable light. With those considerations in mind, we'll institute the following changes to this version of the benchmarking: In order to help alleviate an additional concern that there is some sort of benchmark bias, we'll use a different suite, called DaCapo.  Funded and supported by many prestigious organizations, DaCapo's aim is to benchmark real world applications.  Further information about DaCapo can be found at http://dacapobench.org. At the suggestion of Xerxes Ranby, who has been a great help through this entire exercise, a newer Linux distribution will be used to assure that the OpenJDK implementations were built with more optimal compiler settings.  The Linux distribution in this instance is Ubuntu 11.10 Oneiric Ocelot. Having experienced difficulties getting Ubuntu 11.10 to run on the original D2Plug ARMv7 platform, for these benchmarks, we'll switch to an embedded system that has a supported Ubuntu 11.10 release.  That platform is the Freescale i.MX53 Quick Start Board.  It has an ARMv7 Coretex-A8 processor running at 1GHz with 1GB RAM. We'll limit comparisons to 4 JVM implementations: Java SE-E 7 Update 2 c1 compiler (default) Java SE-E 6 Update 30 (c1 compiler is the only option) OpenJDK 6 IcedTea6 1.11pre 6b23~pre11-0ubuntu1.11.10.2 CACAO build 1.1.0pre2 OpenJDK 6 IcedTea6 1.11pre 6b23~pre11-0ubuntu1.11.10.2 JamVM build-1.6.0-devel Certain OpenJDK implementations were eliminated from this round of testing for the simple reason that their performance was not competitive.  The Java SE 7u2 c2 compiler was also removed because although quite respectable, it did not perform as well as the c1 compilers.  Recall that c2 works optimally in long-lived situations.  Many of these benchmarks completed in a relatively short period of time.  To get a feel for where c2 shines, take a look at the first chart in this blog. The first chart that follows includes performance of all benchmark runs on all platforms.  Later on we'll look more at individual tests.  In all runs, smaller means faster.  The DaCapo aficionado may notice that only 10 of the 14 DaCapo tests for this version were executed.  The reason for this is that these 10 tests represent the only ones successfully completed by all 4 JVMs.  Only the Java SE-E 6u30 could successfully run all of the tests.  Both OpenJDK instances not only failed to complete certain tests, but also experienced VM aborts too. One of the first observations that can be made between Java SE-E 6 and 7 is that, for all intents and purposes, they are on par with regards to performance.  While it is a fact that successive Java SE releases add additional optimizations, it is also true that Java SE 7 introduces additional complexity to the Java platform thus balancing out any potential performance gains at this point.  We are still early into Java SE 7.  We would expect further performance enhancements for Java SE-E 7 in future updates. In comparing Java SE-E to OpenJDK performance, among both OpenJDK VMs, Cacao results are respectable in 4 of the 10 tests.  The charts that follow show the individual results of those four tests.  Both Java SE-E versions do win every test and outperform Cacao in the range of 9% to 55%. For the remaining 6 tests, Java SE-E significantly outperforms Cacao in the range of 114% to 311% So it looks like OpenJDK results are mixed for this round of benchmarks.  In some cases, performance looks to have improved.  But in a majority of instances, OpenJDK still lags behind Java SE-Embedded considerably. Time to put on my asbestos suit.  Let the flames begin...

    Read the article

  • Why this code is not working on linux server ?

    - by user488001
    Hello Experts, I am new in Zend Framework, and this code is use for downloading contents. This code is working in localhost but when i tried to execute in linux server it shows error file not found. public function downloadAnnouncementsAction() { $file= $this-_getParam('file'); $file = str_replace("%2F","/",$this-_getParam('file')); // Allow direct file download (hotlinking)? // Empty - allow hotlinking // If set to nonempty value (Example: example.com) will only allow downloads when referrer contains this text define('ALLOWED_REFERRER', ''); // Download folder, i.e. folder where you keep all files for download. // MUST end with slash (i.e. "/" ) define('BASE_DIR','file_upload'); // log downloads? true/false define('LOG_DOWNLOADS',true); // log file name define('LOG_FILE','downloads.log'); // Allowed extensions list in format 'extension' => 'mime type' // If myme type is set to empty string then script will try to detect mime type // itself, which would only work if you have Mimetype or Fileinfo extensions // installed on server. $allowed_ext = array ( // audio 'mp3' => 'audio/mpeg', 'wav' => 'audio/x-wav', // video 'mpeg' => 'video/mpeg', 'mpg' => 'video/mpeg', 'mpe' => 'video/mpeg', 'mov' => 'video/quicktime', 'avi' => 'video/x-msvideo' ); // If hotlinking not allowed then make hackers think there are some server problems if (ALLOWED_REFERRER !== '' && (!isset($_SERVER['HTTP_REFERER']) || strpos(strtoupper($_SERVER['HTTP_REFERER']),strtoupper(ALLOWED_REFERRER)) === false) ) { die("Internal server error. Please contact system administrator."); } // Make sure program execution doesn't time out // Set maximum script execution time in seconds (0 means no limit) set_time_limit(0); if (!isset($file) || empty($file)) { die("Please specify file name for download."); } // Nullbyte hack fix if (strpos($file, "\0") !== FALSE) die(''); // Get real file name. // Remove any path info to avoid hacking by adding relative path, etc. $fname = basename($file); // Check if the file exists // Check in subfolders too function find_file ($dirname, $fname, &$file_path) { $dir = opendir($dirname); while ($file = readdir($dir)) { if (empty($file_path) && $file != '.' && $file != '..') { if (is_dir($dirname.'/'.$file)) { find_file($dirname.'/'.$file, $fname, $file_path); } else { if (file_exists($dirname.'/'.$fname)) { $file_path = $dirname.'/'.$fname; return; } } } } } // find_file // get full file path (including subfolders) $file_path = ''; find_file(BASE_DIR, $fname, $file_path); if (!is_file($file_path)) { die("File does not exist. Make sure you specified correct file name."); } // file size in bytes $fsize = filesize($file_path); // file extension $fext = strtolower(substr(strrchr($fname,"."),1)); // check if allowed extension if (!array_key_exists($fext, $allowed_ext)) { die("Not allowed file type."); } // get mime type if ($allowed_ext[$fext] == '') { $mtype = ''; // mime type is not set, get from server settings if (function_exists('mime_content_type')) { $mtype = mime_content_type($file_path); } else if (function_exists('finfo_file')) { $finfo = finfo_open(FILEINFO_MIME); // return mime type $mtype = finfo_file($finfo, $file_path); finfo_close($finfo); } if ($mtype == '') { $mtype = "application/force-download"; } } else { // get mime type defined by admin $mtype = $allowed_ext[$fext]; } // Browser will try to save file with this filename, regardless original filename. // You can override it if needed. if (!isset($_GET['fc']) || empty($_GET['fc'])) { $asfname = $fname; } else { // remove some bad chars $asfname = str_replace(array('"',"'",'\\','/'), '', $_GET['fc']); if ($asfname === '') $asfname = 'NoName'; } // set headers header("Pragma: public"); header("Expires: 0"); header("Cache-Control: must-revalidate, post-check=0, pre-check=0"); header("Cache-Control: public"); header("Content-Description: File Transfer"); header("Content-Type: $mtype"); header("Content-Disposition: attachment; filename=\"$asfname\""); header("Content-Transfer-Encoding: binary"); header("Content-Length: " . $fsize); // download // @readfile($file_path); $file = @fopen($file_path,"rb"); if ($file) { while(!feof($file)) { print(fread($file, 1024*8)); flush(); if (connection_status()!=0) { @fclose($file); die(); } } @fclose($file); } // log downloads if (!LOG_DOWNLOADS) die(); $f = @fopen(LOG_FILE, 'a+'); if ($f) { @fputs($f, date("m.d.Y g:ia")." ".$_SERVER['REMOTE_ADDR']." ".$fname."\n"); @fclose($f); } } please Help...

    Read the article

  • Super-Charge GIMP’s Image Editing Capabilities with G’MIC [Cross-Platform]

    - by Asian Angel
    Recently we showed you how to enhance GIMP’s image editing power and today we help you super-charge GIMP even more. G’MIC (GREYC’s Magic Image Converter) will add an impressive array of filters and effects to your GIMP installation for image editing goodness. Note: We applied the Contrast Swiss Mask filter to the image shown in the screenshot above to create a nice, warm sunset effect. To add the new PPA open the Ubuntu Software Center, go to the Edit Menu, and select Software Sources. Access the Other Software Tab in the Software Sources Window and add the first of the PPAs shown below (outlined in red). The second PPA will be automatically added to your system. Once you have the new PPAs set up, go back to the Ubuntu Software Center and do a search for “G’MIC”. You will find two listings available and can select either one to add G’MIC to your system (both work equally well). Click on More Info for the listing that you choose and scroll down to where Add-ons are listed. Make sure to select the Add-on listed, click Apply Changes when it appears, and then click Install. We have both shown here for your convenience… When you get ready to use G’MIC to enhance an image, go to the Filters Menu and select G’MIC. A new window will appear where you can select from an impressive array of filters available for your use. Have fun! Command Line Installation For those of you who prefer using the command line for installation use the following commands: sudo add-apt-repository ppa:ferramroberto/gimp sudo apt-get update sudo apt-get install gmic gimp-gmic Links Note: G’MIC is available for Linux, Windows, and Mac. G’MIC PPA at Launchpad [via Web Upd8] G’MIC Homepage at Sourceforge *Downloads for all three platforms available here. Bonus The anime wallpaper shown in the screenshots above can be found here: anime sport [DesktopNexus] Latest Features How-To Geek ETC Learn To Adjust Contrast Like a Pro in Photoshop, GIMP, and Paint.NET Have You Ever Wondered How Your Operating System Got Its Name? Should You Delete Windows 7 Service Pack Backup Files to Save Space? What Can Super Mario Teach Us About Graphics Technology? Windows 7 Service Pack 1 is Released: But Should You Install It? How To Make Hundreds of Complex Photo Edits in Seconds With Photoshop Actions Access and Manage Your Ubuntu One Account in Chrome and Iron Mouse Over YouTube Previews YouTube Videos in Chrome Watch a Machine Get Upgraded from MS-DOS to Windows 7 [Video] Bring the Whole Ubuntu Gang Home to Your Desktop with this Mascots Wallpaper Hack Apart a Highlighter to Create UV-Reactive Flowers [Science] Add a “Textmate Style” Lightweight Text Editor with Dropbox Syncing to Chrome and Iron

    Read the article

  • Access Windows Home Server from an Ubuntu Computer on your Network

    - by Mysticgeek
    If you’re a Windows Home Server user, there may be times when you need to access it from an Ubuntu machine on your network. Today we take a look at the process of accessing files on your home server from Ubuntu. Note: In this example we’re using Windows Home Server with PowerPack 3, and Ubuntu 10.04 running on a home network. Access WHS from Ubuntu To access files on your home server from Ubuntu, click on Places then select Network. You should now see your home server listed in the Network folder as well as other Windows machines…double-click the server to access it. If you don’t see your server listed, you might need to go into Windows Network \ Workgroup and find it there. You’ll be prompted to enter in the correct credentials for WHS just as you would when accessing it from a Windows machine. It’s your choice if you want to have the password remembered or not…make your selection and click Connect. Now you will see the available folders on your home server. In this example we signed in with Administrator credentials, so we have access to everything. Double-click on the folder share you want to access content from…here we see MS Office documents on the server. Or, here we take a look at a music folder with various MP3 files which you can make Ubuntu play. You can access the files directly from the server, provided there is a Linux app that can handle the file type. In this example we opened a Word document in OpenOffice. Here we’re playing an MKV movie file from the server in Totem Movie Player.   You can easily search for files on the server as well… If you want to store your Ubuntu files on WHS it’s just a matter of dragging them to the correct WHS folder you want them in. If you’re using an Ubuntu computer on your home network and need to access files from Windows Home Server, luckily it’s a straight-forward process. You’ll often have to find the correct software to use Windows files, but even that’s getting much easier with version 10.04. Similar Articles Productive Geek Tips Share Ubuntu Home Directories using SambaCreate a Samba User on UbuntuGMedia Blog: Setting Up a Windows Home ServerRestore Files from Backups on Windows Home ServerInstall Samba Server on Ubuntu TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips HippoRemote Pro 2.2 Xobni Plus for Outlook All My Movies 5.9 CloudBerry Online Backup 1.5 for Windows Home Server Speed Up Windows With ReadyBoost Awesome World Cup Soccer Calendar Nice Websites To Watch TV Shows Online 24 Million Sites Windows Media Player Glass Icons (icons we like) How to Forecast Weather, without Gadgets

    Read the article

  • How to Format a USB Drive in Ubuntu Using GParted

    - by Trevor Bekolay
    If a USB hard drive or flash drive is not properly formatted, then it will not show up in the Ubuntu Places menu, making it hard to interact with. We’ll show you how to format a USB drive using the tool GParted. Note: Formatting a USB drive will destroy any data currently stored on it. If you think that your USB drive is already properly formatted, but Ubuntu just isn’t picking it up, try unplugging it and plugging it back in to a different USB slot, or restarting your machine with the device plugged in on start-up. Open a terminal by clicking on Applications in the top-left of the screen, then Accessories > Terminal. GParted should be installed by default, but we’ll make sure it’s installed by entering the following command in the terminal: sudo apt-get install gparted To open GParted, enter the following command in the terminal: sudo gparted Find your USB drive in the drop-down box at the top right of the GParted window. The drive should be unallocated – if it has a valid partition on it, then you may be looking at the wrong drive. Note: Make sure you’re on the correct drive, as making changes on the wrong hard drive with GParted can delete all data on a hard drive! Assuming you’re on the right drive, right-click on the unallocated grey block and click New. In the window that pops up, change the File System to fat32 for USB Flash Drives, NTFS for USB Hard Drives that will be used in Windows, or ext3/ext4 for USB Hard Drives that will be used exclusively in Linux. Add a label if you’d like, and then click Add. Click the green checkmark and then the Apply button to apply the changes. GParted will now format your drive. If you’re formatting a large USB Hard Drive, this can take some time. Once the process is done, you can close GParted, and the drive will now show up in the Places menu. Clicking on the drive will mount it and open it in a File Browser window. It will also add a shortcut to the drive on the Desktop by default. Your USB drive is now ready to store your files! Similar Articles Productive Geek Tips Using GParted to Resize Your Windows Vista PartitionInstall an RPM Package on Ubuntu LinuxCreate a Persistent Bootable Ubuntu USB Flash DriveShare Ubuntu Home Directories using SambaCreate a Samba User on Ubuntu TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Acronis Online Backup DVDFab 6 Revo Uninstaller Pro Registry Mechanic 9 for Windows Fun with 47 charts and graphs Tomorrow is Mother’s Day Check the Average Speed of YouTube Videos You’ve Watched OutlookStatView Scans and Displays General Usage Statistics How to Add Exceptions to the Windows Firewall Office 2010 reviewed in depth by Ed Bott

    Read the article

  • Small hiccup with VMware Player after upgrading to Ubuntu 12.04

    The upgrade process Finally, it was time to upgrade to a new LTS version of Ubuntu - 12.04 aka Precise Pangolin. I scheduled the weekend for this task and despite the nickname of Mauritius (Cyber Island) it took roughly 6 hours to download nearly 2.400 packages. No problem in general, as I have spare machines to work on, and it was weekend anyway. All went very smooth and only a few packages required manual attention due to local modifications in the configuration. With the new kernel 3.2.0-24 it was necessary to reboot the system and compared to the last upgrade, I got my graphical login as expected. Compilation of VMware Player 4.x fails A quick test on the installed applications, Firefox, Thunderbird, Chromium, Skype, CrossOver, etc. reveils that everything is fine in general. Firing up VMware Player displays the known kernel mod dialog that requires to compile the modules for the newly booted kernel. Usually, this isn't a big issue but this time I was confronted with the situation that vmnet didn't compile as expected ("Failed to compile module vmnet"). Luckily, this issue is already well-known, even though with "Failed to compile module vmmon" as general reason but nevertheless it was very easy and quick to find the solution to this problem. In VMware Communities there are several forum threads related to this topic and VMware provides the necessary patch file for Workstation 8.0.2 and Player 4.0.2. In case that you are still on Workstation 7.x or Player 3.x there is another patch file available. After download extract the file like so: tar -xzvf vmware802fixlinux320.tar.gz and run the patch script as super-user: sudo ./patch-modules_3.2.0.sh This will alter the existing installation and source files of VMware Player on your machine. As last step, which isn't described in many other resources, you have to restart the vmware service, or for the heart-fainted, just reboot your system: sudo service vmware restart This will load the newly created kernel modules into your userspace, and after that VMware Player will start as usual. Summary Upgrading any derivate of Ubuntu, in my case Xubuntu, is quick and easy done but it might hold some surprises from time to time. Nonetheless, it is absolutely worthy to go for it. Currently, this patch for VMware is the only obstacle I had to face so far and my system feels and looks better than before. Happy upgrade! Resources I used the following links based on Google search results: http://communities.vmware.com/message/1902218#1902218http://weltall.heliohost.org/wordpress/2012/01/26/vmware-workstation-8-0-2-player-4-0-2-fix-for-linux-kernel-3-2-and-3-3/ Update on VMware Player 4.0.3 Please continue to read on my follow-up article in case that you upgraded either VMware Workstation 8.0.3 or VMware Player 4.0.3.

    Read the article

  • Linux: multiple network connections - 3G/4G / Wifi / LAN / etc; how can i set a preferred network connection to use?

    - by Alex
    I've been looking at how I can setup a laptop that has multiple network interfaces, but a problem exists if all the connections are active, i.e. 3G, WiFi and LAN are all connected, I would like it to default to LAN. I would like to set "weights" or "priority" to each connection, so that if the LAN is unplugged, it'll default to WiFi - if its in range and working, otherwise, it'll switch and use the 3G dongle; I've been looking around and I can see that the "metric" counter for route isn't being used for recent kernels. I thought that would be able to set the preferred gateway / connections - but according to the man page: man route: OUTPUT Metric The 'distance' to the target (usually counted in hops). It is not used by recent kernels, but may be needed by routing daemons. So I'm confused, are there any scripts / apps / anything that can detect active network connections, and by way of configuration, send my default gateway network traffic through that interface if its active / alive?

    Read the article

  • Ubuntu Control Center Makes Using Ubuntu Easier

    - by Vivek
    Users who are new to Ubuntu might find it somewhat difficult to configure. Today we take a look at using Ubuntu Control Center which makes managing different aspects of the system easier. About Ubuntu Control Center A lot of utilities and software has been written to work with Ubuntu. Ubuntu Control Center is one such cool utility which makes it easy for configuring Ubuntu. The following is a brief description of Ubuntu Control Center: Ubuntu Control Center or UCC is an application inspired by Mandriva Control Center and aims to centralize and organize in a simple and intuitive form the main configuration tools for Ubuntu distribution. UCC uses all the native applications already bundled with Ubuntu, but it also utilize some third-party apps like “Hardinfo”, “Boot-up Manager”, “GuFW” and “Font-Manager”. Ubuntu Control Center Here we look at installation and use of Ubuntu Control Center in Ubuntu 10.04. First we have to satisfy some dependencies. You will need to install Font-Manager and jstest-gtk (link below)…before installing Ubuntu Control Center (UCC). Click the Install Package button. You’ll be prompted to enter in your admin password for each installation package. Installation is successful…close out of the screen. Download and install Font-Manager…again you’ll need to enter in your password to complete installation.   Once you have installed the two dependencies, you are all set to install Ubuntu Control Center (link below), double click the downloaded Ubuntu Control Center deb file to install it. Once installed you can find it under Applications \ System Tools \ UCC. Once you launch it you can start managing your system, software, hardware, and more.   You can easily control various aspects of your Ubuntu System using Ubuntu Control Center. Here we look at configuring the firewall under Network and Internet.     UCC allows easy access for configuring several aspects of your system. Once you install UCC you’ll see how easy it is to configure your Ubuntu system through an intuitive clean graphical interface. If you’re new to Ubuntu, using UCC can help you in setting up your system how you like in a user friendly way. Home Page of UCC http://code.google.com/p/ucc/ Links Download Font-Manager ManagerDownload jstest-gtkUbuntu Control Center (UCC) Similar Articles Productive Geek Tips Adding extra Repositories on UbuntuAllow Remote Control To Your Desktop On UbuntuAssign a Hotkey to Open a Terminal Window in UbuntuInstall VMware Tools on Ubuntu Edgy EftInstall Monodevelop on Ubuntu Linux TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Xobni Plus for Outlook All My Movies 5.9 CloudBerry Online Backup 1.5 for Windows Home Server Snagit 10 How to Forecast Weather, without Gadgets Outlook Tools, one stop tweaking for any Outlook version Zoofs, find the most popular tweeted YouTube videos Video preview of new Windows Live Essentials 21 Cursor Packs for XP, Vista & 7 Map the Stars with Stellarium

    Read the article

  • Use Any Folder For Your Ubuntu Desktop (Even a Dropbox Folder)

    - by Trevor Bekolay
    By default, Ubuntu creates a folder called Desktop in your home directory that gets displayed on your desktop. What if you want to use something else, like your Dropbox folder? Here we look at how to use any folder for your desktop. Not only can you change your desktop folder, you can change the location of any other folder Ubuntu creates for you in your home folder, like Documents or Music – and this works in any Linux distribution using the Gnome desktop manager. In this example, we’re going to change desktop to show our Dropbox folder. Open your home folder in a File Browser by clicking on Places > Home Folder. In the Home Folder, open the .config folder. By default, .config is hidden, so you may have to show hidden folders (temporarily) by clicking on View > Show Hidden Files. Then open the .config folder by double-clicking on it. Now open the user-dirs.dirs file… If double-clicking on it does not open it in a text editor, right-click on it and choose Open with Other Application… and find a text editor like Gedit. Change the entry associated with XDG_DESKTOP_DIR to the folder you want to be shown as your desktop. In our case, this is $HOME/Dropbox. Note: The “~” shortcut for the home directory won’t work in this file (use $HOME for that), but an absolute path (i.e. a path starting with “/”) will work. Feel free to change the locations of the other folders as well. Save and close user-dirs.dirs. At this point you can either log off and then log back on to get your desktop back, or open a terminal window Applications > Accessories > Terminal and enter: killall nautilus Nautilus (the file manager in Gnome) will restart itself and display your newly chosen folder as the desktop! This is a cool trick to use any folder for your Ubuntu desktop. What did you use as your desktop folder? Let us know in the comments! Similar Articles Productive Geek Tips Sync Your Pidgin Profile Across Multiple PCs with DropboxAdd "My Dropbox" to Your Windows 7 Start MenuCreate a Keyboard Shortcut to Access Hidden Desktop Icons and FilesAdd "My Computer" to Your Windows 7 / Vista TaskbarCheck your Disk Usage on Ubuntu with Disk Usage Analyzer TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips VMware Workstation 7 Acronis Online Backup DVDFab 6 Revo Uninstaller Pro Use Flixtime To Create Video Slideshows Creating a Password Reset Disk in Windows Bypass Waiting Time On Customer Service Calls With Lucyphone MELTUP – "The Beginning Of US Currency Crisis And Hyperinflation" Enable or Disable the Task Manager Using TaskMgrED Explorer++ is a Worthy Windows Explorer Alternative

    Read the article

  • How John Got 15x Improvement Without Really Trying

    - by rchrd
    The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here.  How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

    Read the article

  • Change the User Interface Language in Ubuntu

    - by Matthew Guay
    Would you like to use your Ubuntu computer in another language?  Here’s how you can easily change your interface language in Ubuntu. Ubuntu’s default install only includes a couple languages, but it makes it easy to find and add a new interface language to your computer.  To get started, open the System menu, select Administration, and then click Language Support. Ubuntu may ask if you want to update or add components to your current default language when you first open the dialog.  Click Install to go ahead and install the additional components, or you can click Remind Me Later to wait as these will be installed automatically when you add a new language. Now we’re ready to find and add an interface language to Ubuntu.  Click Install / Remove Languages to add the language you want. Find the language you want in the list, and click the check box to install it.  Ubuntu will show you all the components it will install for the language; this often includes spellchecking files for OpenOffice as well.  Once you’ve made your selection, click Apply Changes to install your new language.  Make sure you’re connected to the internet, as Ubuntu will have to download the additional components you’ve selected. Enter your system password when prompted, and then Ubuntu will download the needed languages files and install them.   Back in the main Language & Text dialog, we’re now ready to set our new language as default.  Find your new language in the list, and then click and drag it to the top of the list. Notice that Thai is the first language listed, and English is the second.  This will make Thai the default language for menus and windows in this account.  The tooltip reminds us that this setting does not effect system settings like currency or date formats. To change these, select the Text Tab and pick your new language from the drop-down menu.  You can preview the changes in the bottom Example box. The changes we just made will only affect this user account; the login screen and startup will not be affected.  If you wish to change the language in the startup and login screens also, click Apply System-Wide in both dialogs.  Other user accounts will still retain their original language settings; if you wish to change them, you must do it from those accounts. Once you have your new language settings all set, you’ll need to log out of your account and log back in to see your new interface language.  When you re-login, Ubuntu may ask you if you want to update your user folders’ names to your new language.  For example, here Ubuntu is asking if we want to change our folders to their Thai equivalents.  If you wish to do so, click Update or its equivalents in your language. Now your interface will be almost completely translated into your new language.  As you can see here, applications with generic names are translated to Thai but ones with specific names like Shutter keep their original name. Even the help dialogs are translated, which makes it easy for users around to world to get started with Ubuntu.  Once again, you may notice some things that are still in English, but almost everything is translated. Adding a new interface language doesn’t add the new language to your keyboard, so you’ll still need to set that up.  Check out our article on adding languages to your keyboard to get this setup. If you wish to revert to your original language or switch to another new language, simply repeat the above steps, this time dragging your original or new language to the top instead of the one you chose previously. Conclusion Ubuntu has a large number of supported interface languages to make it user-friendly to people around the globe.  And since you can set the language for each user account, it’s easy for multi-lingual individuals to share the same computer. Or, if you’re using Windows, check out our article on how you can Change the User Interface Language in Vista or Windows 7, too! Similar Articles Productive Geek Tips Restart the Ubuntu Gnome User Interface QuicklyChange the User Interface Language in Vista or Windows 7Create a Samba User on UbuntuInstall Samba Server on UbuntuSee Which Groups Your Linux User Belongs To TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips VMware Workstation 7 Acronis Online Backup DVDFab 6 Revo Uninstaller Pro FetchMp3 Can Download Videos & Convert Them to Mp3 Use Flixtime To Create Video Slideshows Creating a Password Reset Disk in Windows Bypass Waiting Time On Customer Service Calls With Lucyphone MELTUP – "The Beginning Of US Currency Crisis And Hyperinflation" Enable or Disable the Task Manager Using TaskMgrED

    Read the article

  • Diagnose PC Hardware Problems with an Ubuntu Live CD

    - by Trevor Bekolay
    So your PC randomly shuts down or gives you the blue screen of death, but you can’t figure out what’s wrong. The problem could be bad memory or hardware related, and thankfully the Ubuntu Live CD has some tools to help you figure it out. Test your RAM with memtest86+ RAM problems are difficult to diagnose—they can range from annoying program crashes, or crippling reboot loops. Even if you’re not having problems, when you install new RAM it’s a good idea to thoroughly test it. The Ubuntu Live CD includes a tool called Memtest86+ that will do just that—test your computer’s RAM! Unlike many of the Live CD tools that we’ve looked at so far, Memtest86+ has to be run outside of a graphical Ubuntu session. Fortunately, it only takes a few keystrokes. Note: If you used UNetbootin to create an Ubuntu flash drive, then memtest86+ will not be available. We recommend using the Universal USB Installer from Pendrivelinux instead (persistence is possible with Universal USB Installer, but not mandatory). Boot up your computer with a Ubuntu Live CD or USB drive. You will be greeted with this screen: Use the down arrow key to select the Test memory option and hit Enter. Memtest86+ will immediately start testing your RAM. If you suspect that a certain part of memory is the problem, you can select certain portions of memory by pressing “c” and changing that option. You can also select specific tests to run. However, the default settings of Memtest86+ will exhaustively test your memory, so we recommend leaving the settings alone. Memtest86+ will run a variety of tests that can take some time to complete, so start it running before you go to bed to give it adequate time. Test your CPU with cpuburn Random shutdowns – especially when doing computationally intensive tasks – can be a sign of a faulty CPU, power supply, or cooling system. A utility called cpuburn can help you determine if one of these pieces of hardware is the problem. Note: cpuburn is designed to stress test your computer – it will run it fast and cause the CPU to heat up, which may exacerbate small problems that otherwise would be minor. It is a powerful diagnostic tool, but should be used with caution. Boot up your computer with a Ubuntu Live CD or USB drive, and choose to run Ubuntu from the CD or USB drive. When the desktop environment loads up, open the Synaptic Package Manager by clicking on the System menu in the top-left of the screen, then selecting Administration, and then Synaptic Package Manager. Cpuburn is in the universe repository. To enable the universe repository, click on Settings in the menu at the top, and then Repositories. Add a checkmark in the box labeled “Community-maintained Open Source software (universe)”. Click close. In the main Synaptic window, click the Reload button. After the package list has reloaded and the search index has been rebuilt, enter “cpuburn” in the Quick search text box. Click the checkbox in the left column, and select Mark for Installation. Click the Apply button near the top of the window. As cpuburn installs, it will caution you about the possible dangers of its use. Assuming you wish to take the risk (and if your computer is randomly restarting constantly, it’s probably worth it), open a terminal window by clicking on the Applications menu in the top-left of the screen and then selection Applications > Terminal. Cpuburn includes a number of tools to test different types of CPUs. If your CPU is more than six years old, see the full list; for modern AMD CPUs, use the terminal command burnK7 and for modern Intel processors, use the terminal command burnP6 Our processor is an Intel, so we ran burnP6. Once it started up, it immediately pushed the CPU up to 99.7% total usage, according to the Linux utility “top”. If your computer is having a CPU, power supply, or cooling problem, then your computer is likely to shutdown within ten or fifteen minutes. Because of the strain this program puts on your computer, we don’t recommend leaving it running overnight – if there’s a problem, it should crop up relatively quickly. Cpuburn’s tools, including burnP6, have no interface; once they start running, they will start driving your CPU until you stop them. To stop a program like burnP6, press Ctrl+C in the terminal window that is running the program. Conclusion The Ubuntu Live CD provides two great testing tools to diagnose a tricky computer problem, or to stress test a new computer. While they are advanced tools that should be used with caution, they’re extremely useful and easy enough that anyone can use them. Similar Articles Productive Geek Tips Reset Your Ubuntu Password Easily from the Live CDCreate a Persistent Bootable Ubuntu USB Flash DriveAdding extra Repositories on UbuntuHow to Share folders with your Ubuntu Virtual Machine (guest)Building a New Computer – Part 3: Setting it Up TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips DVDFab 6 Revo Uninstaller Pro Registry Mechanic 9 for Windows PC Tools Internet Security Suite 2010 Have Fun Editing Photo Editing with Citrify Outlook Connector Upgrade Error Gadfly is a cool Twitter/Silverlight app Enable DreamScene in Windows 7 Microsoft’s “How Do I ?” Videos Home Networks – How do they look like & the problems they cause

    Read the article

  • Linux Live USB Media

    <b>Jamie's Random Musings:</b> "It is pretty common these days for laptops, and even desktops, to be able to boot from a USB flash memory drive. So you can save a little time and a little money by converting various Linux distributions ISO images to bootable USB devices, rather than burning them to CD/DVD."

    Read the article

  • Speed-start your Linux App: Using DB2 and the DB2 Control Center

    This article guides you through setting up and using IBM DB2 7.2 with the Command Line Processor. You'll also learn to use the graphical Control Center, which helps you explore and control your databases, and the graphical Command Center, which helps you generate SQL queries. Other topics covered include Java runtime environment setup, useful Linux utility functions, and bash profile customization.

    Read the article

< Previous Page | 169 170 171 172 173 174 175 176 177 178 179 180  | Next Page >