Search Results

Search found 107 results on 5 pages for 'ocean'.

Page 5/5 | < Previous Page | 1 2 3 4 5 

  • Project Management Helps AmeriCares Deliver International Aid

    - by Sylvie MacKenzie, PMP
    Excerpt from PROFIT - ORACLE - by Alison Weiss Handle with Care Sound project management helps AmeriCares bring international aid to those in need. The stakes are always high for AmeriCares. On a mission to restore health and save lives during times of disaster, the nonprofit international relief and humanitarian aid organization delivers donated medicines, medical supplies, and humanitarian aid to people in the U.S. and around the globe. Founded in 1982 with the express mission of responding as quickly and efficiently as possible to help people in need, the Stamford, Connecticut-based AmeriCares has delivered more than US$10.5 billion in aid to 147 countries over the past three decades. Launch the Slideshow “It’s critically important to us that we steward all the donations and that the medical supplies and medicines get to people as quickly as possible with no loss,” says Kate Sears, senior vice president for finance and technology at AmeriCares. “Whether we’re shipping IV solutions to victims of cholera in Haiti or antibiotics to Somali famine victims, we need to get the medicines there sooner because it means more people will be helped and lives improved or even saved.” Ten years ago, the tracking systems used by AmeriCares associates were paper-based. In recent years, staff started using spreadsheets, but the tracking processes were not standardized between teams. “Every team was tracking completely different information,” says Megan McDermott, senior associate, Sub-Saharan Africa partnerships, at AmeriCares. “It was just a few key things. For example, we tracked the date a shipment was supposed to arrive and the date we got reports from our partner that a hospital received aid on their end.” While the data was accurate, much detail was being lost in the process. AmeriCares management knew it could do a better job of tracking this enterprise data and in 2011 took a significant step by implementing Oracle’s Primavera P6 Professional Project Management. “It’s a comprehensive solution that has helped us improve the monitoring and controlling processes. It has allowed us to do our distribution better,” says Sears. In addition, the implementation effort has been a change agent, helping AmeriCares leadership rethink project management across the entire organization. Initially, much of the focus was on standardizing processes, but staff members also learned the importance of thinking proactively to prevent possible problems and evaluating results to determine if goals and objectives are truly being met. Such data about process efficiency and overall results is critical not only to AmeriCares staff but also to the donors supporting the organization’s life-saving missions. Efficiency Saves Lives One of AmeriCares’ core operations is to gather product donations from the private sector, establish where the most-urgent needs are, and solicit monetary support to send the aid via ocean cargo or airlift to welfare- and health-oriented nongovernmental organizations, hospitals, health networks, and government ministries based in areas in need. In 2011 alone, AmeriCares sent more than 3,500 shipments to 95 countries in response to both ongoing humanitarian needs and more than two dozen emergencies, including deadly tornadoes and storms in the U.S. and the devastating tsunami in Japan. When it comes to nonprofits in general, donors want to know that the charitable organizations they support are using funds wisely. Typically, nonprofits are evaluated by donors in terms of efficiency, an area where AmeriCares has an excellent reputation: 98 percent of expenses go directly to supporting programs and less than 2 percent represent administrative and fundraising costs. Donors, however, should look at more than simple efficiency, says Peter York, senior partner and chief research and learning officer at TCC Group, a nonprofit consultancy headquartered in New York, New York. They should also look at whether organizations have the systems in place to sustain their missions and continue to thrive. An expert on nonprofit organizational management, York has spent years studying sustainable charitable organizations. He defines them as nonprofits that are able to achieve the ongoing financial support to stay relevant and continue doing core mission work. In his analysis of well over 2,500 larger nonprofits, York has found that many are not sustaining, and are actually scaling back in size. “One of the biggest challenges of nonprofit sustainability is the general public’s perception that every dollar donated has to go only to the delivery of service,” says York. “What our data shows is that there are some fundamental capacities that have to be there in order for organizations to sustain and grow.” York’s research highlights the importance of data-driven leadership at successful nonprofits. “You’ve got to have the tools, the systems, and the technologies to get objective information on what you do, the people you serve, and the results you’re achieving,” says York. “If leaders don’t have the knowledge and the data, they can’t make the strategic decisions about programs to take organizations to the next level.” Historically, AmeriCares associates have used time-tested and cost-effective strategies to ship and then track supplies from donation to delivery to their destinations in designated time frames. When disaster strikes, AmeriCares ships by air and generally pulls out all the stops to deliver the most urgently needed aid within the first few days and weeks. Then, as situations stabilize, AmeriCares turns to delivering sea containers for the postemergency and ongoing aid so often needed over the long term. According to McDermott, getting a shipment out the door is fairly complicated, requiring as many as five different AmeriCares teams collaborating together. The entire process can take months—from when products are received in the warehouse and deciding which recipients to allocate supplies to, to getting customs and governmental approvals in place, actually shipping products, and finally ensuring that the products are received in-country. Delivering that aid is no small affair. “Our volume exceeds half a billion dollars a year worth of donated medicines and medical supplies, so it’s a sizable logistical operation to bring these products in and get them out to the right place quickly to have the most impact,” says Sears. “We really pride ourselves on our controls and efficiencies.” Adding to that complexity is the fact that the longer it takes to deliver aid, the more dire the human need can be. Any time AmeriCares associates can shave off the complicated aid delivery process can translate into lives saved. “It’s really being able to track information consistently that will help us to see where are the bottlenecks and where can we work on improving our processes,” says McDermott. Setting a Standard Productivity and information management improvements were key objectives for AmeriCares when staff began the process of implementing Oracle’s Primavera solution. But before configuring the software, the staff needed to take the time to analyze the systems already in place. According to Greg Loop, manager of database systems at AmeriCares, the organization received guidance from several consultants, including Rich D’Addario, consulting project manager in the Primavera Global Business Unit at Oracle, who was instrumental in shepherding the critical requirements-gathering phase. D’Addario encouraged staff to begin documenting shipping processes by considering the order in which activities occur and which ones are dependent on others to get accomplished. This exercise helped everyone realize that to be more efficient, they needed to keep track of shipments in a more standard way. “The staff didn’t recognize formal project management methodology,” says D’Addario. “But they did understand what the most important things are and that if they go wrong, an entire project can go off course.” Before, if a boatload of supplies was being sent to Haiti and there was a problem somewhere, a lot of time was taken up finding out where the problem was—because staff was not tracking things in a standard way. As a result, even more time was needed to find possible solutions to the problem and alert recipients that the aid might be delayed. “For everyone to put on the project manager hat and standardize the way every single thing is done means that now the whole organization is on the same page as to what needs to occur from the time a hurricane hits Haiti and when a boat pulls in to unload supplies,” says D’Addario. With so much care taken to put a process foundation firmly in place, configuring the Primavera solution was actually quite simple. Specific templates were set up for different types of shipments, and dashboards were implemented to provide executives with clear overviews of every project in the system. AmeriCares’ Loop reports that system planning, refining, and testing, followed by writing up documentation and training, took approximately four months. The system went live in spring 2011 at AmeriCares’ Connecticut headquarters. While the nonprofit has an international presence, with warehouses in Europe and offices in Haiti, India, Japan, and Sri Lanka, most donated medicines come from U.S. entities and are shipped from the U.S. out to the rest of the world. In addition, all shipments are tracked from the U.S. office. AmeriCares doesn’t expect the Primavera system to take months off the shipping time, especially for sea containers. However, any time saved is still important because it will allow aid to be delivered to people more quickly at a lower overall cost. “If we can trim a day or two here or there, that can translate into lives that we’re saving, especially in emergency situations,” says Sears. A Cultural Change Beyond the measurable benefits that come with IT-driven process improvement, AmeriCares management is seeing a change in culture as a result of the Primavera project. One change has been treating every shipment of aid as a project, and everyone involved with facilitating shipments as a project manager. “This is a revolutionary concept for us,” says McDermott. “Before, we were used to thinking we were doing logistics—getting a container from point A to point B without looking at it as one project and really understanding what it meant to manage it.” AmeriCares staff is also happy to report that collaboration within the organization is much more efficient. When someone creates a shipment in the Primavera system, the same shared template is used, which means anyone can log in to the system to see the status of a shipment. Knowledgeable staff can access a shipment project to help troubleshoot a problem. Management can easily check the status of projects across the organization. “Dashboards are really useful,” says McDermott. “Instead of going into the details of each project, you can just see the high-level real-time information at a glance.” The new system is helping team members focus on proactively managing shipments rather than simply reacting when problems occur. For example, when a container is shipped, documents must be included for customs clearance. Now, the shipping template has built-in reminders to prompt team members to ask for copies of these documents from freight forwarders and to follow up with partners to discover if a shipment is on time. In the past, staff may not have worked on securing these documents until they’d been notified a shipment had arrived in-country. Another benefit of capturing and adopting best practices within the Primavera system is that staff training is easier. “Capturing the processes in documented steps and milestones allows us to teach new staff members how to do their jobs faster,” says Sears. “It provides them with the knowledge of their predecessors so they don’t have to keep reinventing the wheel.” With the Primavera system already generating positive results, management is eager to take advantage of advanced capabilities. Loop is working on integrating the company’s proprietary inventory management system with the Primavera system so that when logistics or warehousing operators input data, the information will automatically go into the Primavera system. In the past, this information had to be manually keyed into spreadsheets, often leading to errors. Mining Historical Data Another feature on the horizon for AmeriCares is utilizing Primavera P6 Professional Project Management reporting capabilities. As the system begins to include more historical data, management soon will be able to draw on this information to conduct analysis that has not been possible before and create customized reports. For example, at the beginning of the shipment process, staff will be able to use historical data to more accurately estimate how long the approval process should take for a particular country. This could help ensure that food and medicine with limited shelf lives do not get stuck in customs or used beyond their expiration dates. The historical data in the Primavera system will also help AmeriCares with better planning year to year. The nonprofit’s staff has always put together a plan at the beginning of the year, but this has been very challenging simply because it is impossible to predict disasters. Now, management will be able to look at historical data and see trends and statistics as they set current objectives and prepare for future need. In addition, this historical data will provide AmeriCares management with the ability to review year-end data and compare actual project results with goals set at the beginning of the year—to see if desired outcomes were achieved and if there are areas that need improvement. It’s this type of information that is so valuable to donors. And, according to York, project management software can play a critical role in generating the data to help nonprofits sustain and grow. “It is important to invest in systems to help replicate, expand, and deliver services,” says York. “Project management software can help because it encourages nonprofits to examine program or service changes and how to manage moving forward.” Sears believes that AmeriCares donors will support the return on investment the organization will achieve with the Primavera solution. “It won’t be financial returns, but rather how many more people we can help for a given dollar or how much more quickly we can respond to a need,” says Sears. “I think donors are receptive to such arguments.” And for AmeriCares, it is all about the future and increasing results. The project management environment currently may be quite simple, but IT staff plans to expand the complexity and functionality as the organization grows in its knowledge of project management and the goals it wants to achieve. “As we use the system over time, we’ll continue to refine our best practices and accumulate more data,” says Sears. “It will advance our ability to make better data-driven decisions.”

    Read the article

  • CodePlex Daily Summary for Saturday, November 03, 2012

    CodePlex Daily Summary for Saturday, November 03, 2012Popular ReleasesZXMAK2: Version 2.6.8.2: fix save to SZX snapshot for +3 model fix +3 ULA timingsLaunchbar: Launchbar 4.2.2.0: This release is the first step in cleaning up the code and using all the latest features of .NET 4.5 Changes 4.2.2 (2012-11-02) Improved handling of left clicks 4.1.0 (2012-10-17) Removed tray icon Assembly renamed and signed with strong name Note When you upgrade, Launchbar will start with the default settings. You can import your previous settings by following these steps: Run Launchbar and just save the settings without configuring anything Shutdown Launchbar Go to the folder %LOCA...CommonLibrary.NET: CommonLibrary.NET 0.9.8.8: Releases notes for FluentScript located at http://fluentscript.codeplex.com/wikipage?title=Release%20Notes&referringTitle=Documentation Fluentscript - 0.9.8.8 - Final ReleaseApplication: FluentScript Version: 0.9.8.8 Build: 0.9.8.8 Changeset: 77368 ( CommonLibrary.NET ) Release date: November 2nd, 2012 Binaries: CommonLibrary.dll Namespace: ComLib.Lang Project site: http://fluentscript.codeplex.com/ Download: http://commonlibrarynet.codeplex.com/releases/view/90426 Source code: http://common...Mouse Jiggler: MouseJiggle-1.3: This adds the much-requested minimize-to-tray feature to Mouse Jiggler.Umbraco CMS: Umbraco 4.10.0 Release Candidate: This is a Release Candidate, which means that if we do not find any major issues in the next week, we will release this version as the final release of 4.10.0 on November 9th, 2012. The documentation for the MVC bits still lives in the Github version of the docs for now and will be updated on our.umbraco.org with the final release of 4.10.0. Browse the documentation here: https://github.com/umbraco/Umbraco4Docs/tree/4.8.0/Documentation/Reference/Mvc If you want to do only MVC then make sur...Skype Auto Recorder: SkypeAutoRecorder 1.3.4: New icon and images. Reworked settings window. Implemented high-quality sound encoding. Implemented a possibility to produce stereo records. Added buttons with system-wide hot keys for manual starting and canceling of recording. Added buttons for opening folder with records. Added Help button. Fixed an issue when recording is continuing after call end. Fixed an issue when recording doesn't start. Fixed several bugs and improved stability. Major refactoring and optimization...Access 2010 Application Platform - Build Your Own Database: Application Platform - 0.0.2: Release 0.0.2 Created two new users. One belongs to the Administrators group and the other to the Public group. User: admin Pass: admin User: guest Pass: guest Initial Release This is the first version of the database. At the moment is all contained in one file to make development easier, but the obvious idea would be to split it into Front and Back End for a production version of the tool. The features it contains at the moment are the "Core" features.Python Tools for Visual Studio: Python Tools for Visual Studio 1.5: We’re pleased to announce the release of Python Tools for Visual Studio 1.5 RTM. Python Tools for Visual Studio (PTVS) is an open-source plug-in for Visual Studio which supports programming with the Python language. PTVS supports a broad range of features including CPython/IronPython, Edit/Intellisense/Debug/Profile, Cloud, HPC, IPython, etc. support. For a quick overview of the general IDE experience, please watch this video There are a number of exciting improvement in this release comp...BCF.Net: BCF.Net: BCF.Net-20121024 source codeAssaultCube Reloaded: 2.5.5: Linux has Ubuntu 11.10 32-bit precompiled binaries and Ubuntu 10.10 64-bit precompiled binaries, but you can compile your own as it also contains the source. If you are using Mac or other operating systems, please wait while we try to package for those OSes. Try to compile it. If it fails, download a virtual machine. The server pack is ready for both Windows and Linux, but you might need to compile your own for Linux (source included) Changelog: Fixed potential bot bugs: Map change, OpenAL...Edi: Edi 1.0 with DarkExpression: Added DarkExpression theme (dialogs and message boxes are not completely themed, yet)DirectX Tool Kit: October 30, 2012 (add WP8 support): October 30, 2012 Added project files for Windows Phone 8MCEBuddy 2.x: MCEBuddy 2.3.6: Changelog for 2.3.6 (32bit and 64bit) 1. Fixed a bug in multichannel audio conversion failure. AAC does not support 6 channel audio, MCEBuddy now checks for it and force the output to 2 channel if AAC codec is specified 2. Fixed a bug in Original Broadcast Date and Time. Original Broadcast Date and Time is reported in UTC timezone in WTV metadata. TVDB and MovieDB dates are reported in network timezone. It is assumed the video is recorded and converted on the same machine, i.e. local timezone...MVVM Light Toolkit: MVVM Light Toolkit V4.1 for Visual Studio 2012: This version only supports Visual Studio 2012 (and all Express editions too). If you use Visual Studio 2010, please stay tuned, we will publish an update in a few days with support for VS10. V4.1 supports: Windows Phone 8 Windows 8 (Windows RT) Silverlight 5 Silverlight 4 WPF 4.5 WPF 4 WPF 3.5 And the following development environments: Visual Studio 2012 (Pro, Premium, Ultimate) Visual Studio 2012 Express for Windows 8 Visual Studio 2012 Express for Windows Phone 8 Visual...Microsoft Ajax Minifier: Microsoft Ajax Minifier 4.73: Fix issue in Discussion #401101 (unreferenced var in a for-in statement was getting removed). add the grouping operator to the parsed output so that unminified parsed code is closer to the original. Will still strip unneeded parens later, if minifying. more cleaning of references as they are minified out of the code.RiP-Ripper & PG-Ripper: PG-Ripper 1.4.03: changes NEW: Added Support for the phun.org forum FIXED: Kitty-Kats new Forum UrlLiberty: v3.4.0.1 Release 28th October 2012: Change Log -Fixed -H4 Fixed the save verification screen showing incorrect mission and difficulty information for some saves -H4 Hopefully fixed the issue where progress did not save between missions and saves would not revert correctly -H3 Fixed crashes that occurred when trying to load player information -Proper exception dialogs will now show in place of crashesPlayer Framework by Microsoft: Player Framework for Windows 8 (Preview 7): This release is compatible with the version of the Smooth Streaming SDK released today (10/26). Release 1 of the player framework is expected to be available next week. IMPROVEMENTS & FIXESIMPORTANT: List of breaking changes from preview 6 Support for the latest smooth streaming SDK. Xaml only: Support for moving any of the UI elements outside the MediaPlayer (e.g. into the appbar). Note: Equivelent changes to the JS version due in coming week. Support for localizing all text used in t...Send multiple SMS via Way2SMS C#: SMS 1.1: Added support for 160by2Quick Launch: Quick Launch 1.0: A Lightweight and Fast Way to Manage and Launch Thousands of Tools and ApplicationsPress Win+Q and start to search and run. http://www.codeplex.com/Download?ProjectName=quicklaunch&DownloadId=523536New Projectsappnitiren: This project's main objective is to deepen my knowledge on developing apps for windows 8 and windows phone and Buddhist philosophy of Nichiren Daishonin. Aspect - TypeScript AOP Framework: A simple AOP framework for TypeScript.AWF's PowerPoint Tab: A collection of little powerpoint utilities, including "resize all shapes".Base64 Encoder-Decoder: (Base64 Encoder/Decoder using C#)BombaJob-WF: Official Windows desktop client for BombaJob.bgBungie Test: !CCAudio: A project to create an audio recording/mixing and processing framework in C#/C++ with a look to creating a Digital Audio Workstation. EvaluationSyarem: for yuanguangexporttfschangesets: Small utility that runs agains tftp.exe(tfs power tools) and exports a given range of changesets.fjfdszjtfs: fjfdszjtfsKladionica: Ovo je mala web aplikacija koja je specijalizovana za f1 kladionicu na forumu KrstaricaListViewItem Float Test: WPF testing ground for development of a "floating" listviewitem (preeettty)Macconnell: This is my test projectNWebsec Demo site: This is the NWebsec demo website project.Ocean Flattened: Summary of this projectQuagmire: QUAGMIRE aims to make distributed data processing and visualisation simple and flexible.Scheduled SQL Jobs for DotNetNuke by IowaComputerGurus Inc.: Keeping a DotNetNuke site database clean can be a real nightmare for those hosting sites on shared hosting providers.Secretary Tool: Tool for secretaries of congregations of Jehovah's Witnesses.Secure My Install for DotNetNuke by IowaComputerGurus: This module is a single-use per DotNetNuke installation utility module that will make a number of security enhancing modifications to your "out of the box" DotNServStop: ServStop is a .NET application that makes it easy to stop several system services at once. Now you don't have to change startup types or stop them one at a time. It has a simple list-based interface with the ability to save and load lists of user services to stop. Written in C#.Simple File List for DotNetNuke by IowaComputerGurus Inc.: The Simple File List module is a Free DotNetNuke module that will list all files within a specific folder under the specific portal folder within DNN for users SOLUS3 Config Discovery: An open-source community developed tool for generating a simple text file detailing your local SIMS server settings to simplify new SOLUS3 configurations.Sort Project: Project Name: Sort Project Description: This project holds a class that extends objects type of IEnumerable by enabling them using sorting algorithms other than the buil in one within the microsoft staff SQL Azure Data Protector: This tool leverages the currently available options that Azure platform has to provide an integrated and automated solution to back up the SQL Azure databases.Throttling Suite for ASP.NET Applications: The Throttling Suite provides throttling control capabilities to the ASP.NET applications. It is highly customizable; including "log-only" mode.TidyBackups: Allows for automation of deletion of Microsoft SQL backup files based on file type (.bak) creation age as well as compression using standard ZIP technology.TrainingData: TrainingData is a set of libraries for reading and modifying training data files for Garmin and Polar heart rate monitors.Virtual eCommerce: ??? ? ?????? ????????? ??????. ??????: ??????????? ?????????, ?????????? ?????????? ?? ????????????? ?? Microsoft - ASP Web Pages with Razor Engine.WebNext: My personal web page / blog with cms on asp mvc 3Windows 8 Mouse Unsticky: Makes Windows 8 top right/left hot corners unsticky.WPForms: WPForms is simple framework for Form-driven Silverlight Windows Phone applications. Using the framework allows developers to easily create and display forms.

    Read the article

  • Selling Federal Enterprise Architecture (EA)

    - by TedMcLaughlan
    Selling Federal Enterprise Architecture A taxonomy of subject areas, from which to develop a prioritized marketing and communications plan to evangelize EA activities within and among US Federal Government organizations and constituents. Any and all feedback is appreciated, particularly in developing and extending this discussion as a tool for use – more information and details are also available. "Selling" the discipline of Enterprise Architecture (EA) in the Federal Government (particularly in non-DoD agencies) is difficult, notwithstanding the general availability and use of the Federal Enterprise Architecture Framework (FEAF) for some time now, and the relatively mature use of the reference models in the OMB Capital Planning and Investment (CPIC) cycles. EA in the Federal Government also tends to be a very esoteric and hard to decipher conversation – early apologies to those who agree to continue reading this somewhat lengthy article. Alignment to the FEAF and OMB compliance mandates is long underway across the Federal Departments and Agencies (and visible via tools like PortfolioStat and ITDashboard.gov – but there is still a gap between the top-down compliance directives and enablement programs, and the bottom-up awareness and effective use of EA for either IT investment management or actual mission effectiveness. "EA isn't getting deep enough penetration into programs, components, sub-agencies, etc.", verified a panelist at the most recent EA Government Conference in DC. Newer guidance from OMB may be especially difficult to handle, where bottom-up input can't be accurately aligned, analyzed and reported via standardized EA discipline at the Agency level – for example in addressing the new (for FY13) Exhibit 53D "Agency IT Reductions and Reinvestments" and the information required for "Cloud Computing Alternatives Evaluation" (supporting the new Exhibit 53C, "Agency Cloud Computing Portfolio"). Therefore, EA must be "sold" directly to the communities that matter, from a coordinated, proactive messaging perspective that takes BOTH the Program-level value drivers AND the broader Agency mission and IT maturity context into consideration. Selling EA means persuading others to take additional time and possibly assign additional resources, for a mix of direct and indirect benefits – many of which aren't likely to be realized in the short-term. This means there's probably little current, allocated budget to work with; ergo the challenge of trying to sell an "unfunded mandate". Also, the concept of "Enterprise" in large Departments like Homeland Security tends to cross all kinds of organizational boundaries – as Richard Spires recently indicated by commenting that "...organizational boundaries still trump functional similarities. Most people understand what we're trying to do internally, and at a high level they get it. The problem, of course, is when you get down to them and their system and the fact that you're going to be touching them...there's always that fear factor," Spires said. It is quite clear to the Federal IT Investment community that for EA to meet its objective, understandable, relevant value must be measured and reported using a repeatable method – as described by GAO's recent report "Enterprise Architecture Value Needs To Be Measured and Reported". What's not clear is the method or guidance to sell this value. In fact, the current GAO "Framework for Assessing and Improving Enterprise Architecture Management (Version 2.0)", a.k.a. the "EAMMF", does not include words like "sell", "persuade", "market", etc., except in reference ("within Core Element 19: Organization business owner and CXO representatives are actively engaged in architecture development") to a brief section in the CIO Council's 2001 "Practical Guide to Federal Enterprise Architecture", entitled "3.3.1. Develop an EA Marketing Strategy and Communications Plan." Furthermore, Core Element 19 of the EAMMF is advised to be applied in "Stage 3: Developing Initial EA Versions". This kind of EA sales campaign truly should start much earlier in the maturity progress, i.e. in Stages 0 or 1. So, what are the understandable, relevant benefits (or value) to sell, that can find an agreeable, participatory audience, and can pave the way towards success of a longer-term, funded set of EA mechanisms that can be methodically measured and reported? Pragmatic benefits from a useful EA that can help overcome the fear of change? And how should they be sold? Following is a brief taxonomy (it's a taxonomy, to help organize SME support) of benefit-related subjects that might make the most sense, in creating the messages and organizing an initial "engagement plan" for evangelizing EA "from within". An EA "Sales Taxonomy" of sorts. We're not boiling the ocean here; the subjects that are included are ones that currently appear to be urgently relevant to the current Federal IT Investment landscape. Note that successful dialogue in these topics is directly usable as input or guidance for actually developing early-stage, "Fit-for-Purpose" (a DoDAF term) Enterprise Architecture artifacts, as prescribed by common methods found in most EA methodologies, including FEAF, TOGAF, DoDAF and our own Oracle Enterprise Architecture Framework (OEAF). The taxonomy below is organized by (1) Target Community, (2) Benefit or Value, and (3) EA Program Facet - as in: "Let's talk to (1: Community Member) about how and why (3: EA Facet) the EA program can help with (2: Benefit/Value)". Once the initial discussion targets and subjects are approved (that can be measured and reported), a "marketing and communications plan" can be created. A working example follows the Taxonomy. Enterprise Architecture Sales Taxonomy Draft, Summary Version 1. Community 1.1. Budgeted Programs or Portfolios Communities of Purpose (CoPR) 1.1.1. Program/System Owners (Senior Execs) Creating or Executing Acquisition Plans 1.1.2. Program/System Owners Facing Strategic Change 1.1.2.1. Mandated 1.1.2.2. Expected/Anticipated 1.1.3. Program Managers - Creating Employee Performance Plans 1.1.4. CO/COTRs – Creating Contractor Performance Plans, or evaluating Value Engineering Change Proposals (VECP) 1.2. Governance & Communications Communities of Practice (CoP) 1.2.1. Policy Owners 1.2.1.1. OCFO 1.2.1.1.1. Budget/Procurement Office 1.2.1.1.2. Strategic Planning 1.2.1.2. OCIO 1.2.1.2.1. IT Management 1.2.1.2.2. IT Operations 1.2.1.2.3. Information Assurance (Cyber Security) 1.2.1.2.4. IT Innovation 1.2.1.3. Information-Sharing/ Process Collaboration (i.e. policies and procedures regarding Partners, Agreements) 1.2.2. Governing IT Council/SME Peers (i.e. an "Architects Council") 1.2.2.1. Enterprise Architects (assumes others exist; also assumes EA participants aren't buried solely within the CIO shop) 1.2.2.2. Domain, Enclave, Segment Architects – i.e. the right affinity group for a "shared services" EA structure (per the EAMMF), which may be classified as Federated, Segmented, Service-Oriented, or Extended 1.2.2.3. External Oversight/Constraints 1.2.2.3.1. GAO/OIG & Legal 1.2.2.3.2. Industry Standards 1.2.2.3.3. Official public notification, response 1.2.3. Mission Constituents Participant & Analyst Community of Interest (CoI) 1.2.3.1. Mission Operators/Users 1.2.3.2. Public Constituents 1.2.3.3. Industry Advisory Groups, Stakeholders 1.2.3.4. Media 2. Benefit/Value (Note the actual benefits may not be discretely attributable to EA alone; EA is a very collaborative, cross-cutting discipline.) 2.1. Program Costs – EA enables sound decisions regarding... 2.1.1. Cost Avoidance – a TCO theme 2.1.2. Sequencing – alignment of capability delivery 2.1.3. Budget Instability – a Federal reality 2.2. Investment Capital – EA illuminates new investment resources via... 2.2.1. Value Engineering – contractor-driven cost savings on existing budgets, direct or collateral 2.2.2. Reuse – reuse of investments between programs can result in savings, chargeback models; avoiding duplication 2.2.3. License Refactoring – IT license & support models may not reflect actual or intended usage 2.3. Contextual Knowledge – EA enables informed decisions by revealing... 2.3.1. Common Operating Picture (COP) – i.e. cross-program impacts and synergy, relative to context 2.3.2. Expertise & Skill – who truly should be involved in architectural decisions, both business and IT 2.3.3. Influence – the impact of politics and relationships can be examined 2.3.4. Disruptive Technologies – new technologies may reduce costs or mitigate risk in unanticipated ways 2.3.5. What-If Scenarios – can become much more refined, current, verifiable; basis for Target Architectures 2.4. Mission Performance – EA enables beneficial decision results regarding... 2.4.1. IT Performance and Optimization – towards 100% effective, available resource utilization 2.4.2. IT Stability – towards 100%, real-time uptime 2.4.3. Agility – responding to rapid changes in mission 2.4.4. Outcomes –measures of mission success, KPIs – vs. only "Outputs" 2.4.5. Constraints – appropriate response to constraints 2.4.6. Personnel Performance – better line-of-sight through performance plans to mission outcome 2.5. Mission Risk Mitigation – EA mitigates decision risks in terms of... 2.5.1. Compliance – all the right boxes are checked 2.5.2. Dependencies –cross-agency, segment, government 2.5.3. Transparency – risks, impact and resource utilization are illuminated quickly, comprehensively 2.5.4. Threats and Vulnerabilities – current, realistic awareness and profiles 2.5.5. Consequences – realization of risk can be mapped as a series of consequences, from earlier decisions or new decisions required for current issues 2.5.5.1. Unanticipated – illuminating signals of future or non-symmetric risk; helping to "future-proof" 2.5.5.2. Anticipated – discovering the level of impact that matters 3. EA Program Facet (What parts of the EA can and should be communicated, using business or mission terms?) 3.1. Architecture Models – the visual tools to be created and used 3.1.1. Operating Architecture – the Business Operating Model/Architecture elements of the EA truly drive all other elements, plus expose communication channels 3.1.2. Use Of – how can the EA models be used, and how are they populated, from a reasonable, pragmatic yet compliant perspective? What are the core/minimal models required? What's the relationship of these models, with existing system models? 3.1.3. Scope – what level of granularity within the models, and what level of abstraction across the models, is likely to be most effective and useful? 3.2. Traceability – the maturity, status, completeness of the tools 3.2.1. Status – what in fact is the degree of maturity across the integrated EA model and other relevant governance models, and who may already be benefiting from it? 3.2.2. Visibility – how does the EA visibly and effectively prove IT investment performance goals are being reached, with positive mission outcome? 3.3. Governance – what's the interaction, participation method; how are the tools used? 3.3.1. Contributions – how is the EA program informed, accept submissions, collect data? Who are the experts? 3.3.2. Review – how is the EA validated, against what criteria?  Taxonomy Usage Example:   1. To speak with: a. ...a particular set of System Owners Facing Strategic Change, via mandate (like the "Cloud First" mandate); about... b. ...how the EA program's visible and easily accessible Infrastructure Reference Model (i.e. "IRM" or "TRM"), if updated more completely with current system data, can... c. ...help shed light on ways to mitigate risks and avoid future costs associated with NOT leveraging potentially-available shared services across the enterprise... 2. ....the following Marketing & Communications (Sales) Plan can be constructed: a. Create an easy-to-read "Consequence Model" that illustrates how adoption of a cloud capability (like elastic operational storage) can enable rapid and durable compliance with the mandate – using EA traceability. Traceability might be from the IRM to the ARM (that identifies reusable services invoking the elastic storage), and then to the PRM with performance measures (such as % utilization of purchased storage allocation) included in the OMB Exhibits; and b. Schedule a meeting with the Program Owners, timed during their Acquisition Strategy meetings in response to the mandate, to use the "Consequence Model" for advising them to organize a rapid and relevant RFI solicitation for this cloud capability (regarding alternatives for sourcing elastic operational storage); and c. Schedule a series of short "Discovery" meetings with the system architecture leads (as agreed by the Program Owners), to further populate/validate the "As-Is" models and frame the "To Be" models (via scenarios), to better inform the RFI, obtain the best feedback from the vendor community, and provide potential value for and avoid impact to all other programs and systems. --end example -- Note that communications with the intended audience should take a page out of the standard "Search Engine Optimization" (SEO) playbook, using keywords and phrases relating to "value" and "outcome" vs. "compliance" and "output". Searches in email boxes, internal and external search engines for phrases like "cost avoidance strategies", "mission performance metrics" and "innovation funding" should yield messages and content from the EA team. This targeted, informed, practical sales approach should result in additional buy-in and participation, additional EA information contribution and model validation, development of more SMEs and quick "proof points" (with real-life testing) to bolster the case for EA. The proof point here is a successful, timely procurement that satisfies not only the external mandate and external oversight review, but also meets internal EA compliance/conformance goals and therefore is more transparently useful across the community. In short, if sold effectively, the EA will perform and be recognized. EA won’t therefore be used only for compliance, but also (according to a validated, stated purpose) to directly influence decisions and outcomes. The opinions, views and analysis expressed in this document are those of the author and do not necessarily reflect the views of Oracle.

    Read the article

  • Map wont show rigth in Joomla

    - by user1653126
    I have the following code of a map using api google, I have tested the code in several html editor and its work perfectly, but when i upload in my web page doesn’t work. The map appears all zoomed in some random point in the ocean. I create an article in Joomla 1.5.20, paste the code. Its shows right in the preview but not in the web page. I disable filtering and use none editor and still won’t work. Thanks for the help. <!DOCTYPE html> <html> <head> <meta name="viewport" content="initial-scale=1.0, user-scalable=no" /> <style type="text/css"> html { height: 100% } body { height: 100%; margin: 0; padding: 0 } #map_canvas { height: 100% } </style> <script type="text/javascript" src="http://maps.googleapis.com/maps/api/js?key=AIzaSyBInlv7FuwtKGhzBP0oISDoB2Iu79HNrPU&sensor=false"> </script> <script type="text/javascript"> var map; // lets define some vars to make things easier later var kml = { a: { name: "Productor", url: "https://maps.google.hn/maps/ms?authuser=0&vps=2&hl=es&ie=UTF8&msa=0&output=kml&msid=200984447026903306654.0004c934a224eca7c3ad4" }, b: { name: "A&S", url: "https://maps.google.hn/maps/ms?ie=UTF8&authuser=0&msa=0&output=kml&msid=200984447026903306654.0004c94bac74cf2304c71" } // keep adding more if ye like }; // initialize our goo function initializeMap() { var options = { center: new google.maps.LatLng(13.324182,-87.080071), zoom: 9, mapTypeId: google.maps.MapTypeId.TERRAIN } map = new google.maps.Map(document.getElementById("map_canvas"), options); var ctaLayer = new google.maps.KmlLayer('https://maps.google.hn/maps/ms?authuser=0&vps=5&hl=es&ie=UTF8&oe=UTF8&msa=0&output=kml&msid=200984447026903306654.0004c94bc3bce6f638aa1'); ctaLayer.setMap(map); var ctaLayer = new google.maps.KmlLayer('https://maps.google.hn/maps/ms?authuser=0&vps=2&ie=UTF8&msa=0&output=kml&msid=200984447026903306654.0004c94ec7e838242b67d'); ctaLayer.setMap(map); createTogglers(); }; google.maps.event.addDomListener(window, 'load', initializeMap); // the important function... kml[id].xxxxx refers back to the top function toggleKML(checked, id) { if (checked) { var layer = new google.maps.KmlLayer(kml[id].url, { preserveViewport: true, suppressInfoWindows: true }); google.maps.event.addListener(layer, 'click', function(kmlEvent) { var text = kmlEvent.featureData.description; showInContentWindow(text); }); function showInContentWindow(text) { var sidediv = document.getElementById('content_window'); sidediv.innerHTML = text; } // store kml as obj kml[id].obj = layer; kml[id].obj.setMap(map); } else { kml[id].obj.setMap(null); delete kml[id].obj; } }; // create the controls dynamically because it's easier, really function createTogglers() { var html = "<form><ul>"; for (var prop in kml) { html += "<li id=\"selector-" + prop + "\"><input type='checkbox' id='" + prop + "'" + " onclick='highlight(this,\"selector-" + prop + "\"); toggleKML(this.checked, this.id)' \/>" + kml[prop].name + "<\/li>"; } html += "<li class='control'><a href='#' onclick='removeAll();return false;'>" + "Limpiar el Mapa<\/a><\/li>" + "<\/ul><\/form>"; document.getElementById("toggle_box").innerHTML = html; }; // easy way to remove all objects function removeAll() { for (var prop in kml) { if (kml[prop].obj) { kml[prop].obj.setMap(null); delete kml[prop].obj; } } }; // Append Class on Select function highlight(box, listitem) { var selected = 'selected'; var normal = 'normal'; document.getElementById(listitem).className = (box.checked ? selected: normal); }; </script> <style type="text/css"> .selected { font-weight: bold; } </style> </head> <body> <div id="map_canvas" style="width: 80%; height: 400px; float:left"></div> <div id="toggle_box" style="position: absolute; top: 100px; right: 640px; padding: 10px; background: #fff; z-index: 5; "></div> <div id="content_window" style="width:10%; height:10%; float:left"></div> </body> </html>

    Read the article

  • Much Ado About Nothing: Stub Objects

    - by user9154181
    The Solaris 11 link-editor (ld) contains support for a new type of object that we call a stub object. A stub object is a shared object, built entirely from mapfiles, that supplies the same linking interface as the real object, while containing no code or data. Stub objects cannot be executed — the runtime linker will kill any process that attempts to load one. However, you can link to a stub object as a dependency, allowing the stub to act as a proxy for the real version of the object. You may well wonder if there is a point to producing an object that contains nothing but linking interface. As it turns out, stub objects are very useful for building large bodies of code such as Solaris. In the last year, we've had considerable success in applying them to one of our oldest and thorniest build problems. In this discussion, I will describe how we came to invent these objects, and how we apply them to building Solaris. This posting explains where the idea for stub objects came from, and details our long and twisty journey from hallway idea to standard link-editor feature. I expect that these details are mainly of interest to those who work on Solaris and its makefiles, those who have done so in the past, and those who work with other similar bodies of code. A subsequent posting will omit the history and background details, and instead discuss how to build and use stub objects. If you are mainly interested in what stub objects are, and don't care about the underlying software war stories, I encourage you to skip ahead. The Long Road To Stubs This all started for me with an email discussion in May of 2008, regarding a change request that was filed in 2002, entitled: 4631488 lib/Makefile is too patient: .WAITs should be reduced This CR encapsulates a number of cronic issues with Solaris builds: We build Solaris with a parallel make (dmake) that tries to build as much of the code base in parallel as possible. There is a lot of code to build, and we've long made use of parallelized builds to get the job done quicker. This is even more important in today's world of massively multicore hardware. Solaris contains a large number of executables and shared objects. Executables depend on shared objects, and shared objects can depend on each other. Before you can build an object, you need to ensure that the objects it needs have been built. This implies a need for serialization, which is in direct opposition to the desire to build everying in parallel. To accurately build objects in the right order requires an accurate set of make rules defining the things that depend on each other. This sounds simple, but the reality is quite complex. In practice, having programmers explicitly specify these dependencies is a losing strategy: It's really hard to get right. It's really easy to get it wrong and never know it because things build anyway. Even if you get it right, it won't stay that way, because dependencies between objects can change over time, and make cannot help you detect such drifing. You won't know that you got it wrong until the builds break. That can be a long time after the change that triggered the breakage happened, making it hard to connect the cause and the effect. Usually this happens just before a release, when the pressure is on, its hard to think calmly, and there is no time for deep fixes. As a poor compromise, the libraries in core Solaris were built using a set of grossly incomplete hand written rules, supplemented with a number of dmake .WAIT directives used to group the libraries into sets of non-interacting groups that can be built in parallel because we think they don't depend on each other. From time to time, someone will suggest that we could analyze the built objects themselves to determine their dependencies and then generate make rules based on those relationships. This is possible, but but there are complications that limit the usefulness of that approach: To analyze an object, you have to build it first. This is a classic chicken and egg scenario. You could analyze the results of a previous build, but then you're not necessarily going to get accurate rules for the current code. It should be possible to build the code without having a built workspace available. The analysis will take time, and remember that we're constantly trying to make builds faster, not slower. By definition, such an approach will always be approximate, and therefore only incremantally more accurate than the hand written rules described above. The hand written rules are fast and cheap, while this idea is slow and complex, so we stayed with the hand written approach. Solaris was built that way, essentially forever, because these are genuinely difficult problems that had no easy answer. The makefiles were full of build races in which the right outcomes happened reliably for years until a new machine or a change in build server workload upset the accidental balance of things. After figuring out what had happened, you'd mutter "How did that ever work?", add another incomplete and soon to be inaccurate make dependency rule to the system, and move on. This was not a satisfying solution, as we tend to be perfectionists in the Solaris group, but we didn't have a better answer. It worked well enough, approximately. And so it went for years. We needed a different approach — a new idea to cut the Gordian Knot. In that discussion from May 2008, my fellow linker-alien Rod Evans had the initial spark that lead us to a game changing series of realizations: The link-editor is used to link objects together, but it only uses the ELF metadata in the object, consisting of symbol tables, ELF versioning sections, and similar data. Notably, it does not look at, or understand, the machine code that makes an object useful at runtime. If you had an object that only contained the ELF metadata for a dependency, but not the code or data, the link-editor would find it equally useful for linking, and would never know the difference. Call it a stub object. In the core Solaris OS, we require all objects to be built with a link-editor mapfile that describes all of its publically available functions and data. Could we build a stub object using the mapfile for the real object? It ought to be very fast to build stub objects, as there are no input objects to process. Unlike the real object, stub objects would not actually require any dependencies, and so, all of the stubs for the entire system could be built in parallel. When building the real objects, one could link against the stub objects instead of the real dependencies. This means that all the real objects can be built built in parallel too, without any serialization. We could replace a system that requires perfect makefile rules with a system that requires no ordering rules whatsoever. The results would be considerably more robust. We immediately realized that this idea had potential, but also that there were many details to sort out, lots of work to do, and that perhaps it wouldn't really pan out. As is often the case, it would be necessary to do the work and see how it turned out. Following that conversation, I set about trying to build a stub object. We determined that a faithful stub has to do the following: Present the same set of global symbols, with the same ELF versioning, as the real object. Functions are simple — it suffices to have a symbol of the right type, possibly, but not necessarily, referencing a null function in its text segment. Copy relocations make data more complicated to stub. The possibility of a copy relocation means that when you create a stub, the data symbols must have the actual size of the real data. Any error in this will go uncaught at link time, and will cause tragic failures at runtime that are very hard to diagnose. For reasons too obscure to go into here, involving tentative symbols, it is also important that the data reside in bss, or not, matching its placement in the real object. If the real object has more than one symbol pointing at the same data item, we call these aliased symbols. All data symbols in the stub object must exhibit the same aliasing as the real object. We imagined the stub library feature working as follows: A command line option to ld tells it to produce a stub rather than a real object. In this mode, only mapfiles are examined, and any object or shared libraries on the command line are are ignored. The extra information needed (function or data, size, and bss details) would be added to the mapfile. When building the real object instead of the stub, the extra information for building stubs would be validated against the resulting object to ensure that they match. In exploring these ideas, I immediately run headfirst into the reality of the original mapfile syntax, a subject that I would later write about as The Problem(s) With Solaris SVR4 Link-Editor Mapfiles. The idea of extending that poor language was a non-starter. Until a better mapfile syntax became available, which seemed unlikely in 2008, the solution could not involve extentions to the mapfile syntax. Instead, we cooked up the idea (hack) of augmenting mapfiles with stylized comments that would carry the necessary information. A typical definition might look like: # DATA(i386) __iob 0x3c0 # DATA(amd64,sparcv9) __iob 0xa00 # DATA(sparc) __iob 0x140 iob; A further problem then became clear: If we can't extend the mapfile syntax, then there's no good way to extend ld with an option to produce stub objects, and to validate them against the real objects. The idea of having ld read comments in a mapfile and parse them for content is an unacceptable hack. The entire point of comments is that they are strictly for the human reader, and explicitly ignored by the tool. Taking all of these speed bumps into account, I made a new plan: A perl script reads the mapfiles, generates some small C glue code to produce empty functions and data definitions, compiles and links the stub object from the generated glue code, and then deletes the generated glue code. Another perl script used after both objects have been built, to compare the real and stub objects, using data from elfdump, and validate that they present the same linking interface. By June 2008, I had written the above, and generated a stub object for libc. It was a useful prototype process to go through, and it allowed me to explore the ideas at a deep level. Ultimately though, the result was unsatisfactory as a basis for real product. There were so many issues: The use of stylized comments were fine for a prototype, but not close to professional enough for shipping product. The idea of having to document and support it was a large concern. The ideal solution for stub objects really does involve having the link-editor accept the same arguments used to build the real object, augmented with a single extra command line option. Any other solution, such as our prototype script, will require makefiles to be modified in deeper ways to support building stubs, and so, will raise barriers to converting existing code. A validation script that rederives what the linker knew when it built an object will always be at a disadvantage relative to the actual linker that did the work. A stub object should be identifyable as such. In the prototype, there was no tag or other metadata that would let you know that they weren't real objects. Being able to identify a stub object in this way means that the file command can tell you what it is, and that the runtime linker can refuse to try and run a program that loads one. At that point, we needed to apply this prototype to building Solaris. As you might imagine, the task of modifying all the makefiles in the core Solaris code base in order to do this is a massive task, and not something you'd enter into lightly. The quality of the prototype just wasn't good enough to justify that sort of time commitment, so I tabled the project, putting it on my list of long term things to think about, and moved on to other work. It would sit there for a couple of years. Semi-coincidentally, one of the projects I tacked after that was to create a new mapfile syntax for the Solaris link-editor. We had wanted to do something about the old mapfile syntax for many years. Others before me had done some paper designs, and a great deal of thought had already gone into the features it should, and should not have, but for various reasons things had never moved beyond the idea stage. When I joined Sun in late 2005, I got involved in reviewing those things and thinking about the problem. Now in 2008, fresh from relearning for the Nth time why the old mapfile syntax was a huge impediment to linker progress, it seemed like the right time to tackle the mapfile issue. Paving the way for proper stub object support was not the driving force behind that effort, but I certainly had them in mind as I moved forward. The new mapfile syntax, which we call version 2, integrated into Nevada build snv_135 in in February 2010: 6916788 ld version 2 mapfile syntax PSARC/2009/688 Human readable and extensible ld mapfile syntax In order to prove that the new mapfile syntax was adequate for general purpose use, I had also done an overhaul of the ON consolidation to convert all mapfiles to use the new syntax, and put checks in place that would ensure that no use of the old syntax would creep back in. That work went back into snv_144 in June 2010: 6916796 OSnet mapfiles should use version 2 link-editor syntax That was a big putback, modifying 517 files, adding 18 new files, and removing 110 old ones. I would have done this putback anyway, as the work was already done, and the benefits of human readable syntax are obvious. However, among the justifications listed in CR 6916796 was this We anticipate adding additional features to the new mapfile language that will be applicable to ON, and which will require all sharable object mapfiles to use the new syntax. I never explained what those additional features were, and no one asked. It was premature to say so, but this was a reference to stub objects. By that point, I had already put together a working prototype link-editor with the necessary support for stub objects. I was pleased to find that building stubs was indeed very fast. On my desktop system (Ultra 24), an amd64 stub for libc can can be built in a fraction of a second: % ptime ld -64 -z stub -o stubs/libc.so.1 -G -hlibc.so.1 \ -ztext -zdefs -Bdirect ... real 0.019708910 user 0.010101680 sys 0.008528431 In order to go from prototype to integrated link-editor feature, I knew that I would need to prove that stub objects were valuable. And to do that, I knew that I'd have to switch the Solaris ON consolidation to use stub objects and evaluate the outcome. And in order to do that experiment, ON would first need to be converted to version 2 mapfiles. Sub-mission accomplished. Normally when you design a new feature, you can devise reasonably small tests to show it works, and then deploy it incrementally, letting it prove its value as it goes. The entire point of stub objects however was to demonstrate that they could be successfully applied to an extremely large and complex code base, and specifically to solve the Solaris build issues detailed above. There was no way to finesse the matter — in order to move ahead, I would have to successfully use stub objects to build the entire ON consolidation and demonstrate their value. In software, the need to boil the ocean can often be a warning sign that things are trending in the wrong direction. Conversely, sometimes progress demands that you build something large and new all at once. A big win, or a big loss — sometimes all you can do is try it and see what happens. And so, I spent some time staring at ON makefiles trying to get a handle on how things work, and how they'd have to change. It's a big and messy world, full of complex interactions, unspecified dependencies, special cases, and knowledge of arcane makefile features... ...and so, I backed away, put it down for a few months and did other work... ...until the fall, when I felt like it was time to stop thinking and pondering (some would say stalling) and get on with it. Without stubs, the following gives a simplified high level view of how Solaris is built: An initially empty directory known as the proto, and referenced via the ROOT makefile macro is established to receive the files that make up the Solaris distribution. A top level setup rule creates the proto area, and performs operations needed to initialize the workspace so that the main build operations can be launched, such as copying needed header files into the proto area. Parallel builds are launched to build the kernel (usr/src/uts), libraries (usr/src/lib), and commands. The install makefile target builds each item and delivers a copy to the proto area. All libraries and executables link against the objects previously installed in the proto, implying the need to synchronize the order in which things are built. Subsequent passes run lint, and do packaging. Given this structure, the additions to use stub objects are: A new second proto area is established, known as the stub proto and referenced via the STUBROOT makefile macro. The stub proto has the same structure as the real proto, but is used to hold stub objects. All files in the real proto are delivered as part of the Solaris product. In contrast, the stub proto is used to build the product, and then thrown away. A new target is added to library Makefiles called stub. This rule builds the stub objects. The ld command is designed so that you can build a stub object using the same ld command line you'd use to build the real object, with the addition of a single -z stub option. This means that the makefile rules for building the stub objects are very similar to those used to build the real objects, and many existing makefile definitions can be shared between them. A new target is added to the Makefiles called stubinstall which delivers the stub objects built by the stub rule into the stub proto. These rules reuse much of existing plumbing used by the existing install rule. The setup rule runs stubinstall over the entire lib subtree as part of its initialization. All libraries and executables link against the objects in the stub proto rather than the main proto, and can therefore be built in parallel without any synchronization. There was no small way to try this that would yield meaningful results. I would have to take a leap of faith and edit approximately 1850 makefiles and 300 mapfiles first, trusting that it would all work out. Once the editing was done, I'd type make and see what happened. This took about 6 weeks to do, and there were many dark days when I'd question the entire project, or struggle to understand some of the many twisted and complex situations I'd uncover in the makefiles. I even found a couple of new issues that required changes to the new stub object related code I'd added to ld. With a substantial amount of encouragement and help from some key people in the Solaris group, I eventually got the editing done and stub objects for the entire workspace built. I found that my desktop system could build all the stub objects in the workspace in roughly a minute. This was great news, as it meant that use of the feature is effectively free — no one was likely to notice or care about the cost of building them. After another week of typing make, fixing whatever failed, and doing it again, I succeeded in getting a complete build! The next step was to remove all of the make rules and .WAIT statements dedicated to controlling the order in which libraries under usr/src/lib are built. This came together pretty quickly, and after a few more speed bumps, I had a workspace that built cleanly and looked like something you might actually be able to integrate someday. This was a significant milestone, but there was still much left to do. I turned to doing full nightly builds. Every type of build (open, closed, OpenSolaris, export, domestic) had to be tried. Each type failed in a new and unique way, requiring some thinking and rework. As things came together, I became aware of things that could have been done better, simpler, or cleaner, and those things also required some rethinking, the seeking of wisdom from others, and some rework. After another couple of weeks, it was in close to final form. My focus turned towards the end game and integration. This was a huge workspace, and needed to go back soon, before changes in the gate would made merging increasingly difficult. At this point, I knew that the stub objects had greatly simplified the makefile logic and uncovered a number of race conditions, some of which had been there for years. I assumed that the builds were faster too, so I did some builds intended to quantify the speedup in build time that resulted from this approach. It had never occurred to me that there might not be one. And so, I was very surprised to find that the wall clock build times for a stock ON workspace were essentially identical to the times for my stub library enabled version! This is why it is important to always measure, and not just to assume. One can tell from first principles, based on all those removed dependency rules in the library makefile, that the stub object version of ON gives dmake considerably more opportunities to overlap library construction. Some hypothesis were proposed, and shot down: Could we have disabled dmakes parallel feature? No, a quick check showed things being build in parallel. It was suggested that we might be I/O bound, and so, the threads would be mostly idle. That's a plausible explanation, but system stats didn't really support it. Plus, the timing between the stub and non-stub cases were just too suspiciously identical. Are our machines already handling as much parallelism as they are capable of, and unable to exploit these additional opportunities? Once again, we didn't see the evidence to back this up. Eventually, a more plausible and obvious reason emerged: We build the libraries and commands (usr/src/lib, usr/src/cmd) in parallel with the kernel (usr/src/uts). The kernel is the long leg in that race, and so, wall clock measurements of build time are essentially showing how long it takes to build uts. Although it would have been nice to post a huge speedup immediately, we can take solace in knowing that stub objects simplify the makefiles and reduce the possibility of race conditions. The next step in reducing build time should be to find ways to reduce or overlap the uts part of the builds. When that leg of the build becomes shorter, then the increased parallelism in the libs and commands will pay additional dividends. Until then, we'll just have to settle for simpler and more robust. And so, I integrated the link-editor support for creating stub objects into snv_153 (November 2010) with 6993877 ld should produce stub objects PSARC/2010/397 ELF Stub Objects followed by the work to convert the ON consolidation in snv_161 (February 2011) with 7009826 OSnet should use stub objects 4631488 lib/Makefile is too patient: .WAITs should be reduced This was a huge putback, with 2108 modified files, 8 new files, and 2 removed files. Due to the size, I was allowed a window after snv_160 closed in which to do the putback. It went pretty smoothly for something this big, a few more preexisting race conditions would be discovered and addressed over the next few weeks, and things have been quiet since then. Conclusions and Looking Forward Solaris has been built with stub objects since February. The fact that developers no longer specify the order in which libraries are built has been a big success, and we've eliminated an entire class of build error. That's not to say that there are no build races left in the ON makefiles, but we've taken a substantial bite out of the problem while generally simplifying and improving things. The introduction of a stub proto area has also opened some interesting new possibilities for other build improvements. As this article has become quite long, and as those uses do not involve stub objects, I will defer that discussion to a future article.

    Read the article

  • Partner Blog Series: PwC Perspectives - The Gotchas, The Do's and Don'ts for IDM Implementations

    - by Tanu Sood
    Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:12.0pt; mso-para-margin-left:0in; line-height:12.0pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Arial","sans-serif"; mso-ascii-font-family:Arial; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Arial; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} table.MsoTableMediumList1Accent6 {mso-style-name:"Medium List 1 - Accent 6"; mso-tstyle-rowband-size:1; mso-tstyle-colband-size:1; mso-style-priority:65; mso-style-unhide:no; border-top:solid #E0301E 1.0pt; mso-border-top-themecolor:accent6; border-left:none; border-bottom:solid #E0301E 1.0pt; mso-border-bottom-themecolor:accent6; border-right:none; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Georgia","serif"; color:black; mso-themecolor:text1; mso-ansi-language:EN-GB;} table.MsoTableMediumList1Accent6FirstRow {mso-style-name:"Medium List 1 - Accent 6"; mso-table-condition:first-row; mso-style-priority:65; mso-style-unhide:no; mso-tstyle-border-top:cell-none; mso-tstyle-border-bottom:1.0pt solid #E0301E; mso-tstyle-border-bottom-themecolor:accent6; font-family:"Verdana","sans-serif"; mso-ascii-font-family:Georgia; mso-ascii-theme-font:major-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:major-fareast; mso-hansi-font-family:Georgia; mso-hansi-theme-font:major-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:major-bidi;} table.MsoTableMediumList1Accent6LastRow {mso-style-name:"Medium List 1 - Accent 6"; mso-table-condition:last-row; mso-style-priority:65; mso-style-unhide:no; mso-tstyle-border-top:1.0pt solid #E0301E; mso-tstyle-border-top-themecolor:accent6; mso-tstyle-border-bottom:1.0pt solid #E0301E; mso-tstyle-border-bottom-themecolor:accent6; color:#968C6D; mso-themecolor:text2; mso-ansi-font-weight:bold; mso-bidi-font-weight:bold;} table.MsoTableMediumList1Accent6FirstCol {mso-style-name:"Medium List 1 - Accent 6"; mso-table-condition:first-column; mso-style-priority:65; mso-style-unhide:no; mso-ansi-font-weight:bold; mso-bidi-font-weight:bold;} table.MsoTableMediumList1Accent6LastCol {mso-style-name:"Medium List 1 - Accent 6"; mso-table-condition:last-column; mso-style-priority:65; mso-style-unhide:no; mso-tstyle-border-top:1.0pt solid #E0301E; mso-tstyle-border-top-themecolor:accent6; mso-tstyle-border-bottom:1.0pt solid #E0301E; mso-tstyle-border-bottom-themecolor:accent6; mso-ansi-font-weight:bold; mso-bidi-font-weight:bold;} table.MsoTableMediumList1Accent6OddColumn {mso-style-name:"Medium List 1 - Accent 6"; mso-table-condition:odd-column; mso-style-priority:65; mso-style-unhide:no; mso-tstyle-shading:#F7CBC7; mso-tstyle-shading-themecolor:accent6; mso-tstyle-shading-themetint:63;} table.MsoTableMediumList1Accent6OddRow {mso-style-name:"Medium List 1 - Accent 6"; mso-table-condition:odd-row; mso-style-priority:65; mso-style-unhide:no; mso-tstyle-shading:#F7CBC7; mso-tstyle-shading-themecolor:accent6; mso-tstyle-shading-themetint:63;} Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:12.0pt; mso-para-margin-left:0in; line-height:12.0pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Arial","sans-serif"; mso-ascii-font-family:Arial; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Arial; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} table.MsoTableMediumList1Accent6 {mso-style-name:"Medium List 1 - Accent 6"; mso-tstyle-rowband-size:1; mso-tstyle-colband-size:1; mso-style-priority:65; mso-style-unhide:no; border-top:solid #E0301E 1.0pt; mso-border-top-themecolor:accent6; border-left:none; border-bottom:solid #E0301E 1.0pt; mso-border-bottom-themecolor:accent6; border-right:none; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Georgia","serif"; color:black; mso-themecolor:text1; mso-ansi-language:EN-GB;} table.MsoTableMediumList1Accent6FirstRow {mso-style-name:"Medium List 1 - Accent 6"; mso-table-condition:first-row; mso-style-priority:65; mso-style-unhide:no; mso-tstyle-border-top:cell-none; mso-tstyle-border-bottom:1.0pt solid #E0301E; mso-tstyle-border-bottom-themecolor:accent6; font-family:"Arial Narrow","sans-serif"; mso-ascii-font-family:Georgia; mso-ascii-theme-font:major-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:major-fareast; mso-hansi-font-family:Georgia; mso-hansi-theme-font:major-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:major-bidi;} table.MsoTableMediumList1Accent6LastRow {mso-style-name:"Medium List 1 - Accent 6"; mso-table-condition:last-row; mso-style-priority:65; mso-style-unhide:no; mso-tstyle-border-top:1.0pt solid #E0301E; mso-tstyle-border-top-themecolor:accent6; mso-tstyle-border-bottom:1.0pt solid #E0301E; mso-tstyle-border-bottom-themecolor:accent6; color:#968C6D; mso-themecolor:text2; mso-ansi-font-weight:bold; mso-bidi-font-weight:bold;} table.MsoTableMediumList1Accent6FirstCol {mso-style-name:"Medium List 1 - Accent 6"; mso-table-condition:first-column; mso-style-priority:65; mso-style-unhide:no; mso-ansi-font-weight:bold; mso-bidi-font-weight:bold;} table.MsoTableMediumList1Accent6LastCol {mso-style-name:"Medium List 1 - Accent 6"; mso-table-condition:last-column; mso-style-priority:65; mso-style-unhide:no; mso-tstyle-border-top:1.0pt solid #E0301E; mso-tstyle-border-top-themecolor:accent6; mso-tstyle-border-bottom:1.0pt solid #E0301E; mso-tstyle-border-bottom-themecolor:accent6; mso-ansi-font-weight:bold; mso-bidi-font-weight:bold;} table.MsoTableMediumList1Accent6OddColumn {mso-style-name:"Medium List 1 - Accent 6"; mso-table-condition:odd-column; mso-style-priority:65; mso-style-unhide:no; mso-tstyle-shading:#F7CBC7; mso-tstyle-shading-themecolor:accent6; mso-tstyle-shading-themetint:63;} table.MsoTableMediumList1Accent6OddRow {mso-style-name:"Medium List 1 - Accent 6"; mso-table-condition:odd-row; mso-style-priority:65; mso-style-unhide:no; mso-tstyle-shading:#F7CBC7; mso-tstyle-shading-themecolor:accent6; mso-tstyle-shading-themetint:63;} It is generally accepted among business communities that technology by itself is not a silver bullet to all problems, but when it is combined with leading practices, strategy, careful planning and execution, it can create a recipe for success. This post attempts to highlight some of the best practices along with dos & don’ts that our practice has accumulated over the years in the identity & access management space in general, and also in the context of R2, in particular. Best Practices The following section illustrates the leading practices in “How” to plan, implement and sustain a successful OIM deployment, based on our collective experience. Planning is critical, but often overlooked A common approach to planning an IAM program that we identify with our clients is the three step process involving a current state assessment, a future state roadmap and an executable strategy to get there. It is extremely beneficial for clients to assess their current IAM state, perform gap analysis, document the recommended controls to address the gaps, align future state roadmap to business initiatives and get buy in from all stakeholders involved to improve the chances of success. When designing an enterprise-wide solution, the scalability of the technology must accommodate the future growth of the enterprise and the projected identity transactions over several years. Aligning the implementation schedule of OIM to related information technology projects increases the chances of success. As a baseline, it is recommended to match hardware specifications to the sizing guide for R2 published by Oracle. Adherence to this will help ensure that the hardware used to support OIM will not become a bottleneck as the adoption of new services increases. If your Organization has numerous connected applications that rely on reconciliation to synchronize the access data into OIM, consider hosting dedicated instances to handle reconciliation. Finally, ensure the use of clustered environment for development and have at least three total environments to help facilitate a controlled migration to production. If your Organization is planning to implement role based access control, we recommend performing a role mining exercise and consolidate your enterprise roles to keep them manageable. In addition, many Organizations have multiple approval flows to control access to critical roles, applications and entitlements. If your Organization falls into this category, we highly recommend that you limit the number of approval workflows to a small set. Most Organizations have operations managed across data centers with backend database synchronization, if your Organization falls into this category, ensure that the overall latency between the datacenters when replicating the databases is less than ten milliseconds to ensure that there are no front office performance impacts. Ingredients for a successful implementation During the development phase of your project, there are a number of guidelines that can be followed to help increase the chances for success. Most implementations cannot be completed without the use of customizations. If your implementation requires this, it’s a good practice to perform code reviews to help ensure quality and reduce code bottlenecks related to performance. We have observed at our clients that the development process works best when team members adhere to coding leading practices. Plan for time to correct coding defects and ensure developers are empowered to report their own bugs for maximum transparency. Many organizations struggle with defining a consistent approach to managing logs. This is particularly important due to the amount of information that can be logged by OIM. We recommend Oracle Diagnostics Logging (ODL) as an alternative to be used for logging. ODL allows log files to be formatted in XML for easy parsing and does not require a server restart when the log levels are changed during troubleshooting. Testing is a vital part of any large project, and an OIM R2 implementation is no exception. We suggest that at least one lower environment should use production-like data and connectors. Configurations should match as closely as possible. For example, use secure channels between OIM and target platforms in pre-production environments to test the configurations, the migration processes of certificates, and the additional overhead that encryption could impose. Finally, we ask our clients to perform database backups regularly and before any major change event, such as a patch or migration between environments. In the lowest environments, we recommend to have at least a weekly backup in order to prevent significant loss of time and effort. Similarly, if your organization is using virtual machines for one or more of the environments, it is recommended to take frequent snapshots so that rollbacks can occur in the event of improper configuration. Operate & sustain the solution to derive maximum benefits When migrating OIM R2 to production, it is important to perform certain activities that will help achieve a smoother transition. At our clients, we have seen that splitting the OIM tables into their own tablespaces by categories (physical tables, indexes, etc.) can help manage database growth effectively. If we notice that a client hasn’t enabled the Oracle-recommended indexing in the applicable database, we strongly suggest doing so to improve performance. Additionally, we work with our clients to make sure that the audit level is set to fit the organization’s auditing needs and sometimes even allocate UPA tables and indexes into their own table-space for better maintenance. Finally, many of our clients have set up schedules for reconciliation tables to be archived at regular intervals in order to keep the size of the database(s) reasonable and result in optimal database performance. For our clients that anticipate availability issues with target applications, we strongly encourage the use of the offline provisioning capabilities of OIM R2. This reduces the provisioning process for a given target application dependency on target availability and help avoid broken workflows. To account for this and other abnormalities, we also advocate that OIM’s monitoring controls be configured to alert administrators on any abnormal situations. Within OIM R2, we have begun advising our clients to utilize the ‘profile’ feature to encapsulate multiple commonly requested accounts, roles, and/or entitlements into a single item. By setting up a number of profiles that can be searched for and used, users will spend less time performing the same exact steps for common tasks. We advise our clients to follow the Oracle recommended guides for database and application server tuning which provides a good baseline configuration. It offers guidance on database connection pools, connection timeouts, user interface threads and proper handling of adapters/plug-ins. All of these can be important configurations that will allow faster provisioning and web page response times. Many of our clients have begun to recognize the value of data mining and a remediation process during the initial phases of an implementation (to help ensure high quality data gets loaded) and beyond (to support ongoing maintenance and business-as-usual processes). A successful program always begins with identifying the data elements and assigning a classification level based on criticality, risk, and availability. It should finish by following through with a remediation process. Dos & Don’ts Here are the most common dos and don'ts that we socialize with our clients, derived from our experience implementing the solution. Dos Don’ts Scope the project into phases with realistic goals. Look for quick wins to show success and value to the stake holders. Avoid “boiling the ocean” and trying to integrate all enterprise applications in the first phase. Establish an enterprise ID (universal unique ID across the enterprise) earlier in the program. Avoid major UI customizations that require code changes. Have a plan in place to patch during the project, which helps alleviate any major issues or roadblocks (product and database). Avoid publishing all the target entitlements if you don't anticipate their usage during access request. Assess your current state and prepare a roadmap to address your operations, tactical and strategic goals, align it with your business priorities. Avoid integrating non-production environments with your production target systems. Defer complex integrations to the later phases and take advantage of lessons learned from previous phases Avoid creating multiple accounts for the same user on the same system, if there is an opportunity to do so. Have an identity and access data quality initiative built into your plan to identify and remediate data related issues early on. Avoid creating complex approval workflows that would negative impact productivity and SLAs. Identify the owner of the identity systems with fair IdM knowledge and empower them with authority to make product related decisions. This will help ensure overcome any design hurdles. Avoid creating complex designs that are not sustainable long term and would need major overhaul during upgrades. Shadow your internal or external consulting resources during the implementation to build the necessary product skills needed to operate and sustain the solution. Avoid treating IAM as a point solution and have appropriate level of communication and training plan for the IT and business users alike. Conclusion In our experience, Identity programs will struggle with scope, proper resourcing, and more. We suggest that companies consider the suggestions discussed in this post and leverage them to help enable their identity and access program. This concludes PwC blog series on R2 for the month and we sincerely hope that the information we have shared thus far has been beneficial. For more information or if you have questions, you can reach out to Rex Thexton, Senior Managing Director, PwC and or Dharma Padala, Director, PwC. We look forward to hearing from you. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:12.0pt; mso-para-margin-left:0in; line-height:12.0pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Arial","sans-serif"; mso-ascii-font-family:Arial; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Arial; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Meet the Writers: Dharma Padala is a Director in the Advisory Security practice within PwC.  He has been implementing medium to large scale Identity Management solutions across multiple industries including utility, health care, entertainment, retail and financial sectors.   Dharma has 14 years of experience in delivering IT solutions out of which he has been implementing Identity Management solutions for the past 8 years. Praveen Krishna is a Manager in the Advisory Security practice within PwC.  Over the last decade Praveen has helped clients plan, architect and implement Oracle identity solutions across diverse industries.  His experience includes delivering security across diverse topics like network, infrastructure, application and data where he brings a holistic point of view to problem solving. Scott MacDonald is a Director in the Advisory Security practice within PwC.  He has consulted for several clients across multiple industries including financial services, health care, automotive and retail.   Scott has 10 years of experience in delivering Identity Management solutions. John Misczak is a member of the Advisory Security practice within PwC.  He has experience implementing multiple Identity and Access Management solutions, specializing in Oracle Identity Manager and Business Process Engineering Language (BPEL).

    Read the article

  • How John Got 15x Improvement Without Really Trying

    - by rchrd
    The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here.  How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

    Read the article

< Previous Page | 1 2 3 4 5