Search Results

Search found 1481 results on 60 pages for 'paper'.

Page 13/60 | < Previous Page | 9 10 11 12 13 14 15 16 17 18 19 20 | Next Page >

SQLAuthority News – Microsoft Whitepaper – AlwaysOn Solution Guide: Offloading Read-Only Workloads to Secondary Replicas

- by pinaldave

SQL Server 2012 has many interesting features but the most talked feature is AlwaysOn. Performance tuning is always a hot topic. I see lots of need of the same and lots of business around it. However, many times when people talk about performance tuning they think of it as a either query tuning, performance tuning, or server tuning. All are valid points, but performance tuning expert usually understands the business workload and business logic before making suggestions. For example, if performance tuning expert analysis workload and realize that there are plenty of reports as well read only queries on the server they can for sure consider alternate options for the same. If read only data is not required real time or it can accept the data which is delayed a bit it makes sense to divide the workload. A secondary replica of the original data which can serve all the read only queries and report is a good idea in most of the cases where there is plenty of workload which is not dependent on the real time data. SQL Server 2012 has introduced the feature of AlwaysOn which can very well fit in this scenario and provide a solution in Read-Only Workloads. Microsoft has recently announced a white paper which is based on absolutely the same subject. I recommend it to read for every SQL Enthusiast who is are going to implement a solution to offload read-only workloads to secondary replicas. Download white paper AlwaysOn Solution Guide: Offloading Read-Only Workloads to Secondary Replicas Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Backup and Restore, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: AlwaysOn

Read the article
Oracle SPARC SuperCluster and US DoD Security guidelines

- by user12611852

I've worked in the past to help our government customers understand how best to secure Solaris. For my customer base that means complying with Security Technical Implementation Guides (STIGs) from the Defense Information Systems Agency (DISA). I recently worked with a team to apply both the Solaris and Oracle 11gR2 database STIGs to a SPARC SuperCluster. The results have been published in an Oracle White paper. The SPARC SuperCluster is a highly available, high performance platform that incorporates: SPARC T4-4 servers Exadata Storage Servers and software ZFS Storage appliance InfiniBand interconnect Flash Cache Oracle Solaris 11 Oracle VM for SPARC Oracle Database 11gR2 It is targeted towards large, mission critical database, middleware and general purpose workloads. Using the Oracle Solution Center we configured a SSC applied DoD security guidance and confirmed functionality and performance of the system. The white paper reviews our findings and includes a number of security recommendations. In addition, customers can contact me for the itemized spreadsheets with our detailed STIG reports. Some notes: There is no DISA STIG documentation for Solaris 11. Oracle is working to help DISA create one using their new process. As a result, our report follows the Solaris 10 STIG document and applies it to Solaris 11 where applicable. In my conversations over the years with DISA Field Security Office they have repeatedly told me, "The absence of a DISA written STIG should not prevent a product from being used. Customer may apply vendor or industry security recommendations to receive accreditation." Thanks to the core team: Kevin Rohan, Gary Jensen and Rich Qualls as well as the staff of the Oracle Solution Center and Glenn Brunette for their help in creating the document.

Read the article
Additional new content SOA & BPM Partner Community

- by JuergenKress

Oracle Fusion Middleware 12c (12.1.2.0.0) Released - Download (OTN, eDelivery) Whitepaper: Next Generation Service Integration Platform - PDF SOA Maturity This article in the Industrial SOA series offers exploration of the fundamentals of applying a factory approach to modern service-oriented software development. Read the article. Enterprise Service Bus The fifth article in the Industrial SOA series answers to some of the most important questions about the use of an enterprise service bus, using concrete examples to clarify areas of application that can be deemed correct for ESBs. Read the article. DevOps, Cloud, and Role Creep DevOps and cloud computing are changing the IT industry - and changing IT roles. An panel of community members discusses what’s happening and how it might affect your job. Listen to the podcast. Industrial SOA - Now chapters 1 to 5 available | Torsten Winterberg White Paper: Cloud Integration - A Comprehensive Solution White Paper: Next Generation Service Integration Platform : SOA Suite on Exalogic IT Briefcase Interview: An Integrated Approach to Mobile, Cloud, and API Management Technologies with Oracle Fusion Middleware Webcast: Oracle Cloud Integration – Information Week Webcast eBook: Oracle SOA Suite – In the Customers’ Words Podcast: Cloud Integration Transitioning from TIBCO to Oracle SOA Suite – Part 1 Events: Oracle Simplifying Integration of Cloud and On-Premise New B2B Book Published for Oracle SOA B2B 11g Get Fast-Data Accelerator in Your Hands Today: Mobile Data Offloading for Telecom Fast Data Accelerator - Blog New Oracle Process Accelerators in Financial Services & Teleco Detect, Analyze, Act Fast with BPM Improving the Quality of Healthcare with BPM Engineers Australia Improves and Automates Business Processes and Completes Engineer Enrollments up to 90% Faster with Middleware Platform - Case Study | PPT Specialized Partner Ataway on BPM Practice - Video eProseed Delivers Processes Skillfully with Oracle BPM Suite - Video Yarra Valley Water Uses SOA and BPM for Orchestration, Re-use and Visibility - Video Victoria University Discusses Oracle SOA & Oracle BPM - Video SOA & BPM Partner Community For regular information on Oracle SOA Suite become a member in the SOA & BPM Partner Community for registration please visit www.oracle.com/goto/emea/soa (OPN account required) If you need support with your account please contact the Oracle Partner Business Center. Blog Twitter LinkedIn Facebook Wiki Mix Forum Technorati Tags: SOA Community,Oracle SOA,Oracle BPM,Community,OPN,Jürgen Kress

Read the article
Delivering Oracle DBaaS: Journey to Enterprise Cloud with Oracle Database 12c

- by B R Clouse

The release of Oracle Database 12c is accompanied by extensive supporting collateral that details the new features and options of this major release. But with so much to read and investigate, where to start? If you don't have the time to pore through everything, then you may wish organize your reading in terms of the use case you're most interested in. If your interest is Database as a Service in private database clouds then may I suggest that you start here: Accelerate the Journey to Enterprise Cloud with Oracle Database 12c Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-family:"Calibri","sans-serif"; mso-ascii- mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi- mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} This paper describes the phases of the journey to enterprise cloud, and enumerates the new features and options in Oracle Database 12c that support each phase. Oracle Multitenant figures prominently, but it's not the only cloud-enabling topic: Oracle Database Quality of Service Management, Application Continuity, Automatic Data Optimization, Global Data Services and Active Data Guard Far Sync all deliver key benefits for delivering database as a service. Further reading and research is suggested by the references included in the paper. Happy clouding!

Read the article
Architectural Composition Languages

- by C. Lawrence Wenham

Recently stumbled upon this paper (PDF) talking about ACLs, or Architectural Composition Languages. They're a fusion of two earlier lines of research: Architectural Definition Languages (such as UML) and Object Composition Languages (such as XAML, WWF, or scripting languages). The goal of an ACL is to have a high-level description of a program's architecture which can also be compiled into a runnable program. The high-level description assists automated analysis, while the 'executability' means changes can be tested immediately. You would still author the components of the program in a conventional programming language (C, Java, Python, etc), but they would be composed into a complete program by the ACL. One of the expected benefits is that a program can be ported to a different platform by swapping in "similar but different" components. I've been hankering for something like this for a long time (see this answer I gave on a StackOverflow question a few years ago). The paper mentions that the researchers were working on a language called ACL/1 that initially targeted Java, but would be ported to support .Net as well. However, I can't find any more mention of ACL/1 anywhere. Has there been any more work done on this? Are there any other implementations of the ACL concept that are available for use or experimentation?

Read the article
How to get hp printer to select correct trays on samba shared printer

- by dgermann

My first post here. Clean install of 12.04.1. Previously 10.04. Have been running a p2035n printer without problem. Now, no matter what paper tray I select it prints to the top "by-pass" feeder--not the lower main tray. And when I select to print to manual feed, it does not wait for me to put in paper (like it used to), but prints immediately. The printer is on a samba share and is physically plugged into a WinXP machine. Its uri is smb:///computername/p2035n. I have tried several of the offered drivers for this hp machine, and even tried generic-laser. Same result on all. Here are some other things I have tried (losing 2-3 hours of my time!): hp-check -t reports “error: User needs to be member of group 'lp' to enable print, scan & fax. ” and “warning: Printer is not HPLIP installed. Printers must use the hp: or hpfax: CUPS backend to function in HPLIP.” needed to download HPLIP 3.12.11 followed http://hplipopensource.com/hplip-web/install/install/index.html sh hplip-3.12.11-run says that before setup logout and then on login run hp-setup hp-setup could not find the printer ran hp-check -t again, found I was not in group lp ran hp-check --fix and it fixed that problem, asks for reboot rebooted after running update manager (new kernel) and hp-check -t shows no errors this time all the tray selections end up on the bypass tray hp-setup still cannot find the printer So is there a way to make this work? Thanks! :- Doug.

Read the article
Why does my Canon printer print document pages at ~25% size?

- by Erlend Alvestad

I'm using a Canon PIXMA MP250, and I'm running 12.04 LTS. The printer's been working fine for the couple of months I've been a Linux user. That is, until today. I just printed a 1-page ODT document from LibreOffice. Instead of filling the sheet, the document occupies only a little less than 25% of the paper, in the top left corner, and the text has also shrunk to something like 5pt. I looked at the paper format settings for the document and printer, which were set to "letter". I changed these to "A4", hoping that would solve the issue. There was no change, however. I tried printing a different document in LibreOffice and got the same result. I tried exporting the original document to PDF and printing it through Document Viewer. Same result. I then printed a web page from Google Chrome. No formatting problems there. In all cases the print preview looks fine.

Read the article
How to rotate html5 canvas as page background?

- by Sebastian P.R. Gingter

Hi, I want to achieve the following: Image a white sheet of paper on a black desk. Then rotate the paper a little bit to the left (like, 25 degrees). Now you still have the black desk, and a rotated white box on it. In this rotated white box I want to place non-rotated normal html content like text, tables, div's etc. I already have a problem at the very first step: rotating a rectangle. This is my code so far: <html> <head> <script> function draw() { var canvas=document.getElementById("myCanvas"); var c=canvas.getContext("2d"); c.fillStyle = '#00'; c.fillRect(100, 100, 100, 100); c.rotate(20); c.fillStyle = '#ff0000'; c.fillRect(150, 150, 10, 10); } </script> </head> <body onload="draw()"> <canvas id="myCanvas" width="500" height="500"></canvas> </body> </html> With this, I see only a normal black box. Nothing else. I assume there should be a red, rotated box too, but there's nothing. What is the best approach to reach this and to have it as a (scaling) background for my web page?

Read the article
Early Adopters of Oracle Enterprise Manager 12c Report Agility and Productivity Benefits

- by Anand Akela

Earlier this month at the Oracle Open World 2012, we celebrated the first anniversary of Oracle Enterprise Manager 12c . Early adopters of Oracle Enterprise manager 12c have benefited from its federated self-service access to complete application stacks, automated provisioning, elastic scalability, metering, and charge-back capabilities. Crimson Consulting Group recently interviewed multiple early adopters of Oracle Enterprise Manager 12c and captured their finding in a white Paper "Real-World Benefits of Private Cloud: Early Adopters of Oracle Enterprise Manager 12c Report Agility and Productivity Gains". Here is summary of the finding :- On October 25th at 10 AM pacific time, Kirk Bangstad from the Crimson Consulting group will join us in a live webcast and share what learnt from the early adopters of Oracle Enterprise Manager 12c. Don't miss this chance to hear how private clouds could impact your business and ask questions from our experts. Webcast: Real-World Benefits of Private Cloud Early Adopters of Oracle Enterprise Manager 12c Report Agility and Productivity Benefits Date: Thursday, October 25, 2012 Time: 10:00 AM PDT | 1:00 PM EDT Register Today All attendees will receive the White Paper: Real-World Benefits of Private Cloud: Early Adopters of Oracle Enterprise Manager 12c Report Agility and Productivity Gains. Stay Connected Twitter | Face book | You Tube | Linked in | Newsletter

Read the article
SOA Community Newsletter March 2012

- by JuergenKress

Dear SOA partner community member Thank you for your excellent feedback on the brand new Patch Set 5! PS5 is a combined release of all Fusion Middleware components. We recommend you to use this version for all your upcoming projects. We are very keen to know your feedback on PS5. We request you to please send it across to us, especially if your fist customers are in production. Edwin, Roland and Demed from product management published a joint paper Start Small, Grow Fast. Please let us know if you are interested to write a joint paper. Till we meet again! Till we meet again! To read the newsletter please visit http://tinyurl.com/soanewsMarch2012 (OPN Account required) To become a member of the SOA Partner Community please register at http://www.oracle.com/goto/emea/soa (OPN account required) If you need support with your account please contact the Oracle Partner Business Center. Blog Twitter LinkedIn Mix Forum Technorati Tags: SOA Community newsletter,SOA Community,Oracle,OPN,Jürgen Kress,WebLogic 12c,SOA Implementation Assessment,BPM Implementation Assessment,SOA Certification,SOA Specialist,BPM Certification,BPM Specialist,SOA Suite for Healthcare Integration,SOA Community Forum,SOA Specialization

Read the article
When can you call yourself good at language X?

- by SoulBeaver

This goes back to a conversation I've had with my girlfriend. I tried to tell her that I simply don't feel adequate enough in my programming language (C++) to call myself good. She then asked me, "Well, when do you consider yourself good enough?" That's an interesting question. I didn't know what to tell her. So I'm asking you. For any programming language, framework or the like, when do you reach a point were you sit back, look at what you've done and say, "Hey, I'm actually pretty good at this."? How do you define "good" so that you can tell others, honestly, "Yeah, I'm good at X". Additionally, do you reach these conclusions by comparing what others can do? Additional Info I have read the canonical paper on how it takes ten-thousand hours before you are an expert on the field. (Props to anybody that knows what this paper is called again) I have also read various articles from Coding Horror about interviewing people. Some people, it was said, "Cannot function outside of a framework." So they may be "good" for that framework, but not otherwise in the language. Is this true?

Read the article
Introducing the PeopleSoft Interaction Hub

- by Matthew Haavisto

The PeopleSoft Applications Portal has just been re-branded as the PeopleSoft Interaction Hub. It's not just a name change, however. As part of our ongoing efforts to deliver a richer user experience to PeopleSoft customers, Oracle/PeopleSoft is now offering an enhanced restricted use license (login required) of the PeopleSoft Interaction Hub free with PeopleTools. This change extends the existing restricted use license to include the following additional capabilities: Dynamic Unified Navigation. Enables customers to easily provide seamless, unified navigation among their entire PeopleSoft application portfolio. Site-wide branding. Makes it easier to brand your ecosystem and provide a vivid, contemporary appearance for your applications. These additions augment the capabilities provided in the previous restricted use license, which remain available: creation and use of collaborative workspaces, and pre-built collaborative services for use in related content. (See the license notes (login required)for a complete list of everything that is granted with the PeopleTools license.)PeopleSoft is moving to deliver a contemporary user experience for your applications users, and the this license change supports that direction. In addition, the name change reflects our positioning of the PeopleSoft Interaction Hub as a primary means for unifying your PeopleSoft ecosystem, and providing a richer, web-site-based user experience rather than a pillared, application-based experience.See this white paper to get an idea of some of the capabilities that you can employ with the PeopleSoft Interaction Hub to enhance the PeopleSoft user experience. In addition, this red paper provides valuable 'how-to' guidance. In the near future we will be producing a best practices guide for deployment. In the mean time, the most recent release/feature pack of the PIH automates the setup of unified navigation with a Workcenter specifically supporting Unified Navigation. This Workcenter guides administrators through the setup process, and streamlines things.So what should customers do if they still want to use the PeopleSoft Interaction Hub for traditional portal purposes? Customers can employ the PIH's full capabilities such as multiple site deployment/management and content management, by buying the full, unrestricted license. We are continuing to enhance the product, and it remains part of Applications Unlimited, and we have some exciting features planned.

Read the article
Service Catalogs for Database as a Service

- by B R Clouse

At the end of last month, I had the opportunity to present a speaking session at Oracle OpenWorld: Database as a Service: Creating a Database Cloud Service Catalog. The session was well-attended which would have surprised me several months ago when I started researching this topic. At that time, I thought of service catalogs as something trivial which could be explained in a few simple slides. But while looking at all the different options and approaches available, I came to learn that designing a succinct and effective catalog is not a trivial task, and mistakes can lead to confusion and unintended side effects. And when the room filled up, my new point of view was confirmed. In case you missed the session, or were able to attend but would like more details, I've posted a white paper that covers the topics from the session, and more. We start with an overview of the components of a service catalog: And then look at several customer case studies of service catalogs for DBaaS. Synthesizing those examples, we summarize the main options for defining the service categories and their levels. We end with a template for defining Bronze | Silver | Gold service tiers for Oracle Database Services. The paper is now available here - watch for updates as we work to expand some sections and incorporate readers' feedback (hint - that includes your feedback). Visit our OTN page for additional Database Cloud collateral.

Read the article
So You Want To Build a SPARC Cloud

- by user12601629

Did you ever wish you could get the industrial strength power of UNIX/RISC with the flexibility of cloud computing? Well, now you can! With recent advances from Oracle it's possible to build an incredibly high-performance, flexible, available virtualized infrastructure based on Solaris and SPARC. Here's the recipe! Authored in collaboration across the Oracle "Systems Group" team, we now have a complete best practice guide for you. Click below to download it: Best Practices for Building a Virtualized SPARC Computing Environment Inside you'll find recommendations for how and when to leverage technologies like: SPARC T4 OVM for SPARC hypervisor (version 2.2 and newer) Solaris 11 Ops Center 12c ZFS Storage Appliance Oracle network switches Based on following these best practices, you'll be able to construct a dynamic, virtualized infrastructure that allows for: Easy, GUI-based provisioning on new VMs Automated HA failover in the event of physical server failures Automatic load balancing across a cluster of VM hosts Complete end-to-end monitoring You should download this paper and check it out. Even if you aren't planning on buying all new hardware, and instead want to transform some existing gear into a dynamic virtualized environment then this paper will give you concrete info on what to do and the trade-offs you'll make. Have fun getting started on your journey to build a SPARC cloud!

Read the article
The Windows Store... why did I sign up with this mess again?

- by FransBouma

Yesterday, Microsoft revealed that the Windows Store is now open to all developers in a wide range of countries and locations. For the people who think "wtf is the 'Windows Store'?", it's the central place where Windows 8 users will be able to find, download and purchase applications (or as we now have to say to not look like a computer illiterate: <accent style="Kentucky">aaaaappss</accent>) for Windows 8. As this is the store which is integrated into Windows 8, it's an interesting place for ISVs, as potential customers might very well look there first. This of course isn't true for all kinds of software, and developer tools in general aren't the kind of applications most users will download from the Windows store, but a presence there can't hurt. Now, this Windows Store hosts two kinds of applications: 'Metro-style' applications and 'Desktop' applications. The 'Metro-style' applications are applications created for the new 'Metro' UI which is present on Windows 8 desktop and Windows RT (the single color/big font fingerpaint-oriented UI). 'Desktop' applications are the applications we all run and use on Windows today. Our software are desktop applications. The Windows Store hosts all Metro-style applications locally in the store and handles the payment for these applications. This means you upload your application (sorry, 'app') to the store, jump through a lot of hoops, Microsoft verifies that your application is not violating a tremendous long list of rules and after everything is OK, it's published and hopefully you get customers and thus earn money. Money which Microsoft will pay you on a regular basis after customers buy your application. Desktop applications are not following this path however. Desktop applications aren't hosted by the Windows Store. Instead, the Windows Store more or less hosts a page with the application's information and where to get the goods. I.o.w.: it's nothing more than a product's Facebook page. Microsoft will simply redirect a visitor of the Windows Store to your website and the visitor will then use your site's system to purchase and download the application. This last bit of information is very important. So, this morning I started with fresh energy to register our company 'Solutions Design bv' at the Windows Store and our two applications, LLBLGen Pro and ORM Profiler. First I went to the Windows Store dashboard page. If you don't have an account, you have to log in or sign up if you don't have a live account. I signed in with my live account. After that, it greeted me with a page where I had to fill in a code which was mailed to me. My local mail server polls every several minutes for email so I had to kick it to get it immediately. I grabbed the code from the email and I was presented with a multi-step process to register myself as a company or as an individual. In red I was warned that this choice was permanent and not changeable. I chuckled: Microsoft apparently stores its data on paper, not in digital form. I chose 'company' and was presented with a lengthy form to fill out. On the form there were two strange remarks: Per company there can just be 1 (one, uno, not zero, not two or more) registered developer, and only that developer is able to upload stuff to the store. I have no idea how this works with large companies, oh the overhead nightmares... "Sorry, but John, our registered developer with the Windows Store is on holiday for 3 months, backpacking through Australia, no, he's not reachable at this point. M'yeah, sorry bud. Hey, did you fill in those TPS reports yesterday?" A separate Approver has to be specified, which has to be a different person than the registered developer. Apparently to Microsoft a company with just 1 person is not a company. Luckily we're with two people! *pfew*, dodged that one, otherwise I would be stuck forever: the choice I already made was not reversible! After I had filled out the form and it was all well and good and accepted by the Microsoft lackey who had to write it all down in some paper notebook ("Hey, be warned! It's a permanent choice! Written down in ink, can't be changed!"), I was presented with the question how I wanted to pay for all this. "Pay for what?" I wondered. Must be the paper they were scribbling the information on, I concluded. After all, there's a financial crisis going on! How could I forget! Silly me. "Ok fair enough". The price was 75 Euros, not the end of the world. I could only pay by credit card, so it was accepted quickly. Or so I thought. You see, Microsoft has a different idea about CC payments. In the normal world, you type in your CC number, some date, a name and a security code and that's it. But Microsoft wants to verify this even more. They want to make a verification purchase of a very small amount and are doing that with a special code in the description. You then have to type in that code in a special form in the Windows Store dashboard and after that you're verified. Of course they'll refund the small amount they pull from your card. Sounds simple, right? Well... no. The problem starts with the fact that I can't see the CC activity on some website: I have a bank issued CC card. I get the CC activity once a month on a piece of paper sent to me. The bank's online website doesn't show them. So it's possible I have to wait for this code till October 12th. One month. "So what, I'm not going to use it anyway, Desktop applications don't use the payment system", I thought. "Haha, you're so naive, dear developer!" Microsoft won't allow you to publish any applications till this verification is done. So no application publishing for a month. Wouldn't it be nice if things were, you know, digital, so things got done instantly? But of course, that lackey who scribbled everything in the Big Windows Store Registration Book isn't that quick. Can't blame him though. He's just doing his job. Now, after the payment was done, I was presented with a page which tells me Microsoft is going to use a third party company called 'Symantec', which will verify my identity again. The page explains to me that this could be done through email or phone and that they'll contact the Approver to verify my identity. "Phone?", I thought... that's a little drastic for a developer account to publish a single page of information about an external hosted software product, isn't it? On Facebook I just added a page, done. And paying you, Microsoft, took less information: you were happy to take my money before my identity was even 'verified' by this 3rd party's minions! "Double standards!", I roared. No-one cared. But it's the thought of getting it off your chest, you know. Luckily for me, everyone at Symantec was asleep when I was registering so they went for the fallback option in case phone calls were not possible: my Approver received an email. Imagine you have to explain the idiot web of security theater I was caught in to someone else who then has to reply a random person over the internet that I indeed was who I said I was. As she's a true sweetheart, she gave me the benefit of the doubt and assured that for now, I was who I said I was. Remember, this is for a desktop application, which is only a link to a website, some pictures and a piece of text. No file hosting, no payment processing, nothing, just a single page. Yeah, I also thought I was crazy. But we're not at the end of this quest yet. I clicked around in the confusing menus of the Windows Store dashboard and found the 'Desktop' section. I get a helpful screen with a warning in red that it can't find any certified 'apps'. True, I'm just getting started, buddy. I see a link: "Check the Windows apps you submitted for certification". Well, I haven't submitted anything, but let's see where it brings me. Oh the thrill of adventure! I click the link and I end up on this site: the hardware/desktop dashboard account registration. "Erm... but I just registered...", I mumbled to no-one in particular. Apparently for desktop registration / verification I have to register again, it tells me. But not only that, the desktop application has to be signed with a certificate. And not just some random el-cheapo certificate you can get at any mall's discount store. No, this certificate is special. It's precious. This certificate, the 'Microsoft Authenticode' Digital Certificate, is the only certificate that's acceptable, and jolly, it can be purchased from VeriSign for the price of only ... $99.-, but be quick, because this is a limited time offer! After that it's, I kid you not, $499.-. 500 dollars for a certificate to sign an executable. But, I do feel special, I got a special price. Only for me! I'm glowing. Not for long though. Here I started to wonder, what the benefit of it all was. I now again had to pay money for a shiny certificate which will add 'Solutions Design bv' to our installer as the publisher instead of 'unknown', while our customers download the file from our website. Not only that, but this was all about a Desktop application, which wasn't hosted by Microsoft. They only link to it. And make no mistake. These prices aren't single payments. Every year these have to be renewed. Like a membership of an exclusive club: you're special and privileged, but only if you cough up the dough. To give you an example how silly this all is: I added LLBLGen Pro and ORM Profiler to the Visual Studio Gallery some time ago. It's the same thing: it's a central place where one can find software which adds to / extends / works with Visual Studio. I could simply create the pages, add the information and they show up inside Visual Studio. No files are hosted at Microsoft, they're downloaded from our website. Exactly the same system. As I have to wait for the CC transcripts to arrive anyway, I can't proceed with publishing in this new shiny store. After the verification is complete I have to wait for verification of my software by Microsoft. Even Desktop applications need to be verified using a long list of rules which are mainly focused on Metro-style applications. Even while they're not hosted by Microsoft. I wonder what they'll find. "Your application wasn't approved. It violates rule 14 X sub D: it provides more value than our own competing framework". While I was writing this post, I tried to check something in the Windows Store Dashboard, to see whether I remembered it correctly. I was presented again with the question, after logging in with my live account, to enter the code that was just mailed to me. Not the previous code, a brand new one. Again I had to kick my mail server to pull the email to proceed. This was it. This 'experience' is so beyond miserable, I'm afraid I have to say goodbye for now to the 'Windows Store'. It's simply not worth my time. Now, about live accounts. You might know this: live accounts are tied to everything you do with Microsoft. So if you have an MSDN subscription, e.g. the one which costs over $5000.-, it's tied to this same live account. But the fun thing is, you can login with your live account to the MSDN subscriptions with just the account id and password. No additional code is mailed to you. While it gives you access to all Microsoft software available, including your licenses. Why the draconian security theater with this Windows Store, while all I want is to publish some desktop applications while on other Microsoft sites it's OK to simply sign in with your live account: no codes needed, no verification and no certificates? Microsoft, one thing you need with this store and that's: apps. Apps, apps, apps, apps, aaaaaaaaapps. Sorry, my bad, got carried away. I just can't stand the word 'app'. This store's shelves have to be filled to the brim with goods. But instead of being welcomed into the store with open arms, I have to fight an uphill battle with an endless list of rules and bullshit to earn the privilege to publish in this shiny store. As if I have to be thrilled to be one of the exclusive club called 'Windows Store Publishers'. As if Microsoft doesn't want it to succeed. Craig Stuntz sent me a link to an old blog post of his regarding code signing and uploading to Microsoft's old mobile store from back in the WinMo5 days: http://blogs.teamb.com/craigstuntz/2006/10/11/28357/. Good read and good background info about how little things changed over the years. I hope this helps Microsoft make things more clearer and smoother and also helps ISVs with their decision whether to go with the Windows Store scheme or ignore it. For now, I don't see the advantage of publishing there, especially not with the nonsense rules Microsoft cooked up. Perhaps it changes in the future, who knows.

Read the article
How John Got 15x Improvement Without Really Trying

- by rchrd

The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here. How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

Read the article
Software Engineering undergraduate project ideas

- by Nasser Hajloo

There was a similar post at << Computer science undergraduate project ideas << Ideas for Software Engineering Thesis Project << Senior computer engineering project ideas ? << Final Year Project(Software Engineering) Idea So I read all of them and my answer wasn't fit to those. Actually I'm looking for some ideas which 1 - Help me extend a functionality of Open source software (like creating a usefull add-in 2 - Let me Create a Scientific Paper (ideas to publish a scientific paper) 3 - Or Create a Unique an usefull application from the scratch , (like performance tool, profiler, analyzers and other similar tools) I know C# - Asp.net and sql So with all these conditions what do you think is better to do? let me know your ideas whatever those are. any idea appriciated.

Read the article
What is the number of the masked code ee@?

- by Jack

This question was asked in the interview of amdocs in the Unix section of the question paper.

Read the article
Prevent dragging in Joint JS

- by user3607705

I am working on JointJS API. However I want to prevent the elements from being movable from their original positions. Can you suggest me some feature of JointJS or any feature of CSS in general, which I could use to make my object immovable. I can't use interactive: false option on the paper or paper.$el.css('pointer-events', 'none'); because I need to have highlighting features when mouse hovers over the element. Please suggest a way that disables movement of elements while allowing other features.

Read the article
GOTO still considered harmful?

- by Kyle Cronin

Everyone is aware of Dijkstra's Letters to the editor: go to statement considered harmful (also here .html transcript and here .pdf) and there has been a formidable push since that time to eschew the goto statement whenever possible. While it's possible to use goto to produce unmaintainable, sprawling code, it nevertheless remains in modern programming languages. Even the advanced continuation control structure in Scheme can be described as a sophisticated goto. What circumstances warrant the use of goto? When is it best to avoid? As a followup question: C provides a pair of functions, setjmp and longjmp, that provide the ability to goto not just within the current stack frame but within any of the calling frames. Should these be considered as dangerous as goto? More dangerous? Dijkstra himself regretted that title, of which he was not responsible for. At the end of EWD1308 (also here .pdf) he wrote: Finally a short story for the record. In 1968, the Communications of the ACM published a text of mine under the title "The goto statement considered harmful", which in later years would be most frequently referenced, regrettably, however, often by authors who had seen no more of it than its title, which became a cornerstone of my fame by becoming a template: we would see all sorts of articles under the title "X considered harmful" for almost any X, including one titled "Dijkstra considered harmful". But what had happened? I had submitted a paper under the title "A case against the goto statement", which, in order to speed up its publication, the editor had changed into a "letter to the Editor", and in the process he had given it a new title of his own invention! The editor was Niklaus Wirth. A well thought out classic paper about this topic, to be matched to that of Dijkstra, is Structured Programming with go to Statements (also here .pdf), by Donald E. Knuth. Reading both helps to reestablish context and a non-dogmatic understanding of the subject. In this paper, Dijkstra's opinion on this case is reported and is even more strong: Donald E. Knuth: I believe that by presenting such a view I am not in fact disagreeing sharply with Dijkstra's ideas, since he recently wrote the following: "Please don't fall into the trap of believing that I am terribly dogmatical about [the go to statement]. I have the uncomfortable feeling that others are making a religion out of it, as if the conceptual problems of programming could be solved by a single trick, by a simple form of coding discipline!"

Read the article
Removing New line character in Fields PHP

- by Aruna

Hi, i am trying to upload an excel file and to store its contents in the Mysql database. i am having a problem in saving the contents.. like My csv file is in the form of "1","aruna","IEEE paper" "2","nisha","JOurnal magazine" actually i am having 2 records and i am using the code <?php $string = file_get_contents( $_FILES["file"]["tmp_name"] ); //echo $string; foreach ( explode( "\n", $string ) as $userString ) { echo $userString; } ? since in the Csv record there is a new line inserted in between IEEE and paper it is dispaying me as 3 records.. How to remove this new line code wise and to modify the code so that only the new line between the records 1 and 2 is considered... Pls help me....

Read the article
Fluent Nhibernate Automap convention for not-null field

- by user215015

Hi, Could some one help, how would I instruct automap to have not-null for a cloumn? public class Paper : Entity { public Paper() { } [DomainSignature] [NotNull, NotEmpty] public virtual string ReferenceNumber { get; set; } [NotNull] public virtual Int32 SessionWeek { get; set; } } But I am getting the following: <column name="SessionWeek"/> I know it can be done using fluent-map. but i would like to know it in auto-mapping way. Many thanks. Regards Robie

Read the article
HMM for perspective estimation in document image, can't understand the algorithm

- by maximus

Hello! Here is a paper, it is about estimating the perspective of binary image containing text and some noise or non text objects. PDF document The algorithm uses the Hidden Markov Model: actually two conditions T - text B - backgrouond (i.e. noise) It is hard to understand the algorithm itself. The question is that I've read about Hidden Markov Models and I know that it uses probabilities that must be known. But in this algorithm I can't understand, if they use HMM, how do they get those probabilities (probability of changing the state from S1 to another state for example S2)? I didn't find anything about training there also in that paper. So, if somebody understands it, please tell me. Also is it possible to use HMM without knowing the state change probabilities?

Read the article
What are logical and path queries

- by NomeN

I'm reading a paper which mentions that a language for refactoring has three specific requirements. functional features (like ML) logical queries (like Datalog) path queries (like Datalog) I know what they mean by functional features, but I'm not totally clear on the latter two and can't find a clear explanation either. Although I have a good idea after what I could find on the subjects, I need to be sure so here goes: Could the SO-community please clearly explain to me what logical queries and path queries are? Or at the very least what the people from the paper meant?

Read the article
DataGridRow Cells property

- by Michal Krawiec

I would like to get to DataGridRow Cells property. It's a table of cells in a current DataGrid. But I cannot get access direct from code nor by Reflection: var x = dataGridRow.GetType().GetProperty("Cells") //returns null Is there any way to get this table? And related question - in Watch window (VS2008) regular properties have an icon of a hand pointing on a sheet of paper. But DataGridRow.Cells has an icon of a hand pointing on a sheet of paper with a little yellow envelope in a left bottom corner - what does it mean? Thanks for replies.

Read the article

< Previous Page | 9 10 11 12 13 14 15 16 17 18 19 20 | Next Page >