ux matters - Page 73 - Developer IT

jQuery not refreshing tabs content in IE

- by iddimu

Hi all! I have a page that is using jQuery tabs. Within one of my tabs I have a div that contains a form (initially hidden) that I want to use to add content to the tab. What I have works perfectly in Chrome, Firefox, and Safari. But, in IE 7 the tab will not refresh. The post works and the data gets added to the database, but it simply will not show the new content after submitting it. I don't think it matters - but, just for information I am using the Codeigniter PHP framework as well. Here is my javascript: <script type="text/javascript"> $(document).ready(function(){ // initialize the addChild form as hidden until user requests it open $('#addChild').hide(); // open the form $('#openDialog').click( function(){ $('#addChild').slideToggle(); return false; }); // close the form $('#closeDialog').click( function(){ $('#addChild').slideToggle(); return false; }); // submit the form $('#frmAddChild').submit( function(){ $('#addChild').slideToggle(); $.ajax({ url: '/children/add', type: 'POST', data: $('#frmAddChild').serialize() //cache: false }); //reload the children tab $('#tabs').tabs('load',3); return false; }); }); </script> And, here is my PHP/HTML: <?php // initialize the elements of the form $frmAddChild = array( 'name' => 'frmAddChild', 'id' => 'frmAddChild', 'method' => 'post' ); $child_name = array( 'name' => 'child_name', 'id' => 'child_name', ); $child_dob = array( 'name' => 'child_dob', 'id' => 'child_dob' ); $btnOpenDialog = array( 'name' => 'openDialog', 'id' => 'openDialog', 'value' => 'true', 'content' => 'Add Child' ); $btnCloseDialog = array( 'name' => 'closeDialog', 'id' => 'closeDialog', 'value' => 'true', 'content' => 'Cancel' ); // button that shows the drop down to add echo form_button($btnOpenDialog); ?> <div id="addChild" title="Add Child"> <?php echo form_open('children/add/',$frmAddChild); ?> <table> <tr> <td> <?php echo form_label('Child\'s Name', 'child_name'); ?>: </td> <td> <?php echo form_input($child_name); ?> </td> </tr> <tr> <td> <?php echo form_label('Date of Birth','child_dob'); ?>: </td> <td> <?php echo form_input($child_dob); ?> </td> </tr> <tr> <td colspan="2" align="right"> <?php echo form_submit('submit', 'Add'); ?> <?php echo form_button($btnCloseDialog); ?> </td> </tr> </table> <?php echo form_close(); ?> </div> Does anyone have any ideas how I can get this working correctly in IE? Also, if anyone has any comments about how I have things structured, please let me know. I'm new to Codeigniter and I am by no means a javascript or jQuery expert. Thanks for your help!

Read the article

The broken Promise of the Mobile Web

- by Rick Strahl

High end mobile devices have been with us now for almost 7 years and they have utterly transformed the way we access information. Mobile phones and smartphones that have access to the Internet and host smart applications are in the hands of a large percentage of the population of the world. In many places even very remote, cell phones and even smart phones are a common sight. I’ll never forget when I was in India in 2011 I was up in the Southern Indian mountains riding an elephant out of a tiny local village, with an elephant herder in front riding atop of the elephant in front of us. He was dressed in traditional garb with the loin wrap and head cloth/turban as did quite a few of the locals in this small out of the way and not so touristy village. So we’re slowly trundling along in the forest and he’s lazily using his stick to guide the elephant and… 10 minutes in he pulls out his cell phone from his sash and starts texting. In the middle of texting a huge pig jumps out from the side of the trail and he takes a picture running across our path in the jungle! So yeah, mobile technology is very pervasive and it’s reached into even very buried and unexpected parts of this world. Apps are still King Apps currently rule the roost when it comes to mobile devices and the applications that run on them. If there’s something that you need on your mobile device your first step usually is to look for an app, not use your browser. But native app development remains a pain in the butt, with the requirement to have to support 2 or 3 completely separate platforms. There are solutions that try to bridge that gap. Xamarin is on a tear at the moment, providing their cross-device toolkit to build applications using C#. While Xamarin tools are impressive – and also *very* expensive – they only address part of the development madness that is app development. There are still specific device integration isssues, dealing with the different developer programs, security and certificate setups and all that other noise that surrounds app development. There’s also PhoneGap/Cordova which provides a hybrid solution that involves creating local HTML/CSS/JavaScript based applications, and then packaging them to run in a specialized App container that can run on most mobile device platforms using a WebView interface. This allows for using of HTML technology, but it also still requires all the set up, configuration of APIs, security keys and certification and submission and deployment process just like native applications – you actually lose many of the benefits that Web based apps bring. The big selling point of Cordova is that you get to use HTML have the ability to build your UI once for all platforms and run across all of them – but the rest of the app process remains in place. Apps can be a big pain to create and manage especially when we are talking about specialized or vertical business applications that aren’t geared at the mainstream market and that don’t fit the ‘store’ model. If you’re building a small intra department application you don’t want to deal with multiple device platforms and certification etc. for various public or corporate app stores. That model is simply not a good fit both from the development and deployment perspective. Even for commercial, big ticket apps, HTML as a UI platform offers many advantages over native, from write-once run-anywhere, to remote maintenance, single point of management and failure to having full control over the application as opposed to have the app store overloads censor you. In a lot of ways Web based HTML/CSS/JavaScript applications have so much potential for building better solutions based on existing Web technologies for the very same reasons a lot of content years ago moved off the desktop to the Web. To me the Web as a mobile platform makes perfect sense, but the reality of today’s Mobile Web unfortunately looks a little different… Where’s the Love for the Mobile Web? Yet here we are in the middle of 2014, nearly 7 years after the first iPhone was released and brought the promise of rich interactive information at your fingertips, and yet we still don’t really have a solid mobile Web platform. I know what you’re thinking: “But we have lots of HTML/JavaScript/CSS features that allows us to build nice mobile interfaces”. I agree to a point – it’s actually quite possible to build nice looking, rich and capable Web UI today. We have media queries to deal with varied display sizes, CSS transforms for smooth animations and transitions, tons of CSS improvements in CSS 3 that facilitate rich layout, a host of APIs geared towards mobile device features and lately even a number of JavaScript framework choices that facilitate development of multi-screen apps in a consistent manner. Personally I’ve been working a lot with AngularJs and heavily modified Bootstrap themes to build mobile first UIs and that’s been working very well to provide highly usable and attractive UI for typical mobile business applications. From the pure UI perspective things actually look very good. Not just about the UI But it’s not just about the UI - it’s also about integration with the mobile device. When it comes to putting all those pieces together into what amounts to a consolidated platform to build mobile Web applications, I think we still have a ways to go… there are a lot of missing pieces to make it all work together and integrate with the device more smoothly, and more importantly to make it work uniformly across the majority of devices. I think there are a number of reasons for this. Slow Standards Adoption HTML standards implementations and ratification has been dreadfully slow, and browser vendors all seem to pick and choose different pieces of the technology they implement. The end result is that we have a capable UI platform that’s missing some of the infrastructure pieces to make it whole on mobile devices. There’s lots of potential but what is lacking that final 10% to build truly compelling mobile applications that can compete favorably with native applications. Some of it is the fragmentation of browsers and the slow evolution of the mobile specific HTML APIs. A host of mobile standards exist but many of the standards are in the early review stage and they have been there stuck for long periods of time and seem to move at a glacial pace. Browser vendors seem even slower to implement them, and for good reason – non-ratified standards mean that implementations may change and vendor implementations tend to be experimental and likely have to be changed later. Neither Vendors or developers are not keen on changing standards. This is the typical chicken and egg scenario, but without some forward momentum from some party we end up stuck in the mud. It seems that either the standards bodies or the vendors need to carry the torch forward and that doesn’t seem to be happening quickly enough. Mobile Device Integration just isn’t good enough Current standards are not far reaching enough to address a number of the use case scenarios necessary for many mobile applications. While not every application needs to have access to all mobile device features, almost every mobile application could benefit from some integration with other parts of the mobile device platform. Integration with GPS, phone, media, messaging, notifications, linking and contacts system are benefits that are unique to mobile applications and could be widely used, but are mostly (with the exception of GPS) inaccessible for Web based applications today. Unfortunately trying to do most of this today only with a mobile Web browser is a losing battle. Aside from PhoneGap/Cordova’s app centric model with its own custom API accessing mobile device features and the token exception of the GeoLocation API, most device integration features are not widely supported by the current crop of mobile browsers. For example there’s no usable messaging API that allows access to SMS or contacts from HTML. Even obvious components like the Media Capture API are only implemented partially by mobile devices. There are alternatives and workarounds for some of these interfaces by using browser specific code, but that’s might ugly and something that I thought we were trying to leave behind with newer browser standards. But it’s not quite working out that way. It’s utterly perplexing to me that mobile standards like Media Capture and Streams, Media Gallery Access, Responsive Images, Messaging API, Contacts Manager API have only minimal or no traction at all today. Keep in mind we’ve had mobile browsers for nearly 7 years now, and yet we still have to think about how to get access to an image from the image gallery or the camera on some devices? Heck Windows Phone IE Mobile just gained the ability to upload images recently in the Windows 8.1 Update – that’s feature that HTML has had for 20 years! These are simple concepts and common problems that should have been solved a long time ago. It’s extremely frustrating to see build 90% of a mobile Web app with relative ease and then hit a brick wall for the remaining 10%, which often can be show stoppers. The remaining 10% have to do with platform integration, browser differences and working around the limitations that browsers and ‘pinned’ applications impose on HTML applications. The maddening part is that these limitations seem arbitrary as they could easily work on all mobile platforms. For example, SMS has a URL Moniker interface that sort of works on Android, works badly with iOS (only works if the address is already in the contact list) and not at all on Windows Phone. There’s no reason this shouldn’t work universally using the same interface – after all all phones have supported SMS since before the year 2000! But, it doesn’t have to be this way Change can happen very quickly. Take the GeoLocation API for example. Geolocation has taken off at the very beginning of the mobile device era and today it works well, provides the necessary security (a big concern for many mobile APIs), and is supported by just about all major mobile and even desktop browsers today. It handles security concerns via prompts to avoid unwanted access which is a model that would work for most other device APIs in a similar fashion. One time approval and occasional re-approval if code changes or caches expire. Simple and only slightly intrusive. It all works well, even though GeoLocation actually has some physical limitations, such as representing the current location when no GPS device is present. Yet this is a solved problem, where other APIs that are conceptually much simpler to implement have failed to gain any traction at all. Technically none of these APIs should be a problem to implement, but it appears that the momentum is just not there. Inadequate Web Application Linking and Activation Another important piece of the puzzle missing is the integration of HTML based Web applications. Today HTML based applications are not first class citizens on mobile operating systems. When talking about HTML based content there’s a big difference between content and applications. Content is great for search engine discovery and plain browser usage. Content is usually accessed intermittently and permanent linking is not so critical for this type of content. But applications have different needs. Applications need to be started up quickly and must be easily switchable to support a multi-tasking user workflow. Therefore, it’s pretty crucial that mobile Web apps are integrated into the underlying mobile OS and work with the standard task management features. Unfortunately this integration is not as smooth as it should be. It starts with actually trying to find mobile Web applications, to ‘installing’ them onto a phone in an easily accessible manner in a prominent position. The experience of discovering a Mobile Web ‘App’ and making it sticky is by no means as easy or satisfying. Today the way you’d go about this is: Open the browser Search for a Web Site in the browser with your search engine of choice Hope that you find the right site Hope that you actually find a site that works for your mobile device Click on the link and run the app in a fully chrome’d browser instance (read tiny surface area) Pin the app to the home screen (with all the limitations outline above) Hope you pointed at the right URL when you pinned Even for you and me as developers, there are a few steps in there that are painful and annoying, but think about the average user. First figuring out how to search for a specific site or URL? And then pinning the app and hopefully from the right location? You’ve probably lost more than half of your audience at that point. This experience sucks. For developers too this process is painful since app developers can’t control the shortcut creation directly. This problem often gets solved by crazy coding schemes, with annoying pop-ups that try to get people to create shortcuts via fancy animations that are both annoying and add overhead to each and every application that implements this sort of thing differently. And that’s not the end of it - getting the link onto the home screen with an application icon varies quite a bit between browsers. Apple’s non-standard meta tags are prominent and they work with iOS and Android (only more recent versions), but not on Windows Phone. Windows Phone instead requires you to create an actual screen or rather a partial screen be captured for a shortcut in the tile manager. Who had that brilliant idea I wonder? Surprisingly Chrome on recent Android versions seems to actually get it right – icons use pngs, pinning is easy and pinned applications properly behave like standalone apps and retain the browser’s active page state and content. Each of the platforms has a different way to specify icons (WP doesn’t allow you to use an icon image at all), and the most widely used interface in use today is a bunch of Apple specific meta tags that other browsers choose to support. The question is: Why is there no standard implementation for installing shortcuts across mobile platforms using an official format rather than a proprietary one? Then there’s iOS and the crazy way it treats home screen linked URLs using a crazy hybrid format that is neither as capable as a Web app running in Safari nor a WebView hosted application. Moving off the Web ‘app’ link when switching to another app actually causes the browser and preview it to ‘blank out’ the Web application in the Task View (see screenshot on the right). Then, when the ‘app’ is reactivated it ends up completely restarting the browser with the original link. This is crazy behavior that you can’t easily work around. In some situations you might be able to store the application state and restore it using LocalStorage, but for many scenarios that involve complex data sources (like say Google Maps) that’s not a possibility. The only reason for this screwed up behavior I can think of is that it is deliberate to make Web apps a pain in the butt to use and forcing users trough the App Store/PhoneGap/Cordova route. App linking and management is a very basic problem – something that we essentially have solved in every desktop browser – yet on mobile devices where it arguably matters a lot more to have easy access to web content we have to jump through hoops to have even a remotely decent linking/activation experience across browsers. Where’s the Money? It’s not surprising that device home screen integration and Mobile Web support in general is in such dismal shape – the mobile OS vendors benefit financially from App store sales and have little to gain from Web based applications that bypass the App store and the cash cow that it presents. On top of that, platform specific vendor lock-in of both end users and developers who have invested in hardware, apps and consumables is something that mobile platform vendors actually aspire to. Web based interfaces that are cross-platform are the anti-thesis of that and so again it’s no surprise that the mobile Web is on a struggling path. But – that may be changing. More and more we’re seeing operations shifting to services that are subscription based or otherwise collect money for usage, and that may drive more progress into the Web direction in the end . Nothing like the almighty dollar to drive innovation forward. Do we need a Mobile Web App Store? As much as I dislike moderated experiences in today’s massive App Stores, they do at least provide one single place to look for apps for your device. I think we could really use some sort of registry, that could provide something akin to an app store for mobile Web apps, to make it easier to actually find mobile applications. This could take the form of a specialized search engine, or maybe a more formal store/registry like structure. Something like apt-get/chocolatey for Web apps. It could be curated and provide at least some feedback and reviews that might help with the integrity of applications. Coupled to that could be a native application on each platform that would allow searching and browsing of the registry and then also handle installation in the form of providing the home screen linking, plus maybe an initial security configuration that determines what features are allowed access to for the app. I’m not holding my breath. In order for this sort of thing to take off and gain widespread appeal, a lot of coordination would be required. And in order to get enough traction it would have to come from a well known entity – a mobile Web app store from a no name source is unlikely to gain high enough usage numbers to make a difference. In a way this would eliminate some of the freedom of the Web, but of course this would also be an optional search path in addition to the standard open Web search mechanisms to find and access content today. Security Security is a big deal, and one of the perceived reasons why so many IT professionals appear to be willing to go back to the walled garden of deployed apps is that Apps are perceived as safe due to the official review and curation of the App stores. Curated stores are supposed to protect you from malware, illegal and misleading content. It doesn’t always work out that way and all the major vendors have had issues with security and the review process at some time or another. Security is critical, but I also think that Web applications in general pose less of a security threat than native applications, by nature of the sandboxed browser and JavaScript environments. Web applications run externally completely and in the HTML and JavaScript sandboxes, with only a very few controlled APIs allowing access to device specific features. And as discussed earlier – security for any device interaction can be granted the same for mobile applications through a Web browser, as they can for native applications either via explicit policies loaded from the Web, or via prompting as GeoLocation does today. Security is important, but it’s certainly solvable problem for Web applications even those that need to access device hardware. Security shouldn’t be a reason for Web apps to be an equal player in mobile applications. Apps are winning, but haven’t we been here before? So now we’re finding ourselves back in an era of installed app, rather than Web based and managed apps. Only it’s even worse today than with Desktop applications, in that the apps are going through a gatekeeper that charges a toll and censors what you can and can’t do in your apps. Frankly it’s a mystery to me why anybody would buy into this model and why it’s lasted this long when we’ve already been through this process. It’s crazy… It’s really a shame that this regression is happening. We have the technology to make mobile Web apps much more prominent, but yet we’re basically held back by what seems little more than bureaucracy, partisan bickering and self interest of the major parties involved. Back in the day of the desktop it was Internet Explorer’s 98+% market shareholding back the Web from improvements for many years – now it’s the combined mobile OS market in control of the mobile browsers. If mobile Web apps were allowed to be treated the same as native apps with simple ways to install and run them consistently and persistently, that would go a long way to making mobile applications much more usable and seriously viable alternatives to native apps. But as it is mobile apps have a severe disadvantage in placement and operation. There are a few bright spots in all of this. Mozilla’s FireFoxOs is embracing the Web for it’s mobile OS by essentially building every app out of HTML and JavaScript based content. It supports both packaged and certified package modes (that can be put into the app store), and Open Web apps that are loaded and run completely off the Web and can also cache locally for offline operation using a manifest. Open Web apps are treated as full class citizens in FireFoxOS and run using the same mechanism as installed apps. Unfortunately FireFoxOs is getting a slow start with minimal device support and specifically targeting the low end market. We can hope that this approach will change and catch on with other vendors, but that’s also an uphill battle given the conflict of interest with platform lock in that it represents. Recent versions of Android also seem to be working reasonably well with mobile application integration onto the desktop and activation out of the box. Although it still uses the Apple meta tags to find icons and behavior settings, everything at least works as you would expect – icons to the desktop on pinning, WebView based full screen activation, and reliable application persistence as the browser/app is treated like a real application. Hopefully iOS will at some point provide this same level of rudimentary Web app support. What’s also interesting to me is that Microsoft hasn’t picked up on the obvious need for a solid Web App platform. Being a distant third in the mobile OS war, Microsoft certainly has nothing to lose and everything to gain by using fresh ideas and expanding into areas that the other major vendors are neglecting. But instead Microsoft is trying to beat the market leaders at their own game, fighting on their adversary’s terms instead of taking a new tack. Providing a kick ass mobile Web platform that takes the lead on some of the proposed mobile APIs would be something positive that Microsoft could do to improve its miserable position in the mobile device market. Where are we at with Mobile Web? It sure sounds like I’m really down on the Mobile Web, right? I’ve built a number of mobile apps in the last year and while overall result and response has been very positive to what we were able to accomplish in terms of UI, getting that final 10% that required device integration dialed was an absolute nightmare on every single one of them. Big compromises had to be made and some features were left out or had to be modified for some devices. In two cases we opted to go the Cordova route in order to get the integration we needed, along with the extra pain involved in that process. Unless you’re not integrating with device features and you don’t care deeply about a smooth integration with the mobile desktop, mobile Web development is fraught with frustration. So, yes I’m frustrated! But it’s not for lack of wanting the mobile Web to succeed. I am still a firm believer that we will eventually arrive a much more functional mobile Web platform that allows access to the most common device features in a sensible way. It wouldn't be difficult for device platform vendors to make Web based applications first class citizens on mobile devices. But unfortunately it looks like it will still be some time before this happens. So, what’s your experience building mobile Web apps? Are you finding similar issues? Just giving up on raw Web applications and building PhoneGap apps instead? Completely skipping the Web and going native? Leave a comment for discussion. Resources Rick Strahl on DotNet Rocks talking about Mobile Web© Rick Strahl, West Wind Technologies, 2005-2014Posted in HTML5 Mobile Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article

Configuring Oracle iPlanet WebServer / Oracle Traffic Director to use crypto accelerators on T4-1 servers

- by mv

Configuring Oracle iPlanet Web Server / Oracle Traffic Director to use crypto accelerators on T4-1 servers Jyri had written a technical article on Configuring Solaris Cryptographic Framework and Sun Java System Web Server 7 on Systems With UltraSPARC T1 Processors. I tried to find out what has changed since then in T4. I have used a T4-1 SPARC system with Solaris 10. Results slightly vary for Solaris 11. For Solaris 11, the T4 optimization was implemented in libsoftcrypto.so while it was in pkcs11_softtoken_extra.so for Solaris 10. Overview of T4 processors is here in this blog. Many thanx to Chi-Chang Lin and Julien for their help. 1. Install Oracle iPlanet Web Server / Oracle Traffic Director. Go to instance/config directory. # cd /opt/oracle/webserver7/https-hostname.fqdn/config 2. List default PKCS#11 Modules # ../../bin/modutil -dbdir . -listListing of PKCS #11 Modules-----------------------------------------------------------1. NSS Internal PKCS #11 Moduleslots: 2 slots attachedstatus: loadedslot: NSS Internal Cryptographic Servicestoken: NSS Generic Crypto Servicesslot: NSS User Private Key and Certificate Servicestoken: NSS Certificate DB2. Root Certslibrary name: libnssckbi.soslots: 1 slot attachedstatus: loadedslot: NSS Builtin Objectstoken: Builtin Object Token----------------------------------------------------------- 3. Initialize the soft token data store in the $HOME/.sunw/pkcs11_softtoken/ directory # pktool setpin keystore=pkcs11Enter token passphrase: olderpasswordCreate new passphrase: passwordRe-enter new passphrase: passwordPassphrase changed. 4. Offload crypto operations to Solaris Crypto Framework on T4 $ ../../bin/modutil -dbdir . -nocertdb -add SCF -libfile /usr/lib/libpkcs11.so -mechanisms RSA:AES:SHA1:MD5 Module "SCF" added to database. Note that -nocertdb means modutil won't try to open the NSS softoken key database. It doesn't even have to be present. PKCS#11 library used is /usr/lib/libpkcs11.so. If the server is running in 64 bit mode, we have to use /usr/lib/64/libpkcs11.so Unlike T1 and T2, in T4 we do not have to disable mechanisms in softtoken provider using cryptoadm. 5. List again to check that a new module SCF is added # ../../bin/modutil -dbdir . -list Listing of PKCS #11 Modules-----------------------------------------------------------1. NSS Internal PKCS #11 Moduleslots: 2 slots attachedstatus: loadedslot: NSS Internal Cryptographic Servicestoken: NSS Generic Crypto Servicesslot: NSS User Private Key and Certificate Servicestoken: NSS Certificate DB2. SCFlibrary name: /usr/lib/libpkcs11.soslots: 2 slots attachedstatus: loadedslot: Sun Metaslottoken: Sun Metaslotslot: n2rng/0 SUNW_N2_Random_Number_Generator token: n2rng/0 SUNW_N2_RNG 3. Root Certs library name: libnssckbi.so slots: 1 slot attached status: loaded slot: NSS Builtin Objects token: Builtin Object Token----------------------------------------------------------- 6. Create certificate in “Sun Metaslot” : I have used certutil, but you must use Admin Server CLI / GUI # ../../bin/certutil -S -x -n "Server-Cert" -t "CT,CT,CT" -s "CN=*.fqdn" -d . -h "Sun Metaslot"Enter Password or Pin for "Sun Metaslot": password 7. Verify that the certificate is created properly in “Sun Metslaot” # ../../bin/certutil -L -d . -h "Sun Metaslot"Certificate Nickname Trust AttributesSSL,S/MIME,JAR/XPIEnter Password or Pin for "Sun Metaslot": passwordSun Metaslot:Server-Cert CTu,Cu,Cu# 8. Associate this newly created certificate to http listener using Admin CLI/GUI. After that server.xml should have <http-listener> ... <ssl> <server-cert-nickname>Sun Metaslot:Server-Cert</server-cert-nicknamer> </ssl> Note the prefix "Sun Metaslot" 9. Disable PKCS#11 bypass To use the accelerated AES algorithm, turn off PKCS#11 bypass, and configure modutil to have the AES mechanism go to the Metaslot. After you disable PKCS#11 bypasss using Admin GUI/CLI, check that server.xml should have <server> .... <pkcs11> <enabled>1</enabled> <allow-bypass>0</allow-bypass> </pkcs11> With PKCS#11 bypass enabled, Oracle iPlanet Web Server will only use the RSA capability of the T4, provided certificate and key are stored in the T4 slot (Metaslot). Actually, the RSA op is never bypassed in NSS, it's always done with PKCS#11 calls. So the bypass settings won't affect the behavior of the probes for RSA at all. The only thing that matters if where the RSA key and certificate live, ie. which PKCS#11 token, and thus which PKCS#11 module gets called to do the work. If your certificate/key are in the NSS certificate/key db, you will see libsoftokn3/libfreebl libraries doing the RSA work. If they are in the Sun Metaslot, it should be the Solaris code. 10. Start the server instance # ../bin/startserv Oracle iPlanet Web Server 7.0.16 B09/14/2012 03:33Please enter the PIN for the "Sun Metaslot" token: password...info: HTTP3072: http-listener-1: https://hostname.fqdn:80 ready to accept requestsinfo: CORE3274: successful server startup 11. Figure out which process to run this DTrace script on # ps -eaf | grep webservd | grep -v dogwebservd 18224 18223 0 13:17:25 ? 0:07 webservd -d /opt/oracle/webserver7/https-hostname.fqdn/config -r /opt/root 18225 18224 0 13:17:25 ? 0:00 webservd -d /opt/oracle/webserver7/https-hostname.fqdn/config -r /opt/ (For Oracle Traffic Director look for process named "trafficd") We see that the child process id is “18225” 12. Clients for testing : You can use any browser. I used NSS tool tstclnt for testing $cat > req.txtGET /index.html HTTP/1.0 For checking both RSA and AES, I used cipher “:0035” which is TLS_RSA_WITH_AES_256_CBC_SHA $./tstclnt -h hostname -p 80 -d . -T -f -o -v -c “:0035” < req.txt 13. How do I make sure that crypto accelerator is being used 13.1 Create DTrace script The following D script should be able to uncover whether T4-specific crypto routine are being called or not. It also displays stats per second. # cat > t4crypto.d#!/usr/sbin/dtrace -spid$target::*rsa*:entry,pid$target::*yf*:entry{ @ops[probemod, probefunc] = count();}tick-1sec{ printa(@ops); trunc(@ops);} Invoke with './t4crypto.d -p <pid> ' 13.2 EXPECTED PROBES FOR Solaris 10 : If offloading to T4 HW are correctly set up, the expected DTrace output would have these probes and libraries library Operations PROBES pkcs11_softtoken_extra.so RSA soft_decrypt_rsa_pkcs_decode, soft_encrypt_rsa_pkcs_encode soft_rsa_crypt_init_common soft_rsa_decrypt, soft_rsa_encrypt soft_rsa_decrypt_common, soft_rsa_encrypt_common AES yf_aes_instructions_present yf_aes_expand256, yf_aes256_cbc_decrypt, yf_aes256_cbc_encrypt, yf_aes256_load_keys_for_decrypt, yf_aes256_load_keys_for_encrypt, Note that these are for 256, same for 128, 192... these are for cbc, same for ecb, ctr, cfb128... DES yf_des_expand, yf_des_instructions_present yf_des_encrypt libmd_psr.so MD5 yf_md5_multiblock, yf_md5_instruction_present SHA1 yf_sha1_instruction_present, yf_sha1_multibloc 13.3 SAMPLE OUTPUT FOR CIPHER TLS_RSA_WITH_AES_256_CBC_SHA (0x0035) ON T4 SPARC SOLARIS 10 WITHOUT PKCS#11 BYPASS # ./t4crypto.d -p 18225 pkcs11_softtoken_extra.so.1 soft_decrypt_rsa_pkcs_decode 1 pkcs11_softtoken_extra.so.1 soft_rsa_crypt_init_common 1 pkcs11_softtoken_extra.so.1 soft_rsa_decrypt 1 pkcs11_softtoken_extra.so.1 big_mp_mul_yf 2 pkcs11_softtoken_extra.so.1 mpm_yf_mpmul 2 pkcs11_softtoken_extra.so.1 mpmul_arr_yf 2 pkcs11_softtoken_extra.so.1 rijndael_key_setup_enc_yf 2 pkcs11_softtoken_extra.so.1 soft_rsa_decrypt_common 2 pkcs11_softtoken_extra.so.1 yf_aes_expand256 2 pkcs11_softtoken_extra.so.1 yf_aes256_cbc_decrypt 3 pkcs11_softtoken_extra.so.1 yf_aes256_load_keys_for_decrypt 3 pkcs11_softtoken_extra.so.1 big_mont_mul_yf 6 pkcs11_softtoken_extra.so.1 mm_yf_montmul 6 pkcs11_softtoken_extra.so.1 yf_des_instructions_present 6 pkcs11_softtoken_extra.so.1 yf_aes256_cbc_encrypt 8 pkcs11_softtoken_extra.so.1 yf_aes256_load_keys_for_encrypt 8 pkcs11_softtoken_extra.so.1 yf_mpmul_present 8 pkcs11_softtoken_extra.so.1 yf_aes_instructions_present 13 pkcs11_softtoken_extra.so.1 yf_des_encrypt 18 libmd_psr.so.1 yf_md5_multiblock 41 libmd_psr.so.1 yf_md5_instruction_present 72 libmd_psr.so.1 yf_sha1_instruction_present 82 libmd_psr.so.1 yf_sha1_multiblock 82 This indicates that both RSA and AES ops are done in Solaris Crypto Framework. 13.4 SAMPLE OUTPUT FOR CIPHER TLS_RSA_WITH_AES_256_CBC_SHA (0x0035) ON T4 SPARC SOLARIS 10 WITH PKCS#11 BYPASS # ./t4crypto.d -p 18225 pkcs11_softtoken_extra.so.1 soft_decrypt_rsa_pkcs_decode 1 pkcs11_softtoken_extra.so.1 soft_rsa_crypt_init_common 1 pkcs11_softtoken_extra.so.1 soft_rsa_decrypt 1 pkcs11_softtoken_extra.so.1 soft_rsa_decrypt_common 1 pkcs11_softtoken_extra.so.1 big_mp_mul_yf 2 pkcs11_softtoken_extra.so.1 mpm_yf_mpmul 2 pkcs11_softtoken_extra.so.1 mpmul_arr_yf 2 pkcs11_softtoken_extra.so.1 big_mont_mul_yf 6 pkcs11_softtoken_extra.so.1 mm_yf_montmul 6 pkcs11_softtoken_extra.so.1 yf_mpmul_present 8 For this cipher, when I enable PKCS#11 bypass, Only RSA probes are being hit AES probes are not being hit. 13.5 ustack() for RSA operations / probefunc == "soft_rsa_decrypt" / Shows that libnss3.so is calling C_* functions of libpkcs11.so which is calling functions of pkcs11_softtoken_extra.so for both cases with and without bypass. When PKCS#11 bypass is disabled (allow-bypass is 0) pkcs11_softtoken_extra.so.1`soft_rsa_decrypt pkcs11_softtoken_extra.so.1`soft_rsa_decrypt_common+0x94 pkcs11_softtoken_extra.so.1`soft_unwrapkey+0x258 pkcs11_softtoken_extra.so.1`C_UnwrapKey+0x1ec libpkcs11.so.1`meta_unwrap_key+0x17c libpkcs11.so.1`meta_UnwrapKey+0xc4 libpkcs11.so.1`C_UnwrapKey+0xfc libnss3.so`pk11_AnyUnwrapKey+0x6b8 libnss3.so`PK11_PubUnwrapSymKey+0x8c libssl3.so`ssl3_HandleRSAClientKeyExchange+0x1a0 libssl3.so`ssl3_HandleClientKeyExchange+0x154 libssl3.so`ssl3_HandleHandshakeMessage+0x440 libssl3.so`ssl3_HandleHandshake+0x11c libssl3.so`ssl3_HandleRecord+0x5e8 libssl3.so`ssl3_GatherCompleteHandshake+0x5c libssl3.so`ssl_GatherRecord1stHandshake+0x30 libssl3.so`ssl_Do1stHandshake+0xec libssl3.so`ssl_SecureRecv+0x1c8 libssl3.so`ssl_Recv+0x9c libns-httpd40.so`__1cNDaemonSessionDrun6M_v_+0x2dc When PKCS#11 bypass is enabled (allow-bypass is 1) pkcs11_softtoken_extra.so.1`soft_rsa_decrypt pkcs11_softtoken_extra.so.1`soft_rsa_decrypt_common+0x94 pkcs11_softtoken_extra.so.1`C_Decrypt+0x164 libpkcs11.so.1`meta_do_operation+0x27c libpkcs11.so.1`meta_Decrypt+0x4c libpkcs11.so.1`C_Decrypt+0xcc libnss3.so`PK11_PrivDecryptPKCS1+0x1ac libssl3.so`ssl3_HandleRSAClientKeyExchange+0xe4 libssl3.so`ssl3_HandleClientKeyExchange+0x154 libssl3.so`ssl3_HandleHandshakeMessage+0x440 libssl3.so`ssl3_HandleHandshake+0x11c libssl3.so`ssl3_HandleRecord+0x5e8 libssl3.so`ssl3_GatherCompleteHandshake+0x5c libssl3.so`ssl_GatherRecord1stHandshake+0x30 libssl3.so`ssl_Do1stHandshake+0xec libssl3.so`ssl_SecureRecv+0x1c8 libssl3.so`ssl_Recv+0x9c libns-httpd40.so`__1cNDaemonSessionDrun6M_v_+0x2dc libnsprwrap.so`ThreadMain+0x1c libnspr4.so`_pt_root+0xe8 13.6 ustack() FOR AES operations / probefunc == "yf_aes256_cbc_encrypt" / When PKCS#11 bypass is disabled (allow-bypass is 0) pkcs11_softtoken_extra.so.1`yf_aes256_cbc_encrypt pkcs11_softtoken_extra.so.1`aes_block_process_contiguous_whole_blocks+0xb4 pkcs11_softtoken_extra.so.1`aes_crypt_contiguous_blocks+0x1cc pkcs11_softtoken_extra.so.1`soft_aes_encrypt_common+0x22c pkcs11_softtoken_extra.so.1`C_EncryptUpdate+0x10c libpkcs11.so.1`meta_do_operation+0x1fc libpkcs11.so.1`meta_EncryptUpdate+0x4c libpkcs11.so.1`C_EncryptUpdate+0xcc libnss3.so`PK11_CipherOp+0x1a0 libssl3.so`ssl3_CompressMACEncryptRecord+0x264 libssl3.so`ssl3_SendRecord+0x300 libssl3.so`ssl3_FlushHandshake+0x54 libssl3.so`ssl3_SendFinished+0x1fc libssl3.so`ssl3_HandleFinished+0x314 libssl3.so`ssl3_HandleHandshakeMessage+0x4ac libssl3.so`ssl3_HandleHandshake+0x11c libssl3.so`ssl3_HandleRecord+0x5e8 libssl3.so`ssl3_GatherCompleteHandshake+0x5c libssl3.so`ssl_GatherRecord1stHandshake+0x30 libssl3.so`ssl_Do1stHandshake+0xec Shows that libnss3.so is calling C_* functions of libpkcs11.so which is calling functions of pkcs11_softtoken_extra.so However when PKCS#11 bypass is disabled (allow-bypass is 1) this stack isn't getting called. 14. LIST OF ALL THE PROBES MATCHED BY D SCRIPT FOR REFERENCE # ./t4crypto.d -p 18225 -l ID PROVIDER MODULE FUNCTION NAME ... 55720 pid18225 libmd_psr.so.1 yf_md5_instruction_present entry 55721 pid18225 libmd_psr.so.1 yf_sha256_instruction_present entry 55722 pid18225 libmd_psr.so.1 yf_sha512_instruction_present entry 55723 pid18225 libmd_psr.so.1 yf_sha1_instruction_present entry 55724 pid18225 libmd_psr.so.1 yf_sha256 entry 55725 pid18225 libmd_psr.so.1 yf_sha256_multiblock entry 55726 pid18225 libmd_psr.so.1 yf_sha512 entry 55727 pid18225 libmd_psr.so.1 yf_sha512_multiblock entry 55728 pid18225 libmd_psr.so.1 yf_sha1 entry 55729 pid18225 libmd_psr.so.1 yf_sha1_multiblock entry 55730 pid18225 libmd_psr.so.1 yf_md5 entry 55731 pid18225 libmd_psr.so.1 yf_md5_multiblock entry 55732 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_instructions_present entry 55733 pid18225 pkcs11_softtoken_extra.so.1 rijndael_key_setup_enc_yf entry 55734 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_expand128 entry 55735 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_encrypt128 entry 55736 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_decrypt128 entry 55737 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_expand192 entry 55738 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_encrypt192 entry 55739 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_decrypt192 entry 55740 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_expand256 entry 55741 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_encrypt256 entry 55742 pid18225 pkcs11_softtoken_extra.so.1 yf_aes_decrypt256 entry 55743 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_load_keys_for_encrypt entry 55744 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_load_keys_for_encrypt entry 55745 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_load_keys_for_encrypt entry 55746 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_ecb_encrypt entry 55747 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_ecb_encrypt entry 55748 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_ecb_encrypt entry 55749 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_cbc_encrypt entry 55750 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_cbc_encrypt entry 55751 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_cbc_encrypt entry 55752 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_ctr_crypt entry 55753 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_ctr_crypt entry 55754 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_ctr_crypt entry 55755 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_cfb128_encrypt entry 55756 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_cfb128_encrypt entry 55757 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_cfb128_encrypt entry 55758 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_load_keys_for_decrypt entry 55759 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_load_keys_for_decrypt entry 55760 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_load_keys_for_decrypt entry 55761 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_ecb_decrypt entry 55762 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_ecb_decrypt entry 55763 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_ecb_decrypt entry 55764 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_cbc_decrypt entry 55765 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_cbc_decrypt entry 55766 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_cbc_decrypt entry 55767 pid18225 pkcs11_softtoken_extra.so.1 yf_aes128_cfb128_decrypt entry 55768 pid18225 pkcs11_softtoken_extra.so.1 yf_aes192_cfb128_decrypt entry 55769 pid18225 pkcs11_softtoken_extra.so.1 yf_aes256_cfb128_decrypt entry 55771 pid18225 pkcs11_softtoken_extra.so.1 yf_des_instructions_present entry 55772 pid18225 pkcs11_softtoken_extra.so.1 yf_des_expand entry 55773 pid18225 pkcs11_softtoken_extra.so.1 yf_des_encrypt entry 55774 pid18225 pkcs11_softtoken_extra.so.1 yf_mpmul_present entry 55775 pid18225 pkcs11_softtoken_extra.so.1 yf_montmul_present entry 55776 pid18225 pkcs11_softtoken_extra.so.1 mm_yf_montmul entry 55777 pid18225 pkcs11_softtoken_extra.so.1 mm_yf_montsqr entry 55778 pid18225 pkcs11_softtoken_extra.so.1 mm_yf_restore_func entry 55779 pid18225 pkcs11_softtoken_extra.so.1 mm_yf_ret_from_mont_func entry 55780 pid18225 pkcs11_softtoken_extra.so.1 mm_yf_execute_slp entry 55781 pid18225 pkcs11_softtoken_extra.so.1 big_modexp_ncp_yf entry 55782 pid18225 pkcs11_softtoken_extra.so.1 big_mont_mul_yf entry 55783 pid18225 pkcs11_softtoken_extra.so.1 mpmul_arr_yf entry 55784 pid18225 pkcs11_softtoken_extra.so.1 big_mp_mul_yf entry 55785 pid18225 pkcs11_softtoken_extra.so.1 mpm_yf_mpmul entry 55786 pid18225 libns-httpd40.so nsapi_rsa_set_priv_fn entry ... 55795 pid18225 libnss3.so prepare_rsa_priv_key_export_for_asn1 entry 55796 pid18225 libresolv.so.2 sunw_dst_rsaref_init entry 55797 pid18225 libnssutil3.so NSS_Get_SEC_UniversalStringTemplate entry ... 55813 pid18225 libsoftokn3.so prepare_low_rsa_priv_key_for_asn1 entry 55814 pid18225 libsoftokn3.so rsa_FormatOneBlock entry 55815 pid18225 libsoftokn3.so rsa_FormatBlock entry 55816 pid18225 libnssdbm3.so lg_prepare_low_rsa_priv_key_for_asn1 entry 55817 pid18225 libfreebl_32fpu_3.so rsa_build_from_primes entry 55818 pid18225 libfreebl_32fpu_3.so rsa_is_prime entry 55819 pid18225 libfreebl_32fpu_3.so rsa_get_primes_from_exponents entry 55820 pid18225 libfreebl_32fpu_3.so rsa_PrivateKeyOpNoCRT entry 55821 pid18225 libfreebl_32fpu_3.so rsa_PrivateKeyOpCRTNoCheck entry 55822 pid18225 libfreebl_32fpu_3.so rsa_PrivateKeyOpCRTCheckedPubKey entry 55823 pid18225 pkcs11_kernel.so.1 key_gen_rsa_by_value entry 55824 pid18225 pkcs11_kernel.so.1 get_rsa_private_key entry 55825 pid18225 pkcs11_kernel.so.1 get_rsa_public_key entry 55826 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_encrypt entry 55827 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_decrypt entry 55828 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_crypt_init_common entry 55829 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_encrypt_common entry 55830 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_decrypt_common entry 55831 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_sign_verify_init_common entry 55832 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_sign_common entry 55833 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_verify_common entry 55834 pid18225 pkcs11_softtoken_extra.so.1 generate_rsa_key entry 55835 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_genkey_pair entry 55836 pid18225 pkcs11_softtoken_extra.so.1 get_rsa_sha1_prefix entry 55837 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_digest_sign_common entry 55838 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_digest_verify_common entry 55839 pid18225 pkcs11_softtoken_extra.so.1 soft_rsa_verify_recover entry 55840 pid18225 pkcs11_softtoken_extra.so.1 rsa_pri_to_asn1 entry 55841 pid18225 pkcs11_softtoken_extra.so.1 asn1_to_rsa_pri entry 55842 pid18225 pkcs11_softtoken_extra.so.1 soft_encrypt_rsa_pkcs_encode entry 55843 pid18225 pkcs11_softtoken_extra.so.1 soft_decrypt_rsa_pkcs_decode entry 55844 pid18225 pkcs11_softtoken_extra.so.1 soft_sign_rsa_pkcs_encode entry 55845 pid18225 pkcs11_softtoken_extra.so.1 soft_verify_rsa_pkcs_decode entry 55770 profile tick-1sec

Read the article

Wireless access point -> Powerline -> Router -> Internet, should this work?

- by Anthony

My network at home used to be a laptop and desktop connected wirelessly to a single Wireless ADSL router, a Cisco 877W. Wireless reception around the house with this setup was quite unreliable, so I've gone about looking to improve it. I purchased some Belkin Gigabit powerline adapters and I've got these working fine. I can hook a computer up to one of the powerline adapters, and with the other one plugged into the ADSL router the computer has internet access. Additionally I can hook a Netgear DG834G Wireless ADSL router into it with the adsl not plugged in, and after turning off DHCP can RJ45 a computer up to the network. Everything works fine. However, if I setup a wireless network on the Netgear then any computer that connects wirelessly to it cannot access the internet. It gets an IP address very slowly via DHCP which is a good one, but it cannot access the internet. It can however communicate with the RJ45'd computer also connected to the Netgear. I wondered whether this could be a problem with the Netgear so I've borrowed a Cisco Aironet 1200 and got this working fine when it's attached directly to the primary ADSL router. I can connect to it wireless and get onto the internet. However, if I then plug it into the Netgear I can communicate with other devices attached to the Netgear, but can't get any further than the Netgear. All the while though the other devices RJ45'd to the Netgear are communicating with the internet just fine. I'm starting to suspect it's one of two things causing the problem: 1) For some reason the belkin powerline adapters don't like carrying wireless-originating signals. Could this be possible? 2) The primary Cisco ADSL router doesn't want to communicate with other devices on my network more than one hop away from it. I'm making an assumption here that within the Netgear box the wireless and wired sides are handled differently. Could this be true? Has anyone successfully setup something similar to what I'm trying, with a wireless device on the otherside of a pair of powerline connectors? Update 06/07/2010 - Response to irrational John 28 June Thanks for the answer John - and for clearing up some of my questions. The model number of the belkin powerline adapters are F5D4076. Security was apparently enabled by default on them, and I didn't change them from their default setting. The network diagram in your answer shows exactly what I'm trying to setup: I've followed that guide and I'm still not able to get things working properly. The thing that perplexes me is that wired network traffic works just fine - it's only the wireless traffic that doesn't. This is with the same laptop, and the same DHCP or static IPs. "1. What IP addresses did you assign to each router? What subnet masks are you using?" - subnet is 255.255.255.0, the router connected to the adsl is 192.168.153.1 and that has the DHCP server. The access point on the other side of the powerline adapters I've tried both a static IP of 192.168.153.110, same subnet, and a DHCP-assigned IP. The other devices are DHCP, although I also tried manually entering IP settings. "2. Have you correctly enabled DHCP on only one of the routers and disabled it on all the others?" Yes I have - only the internet-connected router has DHCP enabled. The IP range for the DHCP is from 192.168.153.11 - 192.168.153.200. The strange thing is that wired connections work fine on the LAN, plugged into any router, work fine - it's only the wireless connections that aren't working when they're plugged into the non-primary AP. "Since the routers you are using appear to integrate an ADSL modem I'm assuming there is no WAN port on them." There's no NAT within the LAN, and all wired connections are connected to LAN ports. It's something wrong with the wireless - wired works fine throughout the whole LAN. Update 06/07/2010 - Response to irrational John 29 June The diagram you've drawn in your answer shows pretty much exactly what I'm trying to do. I've spent another evening trying different things and made some progress but I'm still scratching my head. I've borrowed a Netgear access point and been trying with this, and the strange thing is that my PC is working now - this is a Windows 7 PC connected to the access point in the position of where the DG834G is in the diagram. Meanwhile, however, I have an old Powerbook G4 12" I use for music, and while that has a DHCP-assigned IP address, it's not getting any network throughput to either LAN or internet addresses. To make matters more strange, my phone appears to be intermittently working when it's on the wifi. The access point is a Netgear WPN802v1, DHCP, NAT both switched off, running firmware 2.0.9.0. Last night I set it up with exactly the same settings, and similar to tonight I could get a couple of devices to work, and a couple not to. By the morning, however, everything had stopped working - nothing could get a DHCP IP address. I rebooted the 877W earlier this evening and I'm wondering whether this is why a few things are working now. "Could it be possible that the issue could be with the 877W?" I didn't configure this - is it possible that the DHCP server only likes assigning devices that are immediately attached to it? Or similar, could a firewall be stopping too many addresses that are coming through one device? (ie. the Access Point) This could explain why devices are working at the start but then not by the end. In reply to your questions, "1. I looked at the Netgear DG834G support page. There are five versions of this router. Which version do you have? Netgear usually lists this on the label on the bottom of the router. What version of the firmware does it have?" It's a DG834Gv3, and the firmware is the last on the netgear site version 4.01.40. "3. Not knowing which version you have, I glanced at the reference manual for the DG834G v3. In the section for Wireless Settings under the subsection Wireless Access Point there is a check box for a Wireless Isolation setting. If you have this setting it should be off/unchecked. If it is checked then any device connected via wireless would not be able to talk to any other device on the LAN. This sounds like your problem so maybe this is the cause?" I've checked this and it's switched off. I've made a change to the IP of the access point to something outside the DHCP range - it's now 192.158.153.5, with DHCP starting at 11 and going up to 254. Thanks for the tip about this - I only have a few devices so wouldn't anticipate the DHCP server assigning up to 110, but better safe than sorry. Finally one more thing I thought I should add, is with the Powerbook G4 that's not working - it's getting a DHCP IP address and it can communicate with the WPN802 as I can visit the administration page. Anything further than this, however, it can't reach; I can't administrate the 192.168.153.1 (877W router). Strangely, however, when I open Finder on the same powerbook it's detecting my NAS which is attached directly via wire to the 877W. If I try to browse it, it says connection failed. RE: "Perhaps the problem with your Powerbook is with DNS?.." The IP settings on the powerbook are identical to that of the PC with the exception of the IP address; the PC is 192.168.153.17 and the powerbook is 192.168.153.12. Subnets are the same, 255.255.255.0 and default gateway is the same, .1, and the DNS servers are the same. I administrate the 877W by going to 192.168.153.1 in the browser. This is what isn't working from the Powerbook, despite the PC working fine when I do the same. Meanwhile, however, I can administrate the AP on 192.168.153.5 from both PC and Powerbook Update 06/07/2010 - FINAL RESOLUTION of sorts: First off, sorry for the length of this question. I need start to practice a more concise writing style, so I'm going to try to keep this bit brief. After much fiddling, and with the hugely-appreciated help of irrational John, I have come to the conclusion that it's something wrong with the powerbook. I believe that this was perhaps the reason I doubted things worked at the very beginning. I now have the original DG834Gv3 running both wirelessly and wired, and both wired devices and wireless devices get internet connectivity. The only anomaly is the powerbook which I've had to keep wired, as no matter what I do it refuses to work wirelessly. I still have suspicions that the 877W isn't quite right; I'm fairly sure that if I RJ45 the powerline adapter into a different LAN port on it then everything will break. I've just about run out of patience to test this further, and I think I need to go into the 877W's config to match the 877w's lan port's settings. I'm accepting irrational John's answer as he's been enormously helpful, way above the call of duty, and for this line he wrote: Beats the heck out of me. which in the midst of great frustration made me chuckle, and for a sentence in one of his comments to the same answer: If it is specific to the Powerbook I would put that issue aside until after you feel you have the rest of your LAN and the additional WAP all working together correctlyt It was this second sentence that made me put the powerbook aside and concentrate on the other devices that ultimately led me to getting things working.

Read the article

Why does C qicksort function implementation works much slower (tape comparations, tape swapping) than bobble sort function?

- by Artur Mustafin

I'm going to implement a toy tape "mainframe" for a students, showing the quickness of "quicksort" class functions (recursive or not, does not really matters, due to the slow hardware, and well known stack reversal techniques) comparatively to the "bubblesort" function class, so, while I'm clear about the hardware implementation ans controllers, i guessed that quicksort function is much faster that other ones in terms of sequence, order and comparation distance (it is much faster to rewind the tape from the middle than from the very end, because of different speed of rewind). Unfortunately, this is not the true, this simple "bubble" code shows great improvements comparatively to the "quicksort" functions in terms of comparison distances, direction and number of comparisons and writes. So I have 3 questions: Does I have mistaken in my implememtation of quicksort function? Does I have mistaken in my implememtation of bubblesoft function? If not, why the "bubblesort" function is works much faster in (comparison and write operations) than "quicksort" function? I already have a "quicksort" function: void quicksort(float *a, long l, long r, const compare_function& compare) { long i=l, j=r, temp, m=(l+r)/2; if (l == r) return; if (l == r-1) { if (compare(a, l, r)) { swap(a, l, r); } return; } if (l < r-1) { while (1) { i = l; j = r; while (i < m && !compare(a, i, m)) i++; while (m < j && !compare(a, m, j)) j--; if (i >= j) { break; } swap(a, i, j); } if (l < m) quicksort(a, l, m, compare); if (m < r) quicksort(a, m, r, compare); return; } } and the kind of my own implememtation of the "bubblesort" function: void bubblesort(float *a, long l, long r, const compare_function& compare) { long i, j, k; if (l == r) { return; } if (l == r-1) { if (compare(a, l, r)) { swap(a, l, r); } return; } if (l < r-1) { while(l < r) { i = l; j = l; while (i < r) { i++; if (!compare(a, j, i)) { continue; } j = i; } if (l < j) { swap(a, l, j); } l++; i = r; k = r; while(l < i) { i--; if (!compare(a, i, k)) { continue; } k = i; } if (k < r) { swap(a, k, r); } r--; } return; } } I have used this sort functions in a test sample code, like this: #include <stdio.h> #include <stdlib.h> #include <math.h> #include <conio.h> long swap_count; long compare_count; typedef long (*compare_function)(float *, long, long ); typedef void (*sort_function)(float *, long , long , const compare_function& ); void init(float *, long ); void print(float *, long ); void sort(float *, long, const sort_function& ); void swap(float *a, long l, long r); long less(float *a, long l, long r); long greater(float *a, long l, long r); void bubblesort(float *, long , long , const compare_function& ); void quicksort(float *, long , long , const compare_function& ); void main() { int n; printf("n="); scanf("%d",&n); printf("\r\n"); long i; float *a = (float *)malloc(n*n*sizeof(float)); sort(a, n, &bubblesort); print(a, n); sort(a, n, &quicksort); print(a, n); free(a); } long less(float *a, long l, long r) { compare_count++; return *(a+l) < *(a+r) ? 1 : 0; } long greater(float *a, long l, long r) { compare_count++; return *(a+l) > *(a+r) ? 1 : 0; } void swap(float *a, long l, long r) { swap_count++; float temp; temp = *(a+l); *(a+l) = *(a+r); *(a+r) = temp; } float tg(float x) { return tan(x); } float ctg(float x) { return 1.0/tan(x); } void init(float *m,long n) { long i,j; for (i = 0; i < n; i++) { for (j=0; j< n; j++) { m[i + j*n] = tg(0.2*(i+1)) + ctg(0.3*(j+1)); } } } void print(float *m, long n) { long i, j; for(i = 0; i < n; i++) { for(j = 0; j < n; j++) { printf(" %5.1f", m[i + j*n]); } printf("\r\n"); } printf("\r\n"); } void sort(float *a, long n, const sort_function& sort) { long i, sort_compare = 0, sort_swap = 0; init(a,n); for(i = 0; i < n*n; i+=n) { if (fmod (i / n, 2) == 0) { compare_count = 0; swap_count = 0; sort(a, i, i+n-1, &less); if (swap_count == 0) { compare_count = 0; sort(a, i, i+n-1, &greater); } sort_compare += compare_count; sort_swap += swap_count; } } printf("compare=%ld\r\n", sort_compare); printf("swap=%ld\r\n", sort_swap); printf("\r\n"); }

Read the article

What’s New in ASP.NET 4.0 Part Two: WebForms and Visual Studio Enhancements

- by Rick Strahl

In the last installment I talked about the core changes in the ASP.NET runtime that I’ve been taking advantage of. In this column, I’ll cover the changes to the Web Forms engine and some of the cool improvements in Visual Studio that make Web and general development easier. WebForms The WebForms engine is the area that has received most significant changes in ASP.NET 4.0. Probably the most widely anticipated features are related to managing page client ids and of ViewState on WebForm pages. Take Control of Your ClientIDs Unique ClientID generation in ASP.NET has been one of the most complained about “features” in ASP.NET. Although there’s a very good technical reason for these unique generated ids - they guarantee unique ids for each and every server control on a page - these unique and generated ids often get in the way of client-side JavaScript development and CSS styling as it’s often inconvenient and fragile to work with the long, generated ClientIDs. In ASP.NET 4.0 you can now specify an explicit client id mode on each control or each naming container parent control to control how client ids are generated. By default, ASP.NET generates mangled client ids for any control contained in a naming container (like a Master Page, or a User Control for example). The key to ClientID management in ASP.NET 4.0 are the new ClientIDMode and ClientIDRowSuffix properties. ClientIDMode supports four different ClientID generation settings shown below. For the following examples, imagine that you have a Textbox control named txtName inside of a master page control container on a WebForms page. <%@Page Language="C#" MasterPageFile="~/Site.Master" CodeBehind="WebForm2.aspx.cs" Inherits="WebApplication1.WebForm2" %> <asp:Content ID="content" ContentPlaceHolderID="content" runat="server" ClientIDMode="Static" > <asp:TextBox runat="server" ID="txtName" /> </asp:Content> The four available ClientIDMode values are: AutoID This is the existing behavior in ASP.NET 1.x-3.x where full naming container munging takes place. <input name="ctl00$content$txtName" type="text" id="ctl00_content_txtName" /> This should be familiar to any ASP.NET developer and results in fairly unpredictable client ids that can easily change if the containership hierarchy changes. For example, removing the master page changes the name in this case, so if you were to move a block of script code that works against the control to a non-Master page, the script code immediately breaks. Static This option is the most deterministic setting that forces the control’s ClientID to use its ID value directly. No naming container naming at all is applied and you end up with clean client ids: <input name="ctl00$content$txtName" type="text" id="txtName" /> Note that the name property which is used for postback variables to the server still is munged, but the ClientID property is displayed simply as the ID value that you have assigned to the control. This option is what most of us want to use, but you have to be clear on that because it can potentially cause conflicts with other controls on the page. If there are several instances of the same naming container (several instances of the same user control for example) there can easily be a client id naming conflict. Note that if you assign Static to a data-bound control, like a list child control in templates, you do not get unique ids either, so for list controls where you rely on unique id for child controls, you’ll probably want to use Predictable rather than Static. I’ll write more on this a little later when I discuss ClientIDRowSuffix. Predictable The previous two values are pretty self-explanatory. Predictable however, requires some explanation. To me at least it’s not in the least bit predictable. MSDN defines this value as follows: This algorithm is used for controls that are in data-bound controls. The ClientID value is generated by concatenating the ClientID value of the parent naming container with the ID value of the control. If the control is a data-bound control that generates multiple rows, the value of the data field specified in the ClientIDRowSuffix property is added at the end. For the GridView control, multiple data fields can be specified. If the ClientIDRowSuffix property is blank, a sequential number is added at the end instead of a data-field value. Each segment is separated by an underscore character (_). The key that makes this value a bit confusing is that it relies on the parent NamingContainer’s ClientID to build its own ClientID value. This effectively means that the value is not predictable at all but rather very tightly coupled to the parent naming container’s ClientIDMode setting. For my simple textbox example, if the ClientIDMode property of the parent naming container (Page in this case) is set to “Predictable” you’ll get this: <input name="ctl00$content$txtName" type="text" id="content_txtName" /> which gives an id that based on walking up to the currently active naming container (the MasterPage content container) and starting the id formatting from there downward. Think of this as a semi unique name that’s guaranteed unique only for the naming container. If, on the other hand, the Page is set to “AutoID” you get the following with Predictable on txtName: <input name="ctl00$content$txtName" type="text" id="ctl00_content_txtName" /> The latter is effectively the same as if you specified AutoID because it inherits the AutoID naming from the Page and Content Master Page control of the page. But again - predictable behavior always depends on the parent naming container and how it generates its id, so the id may not always be exactly the same as the AutoID generated value because somewhere in the NamingContainer chain the ClientIDMode setting may be set to a different value. For example, if you had another naming container in the middle that was set to Static you’d end up effectively with an id that starts with the NamingContainers id rather than the whole ctl000_content munging. The most common use for Predictable is likely to be for data-bound controls, which results in each data bound item getting a unique ClientID. Unfortunately, even here the behavior can be very unpredictable depending on which data-bound control you use - I found significant differences in how template controls in a GridView behave from those that are used in a ListView control. For example, GridView creates clean child ClientIDs, while ListView still has a naming container in the ClientID, presumably because of the template container on which you can’t set ClientIDMode. Predictable is useful, but only if all naming containers down the chain use this setting. Otherwise you’re right back to the munged ids that are pretty unpredictable. Another property, ClientIDRowSuffix, can be used in combination with ClientIDMode of Predictable to force a suffix onto list client controls. For example: <asp:GridView runat="server" ID="gvItems" AutoGenerateColumns="false" ClientIDMode="Static" ClientIDRowSuffix="Id"> <Columns> <asp:TemplateField> <ItemTemplate> <asp:Label runat="server" id="txtName" Text='<%# Eval("Name") %>' ClientIDMode="Predictable"/> </ItemTemplate> </asp:TemplateField> <asp:TemplateField> <ItemTemplate> <asp:Label runat="server" id="txtId" Text='<%# Eval("Id") %>' ClientIDMode="Predictable" /> </ItemTemplate> </asp:TemplateField> </Columns> </asp:GridView> generates client Ids inside of a column in the master page described earlier: <td> <span id="txtName_0">Rick</span> </td> where the value after the underscore is the ClientIDRowSuffix field - in this case “Id” of the item data bound to the control. Note that all of the child controls require ClientIDMode=”Predictable” in order for the ClientIDRowSuffix to be applied, and the parent GridView controls need to be set to Static either explicitly or via Naming Container inheritance to give these simple names. It’s a bummer that ClientIDRowSuffix doesn’t work with Static to produce this automatically. Another real problem is that other controls process the ClientIDMode differently. For example, a ListView control processes the Predictable ClientIDMode differently and produces the following with the Static ListView and Predictable child controls: <span id="ctrl0_txtName_0">Rick</span> I couldn’t even figure out a way using ClientIDMode to get a simple ID that also uses a suffix short of falling back to manually generated ids using <%= %> expressions instead. Given the inconsistencies inside of list controls using <%= %>, ids for the ListView might not be a bad idea anyway. Inherit The final setting is Inherit, which is the default for all controls except Page. This means that controls by default inherit the parent naming container’s ClientIDMode setting. For more detailed information on ClientID behavior and different scenarios you can check out a blog post of mine on this subject: http://www.west-wind.com/weblog/posts/54760.aspx. ClientID Enhancements Summary The ClientIDMode property is a welcome addition to ASP.NET 4.0. To me this is probably the most useful WebForms feature as it allows me to generate clean IDs simply by setting ClientIDMode="Static" on either the page or inside of Web.config (in the Pages section) which applies the setting down to the entire page which is my 95% scenario. For the few cases when it matters - for list controls and inside of multi-use user controls or custom server controls) - I can use Predictable or even AutoID to force controls to unique names. For application-level page development, this is easy to accomplish and provides maximum usability for working with client script code against page controls. ViewStateMode Another area of large criticism for WebForms is ViewState. ViewState is used internally by ASP.NET to persist page-level changes to non-postback properties on controls as pages post back to the server. It’s a useful mechanism that works great for the overall mechanics of WebForms, but it can also cause all sorts of overhead for page operation as ViewState can very quickly get out of control and consume huge amounts of bandwidth in your page content. ViewState can also wreak havoc with client-side scripting applications that modify control properties that are tracked by ViewState, which can produce very unpredictable results on a Postback after client-side updates. Over the years in my own development, I’ve often turned off ViewState on pages to reduce overhead. Yes, you lose some functionality, but you can easily implement most of the common functionality in non-ViewState workarounds. Relying less on heavy ViewState controls and sticking with simpler controls or raw HTML constructs avoids getting around ViewState problems. In ASP.NET 3.x and prior, it wasn’t easy to control ViewState - you could turn it on or off and if you turned it off at the page or web.config level, you couldn’t turn it back on for specific controls. In short, it was an all or nothing approach. With ASP.NET 4.0, the new ViewStateMode property gives you more control. It allows you to disable ViewState globally either on the page or web.config level and then turn it back on for specific controls that might need it. ViewStateMode only works when EnableViewState="true" on the page or web.config level (which is the default). You can then use ViewStateMode of Disabled, Enabled or Inherit to control the ViewState settings on the page. If you’re shooting for minimal ViewState usage, the ideal situation is to set ViewStateMode to disabled on the Page or web.config level and only turn it back on particular controls: <%@Page Language="C#" CodeBehind="WebForm2.aspx.cs" Inherits="Westwind.WebStore.WebForm2" ClientIDMode="Static" ViewStateMode="Disabled" EnableViewState="true" %>  <asp:TextBox runat="server" ID="txtName" ViewStateMode="Enabled" />  <asp:TextBox runat="server" ID="txtAddress" /> Note that the EnableViewState="true" at the Page level isn’t required since it’s the default, but it’s important that the value is true. ViewStateMode has no effect if EnableViewState="false" at the page level. The main benefit of ViewStateMode is that it allows you to more easily turn off ViewState for most of the page and enable only a few key controls that might need it. For me personally, this is a perfect combination as most of my WebForm apps can get away without any ViewState at all. But some controls - especially third party controls - often don’t work well without ViewState enabled, and now it’s much easier to selectively enable controls rather than the old way, which required you to pretty much turn off ViewState for all controls that you didn’t want ViewState on. Inline HTML Encoding HTML encoding is an important feature to prevent cross-site scripting attacks in data entered by users on your site. In order to make it easier to create HTML encoded content, ASP.NET 4.0 introduces a new Expression syntax using <%: %> to encode string values. The encoding expression syntax looks like this: <%: "<script type='text/javascript'>" + "alert('Really?');</script>" %> which produces properly encoded HTML: <script type='text/javascript' >alert('Really?');</script> Effectively this is a shortcut to: <%= HttpUtility.HtmlEncode( "<script type='text/javascript'>" + "alert('Really?');</script>") %> Of course the <%: %> syntax can also evaluate expressions just like <%= %> so the more common scenario applies this expression syntax against data your application is displaying. Here’s an example displaying some data model values: <%: Model.Address.Street %> This snippet shows displaying data from your application’s data store or more importantly, from data entered by users. Anything that makes it easier and less verbose to HtmlEncode text is a welcome addition to avoid potential cross-site scripting attacks. Although I listed Inline HTML Encoding here under WebForms, anything that uses the WebForms rendering engine including ASP.NET MVC, benefits from this feature. ScriptManager Enhancements The ASP.NET ScriptManager control in the past has introduced some nice ways to take programmatic and markup control over script loading, but there were a number of shortcomings in this control. The ASP.NET 4.0 ScriptManager has a number of improvements that make it easier to control script loading and addresses a few of the shortcomings that have often kept me from using the control in favor of manual script loading. The first is the AjaxFrameworkMode property which finally lets you suppress loading the ASP.NET AJAX runtime. Disabled doesn’t load any ASP.NET AJAX libraries, but there’s also an Explicit mode that lets you pick and choose the library pieces individually and reduce the footprint of ASP.NET AJAX script included if you are using the library. There’s also a new EnableCdn property that forces any script that has a new WebResource attribute CdnPath property set to a CDN supplied URL. If the script has this Attribute property set to a non-null/empty value and EnableCdn is enabled on the ScriptManager, that script will be served from the specified CdnPath. [assembly: WebResource( "Westwind.Web.Resources.ww.jquery.js", "application/x-javascript", CdnPath = "http://mysite.com/scripts/ww.jquery.min.js")] Cool, but a little too static for my taste since this value can’t be changed at runtime to point at a debug script as needed, for example. Assembly names for loading scripts from resources can now be simple names rather than fully qualified assembly names, which make it less verbose to reference scripts from assemblies loaded from your bin folder or the assembly reference area in web.config: <asp:ScriptManager runat="server" id="Id" EnableCdn="true" AjaxFrameworkMode="disabled"> <Scripts> <asp:ScriptReference Name="Westwind.Web.Resources.ww.jquery.js" Assembly="Westwind.Web" /> </Scripts> </asp:ScriptManager> The ScriptManager in 4.0 also supports script combining via the CompositeScript tag, which allows you to very easily combine scripts into a single script resource served via ASP.NET. Even nicer: You can specify the URL that the combined script is served with. Check out the following script manager markup that combines several static file scripts and a script resource into a single ASP.NET served resource from a static URL (allscripts.js): <asp:ScriptManager runat="server" id="Id" EnableCdn="true" AjaxFrameworkMode="disabled"> <CompositeScript Path="~/scripts/allscripts.js"> <Scripts> <asp:ScriptReference Path="~/scripts/jquery.js" /> <asp:ScriptReference Path="~/scripts/ww.jquery.js" /> <asp:ScriptReference Name="Westwind.Web.Resources.editors.js" Assembly="Westwind.Web" /> </Scripts> </CompositeScript> </asp:ScriptManager> When you render this into HTML, you’ll see a single script reference in the page: <script src="scripts/allscripts.debug.js" type="text/javascript"></script> All you need to do to make this work is ensure that allscripts.js and allscripts.debug.js exist in the scripts folder of your application - they can be empty but the file has to be there. This is pretty cool, but you want to be real careful that you use unique URLs for each combination of scripts you combine or else browser and server caching will easily screw you up royally. The script manager also allows you to override native ASP.NET AJAX scripts now as any script references defined in the Scripts section of the ScriptManager trump internal references. So if you want custom behavior or you want to fix a possible bug in the core libraries that normally are loaded from resources, you can now do this simply by referencing the script resource name in the Name property and pointing at System.Web for the assembly. Not a common scenario, but when you need it, it can come in real handy. Still, there are a number of shortcomings in this control. For one, the ScriptManager and ClientScript APIs still have no common entry point so control developers are still faced with having to check and support both APIs to load scripts so that controls can work on pages that do or don’t have a ScriptManager on the page. The CdnUrl is static and compiled in, which is very restrictive. And finally, there’s still no control over where scripts get loaded on the page - ScriptManager still injects scripts into the middle of the HTML markup rather than in the header or optionally the footer. This, in turn, means there is little control over script loading order, which can be problematic for control developers. MetaDescription, MetaKeywords Page Properties There are also a number of additional Page properties that correspond to some of the other features discussed in this column: ClientIDMode, ClientTarget and ViewStateMode. Another minor but useful feature is that you can now directly access the MetaDescription and MetaKeywords properties on the Page object to set the corresponding meta tags programmatically. Updating these values programmatically previously required either <%= %> expressions in the page markup or dynamic insertion of literal controls into the page. You can now just set these properties programmatically on the Page object in any Control derived class on the page or the Page itself: Page.MetaKeywords = "ASP.NET,4.0,New Features"; Page.MetaDescription = "This article discusses the new features in ASP.NET 4.0"; Note, that there’s no corresponding ASP.NET tag for the HTML Meta element, so the only way to specify these values in markup and access them is via the @Page tag: <%@Page Language="C#" CodeBehind="WebForm2.aspx.cs" Inherits="Westwind.WebStore.WebForm2" ClientIDMode="Static" MetaDescription="Article that discusses what's new in ASP.NET 4.0" MetaKeywords="ASP.NET,4.0,New Features" %> Nothing earth shattering but quite convenient. Visual Studio 2010 Enhancements for Web Development For Web development there are also a host of editor enhancements in Visual Studio 2010. Some of these are not Web specific but they are useful for Web developers in general. Text Editors Throughout Visual Studio 2010, the text editors have all been updated to a new core engine based on WPF which provides some interesting new features for various code editors including the nice ability to zoom in and out with Ctrl-MouseWheel to quickly change the size of text. There are many more API options to control the editor and although Visual Studio 2010 doesn’t yet use many of these features, we can look forward to enhancements in add-ins and future editor updates from the various language teams that take advantage of the visual richness that WPF provides to editing. On the negative side, I’ve noticed that occasionally the code editor and especially the HTML and JavaScript editors will lose the ability to use various navigation keys like arrows, back and delete keys, which requires closing and reopening the documents at times. This issue seems to be well documented so I suspect this will be addressed soon with a hotfix or within the first service pack. Overall though, the code editors work very well, especially given that they were re-written completely using WPF, which was one of my big worries when I first heard about the complete redesign of the editors. Multi-Targeting Visual Studio now targets all versions of the .NET framework from 2.0 forward. You can use Visual Studio 2010 to work on your ASP.NET 2, 3.0 and 3.5 applications which is a nice way to get your feet wet with the new development environment without having to make changes to existing applications. It’s nice to have one tool to work in for all the different versions. Multi-Monitor Support One cool feature of Visual Studio 2010 is the ability to drag windows out of the Visual Studio environment and out onto the desktop including onto another monitor easily. Since Web development often involves working with a host of designers at the same time - visual designer, HTML markup window, code behind and JavaScript editor - it’s really nice to be able to have a little more screen real estate to work on each of these editors. Microsoft made a welcome change in the environment. IntelliSense Snippets for HTML and JavaScript Editors The HTML and JavaScript editors now finally support IntelliSense scripts to create macro-based template expansions that have been in the core C# and Visual Basic code editors since Visual Studio 2005. Snippets allow you to create short XML-based template definitions that can act as static macros or real templates that can have replaceable values that can be embedded into the expanded text. The XML syntax for these snippets is straight forward and it’s pretty easy to create custom snippets manually. You can easily create snippets using XML and store them in your custom snippets folder (C:\Users\rstrahl\Documents\Visual Studio 2010\Code Snippets\Visual Web Developer\My HTML Snippets and My JScript Snippets), but it helps to use one of the third-party tools that exist to simplify the process for you. I use SnippetEditor, by Bill McCarthy, which makes short work of creating snippets interactively (http://snippeteditor.codeplex.com/). Note: You may have to manually add the Visual Studio 2010 User specific Snippet folders to this tool to see existing ones you’ve created. Code snippets are some of the biggest time savers and HTML editing more than anything deals with lots of repetitive tasks that lend themselves to text expansion. Visual Studio 2010 includes a slew of built-in snippets (that you can also customize!) and you can create your own very easily. If you haven’t done so already, I encourage you to spend a little time examining your coding patterns and find the repetitive code that you write and convert it into snippets. I’ve been using CodeRush for this for years, but now you can do much of the basic expansion natively for HTML and JavaScript snippets. jQuery Integration Is Now Native jQuery is a popular JavaScript library and recently Microsoft has recently stated that it will become the primary client-side scripting technology to drive higher level script functionality in various ASP.NET Web projects that Microsoft provides. In Visual Studio 2010, the default full project template includes jQuery as part of a new project including the support files that provide IntelliSense (-vsdoc files). IntelliSense support for jQuery is now also baked into Visual Studio 2010, so unlike Visual Studio 2008 which required a separate download, no further installs are required for a rich IntelliSense experience with jQuery. Summary ASP.NET 4.0 brings many useful improvements to the platform, but thankfully most of the changes are incremental changes that don’t compromise backwards compatibility and they allow developers to ease into the new features one feature at a time. None of the changes in ASP.NET 4.0 or Visual Studio 2010 are monumental or game changers. The bigger features are language and .NET Framework changes that are also optional. This ASP.NET and tools release feels more like fine tuning and getting some long-standing kinks worked out of the platform. It shows that the ASP.NET team is dedicated to paying attention to community feedback and responding with changes to the platform and development environment based on this feedback. If you haven’t gotten your feet wet with ASP.NET 4.0 and Visual Studio 2010, there’s no reason not to give it a shot now - the ASP.NET 4.0 platform is solid and Visual Studio 2010 works very well for a brand new release. Check it out. © Rick Strahl, West Wind Technologies, 2005-2010Posted in ASP.NET

Read the article

Ping "replies" from same computer with 'Destination host unreachable' (no route to other computer)

- by Srekel

I've got two computers in a LAN behind a wireless router. One has XP with ip 192.168.1.2 This one has W7 with ip 192.168.1.7 If I try to ping the other one from this computer, I get this: C:\Users\Srekel>ping 192.168.1.2 Pinging 192.168.1.2 with 32 bytes of data: Reply from 192.168.1.7: Destination host unreachable. Reply from 192.168.1.7: Destination host unreachable. Reply from 192.168.1.7: Destination host unreachable. Reply from 192.168.1.7: Destination host unreachable. Ping statistics for 192.168.1.2: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Tracert gives the same result: C:\Users\Srekel>tracert 192.168.1.2 Tracing route to 192.168.1.2 over a maximum of 30 hops 1 Kakburken4 [192.168.1.7] reports: Destination host unreachable. Trace complete. Although I can ping and tracert the router without any problems. I have disabled the firewalls on both computers. The router is set to use DHCP (if that matters). Here is the output from "route". C:\Users\Srekel>route print =========================================================================== Interface List 13...00 25 86 df c6 89 ......TP-LINK Wireless N Adapter 12...e0 cb 4e 26 b9 84 ......Realtek PCIe GBE Family Controller #2 11...e0 cb 4e 26 be 94 ......Realtek PCIe GBE Family Controller 1...........................Software Loopback Interface 1 16...00 00 00 00 00 00 00 e0 Microsoft ISATAP Adapter #2 14...00 00 00 00 00 00 00 e0 Teredo Tunneling Pseudo-Interface =========================================================================== IPv4 Route Table =========================================================================== Active Routes: Network Destination Netmask Gateway Interface Metric 0.0.0.0 0.0.0.0 192.168.1.1 192.168.1.7 20 127.0.0.0 255.0.0.0 On-link 127.0.0.1 306 127.0.0.1 255.255.255.255 On-link 127.0.0.1 306 127.255.255.255 255.255.255.255 On-link 127.0.0.1 306 192.168.1.0 255.255.255.0 On-link 192.168.1.7 276 192.168.1.7 255.255.255.255 On-link 192.168.1.7 276 192.168.1.255 255.255.255.255 On-link 192.168.1.7 276 224.0.0.0 240.0.0.0 On-link 127.0.0.1 306 224.0.0.0 240.0.0.0 On-link 192.168.1.7 276 255.255.255.255 255.255.255.255 On-link 127.0.0.1 306 255.255.255.255 255.255.255.255 On-link 192.168.1.7 276 =========================================================================== Persistent Routes: None IPv6 Route Table =========================================================================== Active Routes: If Metric Network Destination Gateway 14 58 ::/0 On-link 1 306 ::1/128 On-link 14 58 2001::/32 On-link 14 306 2001:0:5ef5:73ba:881:20c1:3f57:fef8/128 On-link 14 306 fe80::/64 On-link 14 306 fe80::881:20c1:3f57:fef8/128 On-link 1 306 ff00::/8 On-link 14 306 ff00::/8 On-link =========================================================================== Persistent Routes: None I've set up and debugged a few networks in my life but I'm not really an advanced network user, so I'm not sure what might be wrong. Any ideas? Oh, and pinging this computer from the other computer doesn't work either. EDIT: Adding arp output: C:\Users\Srekel>arp -a Interface: 192.168.1.7 --- 0xd Internet Address Physical Address Type 192.168.1.1 00-1f-33-ef-28-01 dynamic 192.168.1.255 ff-ff-ff-ff-ff-ff static 224.0.0.22 01-00-5e-00-00-16 static 224.0.0.252 01-00-5e-00-00-fc static 239.255.255.250 01-00-5e-7f-ff-fa static 255.255.255.255 ff-ff-ff-ff-ff-ff static Adding ipconfig... C:\Users\Srekel>ipconfig /all Windows IP Configuration Host Name . . . . . . . . . . . . : Kakburken4 Primary Dns Suffix . . . . . . . : Node Type . . . . . . . . . . . . : Hybrid IP Routing Enabled. . . . . . . . : No WINS Proxy Enabled. . . . . . . . : No Wireless LAN adapter Wireless Network Connection: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : TP-LINK Wireless N Adapter Physical Address. . . . . . . . . : 00-25-86-DF-C6-89 DHCP Enabled. . . . . . . . . . . : Yes Autoconfiguration Enabled . . . . : Yes IPv4 Address. . . . . . . . . . . : 192.168.1.7(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.255.0 Lease Obtained. . . . . . . . . . : 09 April 2010 23:09:45 Lease Expires . . . . . . . . . . : 10 April 2010 23:09:45 Default Gateway . . . . . . . . . : 192.168.1.1 DHCP Server . . . . . . . . . . . : 192.168.1.1 DNS Servers . . . . . . . . . . . : 192.168.1.1 NetBIOS over Tcpip. . . . . . . . : Enabled Ethernet adapter Local Area Connection 2: Media State . . . . . . . . . . . : Media disconnected Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Realtek PCIe GBE Family Controller #2 Physical Address. . . . . . . . . : E0-CB-4E-26-B9-84 DHCP Enabled. . . . . . . . . . . : Yes Autoconfiguration Enabled . . . . : Yes Ethernet adapter Local Area Connection: Media State . . . . . . . . . . . : Media disconnected Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Realtek PCIe GBE Family Controller Physical Address. . . . . . . . . : E0-CB-4E-26-BE-94 DHCP Enabled. . . . . . . . . . . : Yes Autoconfiguration Enabled . . . . : Yes Tunnel adapter isatap.{74D5C406-894E-4000-8DE7-6AAEBF7C8382}: Media State . . . . . . . . . . . : Media disconnected Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Microsoft ISATAP Adapter #2 Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0 DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes Tunnel adapter Teredo Tunneling Pseudo-Interface: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Teredo Tunneling Pseudo-Interface Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0 DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes IPv6 Address. . . . . . . . . . . : 2001:0:5ef5:73ba:881:20c1:3f57:fef8(Preferred) Link-local IPv6 Address . . . . . : fe80::881:20c1:3f57:fef8%14(Preferred) Default Gateway . . . . . . . . . : :: NetBIOS over Tcpip. . . . . . . . : Disabled

Read the article

How John Got 15x Improvement Without Really Trying

- by rchrd

The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here. How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

Read the article

Segfaulting Java process

- by zenmonkey

I've a java process that is working on some large data set in memory. I've seen it crash with a SIGSEGV signal sometimes, so i was wondering some potential causes and fixes could do. Caues: - JVM bug - Native library bug (e.g pthreads etc) - JNI bug in user code Fixes: - Upgrade to new JVM In my particular case, this is the output form the log file (pruned) A fatal error has been detected by the Java Runtime Environment: # SIGSEGV (0xb) at pc=0x00002aaaaacd1b94, pid=32116, tid=1086544208 # JRE version: 6.0_14-b08 Java VM: Java HotSpot(TM) 64-Bit Server VM (14.0-b16 mixed mode linux-amd64 ) Problematic frame: C [libpthread.so.0+0xab94] pthread_cond_timedwait+0x154 # If you would like to submit a bug report, please visit: http://java.sun.com/webapps/bugreport/crash.jsp # --------------- T H R E A D --------------- Current thread (0x00002aacaad41000): WatcherThread [stack: 0x0000000040b35000,0x0000000040c36000] [id=32141] siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR), si_addr=0x00002aabc40008c0 Registers: RAX=0x0000000000000000, RBX=0x0000000000000000, RCX=0x0000000000000000, RDX=0x0000000000000002 RSP=0x0000000040c34cc0, RBP=0x0000000040c34d80, RSI=0x0000000000000001, RDI=0x00002aabc40008c0 R8 =0x00002aacaad42528, R9 =0x0000000000000000, R10=0x0000000040c34cd8, R11=0x0000000000000202 R12=0x0000000000000001, R13=0x0000000040c34d40, R14=0xffffffffffffff92, R15=0x00002aacaad42550 RIP=0x00002aaaaacd1b94, EFL=0x0000000000010246, CSGSFS=0x000000000000e033, ERR=0x0000000000000006 TRAPNO=0x000000000000000e Top of Stack: (sp=0x0000000040c34cc0) 0x0000000040c34cc0: 0000000000000000 00002aabc40008c0 0x0000000040c34cd0: 00002aacaad42528 0000000000000000 0x0000000040c34ce0: 0000000002fae0e0 0000000000000000 0x0000000040c34cf0: 00002aaaaacd1750 0000000040c34cc0 0x0000000040c34d00: 00002aacaad42528 0000000000000000 0x0000000040c34d10: 00002aacaad42528 00002aacaad42500 0x0000000040c34d20: 0000000000000032 00002aaaabadf876 0x0000000040c34d30: fffffffdaad40e80 0000000040c34d40 0x0000000040c34d40: 000000004bbb7166 0000000015f07098 0x0000000040c34d50: 0000000040c34d80 00138cd32df59cce 0x0000000040c34d60: 431bde82d7b634db 00002aacaad429c0 0x0000000040c34d70: 0000000000000032 00002aacaad429c0 0x0000000040c34d80: 0000000040c34e00 00002aaaabadda6d 0x0000000040c34d90: 0000000040c34da0 00002aacaad42500 0x0000000040c34da0: 00002aacaad429c0 00002aaa00000002 0x0000000040c34db0: 0000000000000001 0000000000000002 0x0000000040c34dc0: 0000000040c34dd0 00002aaaabb6f613 0x0000000040c34dd0: 0000000040c34e00 00002aacaad41000 0x0000000040c34de0: 0000000000000032 00002aacaad429c0 0x0000000040c34df0: 00002aacaad41000 0000000000001000 0x0000000040c34e00: 0000000040c34e60 00002aaaabbc39fb 0x0000000040c34e10: 0000000040c34e40 00002aaaabab868f 0x0000000040c34e20: 00002aacaad41000 00002aacaad42aa0 0x0000000040c34e30: 00002aacaad42aa0 00002aaaabe10630 0x0000000040c34e40: 00002aaaabe10630 00002aacaad42aa0 0x0000000040c34e50: 00002aacaad429c0 00002aacaad41000 0x0000000040c34e60: 0000000040c35130 00002aaaabadff9f 0x0000000040c34e70: 0000000000000000 0000000000000000 0x0000000040c34e80: 0000000000000000 0000000000000000 0x0000000040c34e90: 0000000000000000 0000000000000000 0x0000000040c34ea0: 0000000000000000 0000000000000000 0x0000000040c34eb0: 0000000000000000 0000000000000000 Instructions: (pc=0x00002aaaaacd1b94) 0x00002aaaaacd1b84: 88 22 00 00 48 8b 7c 24 08 be 01 00 00 00 31 c0 0x00002aaaaacd1b94: f0 0f b1 37 0f 85 e8 00 00 00 8b 57 2c 48 8b 47 Stack: [0x0000000040b35000,0x0000000040c36000], sp=0x0000000040c34cc0, free space=1023k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) C [libpthread.so.0+0xab94] pthread_cond_timedwait+0x154 V [libjvm.so+0x594a6d] V [libjvm.so+0x67a9fb] V [libjvm.so+0x596f9f] --------------- P R O C E S S --------------- Java Threads: ( = current thread ) 0x00002aacaad3f000 JavaThread "Low Memory Detector" daemon [_thread_blocked, id=32140, stack(0x0000000040a34000,0x0000000040b35000)] 0x00002aacaad3c000 JavaThread "CompilerThread1" daemon [_thread_blocked, id=32139, stack(0x0000000040933000,0x0000000040a34000)] 0x00002aacaad37800 JavaThread "CompilerThread0" daemon [_thread_blocked, id=32138, stack(0x0000000040832000,0x0000000040933000)] 0x00002aacaad36800 JavaThread "Signal Dispatcher" daemon [_thread_blocked, id=32137, stack(0x0000000040731000,0x0000000040832000)] 0x00002aacaab7d800 JavaThread "Finalizer" daemon [_thread_blocked, id=32136, stack(0x0000000040630000,0x0000000040731000)] 0x00002aacaab7b800 JavaThread "Reference Handler" daemon [_thread_blocked, id=32135, stack(0x000000004052f000,0x0000000040630000)] 0x0000000040115800 JavaThread "main" [_thread_blocked, id=32117, stack(0x000000004012b000,0x000000004022c000)] Other Threads: 0x00002aacaab75000 VMThread [stack: 0x000000004042e000,0x000000004052f000] [id=32134] =0x00002aacaad41000 WatcherThread [stack: 0x0000000040b35000,0x0000000040c36000] [id=32141] VM state:at safepoint (normal execution) VM Mutex/Monitor currently owned by a thread: ([mutex/lock_event]) [0x0000000040112e80] Threads_lock - owner thread: 0x00002aacaab75000 [0x0000000040113380] Heap_lock - owner thread: 0x0000000040115800 Heap PSYoungGen total 1854528K, used 1029248K [0x00002aac025a0000, 0x00002aaca8340000, 0x00002aaca9040000) eden space 1029248K, 100% used [0x00002aac025a0000,0x00002aac412c0000,0x00002aac412c0000) from space 825280K, 0% used [0x00002aac412c0000,0x00002aac412c0000,0x00002aac738b0000) to space 812800K, 0% used [0x00002aac76980000,0x00002aac76980000,0x00002aaca8340000) PSOldGen total 4423680K, used 4423651K [0x00002aaab5040000, 0x00002aabc3040000, 0x00002aac025a0000) object space 4423680K, 99% used [0x00002aaab5040000,0x00002aabc3038fe8,0x00002aabc3040000) PSPermGen total 21248K, used 5848K [0x00002aaaafc40000, 0x00002aaab1100000, 0x00002aaab5040000) object space 21248K, 27% used [0x00002aaaafc40000,0x00002aaab01f61f0,0x00002aaab1100000) Dynamic libraries: 40000000-40009000 r-xp 00000000 08:01 313415 /usr/java/jdk1.6.0_14/bin/java 40108000-4010a000 rwxp 00008000 08:01 313415 /usr/java/jdk1.6.0_14/bin/java 4010a000-4012b000 rwxp 4010a000 00:00 0 [heap] 4012b000-4012e000 ---p 4012b000 00:00 0 4012e000-4022c000 rwxp 4012e000 00:00 0 4022c000-4022d000 ---p 4022c000 00:00 0 4022d000-4032d000 rwxp 4022d000 00:00 0 4032d000-4032e000 ---p 4032d000 00:00 0 4032e000-4042e000 rwxp 4032e000 00:00 0 4042e000-4042f000 ---p 4042e000 00:00 0 4042f000-4052f000 rwxp 4042f000 00:00 0 4052f000-40532000 ---p 4052f000 00:00 0 40532000-40630000 rwxp 40532000 00:00 0 40630000-40633000 ---p 40630000 00:00 0 40633000-40731000 rwxp 40633000 00:00 0 40731000-40734000 ---p 40731000 00:00 0 40734000-40832000 rwxp 40734000 00:00 0 40832000-40835000 ---p 40832000 00:00 0 40835000-40933000 rwxp 40835000 00:00 0 40933000-40936000 ---p 40933000 00:00 0 40936000-40a34000 rwxp 40936000 00:00 0 40a34000-40a37000 ---p 40a34000 00:00 0 40a37000-40b35000 rwxp 40a37000 00:00 0 40b35000-40b36000 ---p 40b35000 00:00 0 40b36000-40c36000 rwxp 40b36000 00:00 0 2aaaaaaab000-2aaaaaac6000 r-xp 00000000 08:01 49198 /lib64/ld-2.7.so 2aaaaaac6000-2aaaaaac7000 rwxp 2aaaaaac6000 00:00 0 2aaaaaac7000-2aaaaaad0000 r-xs 0006d000 08:10 29851669 /mnt/home/jatten/workspace/common/build/lib/common.jar 2aaaaaad2000-2aaaaaad3000 rwxp 2aaaaaad2000 00:00 0 2aaaaaad3000-2aaaaaae0000 r-xp 00000000 08:01 315357 /usr/java/jdk1.6.0_14/jre/lib/amd64/libverify.so 2aaaaaae0000-2aaaaabdf000 ---p 0000d000 08:01 315357 /usr/java/jdk1.6.0_14/jre/lib/amd64/libverify.so 2aaaaabdf000-2aaaaabe2000 rwxp 0000c000 08:01 315357 /usr/java/jdk1.6.0_14/jre/lib/amd64/libverify.so 2aaaaabe2000-2aaaaac0a000 rwxp 2aaaaabe2000 00:00 0 2aaaaac0a000-2aaaaac0f000 r-xs 0003a000 08:10 30326840 /mnt/home/jatten/workspace/common_ml20010405/build/lib/common_ml.jar 2aaaaac0f000-2aaaaac12000 r-xs 00020000 08:10 29786222 /mnt/home/jatten/pagescorer.jar 2aaaaacc5000-2aaaaacc6000 r-xp 0001a000 08:01 49198 /lib64/ld-2.7.so 2aaaaacc6000-2aaaaacc7000 rwxp 0001b000 08:01 49198 /lib64/ld-2.7.so 2aaaaacc7000-2aaaaacdd000 r-xp 00000000 08:01 49280 /lib64/libpthread-2.7.so 2aaaaacdd000-2aaaaaedc000 ---p 00016000 08:01 49280 /lib64/libpthread-2.7.so 2aaaaaedc000-2aaaaaedd000 r-xp 00015000 08:01 49280 /lib64/libpthread-2.7.so 2aaaaaedd000-2aaaaaede000 rwxp 00016000 08:01 49280 /lib64/libpthread-2.7.so 2aaaaaede000-2aaaaaee2000 rwxp 2aaaaaede000 00:00 0 2aaaaaee2000-2aaaaaee9000 r-xp 00000000 08:01 315360 /usr/java/jdk1.6.0_14/jre/lib/amd64/jli/libjli.so 2aaaaaee9000-2aaaaafea000 ---p 00007000 08:01 315360 /usr/java/jdk1.6.0_14/jre/lib/amd64/jli/libjli.so 2aaaaafea000-2aaaaafec000 rwxp 00008000 08:01 315360 /usr/java/jdk1.6.0_14/jre/lib/amd64/jli/libjli.so 2aaaaafec000-2aaaaafee000 r-xp 00000000 08:01 49240 /lib64/libdl-2.7.so 2aaaaafee000-2aaaab1ee000 ---p 00002000 08:01 49240 /lib64/libdl-2.7.so 2aaaab1ee000-2aaaab1ef000 r-xp 00002000 08:01 49240 /lib64/libdl-2.7.so 2aaaab1ef000-2aaaab1f0000 rwxp 00003000 08:01 49240 /lib64/libdl-2.7.so 2aaaab1f0000-2aaaab1f1000 rwxp 2aaaab1f0000 00:00 0 2aaaab1f1000-2aaaab33e000 r-xp 00000000 08:01 49219 /lib64/libc-2.7.so 2aaaab33e000-2aaaab53e000 ---p 0014d000 08:01 49219 /lib64/libc-2.7.so 2aaaab53e000-2aaaab542000 r-xp 0014d000 08:01 49219 /lib64/libc-2.7.so 2aaaab542000-2aaaab543000 rwxp 00151000 08:01 49219 /lib64/libc-2.7.so 2aaaab543000-2aaaab549000 rwxp 2aaaab543000 00:00 0 2aaaab549000-2aaaabca7000 r-xp 00000000 08:01 315371 /usr/java/jdk1.6.0_14/jre/lib/amd64/server/libjvm.so 2aaaabca7000-2aaaabda6000 ---p 0075e000 08:01 315371 /usr/java/jdk1.6.0_14/jre/lib/amd64/server/libjvm.so 2aaaabda6000-2aaaabf1e000 rwxp 0075d000 08:01 315371 /usr/java/jdk1.6.0_14/jre/lib/amd64/server/libjvm.so 2aaaabf1e000-2aaaabf5c000 rwxp 2aaaabf1e000 00:00 0 2aaaabf67000-2aaaabfe9000 r-xp 00000000 08:01 49263 /lib64/libm-2.7.so 2aaaabfe9000-2aaaac1e8000 ---p 00082000 08:01 49263 /lib64/libm-2.7.so 2aaaac1e8000-2aaaac1e9000 r-xp 00081000 08:01 49263 /lib64/libm-2.7.so 2aaaac1e9000-2aaaac1ea000 rwxp 00082000 08:01 49263 /lib64/libm-2.7.so 2aaaac1ea000-2aaaac1f2000 r-xp 00000000 08:01 49283 /lib64/librt-2.7.so 2aaaac1f2000-2aaaac3f1000 ---p 00008000 08:01 49283 /lib64/librt-2.7.so 2aaaac3f1000-2aaaac3f2000 r-xp 00007000 08:01 49283 /lib64/librt-2.7.so 2aaaac3f2000-2aaaac3f3000 rwxp 00008000 08:01 49283 /lib64/librt-2.7.so 2aaaac3f3000-2aaaac41c000 r-xp 00000000 08:01 315336 /usr/java/jdk1.6.0_14/jre/lib/amd64/libjava.so 2aaaac41c000-2aaaac51b000 ---p 00029000 08:01 315336 /usr/java/jdk1.6.0_14/jre/lib/amd64/libjava.so 2aaaac51b000-2aaaac522000 rwxp 00028000 08:01 315336 /usr/java/jdk1.6.0_14/jre/lib/amd64/libjava.so 2aaaac522000-2aaaac523000 ---p 2aaaac522000 00:00 0 2aaaac523000-2aaaac524000 rwxp 2aaaac523000 00:00 0 2aaaac52d000-2aaaac542000 r-xp 00000000 08:01 49265 /lib64/libnsl-2.7.so 2aaaac542000-2aaaac741000 ---p 00015000 08:01 49265 /lib64/libnsl-2.7.so 2aaaac741000-2aaaac742000 r-xp 00014000 08:01 49265 /lib64/libnsl-2.7.so 2aaaac742000-2aaaac743000 rwxp 00015000 08:01 49265 /lib64/libnsl-2.7.so 2aaaac743000-2aaaac745000 rwxp 2aaaac743000 00:00 0 2aaaac745000-2aaaac74c000 r-xp 00000000 08:01 315362 /usr/java/jdk1.6.0_14/jre/lib/amd64/native_threads/libhpi.so 2aaaac74c000-2aaaac84d000 ---p 00007000 08:01 315362 /usr/java/jdk1.6.0_14/jre/lib/amd64/native_threads/libhpi.so 2aaaac84d000-2aaaac84f000 rwxp 00008000 08:01 315362 /usr/java/jdk1.6.0_14/jre/lib/amd64/native_threads/libhpi.so 2aaaac84f000-2aaaac850000 rwxp 2aaaac84f000 00:00 0 2aaaac850000-2aaaac858000 rwxs 00000000 08:01 229379 /tmp/hsperfdata_jatten/32116 2aaaac85b000-2aaaac865000 r-xp 00000000 08:01 49269 /lib64/libnss_files-2.7.so 2aaaac865000-2aaaaca64000 ---p 0000a000 08:01 49269 /lib64/libnss_files-2.7.so 2aaaaca64000-2aaaaca65000 r-xp 00009000 08:01 49269 /lib64/libnss_files-2.7.so 2aaaaca65000-2aaaaca66000 rwxp 0000a000 08:01 49269 /lib64/libnss_files-2.7.so 2aaaaca66000-2aaaaca74000 r-xp 00000000 08:01 315358 /usr/java/jdk1.6.0_14/jre/lib/amd64/libzip.so 2aaaaca74000-2aaaacb76000 ---p 0000e000 08:01 315358 /usr/java/jdk1.6.0_14/jre/lib/amd64/libzip.so 2aaaacb76000-2aaaacb79000 rwxp 00010000 08:01 315358 /usr/java/jdk1.6.0_14/jre/lib/amd64/libzip.so 2aaaacb79000-2aaaacdea000 rwxp 2aaaacb79000 00:00 0 2aaaacdea000-2aaaafb7a000 rwxp 2aaaacdea000 00:00 0 2aaaafb7a000-2aaaafb84000 rwxp 2aaaafb7a000 00:00 0 2aaaafb84000-2aaaafc3a000 rwxp 2aaaafb84000 00:00 0 2aaaafc40000-2aaab1100000 rwxp 2aaaafc40000 00:00 0 2aaab1100000-2aaab5040000 rwxp 2aaab1100000 00:00 0 2aaab5040000-2aabc3040000 rwxp 2aaab5040000 00:00 0 2aac025a0000-2aaca8340000 rwxp 2aac025a0000 00:00 0 2aaca8340000-2aaca9040000 rwxp 2aaca8340000 00:00 0 2aaca9040000-2aaca904b000 rwxp 2aaca9040000 00:00 0 2aaca904b000-2aaca906a000 rwxp 2aaca904b000 00:00 0 2aaca906a000-2aaca98da000 rwxp 2aaca906a000 00:00 0 2aaca98da000-2aaca9ad4000 rwxp 2aaca98da000 00:00 0 2aaca9ad4000-2aacaa004000 rwxp 2aaca9ad4000 00:00 0 2aacaa004000-2aacaa00a000 rwxp 2aacaa004000 00:00 0 2aacaa00a000-2aacaa87b000 rwxp 2aacaa00a000 00:00 0 2aacaa87b000-2aacaaa76000 rwxp 2aacaa87b000 00:00 0 2aacaaa76000-2aacaaa81000 rwxp 2aacaaa76000 00:00 0 2aacaaa81000-2aacaaaa0000 rwxp 2aacaaa81000 00:00 0 2aacaaaa0000-2aacaaba0000 rwxp 2aacaaaa0000 00:00 0 2aacaaba0000-2aacaad36000 r-xs 02fb1000 08:01 315318 /usr/java/jdk1.6.0_14/jre/lib/rt.jar 2aacaad36000-2aacaaf36000 rwxp 2aacaad36000 00:00 0 2aacaaf36000-2aacaaf49000 r-xp 00000000 08:01 315349 /usr/java/jdk1.6.0_14/jre/lib/amd64/libnet.so 2aacaaf49000-2aacab04a000 ---p 00013000 08:01 315349 /usr/java/jdk1.6.0_14/jre/lib/amd64/libnet.so 2aacab04a000-2aacab04d000 rwxp 00014000 08:01 315349 /usr/java/jdk1.6.0_14/jre/lib/amd64/libnet.so 2aacab058000-2aacab05c000 r-xp 00000000 08:01 49268 /lib64/libnss_dns-2.7.so 2aacab05c000-2aacab25b000 ---p 00004000 08:01 49268 /lib64/libnss_dns-2.7.so 2aacab25b000-2aacab25c000 r-xp 00003000 08:01 49268 /lib64/libnss_dns-2.7.so 2aacab25c000-2aacab25d000 rwxp 00004000 08:01 49268 /lib64/libnss_dns-2.7.so 2aacab25d000-2aacab26e000 r-xp 00000000 08:01 49282 /lib64/libresolv-2.7.so 2aacab26e000-2aacab46e000 ---p 00011000 08:01 49282 /lib64/libresolv-2.7.so 2aacab46e000-2aacab46f000 r-xp 00011000 08:01 49282 /lib64/libresolv-2.7.so 2aacab46f000-2aacab470000 rwxp 00012000 08:01 49282 /lib64/libresolv-2.7.so 2aacab470000-2aacab572000 rwxp 2aacab470000 00:00 0 2aacab572000-2aacab57e000 r-xs 00081000 08:10 29851828 /mnt/home/jatten/workspace/common/lib/google-collect-1.0.jar 2aacab57e000-2aacab585000 r-xs 000aa000 08:10 29851946 /mnt/home/jatten/workspace/common/lib/mysql-connector-java-5.1.8-bin.jar 2aacab585000-2aacab58d000 r-xs 00028000 08:10 29851949 /mnt/home/jatten/workspace/common/lib/xml-apis.jar 2aacab58d000-2aacab591000 r-xs 0002f000 08:10 29851947 /mnt/home/jatten/workspace/common/lib/commons-beanutils-core-1.8.2.jar 2aacab591000-2aacab59e000 r-xs 0007f000 08:10 29851943 /mnt/home/jatten/workspace/common/lib/commons-collections-3.2.jar 2aacab59e000-2aacab5a3000 r-xs 00026000 08:10 29851942 /mnt/home/jatten/workspace/common/lib/httpcore-4.0.jar 2aacab5a3000-2aacab5a9000 r-xs 00030000 08:10 29851932 /mnt/home/jatten/workspace/common/lib/junit-dep-4.8.1.jar 2aacab5a9000-2aacab5ac000 r-xs 00011000 08:10 29851922 /mnt/home/jatten/workspace/common/lib/servlet.jar 2aacab5ac000-2aacab5ae000 r-xs 00009000 08:10 29851937 /mnt/home/jatten/workspace/common/lib/gsb.jar 2aacab5ae000-2aacab5b5000 r-xs 00059000 08:10 29851930 /mnt/home/jatten/workspace/common/lib/log4j-1.2.15.jar 2aacab5b5000-2aacab6b5000 rwxp 2aacab5b5000 00:00 0 2aacab6b5000-2aacab6b7000 r-xs 00009000 08:10 29851956 /mnt/home/jatten/workspace/common/lib/gsb-src.jar 2aacab6b7000-2aacab7b7000 rwxp 2aacab6b7000 00:00 0 2aacab7b7000-2aacab7cf000 r-xs 00115000 08:10 29851938 /mnt/home/jatten/workspace/common/lib/xercesImpl.jar 2aacab7cf000-2aacab7d1000 r-xs 00009000 08:10 29851957 /mnt/home/jatten/workspace/common/lib/velocity-tools-view-1.0.jar 2aacab7d1000-2aacab7d3000 r-xs 00009000 08:10 29851939 /mnt/home/jatten/workspace/common/lib/commons-cli-1.2.jar 2aacab7d3000-2aacab7d9000 r-xs 00034000 08:10 29851955 /mnt/home/jatten/workspace/common/lib/junit-4.8.1.jar 2aacab7d9000-2aacab7db000 r-xs 0000e000 08:10 29851917 /mnt/home/jatten/workspace/common/lib/jakarta-oro-2.0.8.jar 2aacab7db000-2aacab858000 r-xs 0031d000 08:10 29851916 /mnt/home/jatten/workspace/common/lib/poi-ooxml-schemas-3.6-20091214.jar 2aacab858000-2aacab85c000 r-xs 00028000 08:10 29851936 /mnt/home/jatten/workspace/common/lib/httpcore-nio-4.0.jar 2aacab85c000-2aacab85e000 r-xs 00005000 08:10 29851940 /mnt/home/jatten/workspace/common/lib/commons-beanutils-bean-collections-1.8.2.jar 2aacab85e000-2aacab864000 r-xs 00059000 08:10 29851919 /mnt/home/jatten/workspace/common/lib/mail-1.4.jar 2aacab864000-2aacab866000 r-xs 0000d000 08:10 29851950 /mnt/home/jatten/workspace/common/lib/commons-logging-1.1.1.jar 2aacab866000-2aacab86c000 r-xs 00045000 08:10 29851924 /mnt/home/jatten/workspace/common/lib/commons-httpclient-3.1.jar 2aacab86c000-2aacab877000 r-xs 00074000 08:10 29851931 /mnt/home/jatten/workspace/common/lib/velocity-dep-1.4.jar 2aacab877000-2aacab87f000 r-xs 00051000 08:10 29851954 /mnt/home/jatten/workspace/common/lib/velocity-1.4.jar 2aacab87f000-2aacab884000 r-xs 00034000 08:10 29851958 /mnt/home/jatten/workspace/common/lib/commons-beanutils-1.8.2.jar 2aacab884000-2aacab889000 r-xs 00048000 08:10 29851918 /mnt/home/jatten/workspace/common/lib/dom4j-1.6.1.jar 2aacab889000-2aacab8c6000 r-xs 0024f000 08:10 29851914 /mnt/home/jatten/workspace/common/lib/xmlbeans-2.3.0.jar 2aacab8c6000-2aacab8cb000 r-xs 00033000 08:10 29851929 /mnt/home/jatten/workspace/common/lib/xmemcached-1.2.3.jar 2aacab8cb000-2aacab8cd000 r-xs 00005000 08:10 29851928 /mnt/home/jatten/workspace/common/lib/org.hamcrest.core_1.1.0.v20090501071000.jar 2aacab8cd000-2aacab8d0000 r-xs 0000a000 08:10 29851944 /mnt/home/jatten/workspace/common/lib/persistence-api-1.0.jar 2aacab8d0000-2aacab8d6000 r-xs 0005f000 08:10 29851926 /mnt/home/jatten/workspace/common/lib/poi-ooxml-3.6-20091214.jar 2aacab8d6000-2aacab8d7000 r-xs 0002b000 08:10 29851951 /mnt/home/jatten/workspace/common/lib/maxmind.jar 2aacab8d7000-2aacab8d8000 r-xs 00002000 08:10 29851935 /mnt/home/jatten/workspace/common/lib/jackson-jaxrs-1.2.0.jar 2aacab8d8000-2aacab8d9000 r-xs 00002000 08:10 29851913 /mnt/home/jatten/workspace/common/lib/slf4j-log4j12-1.5.6.jar 2aacab8d9000-2aacab8dd000 r-xs 00025000 08:10 29851945 /mnt/home/jatten/workspace/common/lib/yanf4j-1.1.1.jar 2aacab8dd000-2aacab8df000 r-xs 00003000 08:10 29851952 /mnt/home/jatten/workspace/common/lib/clickstream-1.0.2.jar 2aacab8df000-2aacab8e1000 r-xs 00004000 08:10 29851953 /mnt/home/jatten/workspace/common/lib/slf4j-api-1.5.6.jar 2aacab8e1000-2aacab8e9000 r-xs 0004d000 08:10 29851920 /mnt/home/jatten/workspace/common/lib/jackson-mapper-asl-1.2.0.jar 2aacab8e9000-2aacab8ed000 r-xs 0001f000 08:10 29851925 /mnt/home/jatten/workspace/common/lib/jackson-core-asl-1.2.0.jar 2aacab8ed000-2aacab8f1000 r-xs 0001b000 08:10 29851912 /mnt/home/jatten/workspace/common/lib/oscache-2.3.jar 2aacab8f1000-2aacab90c000 r-xs 0015d000 08:10 29851927 /mnt/home/jatten/workspace/common/lib/poi-3.6-20091214.jar 2aacab90c000-2aacab911000 r-xs 00040000 08:10 29851831 /mnt/home/jatten/workspace/common/lib/commons-lang-2.5.jar 2aacab911000-2aacab914000 r-xs 00012000 08:10 29851923 /mnt/home/jatten/workspace/common/lib/jgooglesafebrowser-0.1a.2.jar 2aacab914000-2aacab918000 r-xs 00023000 08:10 29851933 /mnt/home/jatten/workspace/common/lib/gson-1.3.jar 2aacab918000-2aacabb18000 rwxp 2aacab918000 00:00 0 2aacabb82000-2aacabd82000 rwxp 2aacabb82000 00:00 0 2aacabe05000-2aacaf204000 rwxp 2aacabe05000 00:00 0 7fffaa12a000-7fffaa141000 rwxp 7fffaa12a000 00:00 0 [stack] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vdso] VM Arguments: jvm_args: -Xmx8000M java_command: com.scorers.ModelImplementingPageScorer -t data/data/golds/adult.all.json -b 18 -s data/models/pagetext.binary. adult.april6.all.model -m com.models.MultiClassUpdateableModel -p 30 --goldsilver -v --cat adult --fakeinput -e /mnt/tmp/xyz.15647.pageo bjects.txt -o Launcher Type: SUN_STANDARD Environment Variables: JAVA_HOME=/usr/java/jdk1.6.0_14 PATH=/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/home/jatten/bin LD_LIBRARY_PATH=/usr/java/jdk1.6.0_14/jre/lib/amd64/server:/usr/java/jdk1.6.0_14/jre/lib/amd64:/usr/java/jdk1.6.0_14/jre/../lib/amd64 SHELL=/bin/bash Signal Handlers: SIGSEGV: [libjvm.so+0x6bd980], sa_mask[0]=0x7ffbfeff, sa_flags=0x10000004 SIGBUS: [libjvm.so+0x6bd980], sa_mask[0]=0x7ffbfeff, sa_flags=0x10000004 SIGFPE: [libjvm.so+0x594cc0], sa_mask[0]=0x7ffbfeff, sa_flags=0x10000004 SIGPIPE: [libjvm.so+0x594cc0], sa_mask[0]=0x7ffbfeff, sa_flags=0x10000004 SIGXFSZ: [libjvm.so+0x594cc0], sa_mask[0]=0x7ffbfeff, sa_flags=0x10000004 SIGILL: [libjvm.so+0x594cc0], sa_mask[0]=0x7ffbfeff, sa_flags=0x10000004 SIGUSR1: SIG_DFL, sa_mask[0]=0x00000000, sa_flags=0x00000000 SIGUSR2: [libjvm.so+0x597480], sa_mask[0]=0x00000000, sa_flags=0x10000004 SIGHUP: [libjvm.so+0x5971d0], sa_mask[0]=0x7ffbfeff, sa_flags=0x10000004 SIGINT: [libjvm.so+0x5971d0], sa_mask[0]=0x7ffbfeff, sa_flags=0x10000004 SIGTERM: [libjvm.so+0x5971d0], sa_mask[0]=0x7ffbfeff, sa_flags=0x10000004 SIGQUIT: [libjvm.so+0x5971d0], sa_mask[0]=0x7ffbfeff, sa_flags=0x10000004 --------------- S Y S T E M --------------- OS:Fedora release 8 (Werewolf) uname:Linux 2.6.21.7-2.fc8xen #1 SMP Fri Feb 15 12:34:28 EST 2008 x86_64 libc:glibc 2.7 NPTL 2.7 rlimit: STACK 10240k, CORE 0k, NPROC 61504, NOFILE 1024, AS infinity load average:2.83 2.73 2.78 CPU:total 2 (4 cores per cpu, 1 threads per core) family 6 model 23 stepping 10, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1 Memory: 4k page, physical 7872040k(14540k free), swap 0k(0k free) vm_info: Java HotSpot(TM) 64-Bit Server VM (14.0-b16) for linux-amd64 JRE (1.6.0_14-b08), built on May 21 2009 01:11:11 by "java_re" with gcc 3.2.2 (SuSE Lin ux) [error occurred during error reporting (printing date and time), id 0xb]

Read the article

Slowdowns when reading from an urlconnection's inputstream (even with byte[] and buffers)

- by user342677

Ok so after spending two days trying to figure out the problem, and reading about dizillion articles, i finally decided to man up and ask to for some advice(my first time here). Now to the issue at hand - I am writing a program which will parse api data from a game, namely battle logs. There will be A LOT of entries in the database(20+ million) and so the parsing speed for each battle log page matters quite a bit. The pages to be parsed look like this: http://api.erepublik.com/v1/feeds/battle_logs/10000/0. (see source code if using chrome, it doesnt display the page right). It has 1000 hit entries, followed by a little battle info(lastpage will have <1000 obviously). On average, a page contains 175000 characters, UTF-8 encoding, xml format(v 1.0). Program will run locally on a good PC, memory is virtually unlimited(so that creating byte[250000] is quite ok). The format never changes, which is quite convenient. Now, I started off as usual: //global vars,class declaration skipped public WebObject(String url_string, int connection_timeout, int read_timeout, boolean redirects_allowed, String user_agent) throws java.net.MalformedURLException, java.io.IOException { // Open a URL connection java.net.URL url = new java.net.URL(url_string); java.net.URLConnection uconn = url.openConnection(); if (!(uconn instanceof java.net.HttpURLConnection)) { throw new java.lang.IllegalArgumentException("URL protocol must be HTTP"); } conn = (java.net.HttpURLConnection) uconn; conn.setConnectTimeout(connection_timeout); conn.setReadTimeout(read_timeout); conn.setInstanceFollowRedirects(redirects_allowed); conn.setRequestProperty("User-agent", user_agent); } public void executeConnection() throws IOException { try { is = conn.getInputStream(); //global var l = conn.getContentLength(); //global var } catch (Exception e) { //handling code skipped } } //getContentStream and getLength methods which just return'is' and 'l' are skipped Here is where the fun part began. I ran some profiling (using System.currentTimeMillis()) to find out what takes long ,and what doesnt. The call to this method takes only 200ms on avg public InputStream getWebPageAsStream(int battle_id, int page) throws Exception { String url = "http://api.erepublik.com/v1/feeds/battle_logs/" + battle_id + "/" + page; WebObject wobj = new WebObject(url, 10000, 10000, true, "Mozilla/5.0 " + "(Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 ( .NET CLR 3.5.30729)"); wobj.executeConnection(); l = wobj.getContentLength(); // global variable return wobj.getContentStream(); //returns 'is' stream } 200ms is quite expected from a network operation, and i am fine with it. BUT when i parse the inputStream in any way(read it into string/use java XML parser/read it into another ByteArrayStream) the process takes over 1000ms! for example, this code takes 1000ms IF i pass the stream i got('is') above from getContentStream() directly to this method: public static Document convertToXML(InputStream is) throws ParserConfigurationException, IOException, SAXException { DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse(is); doc.getDocumentElement().normalize(); return doc; } this code too, takes around 920ms IF the initial InputStream 'is' is passed in(dont read into the code itself - it just extracts the data i need by directly counting the characters, which can be done thanks to the rigid api feed format): public static parsedBattlePage convertBattleToXMLWithoutDOM(InputStream is) throws IOException { // Point A BufferedReader br = new BufferedReader(new InputStreamReader(is)); LinkedList ll = new LinkedList(); String str = br.readLine(); while (str != null) { ll.add(str); str = br.readLine(); } if (((String) ll.get(1)).indexOf("error") != -1) { return new parsedBattlePage(null, null, true, -1); } //Point B Iterator it = ll.iterator(); it.next(); it.next(); it.next(); it.next(); String[][] hits_arr = new String[1000][4]; String t_str = (String) it.next(); String tmp = null; int j = 0; for (int i = 0; t_str.indexOf("time") != -1; i++) { hits_arr[i][0] = t_str.substring(12, t_str.length() - 11); tmp = (String) it.next(); hits_arr[i][1] = tmp.substring(14, tmp.length() - 9); tmp = (String) it.next(); hits_arr[i][2] = tmp.substring(15, tmp.length() - 10); tmp = (String) it.next(); hits_arr[i][3] = tmp.substring(18, tmp.length() - 13); it.next(); it.next(); t_str = (String) it.next(); j++; } String[] b_info_arr = new String[9]; int[] space_nums = {13, 10, 13, 11, 11, 12, 5, 10, 13}; for (int i = 0; i < space_nums.length; i++) { tmp = (String) it.next(); b_info_arr[i] = tmp.substring(space_nums[i] + 4, tmp.length() - space_nums[i] - 1); } //Point C return new parsedBattlePage(hits_arr, b_info_arr, false, j); } I have tried replacing the default BufferedReader with BufferedReader br = new BufferedReader(new InputStreamReader(is), 250000); This didnt change much. My second try was to replace the code between A and B with: Iterator it = IOUtils.lineIterator(is, "UTF-8"); Same result, except this time A-B was 0ms, and B-C was 1000ms, so then every call to it.next() must have been consuming some significant time.(IOUtils is from apache-commons-io library). And here is the culprit - the time taken to parse the stream to string, be it by an iterator or BufferedReader in ALL cases was about 1000ms, while the rest of the code took 0ms(e.g. irrelevant). This means that parsing the stream to LinkedList, or iterating over it, for some reason was eating up a lot of my system resources. question was - why? Is it just the way java is made...no...thats just stupid, so I did another experiment. In my main method I added after the getWebPageAsStream(): //Point A ba = new byte[l]; // 'l' comes from wobj.getContentLength above bytesRead = is.read(ba); //'is' is our URLConnection original InputStream offset = bytesRead; while (bytesRead != -1) { bytesRead = is.read(ba, offset - 1, l - offset); offset += bytesRead; } //Point B InputStream is2 = new ByteArrayInputStream(ba); //Now just working with 'is2' - the "copied" stream The InputStream-byte[] conversion took again 1000ms - this is the way many ppl suggested to read an InputStream, and stil it is slow. And guess what - the 2 parser methods above (convertToXML() and convertBattlePagetoXMLWithoutDOM(), when passed 'is2' instead of 'is' took, in all 4 cases, under 50ms to complete. I read a suggestion that the stream waits for connection to close before unblocking, so i tried using HttpComponentsClient 4.0 (http://hc.apache.org/httpcomponents-client/index.html) instead, but the initial InputStream took just as long to parse. e.g. this code: public InputStream getWebPageAsStream2(int battle_id, int page) throws Exception { String url = "http://api.erepublik.com/v1/feeds/battle_logs/" + battle_id + "/" + page; HttpClient httpclient = new DefaultHttpClient(); HttpGet httpget = new HttpGet(url); HttpParams p = new BasicHttpParams(); HttpConnectionParams.setSocketBufferSize(p, 250000); HttpConnectionParams.setStaleCheckingEnabled(p, false); HttpConnectionParams.setConnectionTimeout(p, 5000); httpget.setParams(p); HttpResponse response = httpclient.execute(httpget); HttpEntity entity = response.getEntity(); l = (int) entity.getContentLength(); return entity.getContent(); } took even longer to process(50ms more for just the network) and the stream parsing times remained the same. Obviously it can be instantiated so as to not create HttpClient and properties every time(faster network time), but the stream issue wont be affected by that. So we come to the center problem - why does the initial URLConnection InputStream(or HttpClient InputStream) take so long to process, while any stream of same size and content created locally is orders of magnitude faster? I mean, the initial response is already somewhere in RAM, and I cant see any good reasong why it is processed so slowly compared to when a same stream is just created from a byte[]. Considering I have to parse million of entries and thousands of pages like that, a total processing time of almost 1.5s/page seems WAY WAY too long. Any ideas? P.S. Please ask in any more code is required - the only thing I do after parsing is make a PreparedStatement and put the entries into JavaDB in packs of 1000+, and the perfomance is ok ~ 200ms/1000entries, prb could be optimized with more cache but I didnt look into it much.

Developer IT