Search Results

Search found 908 results on 37 pages for 'the worst shady'.

Page 34/37 | < Previous Page | 30 31 32 33 34 35 36 37 | Next Page >

Video on Architecture and Code Quality using Visual Studio 2012–interview with Marcel de Vries and Terje Sandstrom by Adam Cogan

- by terje

Find the video HERE. Adam Cogan did a great Web TV interview with Marcel de Vries and myself on the topics of architecture and code quality. It was real fun participating in this session. Although we know each other from the MVP ALM community, Marcel, Adam and I haven’t worked together before. It was very interesting to see how we agreed on so many terms, and how alike we where thinking. The basics of ensuring you have a good architecture and how you could document it is one thing. Also, the same agreement on the importance of having a high quality code base, and how we used the Visual Studio 2012 tools, and some others (NDepend for example) to measure and ensure that the code quality was where it should be. As the tools, methods and thinking popped up during the interview it was a lot of “Hey ! I do that too!”. The tools are not only for “after the fact” work, but we use them during the coding. That way the tools becomes an integrated part of our coding work, and helps us to find issues we may have overlooked. The video has a bunch of call outs, pinpointing important things to remember. These are also listed on the corresponding web page. I haven’t seen that touch before, but really liked this way of doing it – it makes it much easier to spot the highlights. Titus Maclaren and Raj Dhatt from SSW have done a terrific job producing this video. And thanks to Lei Xu for doing the camera and recording job. Thanks guys ! Also, if you are at TechEd Amsterdam 2012, go and listen to Adam Cogan in his session on “A modern architecture review: Using the new code review tools” Friday 29th, 10.15-11.30 and Marcel de Vries session on “Intellitrace, what is it and how can I use it to my benefit” Wednesday 27th, 5-6.15 The highlights points out some important practices. I’ll elaborate on a few of them here: Add instructions on how to compile the solution. You do this by adding a text file with instructions to the solution, and keep it under source control. These instructions should contain what is needed on top of a standard install of Visual Studio. I do a lot of code reviews, and more often that not, I am not even able to compile the program, because they have used some tool or library that needs to be installed. The same applies to any new developer who enters into the team, so do this to increase your productivity when the team changes, or a team member switches computer. Don’t forget to document what you have to configure on the computer, the IIS being a common one. The more automatic you can do this, the better. Use NuGet to get down libraries. When the text document gets more than say, half a page, with a bunch of different things to do, convert it into a powershell script instead. The metrics warning levels. These are very conservatively set by Microsoft. You rarely see anything but green, and besides, you should have color scales for each of the metrics. I have a blog post describing a more appropriate set of levels, based on both research work and industry “best practices”. The essential limits are: Cyclomatic complexity and coupling: Higher numbers are worse On method levels: Green : From 0 to 10 Yellow: From 10 to 20 (some say 15). Acceptable, but have a look to see if there is something unneeded here. Red: From 20 to 40: Action required, get these down. Bleeding Red: Above 40 This is the real red alert. Immediate action! (My invention, as people have asked what do I do when I have cyclomatic complexity of 150. The only answer I could think of was: RUN! ) Maintainability index: Lower numbers are worse, scale from 0 to 100. On method levels: Green: 60 to 100 Yellow: 40 – 60. You will always have methods here too, accept the higher ones, take a look at those who are down to the lower limit. Check up against the other metrics.) Red: 20 – 40: Action required, fix these. Bleeding red: Below 20. Immediate action required. When doing metrics analysis, you should leave the generated code out. You do this by adding attributes, unfortunately Microsoft has “forgotten” to add these to all their stuff, so you might have to add them to some of the code. It most cases it can be done so that it is not overwritten by a new round of code generation. Take a look a my blog post here for details on how to do that. Class level metrics might also be useful, at least for coupling and maintenance. But it is much more difficult to set any fixed limits on those. Any metric aggregations on higher level tend to be pretty useless, as the number of methods vary pretty much, and there are little science on what number of methods can be regarded as good or bad. NDepend have a recommendation, but they say it may vary too. And in these days of data binding, the number might be pretty high, as properties counts as methods. However, if you take the worst case situations, classes with more than 20 methods are suspicious, and coupling and cyclomatic complexity go red above 20, so any classes with more than 20x20 = 400 for these measures should be checked over. In the video we mention the SOLID principles, coined by “Uncle Bob” (Richard Martin). One of them, the Dependency Inversion principle we discuss in the video. It is important to note that this principle is NOT on whether you should use a Dependency Inversion Container or not, it is about how you design the interfaces and interactions between your classes. The Dependency Inversion Container is just one technique which is based on this principle, but which main purpose is to isolate things you would like to change at runtime, for example if you implement a plug in architecture. Overuse of a Dependency Inversion Container is however, NOT a good thing. It should be used for a purpose and not as a general DI solution. The general DI solution and thinking however is useful far beyond the DIC. You should always “program to an abstraction”, and not to the concreteness. We also talk a bit about the GRASP patterns, a term coined by Craig Larman in his book Applying UML and design patterns. GRASP patterns stand for General Responsibility Assignment Software Patterns and describe fundamental principles of object design and responsibility assignment. What I find great with these patterns is that they is another way to focus on the responsibility of a class. One of the things I most often found that is broken in software designs, is that the class lack responsibility, and as a result there are a lot of classes mucking around in the internals of the other classes. We also discuss the term “Code Smells”. This term was invented by Kent Beck and Martin Fowler when they worked with Fowler’s “Refactoring” book. A code smell is a set of “bad” coding practices, which are the drivers behind a corresponding set of refactorings. Here is a good list of the smells, and their corresponding refactor patterns. See also this.

Read the article
Managing access to multiple linux system

- by Swartz

A searched for answers but have found nothing on here... Long story short: a non-profit organization is in dire need of modernizing its infrastructure. First thing is to find an alternatives to managing user accounts on a number of Linux hosts. We have 12 servers (both physical and virtual) and about 50 workstations. We have 500 potential users for these systems. The individual who built and maintained the systems over the years has retired. He wrote his own scripts to manage it all. It still works. No complaints there. However, a lot of the stuff is very manual and error-prone. Code is messy and after updates often needs to be tweaked. Worst part is there is little to no docs written. There are just a few ReadMe's and random notes which may or may not be relevant anymore. So maintenance has become a difficult task. Currently accounts are managed via /etc/passwd on each system. Updates are distributed via cron scripts to correct systems as accounts are added on the "main" server. Some users have to have access to all systems (like a sysadmin account), others need access to shared servers, while others may need access to workstations or only a subset of those. Is there a tool that can help us manage accounts that meets the following requirements? Preferably open source (i.e. free as budget is VERY limited) mainstream (i.e. maintained) preferably has LDAP integration or could be made to interface with LDAP or AD service for user authentication (will be needed in the near future to integrate accounts with other offices) user management (adding, expiring, removing, lockout, etc) allows to manage what systems (or group of systems) each user has access to - not all users are allowed on all systems support for user accounts that could have different homedirs and mounts available depending on what system they are logged into. For example sysadmin logged into "main" server has main://home/sysadmin/ as homedir and has all shared mounts sysadmin logged into staff workstations would have nas://user/s/sysadmin as homedir(different from above) and potentially limited set of mounts, a logged in client would have his/her homedir at different location and no shared mounts. If there is an easy management interface that would be awesome. And if this tool is cross-platform (Linux / MacOS / *nix), that will be a miracle! I have searched the web and so have found nothing suitable. We are open to any suggestions. Thank you. EDIT: This question has been incorrectly marked as a duplicate. The linked to answer only talks about having same homedirs on all systems, whereas we need to have different homedirs based on what system user is currently logged into(MULTIPLE homedirs). Also access needs to be granted only to some machinees not the whole lot. Mods, please understand the full extent of the problem instead of merely marking it as duplicate for points...

Read the article
Coda 2 and SCP uploading files with the wrong permission

- by Tom Black

Currently I have a basic Ubuntu server running a website. The website is for a few students learning HTML/PHP and each student has their own account with a symbolic link to the shared website folder. Since the students are working on the website together, each user needs to be able to modify all the files (index.html for example). So I created a Webdev group containing all of the students with the default umask of 0002 set in their .bashrc (This allows newly created files to be 774). The shared folder is owned by the group Webdev with a chmod g+s so that new files/folders also belong to the group Webdev. The problem is that the students are using an IDE (Coda 2) and when they create a new file or folder using the IDE the file has the permissions of 644 on the server (not group writable). However when I make a new file through connecting with Cyberduck (SFTP client) the file permissions are 664 (as they should be). So I don't understand why Coda would be any different. However, after some trial and error I believe that Coda is first creating the file on local disk and then uploading that file to the server. On a mac by default a newly created file is 644. When the client uploads a file that's already 644 it stays 644 on the server side (umask is kind of useless in this situation). I've also tried creating ACL permissions for that folder but an uploaded file from my mac via SCP doesn't get the default ACL permissions. In Coda there is an option to change file permissions on a transfer. However this option seems to apply a chmod to all files being uploaded or saved. When one of students is modifying a file created by someone else when they try to upload the file or save it Coda tries to also do a chmod but fails because that user isn't the owner of the file. My current solution is using bindfs... I mount the shared web folder and bindfs sets permissions and group ownership of newly created files. However, bindfs seems to be a bit slow and I'm sure there is a better solution. Even if the students ditched Coda 2 and used Mac vim with scp the newly created files on the server would behave the same (644) which is default on the mac. Other options... 1) Either I teach the students to use (ssh/chmod) with their IDE to change their own file permissions when uploading. 2) I make all the students' Macs have the default umask of 0002 which would upload files with the right permissions. 3) Write a corn script to fix the file permissions every 5 to 15 minutes... (This option I think is the worst if students are working together at the same time). Is there any way that I could make all files that are uploaded via SCP have the default file permissions of 664 even though the uploaded file has a lower permission? (After hours of searching I don't think this is possible) I guess a corn script is my best option for novice users. How do web developers work together on larger sites? similar to this: http://serverfault.com/questions/283492/how-to-specify-file-permission-when-putting-a-file-using-openssh-sftp-command Also similar: http://serverfault.com/questions/395418/managing-linux-directory-permissions-sftp

Read the article
A faulty Caviar Blue hard drive?

- by Glister

We have a small "homemade" server running fully updated Debian Wheezy (amd64). One hard drive installed: WDC WD6400AAKS. The motherboard is ASUS M4N68T V2. The usual load: CPU: an average of 20% Each week about 50GB of additional space is occupied. About 47GB of uploaded files and 3GB of MySQL data. I'm afraid that the hard drive may be about to fail. I saw Pre-fail on few places when I ran: root@SERVER:/tmp# smartctl -a /dev/sda smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-4-amd64] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Blue Serial ATA Device Model: WDC WD6400AAKS-XXXXXXX Serial Number: WD-XXXXXXXXXXXXXXXXXXX LU WWN Device Id: 5 0014ee XXXXXXXXXXXXX Firmware Version: 01.03B01 User Capacity: 640,135,028,736 bytes [640 GB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Mon Oct 28 18:55:27 2013 UTC SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x85) Offline data collection activity was aborted by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 247) Self-test routine in progress... 70% of test remaining. Total time to complete Offline data collection: (11580) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 136) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x303f) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 157 146 021 Pre-fail Always - 5108 4 Start_Stop_Count 0x0032 098 098 000 Old_age Always - 2968 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 051 Old_age Always - 0 9 Power_On_Hours 0x0032 079 079 000 Old_age Always - 15445 10 Spin_Retry_Count 0x0032 100 100 051 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 051 Old_age Always - 0 12 Power_Cycle_Count 0x0032 098 098 000 Old_age Always - 2950 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 426 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 2968 194 Temperature_Celsius 0x0022 111 095 000 Old_age Always - 36 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 160 000 Old_age Always - 21716 200 Multi_Zone_Error_Rate 0x0008 200 200 051 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 15444 - Error SMART Read Selective Self-Test Log failed: scsi error aborted command Smartctl: SMART Selective Self Test Log Read Failed root@SERVER:/tmp# In one tutorial I read that the pre-fail is a an indication of coming failure, in another tutorial I read that it is not true. Can you guys help me decode the output of smartctl? It would be also nice to share suggestions what should I do if I want to ensure data integrity (about 50GB of new data each week, up to 2TB for the whole period I'm interested in). Maybe I will go with 2x2TB Caviar Black in RAID4?

Read the article
Making sense of S.M.A.R.T

- by James

First of all, I think everyone knows that hard drives fail a lot more than the manufacturers would like to admit. Google did a study that indicates that certain raw data attributes that the S.M.A.R.T status of hard drives reports can have a strong correlation with the future failure of the drive. We find, for example, that after their first scan error, drives are 39 times more likely to fail within 60 days than drives with no such errors. First errors in re- allocations, offline reallocations, and probational counts are also strongly correlated to higher failure probabil- ities. Despite those strong correlations, we find that failure prediction models based on SMART parameters alone are likely to be severely limited in their prediction accuracy, given that a large fraction of our failed drives have shown no SMART error signals whatsoever. Seagate seems like it is trying to obscure this information about their drives by claiming that only their software can accurately determine the accurate status of their drive and by the way their software will not tell you the raw data values for the S.M.A.R.T attributes. Western digital has made no such claim to my knowledge but their status reporting tool does not appear to report raw data values either. I've been using HDtune and smartctl from smartmontools in order to gather the raw data values for each attribute. I've found that indeed... I am comparing apples to oranges when it comes to certain attributes. I've found for example that most Seagate drives will report that they have many millions of read errors while western digital 99% of the time shows 0 for read errors. I've also found that Seagate will report many millions of seek errors while Western Digital always seems to report 0. Now for my question. How do I normalize this data? Is Seagate producing millions of errors while Western digital is producing none? Wikipedia's article on S.M.A.R.T status says that manufacturers have different ways of reporting this data. Here is my hypothesis: I think I found a way to normalize (is that the right term?) the data. Seagate drives have an additional attribute that Western Digital drives do not have (Hardware ECC Recovered). When you subtract the Read error count from the ECC Recovered count, you'll probably end up with 0. This seems to be equivalent to Western Digitals reported "Read Error" count. This means that Western Digital only reports read errors that it cannot correct while Seagate counts up all read errors and tells you how many of those it was able to fix. I had a Seagate drive where the ECC Recovered count was less than the Read error count and I noticed that many of my files were becoming corrupt. This is how I came up with my hypothesis. The millions of seek errors that Seagate produces are still a mystery to me. Please confirm or correct my hypothesis if you have additional information. Here is the smart status of my western digital drive just so you can see what I'm talking about: james@ubuntu:~$ sudo smartctl -a /dev/sda smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: WDC WD1001FALS-00E3A0 Serial Number: WD-WCATR0258512 Firmware Version: 05.01D05 User Capacity: 1,000,204,886,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Thu Jun 10 19:52:28 2010 PDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 179 175 021 Pre-fail Always - 4033 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 270 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1468 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 262 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 46 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 223 194 Temperature_Celsius 0x0022 105 102 000 Old_age Always - 42 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

Read the article
smartctl -t long isn't finishing

- by xenoterracide

I been running smartctl -t long on a drive for about 2 days now and it seems to be stalled at 10%. short and conveyance both passed. I have to send 1 of 2 drives purchased back I found badblocks with badblocks (none on this drive and I'ts made over a pass already). I'm just wondering if I should be concerned about this. smartctl 5.39.1 2010-01-28 r3054 [x86_64-unknown-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD10EARS-00Y5B1 Serial Number: WD-WMAV51582123 Firmware Version: 80.00A80 User Capacity: 1,000,204,886,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Mon May 10 22:19:52 2010 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 241) Self-test routine in progress... 10% of test remaining. Total time to complete Offline data collection: (20100) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 231) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3031) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 2 3 Spin_Up_Time 0x0027 131 131 021 Pre-fail Always - 6408 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 12 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 148 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 10 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 7 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 174 194 Temperature_Celsius 0x0022 106 102 000 Old_age Always - 41 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Conveyance offline Completed without error 00% 99 - # 2 Extended offline Interrupted (host reset) 10% 30 - # 3 Short offline Completed without error 00% 0 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.

Read the article
How to prevent delays associated with IPv6 AAAA records?

- by Nic

Our Windows servers are registering IPv6 AAAA records with our Windows DNS servers. However, we don't have IPv6 routing enabled on our network, so this frequently causes stall behaviours. Microsoft RDP is the worst offender. When connecting to a server that has a AAAA record in DNS, the remote desktop client will try IPv6 first, and won't fall back to IPv4 until the connection times out. Power users can work around this by connecting to the IP address directly. Resolving the IPv4 address with ping -4 hostname.foo always works instantly. What can I do to avoid this delay? Disable IPv6 on client? Nope, Microsoft says IPv6 is a mandatory part of the Windows operating system. Too many clients to ensure this is set everywhere consistently. Will cause more problems later when we finally implement IPv6. Disable IPv6 on the server? Nope, Microsoft says IPv6 is a mandatory part of the Windows operating system. Requires an inconvenient registry hack to disable the entire IPv6 stack. Ensuring this is correctly set on all servers is inconvenient. Will cause more problems later when we finally implement IPv6. Mask IPv6 records on the user-facnig DNS recursor? Nope, we're using NLNet Unbound and it doesn't support that. Prevent registration of IPv6 AAAA records on the Microsoft DNS server? I don't think that's even possible. At this point, I'm considering writing a script that purges all AAAA records from our DNS zones. Please, help me find a better way. UPDATE: DNS resolution is not the problem. As @joeqwerty points out in his answer, the DNS records are returned instantly. Both A and AAAA records are immediately available. The problem is that some clients (mstsc.exe) will preferentially attempt a connection over IPv6, and take a while to fall back to IPv4. This seems like a routing problem. The ping command produces a "General failure" error message because the destination address is unroutable. C:\Windows\system32>ping myhost.mydomain Pinging myhost.mydomain [2002:1234:1234::1234:1234] with 32 bytes of data: General failure. General failure. General failure. General failure. Ping statistics for 2002:1234:1234::1234:1234: Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), I can't get a packet capture of this behaviour. Running this (failing) ping command does not produce any packets in Microsoft Network Monitor. Similarly, attempting a connection with mstsc.exe to a host with an AAAA record produces no traffic until it does a fallback to IPv4. UPDATE: Our hosts are all using publicly-routable IPv4 addresses. I think this problem might come down to a broken 6to4 configuration. 6to4 behaves differently on hosts with public IP addresses vs RFC1918 addresses. UPDATE: There is definitely something fishy with 6to4 on my network. When I disable 6to4 on the Windows client, connections resolve instantly. netsh int ipv6 6to4 set state disabled But as @joeqwerty says, this only masks the problem. I'm still trying to find out why IPv6 communication on our network is completely non-working.

Read the article
Premature-Optimization and Performance Anxiety

- by James Michael Hare

While writing my post analyzing the new .NET 4 ConcurrentDictionary class (here), I fell into one of the classic blunders that I myself always love to warn about. After analyzing the differences of time between a Dictionary with locking versus the new ConcurrentDictionary class, I noted that the ConcurrentDictionary was faster with read-heavy multi-threaded operations. Then, I made the classic blunder of thinking that because the original Dictionary with locking was faster for those write-heavy uses, it was the best choice for those types of tasks. In short, I fell into the premature-optimization anti-pattern. Basically, the premature-optimization anti-pattern is when a developer is coding very early for a perceived (whether rightly-or-wrongly) performance gain and sacrificing good design and maintainability in the process. At best, the performance gains are usually negligible and at worst, can either negatively impact performance, or can degrade maintainability so much that time to market suffers or the code becomes very fragile due to the complexity. Keep in mind the distinction above. I'm not talking about valid performance decisions. There are decisions one should make when designing and writing an application that are valid performance decisions. Examples of this are knowing the best data structures for a given situation (Dictionary versus List, for example) and choosing performance algorithms (linear search vs. binary search). But these in my mind are macro optimizations. The error is not in deciding to use a better data structure or algorithm, the anti-pattern as stated above is when you attempt to over-optimize early on in such a way that it sacrifices maintainability. In my case, I was actually considering trading the safety and maintainability gains of the ConcurrentDictionary (no locking required) for a slight performance gain by using the Dictionary with locking. This would have been a mistake as I would be trading maintainability (ConcurrentDictionary requires no locking which helps readability) and safety (ConcurrentDictionary is safe for iteration even while being modified and you don't risk the developer locking incorrectly) -- and I fell for it even when I knew to watch out for it. I think in my case, and it may be true for others as well, a large part of it was due to the time I was trained as a developer. I began college in in the 90s when C and C++ was king and hardware speed and memory were still relatively priceless commodities and not to be squandered. In those days, using a long instead of a short could waste precious resources, and as such, we were taught to try to minimize space and favor performance. This is why in many cases such early code-bases were very hard to maintain. I don't know how many times I heard back then to avoid too many function calls because of the overhead -- and in fact just last year I heard a new hire in the company where I work declare that she didn't want to refactor a long method because of function call overhead. Now back then, that may have been a valid concern, but with today's modern hardware even if you're calling a trivial method in an extremely tight loop (which chances are the JIT compiler would optimize anyway) the results of removing method calls to speed up performance are negligible for the great majority of applications. Now, obviously, there are those coding applications where speed is absolutely king (for example drivers, computer games, operating systems) where such sacrifices may be made. But I would strongly advice against such optimization because of it's cost. Many folks that are performing an optimization think it's always a win-win. That they're simply adding speed to the application, what could possibly be wrong with that? What they don't realize is the cost of their choice. For every piece of straight-forward code that you obfuscate with performance enhancements, you risk the introduction of bugs in the long term technical debt of the application. It will become so fragile over time that maintenance will become a nightmare. I've seen such applications in places I have worked. There are times I've seen applications where the designer was so obsessed with performance that they even designed their own memory management system for their application to try to squeeze out every ounce of performance. Unfortunately, the application stability often suffers as a result and it is very difficult for anyone other than the original designer to maintain. I've even seen this recently where I heard a C++ developer bemoaning that in VS2010 the iterators are about twice as slow as they used to be because Microsoft added range checking (probably as part of the 0x standard implementation). To me this was almost a joke. Twice as slow sounds bad, but it almost never as bad as you think -- especially if you're gaining safety. The only time twice is really that much slower is when once was too slow to begin with. Think about it. 2 minutes is slow as a response time because 1 minute is slow. But if an iterator takes 1 microsecond to move one position and a new, safer iterator takes 2 microseconds, this is trivial! The only way you'd ever really notice this would be in iterating a collection just for the sake of iterating (i.e. no other operations). To my mind, the added safety makes the extra time worth it. Always favor safety and maintainability when you can. I know it can be a hard habit to break, especially if you started out your career early or in a language such as C where they are very performance conscious. But in reality, these type of micro-optimizations only end up hurting you in the long run. Remember the two laws of optimization. I'm not sure where I first heard these, but they are so true: For beginners: Do not optimize. For experts: Do not optimize yet. This is so true. If you're a beginner, resist the urge to optimize at all costs. And if you are an expert, delay that decision. As long as you have chosen the right data structures and algorithms for your task, your performance will probably be more than sufficient. Chances are it will be network, database, or disk hits that will be your slow-down, not your code. As they say, 98% of your code's bottleneck is in 2% of your code so premature-optimization may add maintenance and safety debt that won't have any measurable impact. Instead, code for maintainability and safety, and then, and only then, when you find a true bottleneck, then you should go back and optimize further.

Read the article
The Business of Winning Innovation: An Exclusive Blog Series

- by Kerrie Foy

"The Business of Winning Innovation” is a series of articles authored by Oracle Agile PLM experts on what it takes to make innovation a successful and lucrative competitive advantage. Our customers have proven Agile PLM applications to be enormously flexible and comprehensive, so we’ve launched this article series to showcase some of the most fascinating, value-packed use cases. In this article by Keith Colonna, we kick-off the series by taking a look at the science side of innovation within the Consumer Products industry and how PLM can help companies innovate faster, cheaper, smarter. This article will review how innovation has become the lifeline for growth within consumer products companies and how certain companies are “winning” by creating a competitive advantage for themselves by taking a more enterprise-wide,systematic approach to “innovation”. Managing the Science of Innovation within the Consumer Products Industry By: Keith Colonna, Value Chain Solution Manager, Oracle The consumer products (CP) industry is very mature and competitive. Most companies within this industry have saturated North America (NA) with their products thus maximizing their NA growth potential. Future growth is expected to come from either expansion outside of North America and/or by way of new ideas and products. Innovation plays an integral role in both of these strategies, whether you’re innovating business processes or the products themselves, and may cause several challenges for the typical CP company, Becoming more innovative is both an art and a science. Most CP companies are very good at the art of coming up with new innovative ideas, but many struggle with perfecting the science aspect that involves the best practice processes that help companies quickly turn ideas into sellable products and services. Symptoms and Causes of Business Pain Struggles associated with the science of innovation show up in a variety of ways, like: · Establishing and storing innovative product ideas and data · Funneling these ideas to the chosen few · Time to market cycle time and on-time launch rates · Success rates, or how often the best idea gets chosen · Imperfect decision making (i.e. the ability to kill projects that are not projected to be winners) · Achieving financial goals · Return on R&D investment · Communicating internally and externally as more outsource partners are added globally · Knowing your new product pipeline and project status These challenges (and others) can be consolidated into three root causes: A lack of visibility Poor data with limited access The inability to truly collaborate enterprise-wide throughout your extended value chain Choose the Right Remedy Product Lifecycle Management (PLM) solutions are uniquely designed to help companies solve these types challenges and their root causes. However, PLM solutions can vary widely in terms of configurability, functionality, time-to-value, etc. Business leaders should evaluate PLM solution in terms of their own business drivers and long-term vision to determine the right fit. Many of these solutions are point solutions that can help you cure only one or two business pains in the short term. Others have been designed to serve other industries with different needs. Then there are those solutions that demo well but are owned by companies that are either unable or unwilling to continuously improve their solution to stay abreast of the ever changing needs of the CP industry to grow through innovation. What the Right PLM Solution Should Do for You Based on more than twenty years working in the CP industry, I recommend investing in a single solution that can help you solve all of the issues associated with the science of innovation in a totally integrated fashion. By integration I mean the (1) integration of the all of the processes associated with the development, maintenance and delivery of your product data, and (2) the integration, or harmonization of this product data with other downstream sources, like ERP, product catalogues and the GS1 Global Data Synchronization Network (or GDSN, which is now a CP industry requirement for doing business with most retailers). The right PLM solution should help you: Increase Revenue. A best practice PLM solution should help a company grow its revenues by consolidating product development cycle-time and helping companies get new and improved products to market sooner. PLM should also eliminate many of the root causes for a product being returned, refused and/or reclaimed (which takes away from top-line growth) by creating an enterprise-wide, collaborative, workflow-driven environment. Reduce Costs. A strong PLM solution should help shave many unnecessary costs that companies typically take for granted. Rationalizing SKU’s, components (ingredients and packaging) and suppliers is a major opportunity at most companies that PLM should help address. A natural outcome of this rationalization is lower direct material spend and a reduction of inventory. Another cost cutting opportunity comes with PLM when it helps companies avoid certain costs associated with process inefficiencies that lead to scrap, rework, excess and obsolete inventory, poor end of life administration, higher cost of quality and regulatory and increased expediting. Mitigate Risk. Risks are the hardest to quantify but can be the most costly to a company. Food safety, recalls, line shutdowns, customer dissatisfaction and, worst of all, the potential tarnishing of your brands are a few of the debilitating risks that CP companies deal with on a daily basis. These risks are so uniquely severe that they require an enterprise PLM solution specifically designed for the CP industry that safeguards product information and processes while still allowing the art of innovation to flourish. Many CP companies have already created a winning advantage by leveraging a single, best practice PLM solution to establish an enterprise-wide, systematic approach to innovation. Oracle’s Answer for the Consumer Products Industry Oracle is dedicated to solving the growth and innovation challenges facing the CP industry. Oracle’s Agile Product Lifecycle Management for Process solution was originally developed with and for CP companies and is driven by a specialized development staff solely focused on maintaining and continuously improving the solution per the latest industry requirements. Agile PLM for Process helps CP companies handle all of the processes associated with managing the science of the innovation process, including: specification management, new product development/project and portfolio management, formulation optimization, supplier management, and quality and regulatory compliance to name a few. And as I mentioned earlier, integration is absolutely critical. Many Oracle CP customers, both with Oracle ERP systems and non-Oracle ERP systems, report benefits from Oracle’s Agile PLM for Process. In future articles we will explain in greater detail how both existing Oracle customers (like Gallo, Smuckers, Land-O-Lakes and Starbucks) and new Oracle customers (like ConAgra, Tyson, McDonalds and Heinz) have all realized the benefits of Agile PLM for Process and its integration to their ERP systems. More to Come Stay tuned for more articles in our blog series “The Business of Winning Innovation.” While we will also feature articles focused on other industries, look forward to more on how Agile PLM for Process addresses innovation challenges facing the CP industry. Additional topics include: Innovation Data Management (IDM), New Product Development (NPD), Product Quality Management (PQM), Menu Management,Private Label Management, and more! . Watch this video for more info about Agile PLM for Process

Read the article
C# Performance Pitfall – Interop Scenarios Change the Rules

- by Reed

C# and .NET, overall, really do have fantastic performance in my opinion. That being said, the performance characteristics dramatically differ from native programming, and take some relearning if you’re used to doing performance optimization in most other languages, especially C, C++, and similar. However, there are times when revisiting tricks learned in native code play a critical role in performance optimization in C#. I recently ran across a nasty scenario that illustrated to me how dangerous following any fixed rules for optimization can be… The rules in C# when optimizing code are very different than C or C++. Often, they’re exactly backwards. For example, in C and C++, lifting a variable out of loops in order to avoid memory allocations often can have huge advantages. If some function within a call graph is allocating memory dynamically, and that gets called in a loop, it can dramatically slow down a routine. This can be a tricky bottleneck to track down, even with a profiler. Looking at the memory allocation graph is usually the key for spotting this routine, as it’s often “hidden” deep in call graph. For example, while optimizing some of my scientific routines, I ran into a situation where I had a loop similar to: for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i]); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This loop was at a fairly high level in the call graph, and often could take many hours to complete, depending on the input data. As such, any performance optimization we could achieve would be greatly appreciated by our users. After a fair bit of profiling, I noticed that a couple of function calls down the call graph (inside of ProcessElement), there was some code that effectively was doing: // Allocate some data required DataStructure* data = new DataStructure(num); // Call into a subroutine that passed around and manipulated this data highly CallSubroutine(data); // Read and use some values from here double values = data->Foo; // Cleanup delete data; // ... return bar; Normally, if “DataStructure” was a simple data type, I could just allocate it on the stack. However, it’s constructor, internally, allocated it’s own memory using new, so this wouldn’t eliminate the problem. In this case, however, I could change the call signatures to allow the pointer to the data structure to be passed into ProcessElement and through the call graph, allowing the inner routine to reuse the same “data” memory instead of allocating. At the highest level, my code effectively changed to something like: DataStructure* data = new DataStructure(numberToProcess); for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i], data); } delete data; Granted, this dramatically reduced the maintainability of the code, so it wasn’t something I wanted to do unless there was a significant benefit. In this case, after profiling the new version, I found that it increased the overall performance dramatically – my main test case went from 35 minutes runtime down to 21 minutes. This was such a significant improvement, I felt it was worth the reduction in maintainability. In C and C++, it’s generally a good idea (for performance) to: Reduce the number of memory allocations as much as possible, Use fewer, larger memory allocations instead of many smaller ones, and Allocate as high up the call stack as possible, and reuse memory I’ve seen many people try to make similar optimizations in C# code. For good or bad, this is typically not a good idea. The garbage collector in .NET completely changes the rules here. In C#, reallocating memory in a loop is not always a bad idea. In this scenario, for example, I may have been much better off leaving the original code alone. The reason for this is the garbage collector. The GC in .NET is incredibly effective, and leaving the allocation deep inside the call stack has some huge advantages. First and foremost, it tends to make the code more maintainable – passing around object references tends to couple the methods together more than necessary, and overall increase the complexity of the code. This is something that should be avoided unless there is a significant reason. Second, (unlike C and C++) memory allocation of a single object in C# is normally cheap and fast. Finally, and most critically, there is a large advantage to having short lived objects. If you lift a variable out of the loop and reuse the memory, its much more likely that object will get promoted to Gen1 (or worse, Gen2). This can cause expensive compaction operations to be required, and also lead to (at least temporary) memory fragmentation as well as more costly collections later. As such, I’ve found that it’s often (though not always) faster to leave memory allocations where you’d naturally place them – deep inside of the call graph, inside of the loops. This causes the objects to stay very short lived, which in turn increases the efficiency of the garbage collector, and can dramatically improve the overall performance of the routine as a whole. In C#, I tend to: Keep variable declarations in the tightest scope possible Declare and allocate objects at usage While this tends to cause some of the same goals (reducing unnecessary allocations, etc), the goal here is a bit different – it’s about keeping the objects rooted for as little time as possible in order to (attempt) to keep them completely in Gen0, or worst case, Gen1. It also has the huge advantage of keeping the code very maintainable – objects are used and “released” as soon as possible, which keeps the code very clean. It does, however, often have the side effect of causing more allocations to occur, but keeping the objects rooted for a much shorter time. Now – nowhere here am I suggesting that these rules are hard, fast rules that are always true. That being said, my time spent optimizing over the years encourages me to naturally write code that follows the above guidelines, then profile and adjust as necessary. In my current project, however, I ran across one of those nasty little pitfalls that’s something to keep in mind – interop changes the rules. In this case, I was dealing with an API that, internally, used some COM objects. In this case, these COM objects were leading to native allocations (most likely C++) occurring in a loop deep in my call graph. Even though I was writing nice, clean managed code, the normal managed code rules for performance no longer apply. After profiling to find the bottleneck in my code, I realized that my inner loop, a innocuous looking block of C# code, was effectively causing a set of native memory allocations in every iteration. This required going back to a “native programming” mindset for optimization. Lifting these variables and reusing them took a 1:10 routine down to 0:20 – again, a very worthwhile improvement. Overall, the lessons here are: Always profile if you suspect a performance problem – don’t assume any rule is correct, or any code is efficient just because it looks like it should be Remember to check memory allocations when profiling, not just CPU cycles Interop scenarios often cause managed code to act very differently than “normal” managed code. Native code can be hidden very cleverly inside of managed wrappers

Read the article
TV not detected by Windows/VGA - when there is a WHDI device in the signal chain

- by ashwalk

I'm at my wit's end with this one... I had an EVGA GTS 250, and I used to plug it's HDMI out into a WHDI sender, which transmitted to its corresponding WHDI receiver 15ft away, which then connected to a Samsung LN40D LCD TV through another HDMI cable. PC/VGA < [hdmi cable] < WHDI sender <[air] WHDI receiver < [hdmi cable] < TV It was perfect, stable, no perceivable latency. I just plugged everything the first time and it worked instantly. It sent 5.1 audio, and Windows/nVidia Control Center detected the TV by its name. The WHDI device is this one: http://goo.gl/Q8iWI5 Now I bought an EVGA GTX 650, and WHDI doesn't work anymore. Both Windows and nVidia Control Center won't detect the TV, only the monitor that's connected via DVI. The TV screen shows "TX202913 connected. Check video signal." on top of a black screen. Though the device is not the problem itself, just the fact that it's not allowing direct connection between PC and TV. I would bet that if put an AVR in its place I'd also have this issue. The HDMI on this new card works with other monitors. If I put the older card back, WHDI works normally. I have googled this for 5 months on and off. Once I bumped into a page that showed how to force a display device to always-on through registry edit. Once I restarted windows, the Tv (through WHDI) displayed my expanded or duplicated desktop at 1024x768 ONLY, and listed the display as "digital display". I could not change the resolution and it wouldn't playback audio (although the option was available at nVidia Control Center HDMI audio options, but did not work). This proves that there is no conflict between the devices, except that software-wise, Windows cannot, for the life of it, understand that there's a TV there to send video/audio to. Since this won't do (no audio, poor video), I reverted this regedit. It's also not an EDID problem within the TV, since when connected directly it works. The last weird bit of this saga is that today, I reminded of Windows' "Add Device" dialog, gave it a go, and a "Samsung Generic UPNP TV" showed up, which I promptly installed the drives for, rising to a climax of... ...NOTHING HAPPENING. As far as I can tell, it really didn't change anything other than using up a few kb in my main disc. I should also say that I looked a LOT into handshake problems and nothing applied either. Do any of you have an idea of what may be going on? I can't stand the thought of having a us$200 device not working because of the addition of a newer graphics card, when the much older one had no issues. There is absolutely NO REASON for this to happen. There is NO documentation on WHDI online. Apparently no one buys this stuff. For the same reason, no one responded to this same plea for help on NVidia and EVGA forums. Worst case, this can be a warning about this setup for people in the future. Thanx in advance.

Read the article
Ubuntu 12 crashed and took down network

- by Leopd

We recently set up a new Ubuntu 12.04LTS server on our network. It's not fully configured so it's not doing much beyond sshd and a default apache2 install. But this evening it appears to have crashed. It wasn't responding to the network or the keyboard. But the worst part is, it took down the entire network. My knowledge of the network stack below OSI layer 3 is very limited, so the rest confuses me. When this machine was physically connected to the network, no other machine could connect to the outside internet. When things were broken, running arp showed that our gateway's IP address (10.0.1.1) was listed as "invalid." Unplugging the server from the network fixed the problem, and plugging it back in broke it again. So the crashed server was advertising itself as owning the gateway's IP address? There's nothing at all in syslog during the time when it was causing problems. Any ideas about how to figure out what went wrong or what we can do to prevent it from happening again? I'm hesitant to even put the machine back on the network right now. Update ** It crashed again, and I ran tcpdump -penn arp (thanks bahamat!) for several minutes and got this... (timestamps and duplicate lines removed) 00:1e:65:f8:dc:24 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 10.0.1.1 tell 10.0.2.191, length 46 00:1e:65:f8:dc:24 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 10.0.1.44 tell 10.0.2.191, length 46 60:d8:19:d4:71:d6 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 10.0.1.1 tell 10.0.2.125, length 46 d4:9a:20:04:e9:78 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.1.1 tell 192.168.1.100, length 28 Update 2 ** When the network is functioning properly, arping -c4 10.0.1.1 returns this: ARPING 10.0.1.1 60 bytes from c0:c1:c0:77:25:8e (10.0.1.1): index=0 time=267.982 usec 60 bytes from c0:c1:c0:77:25:8e (10.0.1.1): index=1 time=422.955 usec 60 bytes from c0:c1:c0:77:25:8e (10.0.1.1): index=2 time=299.215 usec 60 bytes from c0:c1:c0:77:25:8e (10.0.1.1): index=3 time=366.926 usec --- 10.0.1.1 statistics --- 4 packets transmitted, 4 packets received, 0% unanswered (0 extra) When the bad server is plugged in, arping -c4 10.0.1.1 returns: ARPING 10.0.1.1 --- 10.0.1.1 statistics --- 4 packets transmitted, 0 packets received, 100% unanswered (0 extra) Context ** 10.0.x.x is the main subnet. 10.0.1.1 is the main internet gateway 10.0.1.44 is a printer 10.0.2.* devices are all laptops / workstations I have no idea what's using the 192.168.x.x subnet -- your guesses are at least as good as mine. A VM on a workstation? A misconfigured WAP? Somebody re-sharing wifi? A machine that failed to DHCP? The offending ubuntu server's MAC address ends in cd:80 so isn't listed in the dump. It should DHCP to 10.0.3.3 Thanks for any help. This ARP stuff is all voodoo to me. Packets just go to IP addresses, right? ;)

Read the article
Source-control 'wet-work'?

- by Phil Factor

When a design or creative work is flawed beyond remedy, it is often best to destroy it and start again. The other day, I lost the code to a long and intricate SQL batch I was working on. I’d thought it was impossible, but it happened. With all the technology around that is designed to prevent this occurring, this sort of accident has become a rare event. If it weren’t for a deranged laptop, and my distraction, the code wouldn’t have been lost this time. As always, I sighed, had a soothing cup of tea, and typed it all in again. The new code I hastily tapped in was much better: I’d held in my head the essence of how the code should work rather than the details: I now knew for certain the start point, the end, and how it should be achieved. Instantly the detritus of half-baked thoughts fell away and I was able to write logical code that performed better. Because I could work so quickly, I was able to hold the details of all the columns and variables in my head, and the dynamics of the flow of data. It was, in fact, easier and quicker to start from scratch rather than tidy up and refactor the existing code with its inevitable fumbling and half-baked ideas. What a shame that technology is now so good that developers rarely experience the cleansing shock of losing one’s code and having to rewrite it from scratch. If you’ve never accidentally lost your code, then it is worth doing it deliberately once for the experience. Creative people have, until Technology mistakenly prevented it, torn up their drafts or sketches, threw them in the bin, and started again from scratch. Leonardo’s obsessive reworking of the Mona Lisa was renowned because it was so unusual: Most artists have been utterly ruthless in destroying work that didn’t quite make it. Authors are particularly keen on writing afresh, and the results are generally positive. Lawrence of Arabia actually lost the entire 250,000 word manuscript of ‘The Seven Pillars of Wisdom’ by accidentally leaving it on a train at Reading station, before rewriting a much better version. Now, any writer or artist is seduced by technology into altering or refining their work rather than casting it dramatically in the bin or setting a light to it on a bonfire, and rewriting it from the blank page. It is easy to pick away at a flawed work, but the real creative process is far more brutal. Once, many years ago whilst running a software house that supplied commercial software to local businesses, I’d been supervising an accounting system for a farming cooperative. No packaged system met their needs, and it was all hand-cut code. For us, it represented a breakthrough as it was for a government organisation, and success would guarantee more contracts. As you’ve probably guessed, the code got mangled in a disk crash just a week before the deadline for delivery, and the many backups all proved to be entirely corrupted by a faulty tape drive. There were some fragments left on individual machines, but they were all of different versions. The developers were in despair. Strangely, I managed to re-write the bulk of a three-month project in a manic and caffeine-soaked weekend. Sure, that elegant universally-applicable input-form routine was‘nt quite so elegant, but it didn’t really need to be as we knew what forms it needed to support. Yes, the code lacked architectural elegance and reusability. By dawn on Monday, the application passed its integration tests. The developers rose to the occasion after I’d collapsed, and tidied up what I’d done, though they were reproachful that some of the style and elegance had gone out of the application. By the delivery date, we were able to install it. It was a smaller, faster application than the beta they’d seen and the user-interface had a new, rather Spartan, appearance that we swore was done to conform to the latest in user-interface guidelines. (we switched to Helvetica font to look more ‘Bauhaus’ ). The client was so delighted that he forgave the new bugs that had crept in. I still have the disk that crashed, up in the attic. In IT, we have had mixed experiences from complete re-writes. Lotus 123 never really recovered from a complete rewrite from assembler into C, Borland made the mistake with Arago and Quattro Pro and Netscape’s complete rewrite of their Navigator 4 browser was a white-knuckle ride. In all cases, the decision to rewrite was a result of extreme circumstances where no other course of action seemed possible. The rewrite didn’t come out of the blue. I prefer to remember the rewrite of Minix by young Linus Torvalds, or the rewrite of Bitkeeper by a slightly older Linus. The rewrite of CP/M didn’t do too badly either, did it? Come to think of it, the guy who decided to rewrite the windowing system of the Xerox Star never regretted the decision. I’ll agree that one should often resist calls for a rewrite. One of the worst habits of the more inexperienced programmer is to denigrate whatever code he or she inherits, and then call loudly for a complete rewrite. They are buoyed up by the mistaken belief that they can do better. This, however, is a different psychological phenomenon, more related to the idea of some motorcyclists that they are operating on infinite lives, or the occasional squaddies that if they charge the machine-guns determinedly enough all will be well. Grim experience brings out the humility in any experienced programmer. I’m referring to quite different circumstances here. Where a team knows the requirements perfectly, are of one mind on methodology and coding standards, and they already have a solution, then what is wrong with considering a complete rewrite? Rewrites are so painful in the early stages, until that point where one realises the payoff, that even I quail at the thought. One needs a natural disaster to push one over the edge. The trouble is that source-control systems, and disaster recovery systems, are just too good nowadays. If I were to lose this draft of this very blog post, I know I’d rewrite it much better. However, if you read this, you’ll know I didn’t have the nerve to delete it and start again. There was a time that one prayed that unreliable hardware would deliver you from an unmaintainable mess of a codebase, but now technology has made us almost entirely immune to such a merciful act of God. An old friend of mine with long experience in the software industry has long had the idea of the ‘source-control wet-work’, where one hires a malicious hacker in some wild eastern country to hack into one’s own source control system to destroy all trace of the source to an application. Alas, backup systems are just too good to make this any more than a pipedream. Somehow, it would be difficult to promote the idea. As an alternative, could one construct a source control system that, on doing all the code-quality metrics, would systematically destroy all trace of source code that failed the quality test? Alas, I can’t see many managers buying into the idea. In reading the full story of the near-loss of Toy Story 2, it set me thinking. It turned out that the lucky restoration of the code wasn’t the happy ending one first imagined it to be, because they eventually came to the conclusion that the plot was fundamentally flawed and it all had to be rewritten anyway. Was this an early case of the ‘source-control wet-job’?’ It is very hard nowadays to do a rapid U-turn in a development project because we are far too prone to cling to our existing source-code.

Read the article
Cannot get official CentOS 5.4 BIND package to start

- by Brian Cline

Yesterday I installed CentOS 5.4 on one of my servers, and it appears that the official BIND/named package has trouble starting for reasons I cannot deduce. Here is what happens: [root@hal init.d]# service named start Starting named: Error in named configuration: /etc/named.conf:57: open: named.root.hints: permission denied [FAILED] The line in question, with the directory option for context: // further up in the file: directory "/var/named"; // line 57: include "named.root.hints"; Like you, my first reaction was to check permissions on /var/named/named.root.hints, /var/named, and /var to make sure the named user would be able to read it. Here are the permissions at each level: drwxr-xr-x 19 root root 4096 Nov 3 02:05 var drwxr-x--- 5 root named 4096 Nov 3 02:36 named -rw-r--r-- 1 named named 524 Mar 29 2006 named.root.hints Everything appears to be fine permission-wise. The same error occurs if the /var/named directory is writable by the named user. I've even temporarily allowed the named user to log in via bash, su'ed from root to named, and checked that I was, in fact, able to cat /var/named/named.root.hints successfully. (Yes, don't worry: I changed the shell back to nologin). My last endeavor showed that BIND is able to run under the named user account and start up just fine, if done so manually: [root@hal ~]# named -u named -g 03-Nov-2009 16:31:02.021 starting BIND 9.3.6-P1-RedHat-9.3.6-4.P1.el5 -u named -g 03-Nov-2009 16:31:02.021 adjusted limit on open files from 1024 to 1048576 03-Nov-2009 16:31:02.021 found 2 CPUs, using 2 worker threads 03-Nov-2009 16:31:02.021 using up to 4096 sockets 03-Nov-2009 16:31:02.028 loading configuration from '/etc/named.conf' 03-Nov-2009 16:31:02.030 using default UDP/IPv4 port range: [1024, 65535] 03-Nov-2009 16:31:02.031 using default UDP/IPv6 port range: [1024, 65535] 03-Nov-2009 16:31:02.034 listening on IPv4 interface lo, 127.0.0.1#53 03-Nov-2009 16:31:02.034 listening on IPv4 interface eth0, 10.0.0.5#53 03-Nov-2009 16:31:02.034 listening on IPv4 interface eth1, ww.xx.yy.zz#53 03-Nov-2009 16:31:02.040 command channel listening on 127.0.0.1#953 03-Nov-2009 16:31:02.040 command channel listening on ::1#953 03-Nov-2009 16:31:02.040 ignoring config file logging statement due to -g option 03-Nov-2009 16:31:02.041 zone 0.in-addr.arpa/IN/localhost_resolver: loaded serial 42 03-Nov-2009 16:31:02.042 zone 0.0.127.in-addr.arpa/IN/localhost_resolver: loaded serial 1997022700 03-Nov-2009 16:31:02.042 zone 255.in-addr.arpa/IN/localhost_resolver: loaded serial 42 03-Nov-2009 16:31:02.042 zone 0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa/IN/localhost_resolver: loaded serial 1997022700 03-Nov-2009 16:31:02.043 zone localdomain/IN/localhost_resolver: loaded serial 42 03-Nov-2009 16:31:02.043 zone localhost/IN/localhost_resolver: loaded serial 42 03-Nov-2009 16:31:02.043 zone x.y.z.in-addr.arpa/IN/internal: loaded serial 1 03-Nov-2009 16:31:02.044 zone x.y.z/IN/internal: loaded serial 2 03-Nov-2009 16:31:02.045 running What type and size of firearm should I use to resolve this? I'd prefer something with automatic ammunition, and, at worst, it should be able to fit on my shoulder. Of course I am open to suggestions.

Read the article
Expectations + Rewards = Innovation

- by D'Arcy Lussier

“Innovation” is a heavy word. We regard those that embrace it as “Innovators”. We describe organizations as being “Innovative”. We hold those associated with the word in high regard, even though its dictionary definition is very simple: Introducing something new. What our culture has done is wrapped Innovation in white robes and a gold crown. Innovation is rarely just introducing something new. Innovations and innovators are typically associated with other terms: groundbreaking, genius, industry-changing, creative, leading. Being a true innovator and creating innovations are a big deal, and something companies try to strive for…or at least say they strive for. There’s huge value in being recognized as an innovator in an industry, since the idea is that innovation equates to increased profitability. IBM ran an ad a few years back that showed what their view of innovation is: “The point of innovation is to make actual money.” If the money aspect makes you feel uneasy, consider it another way: the point of innovation is to <insert payoff here>. Companies that innovate will be more successful. Non-profits that innovate can better serve their target clients. Governments that innovate can better provide services to their citizens. True innovation is not easy to come by though. As with anything in business, how well an organization will innovate is reliant on the employees it retains, the expectations placed on those employees, and the rewards available to them. In a previous blog post I talked about one formula: Right Employees + Happy Employees = Productive Employees I want to introduce a new one, that builds upon the previous one: Expectations + Rewards = Innovation The level of innovation your organization will realize is directly associated with the expectations you place on your staff and the rewards you make available to them. Expectations We may feel uncomfortable with the idea of placing expectations on our staff, mainly because expectation has somewhat of a negative or cold connotation to it: “I expect you to act this way or else!” The problem is in the or-else part…we focus on the negative aspects of failing to meet expectations instead of looking at the positive side. “I expect you to act this way because it will produce <insert benefit here>”. Expectations should not be set to punish but instead be set to ensure quality. At a recent conference I spoke with some Microsoft employees who told me that you have five years from starting with the company to reach a “Senior” level. If you don’t, then you’re let go. The expectation Microsoft placed on their staff is that they should be working towards improving themselves, taking more responsibility, and thus ensure that there is a constant level of quality in the workforce. Rewards Let me be clear: a paycheck is not a reward. A paycheck is simply the employer’s responsibility in the employee/employer relationship. A paycheck will never be the key motivator to drive innovation. Offering employees something over and above their required compensation can spur them to greater performance and achievement. Working in the food service industry, this tactic was used again and again: whoever has the highest sales over lunch will receive a free lunch/gift certificate/entry into a draw/etc. There was something to strive for, to try beyond the baseline of what our serving jobs were. It was through this that innovative sales techniques would be tried and honed, with key servers being top sellers time and time again. At a code camp I spoke at, I was amazed to see that all the employees from one company receive $100 Visa gift cards as a thank you for taking time to speak. Again, offering something over and above that can give that extra push for employees. Rewards work. But what about the fairness angle? In the restaurant example I gave, there were servers that would never win the competition. They just weren’t good enough at selling and never seemed to get better. So should those that did work at performing better and produce more sales for the restaurant not get rewarded because those who weren’t working at performing better might get upset? Of course not! Organizations succeed because of their top performers and those that strive to join their ranks. The Expectation/Reward Graph While the Expectations + Rewards = Innovation formula may seem like a simple mathematics formula, there’s much more going under the hood. In fact there are three different outcomes that could occur based on what you put in as values for Expectations and Rewards. Consider the graph below and the descriptions that follow: Disgruntled – High Expectation, Low Reward I worked at a company where the mantra was “Company First, Because We Pay You”. Even today I still hear stories of how this sentiment continues to be perpetuated: They provide you a paycheck and a means to live, therefore you should always put them as your top priority. Of course, this is a huge imbalance in the expectation/reward equation. Why would anyone willingly meet high expectations of availability, workload, deadlines, etc. when there is no reward other than a paycheck to show for it? Remember: paychecks are not rewards! Instead, you see employees be disgruntled which not only affects the level of production but also the level of quality within an organization. It also means that you see higher turnover. Complacent – Low Expectation, Low Reward Complacency is a systemic problem that typically exists throughout all levels of an organization. With no real expectations or rewards, nobody needs to excel. In fact, those that do try to innovate, improve, or introduce new things into the organization might be shunned or pushed out by the rest of the staff who are just doing things the same way they’ve always done it. The bigger issue for the organization with low/low values is that at best they’ll never grow beyond their current size (and may shrink actually), and at worst will cease to exist. Entitled – Low Expectation, High Reward It’s one thing to say you have the best people and reward them as such, but its another thing to actually have the best people and reward them as such. Organizations with Entitled employees are the former: their organization provides them with all types of comforts, benefits, and perks. But there’s no requirement before the rewards are dolled out, and there’s no short-list of who receives the rewards. Everyone in the company is treated the same and is given equal share of the spoils. Entitlement is actually almost identical with Complacency with one notable difference: just try to introduce higher expectations into an entitled organization! Entitled employees have been spoiled for so long that they can’t fathom having rewards taken from them, or having to achieve specific levels of performance before attaining them. Those running the organization also buy in to the Entitled sentiment, feeling that they must persist the same level of comforts to appease their staff…even though the quality of the employee pool may be suspect. Innovative – High Expectation, High Reward Finally we have the Innovative organization which places high expectations but also provides high rewards. This organization gets it: if you truly want the best employees you need to apply equal doses of pressure and praise. Realize that I’m not suggesting crazy overtime or un-realistic working conditions. I do not agree with the “Glengary-Glenross” method of encouragement. But as anyone who follows sports can tell you, the teams that win are the ones where the coaches push their players to be their best; to achieve new levels of performance that they didn’t know they could receive. And the result for the players is more money, fame, and opportunity. It’s in this environment that organizations can focus on innovation – true innovation that builds the business and allows everyone involved to truly benefit. In Closing Organizations love to use the word “Innovation” and its derivatives, but very few actually do innovate. For many, the term has just become another marketing buzzword to lump in with all the other business terms that get overused. But for those organizations that truly get the value of innovation, they will be the ones surging forward while other companies simply fade into the background. And they will be the organizations that expect more from their employees, and give them their just rewards.

Read the article
Why We Should Learn to Stop Worrying and Love Millennials

- by HCM-Oracle

By Christine Mellon Much is said and written about the new generations of employees entering our workforce, as though they are a strange specimen, a mysterious life form to be “figured out,” accommodated and engaged – at a safe distance, of course. At its worst, this talk takes a critical and disapproving tone, with baby boomer employees adamantly refusing to validate this new breed of worker, let alone determine how to help them succeed and achieve their potential. The irony of our baby-boomer resentments and suspicions is that they belie the fact that we created the very vision that younger employees are striving to achieve. From our frustrations with empty careers that did not fulfill us, from our opposition to “the man,” from our sharp memories of our parents’ toiling for 30 years just for the right to retire, from the simple desire not to live our lives in a state of invisibility, came the seeds of hope for something better. One characteristic of Millennial workers that grew from these seeds is the desire to experience as much as possible. They are the “Experiential Employee”, with a passion for growing in diverse ways and expanding personal and professional horizons. Rather than rooting themselves in a single company for a career, or even in a single career path, these employees are committed to building a broad portfolio of experiences and capabilities that will enable them to make a difference and to leave a mark of significance in the world. How much richer is the organization that nurtures and leverages this inclination? Our curmudgeonly ways must be surrendered and our focus redirected toward building the next generation of talent ecosystems, if we are to optimize what future generations have to offer. Accelerating Professional Development In spite of our Boomer grumblings about Millennials’ “unrealistic” expectations, the truth is that we have a well-matched set of circumstances. We have executives-in-waiting who want to learn quickly and a concurrent, urgent need to ramp up their development time, based on anticipated high levels of retirement in the next 10+ years. Since we need to rapidly skill up these heirs to the corporate kingdom, isn’t it a fortunate coincidence that they are hungry to learn, develop and move fluidly throughout our organizations?? So our challenge now is to efficiently operationalize the wisdom we have acquired about effective learning and development. We have already evolved from classroom-based models to diverse instructional methods. The next step is to find the best approaches to help younger employees learn quickly and apply new learnings in an impactful way. Creating temporary or even permanent functional partnerships among Millennial employees is one way to maximize outcomes. This might take the form of 2 or more employees owning aspects of what once fell under a single role. While one might argue this would mean duplication of resources, it could be a short term cost while employees come up to speed. And the potential benefits would be numerous: leveraging and validating the inherent sense of community of new generations, creating cross-functional skills with broad applicability, yielding additional perspectives and approaches to traditional work outcomes, and accelerating the performance curve for incumbents through Cooperative Learning (Johnson, D. and Johnson R., 1989, 1999). This well-researched teaching strategy, where students support each other in the absorption and application of new information, has been shown to deliver faster, more efficient learning, and greater retention. Alternately, perhaps short term contracts with exiting retirees, or former retirees, to help facilitate the development of following generations may have merit. Again, a short term cost, certainly. However, the gains realized in shortening the learning curve, and strengthening engagement are substantial and lasting. Ultimately, there needs to be creative thinking applied for each organization on how to accelerate the capabilities of our future leaders in unique ways that mesh with current culture. The manner in which performance is evaluated must finally shift as well. Employees will need to be assessed on how well they have developed key skills and capabilities vs. end-to-end mastery of functional positions they have no interest in keeping for an entire career. As we become more comfortable in placing greater and greater weight on competencies vs. tasks, we will realize increased organizational agility via this new generation of workers, which will be further enhanced by their natural flexibility and appetite for change. Revisiting Succession For many years, organizations have failed to deliver desired succession planning outcomes. According to CEB’s 2013 research, only 28% of current leaders were pre-identified in a succession plan. These disappointing results, along with the entrance of the experiential, Millennial employee into the workforce, may just provide the needed impetus for HR to reinvent succession processes. We have recognized that the best professional development efforts are not always linear, and the time has come to fully adopt this philosophy in regard to succession as well. Paths to specific organizational roles will not look the same for newer generations who seek out unique learning opportunities, without consideration of a singular career destination. Rather than charting particular jobs as precursors for key positions, the experiences and skills behind what makes an incumbent successful must become essential in succession mapping. And the multitude of ways in which those experiences and skills may be acquired must be factored into the process, along with the individual employee’s level of learning agility. While this may seem daunting, it is necessary and long overdue. We have talked about the criticality of competency-based succession, however, we have not lived up to our own rhetoric. Many Boomers have experienced the same frustration in our careers; knowing we are capable of shining in a particular role, but being denied the opportunity due to how our career history lined up, on paper, with documented job requirements. These requirements usually emphasized past jobs/titles and specific tasks, versus capabilities, drive and willingness (let alone determination) to learn new things. How satisfying would it be for us to leave a legacy where such narrow thinking no longer applies and potential is amplified? Realizing Diversity Another bloom from the seeds we Boomers have tried to plant over the past decades is a completely evolved view of diversity. Millennial employees assume a diverse workforce, and are startled by anything less. Their social tolerance, nurtured by wide and diverse networks, is unprecedented. College graduates expect a similar landscape in the “real world” to what they experienced throughout their lives. They appreciate and seek out divergent points of view and experiences without needing any persuasion. The face of our U.S. workforce will likely see dramatic change as Millennials apply their fresh take on hiring and building strong teams, with an inherent sense of inclusion. This wonderful aspect of the Millennial wave should be celebrated and strongly encouraged, as it is the fulfillment of our own aspirations. Future Perfect The Experiential Employee is operating more as a free agent than a long term player, and their commitment will essentially last as long as meaningful organizational culture and personal/professional opportunities keep their interest. As Boomers, we have laid the foundation for this new, spirited employment attitude, and we should take pride in knowing that. Generations to come will challenge organizations to excel in how they identify, manage and nurture talent. Let’s support and revel in the future that we’ve helped invent, rather than lament what we think has been lost. After all, the future is always connected to the past. And as so eloquently phrased by Antoine Lavoisier, French nobleman, chemist and politico: “Nothing is Lost, Nothing is Created, and Everything is Transformed.” Christine has over 25 years of diverse HR experience. She has held HR consulting and corporate roles, including CHRO positions for Echostar in Denver, a 6,000+ employee global engineering firm, and Aepona, a startup software firm, successfully acquired by Intel. Christine is a resource to Oracle clients, to assist in Human Capital Management strategy development and implementation, compensation practices, talent development initiatives, employee engagement, global HR management, and integrated HR systems and processes that support the full employee lifecycle.

Read the article
Visiting the Fire Station in Coromandel

Hm, I just tried to remember how we actually came up with this cool idea... but it's already too blurred and it doesn't really matter after all. Anyway, if I remember correctly (IIRC), it happened during one of the Linux meetups at Mugg & Bean, Bagatelle where Ajay and I brought our children along and we had a brief conversation about how cool it would be to check out one of the fire stations here in Mauritius. We both thought that it would be a great experience and adventure for the little ones. An idea takes shape And there we go, down the usual routine these... having an idea, checking out the options and discussing who's doing what. Except this time, it was all up to Ajay, and he did a fantastic job. End of August, he told me that he got in touch with one of his friends which actually works as a fire fighter at the station in Coromandel and that there could be an option to come and visit them (soon). A couple of days later - Confirmed! Be there, and in time... What time? Anyway, doesn't really matter... Everything was settled and arranged. I asked the kids on Friday afternoon if they might be interested to see the fire engines and what a fire fighter is doing. Of course, they were all in! Getting up early on Sunday morning isn't really a regular exercise for all of us but everything went smooth and after a short breakfast it was time to leave. Where are we going? Are we there yet? Now, we are in Bambous. Why do you go this way? The kids were so much into it. Absolutely amazing to see their excitement. Are we there yet? Well, we went through the sugar cane fields towards Chebel and then down into the industrial zone at Coromandel. Honestly, I had a clue where the fire station is located but having Google Maps in reach that shouldn't be a problem in case that we might get lost. But my worries were washed away when our children guided us... "There! Over there are the fire engines! We have to turn left, dad." - No comment, the kids were right! As we were there a little bit too early, we parked the car and the kids started to explore the area and outskirts of the fire station. Some minutes later, as if we had placed an order a unit of two cars had to go out for an alarm and the kids could witness them leaving as closely as possible. Sirens on and wow!!! Ladder truck L32 - MAN truck with Rosenbauer built-up and equipment by Metz Taking the tour Ajay arrived shortly after that and guided us finally inside the station to meet with his pal. The three guys were absolutely well-prepared and showed us around in the hall, explaining that there two units out at the moment. But the ladder truck (with max. 32m expandable height) was still around we all got a great insight into the technique and equipment on the vehicle. It was amazing to see all three kids listening to Mambo as give some figures about the truck and how the fire fighters are actually it. The children and 'our' fire fighters of the day had great fun with the various fire engines Absolutely fantastic that the children were allowed to experience this - we had so much fun! Ajay's son brought two of his toy fire engines along, shared them with ours, and they all played very well together. As a parent it was really amazing to see them at such an ease. Enough theory Shortly afterwards the ladder truck was moved outside, got stabilised and ready to go for 'real-life' exercising. With the additional equipment of safety helmets, security belts and so on, we all got a first-hand impression about how it could be as a fire-fighter. Actually, I was totally amazed by the curiousity and excitement of my BWE. She was really into it and asked lots of interesting questions - in general but also technical. And while our fighters were busy with Ajay and family, I gave her some more details and explanations about the truck, the expandable ladder, the safety cage at the top and other equipment available. Safety first! No exceptions and always be prepared for the worst case... Also, the equipped has been checked prior to excuse - This is your life saver... Hooked up and ready to go... ...of course not too high. This is just a demonstration - and 32 meters above ground isn't for everyone. Well, after that it was me that had the asking looks on me, and I finally revealed to the local fire fighters that I was in the auxiliary fire brigade, more precisely in the hazard department, for more than 10 years. So not a professional fire fighter but at least a passionate and educated one as them. Inside the station Our fire fighters really took their time to explain their daily job to kids, provided them access to operation seat on the ladder truck and how the truck cabin is actually equipped with the different radios and so on. It was really a great time. Later on we had a brief tour through the building itself, and again all of our questions were answered. We had great fun and started to joke about bits and pieces. For me it was also very interesting to see the comparison between the fire station here in Mauritius and the ones I have been to back in Germany. Amazing to see them completely captivated in the play - the children had lots of fun! Also, that there are currently ten fire stations all over the island, plus two additional but private ones at the airport and at the harbour. The newest one is actually down in Black River on the west coast because the time from Quatre Bornes takes too long to have any chance of an effective alarm at all. IMHO, a very good decision as time is the most important factor in getting fire incidents under control. After all it was great experience for all of us, especially for the children to see and understand that their toy trucks are only copies of the real thing and that the job of a (professional) fire fighter is very important in our society. Don't forget that those guys run into the danger zone while you're trying to get away from it as much as possible. Another unit just came back from a grass fire - and shortly after they went out again. No time to rest, too much to do! Mauritian Fire Fighters now and (maybe) in the future... Thank you! It was an honour to be around! Thank you to Ajay for organising and arranging this Sunday morning event, and of course of Big Thank You to the three guys that took some time off to have us at the Fire Station in Coromandel and guide us through their daily job! And remember to call 115 in case of emergencies!

Read the article
Learnings from trying to write better software: Loud errors from the very start

- by theo.spears

Microsoft made a very small number of backwards incompatible changes between .NET 1.1 and 2.0, because they wanted to make it as easy and safe as possible to port applications to the new runtime. (Here’s a list.) However, one thing they did change was what happens when a background thread fails with an unhanded exception - in .NET 1.1 nothing happened, the thread terminated, and the application continued oblivious. Try the same trick in .NET 2.0 and the entire application, including all threads, will rudely terminate. There are three reasons for this. Firstly if a background thread has crashed, it may have left the entire application in an inconsistent state, in a way that will affect other threads. It’s better to terminate the entire application than continue and have the application perform actions based on a broken state, for example take customer orders, or write corrupt files to disk. Secondly, during software development, it is far better for errors to be loud and obtrusive. Even if you have unit tests and integration tests (and you should), a key part of ensuring software works properly is to actually try using it, both through systematic testing and through the casual use all software gets by its developers during use. Subtle errors are easy to miss if you are not actually doing real work using the application, loud errors are obvious. Thirdly, and most importantly, even if catching and swallowing exceptions indiscriminately doesn't cause any problems in your application, the presence of unexpected exceptions shows you do not fully understand the behavior of your code. The currently released version of your application may be absolutely correct. However, because your mental model of the behavior is wrong, any future change you make to the program could and probably will introduce critical errors. This applies to more than just exceptions causing threads to exit, any unexpected state should make the application blow up in an un-ignorable way. The worst thing you can do is silently swallow errors and continue. And let's be clear, writing to a log file does not count as blowing up in an un-ignorable way. This is all simple as long as the call stack only contains your code, but when your functions start to be called by third party or .NET framework code, it's surprisingly easy for exceptions to start vanishing. Let's look at two examples. 1. Windows forms drag drop events Usually if you throw an exception from a winforms event handler it will bring up the "application has crashed" dialog with abort and continue options. This is a good default behavior - the error is big and loud, but it is possible for the user to ignore the error and hopefully save their data, if somehow this bug makes it past testing. However drag and drop are different - throw an exception from one of these and it will just be silently swallowed with no explanation. By the way, it's not just drag and drop events. Timer events do it too. You can research how exceptions are treated in different handlers and code appropriately, but the safest and most user friendly approach is to always catch exceptions in your event handlers and show your own error message. I'll talk about one good approach to handling these exceptions at the end of this post. 2. SSMS integration for SQL Tab Magic A while back wrote an SSMS add-in called SQL Tab Magic (learn more about the process here). It works by listening to certain SSMS events and remembering what documents are opened and closed. I deployed it internally and it was used for a few months by a number of people without problems, so I was reasonably confident in its quality. Before releasing I made a few cleanups, including introducing error reporting. Bam. A few days later I was looking at over 1,000 error reports in my inbox. In turns out I wasn't handling table designers properly. The exceptions were there, but again SSMS was helpfully swallowing them all for me, so I was blissfully unaware. Had I made my errors loud from the start, I would have noticed these issues long before and fixed them. Handling exceptions Now you are systematically catching exceptions throughout your application, you need to do something with them. I've tried 3 options: log them, alert the user, and automatically send them home. There are a few good options for logging in .NET. The most widespread is Apache log4net, which provides a very capable and configurable logging framework. There is also NLog which has a compatible interface, with a greater emphasis on fluent rather than XML configuration. Alerting the user serves two purposes. Firstly it means they understand their action has failed to they don't just assume it worked (Silent file copy failure is a problem if you then delete the originals) or that they should keep waiting for a background task to complete. Secondly, it means the users can report the bug to your support team, and then you can fix it. This means the message you show the user should contain the information you need as a developer to identify and fix it. And the user will probably just send you a screenshot of the dialog, so it shouldn't be hidden by scroll bars. This leads us to the third option, automatically sending error reports home. By automatic I mean with minimal effort on the part of the user, rather than doing it silently behind their backs. The advantage of this is you can send back far more detailed and precise information than you can expect a user to include in an email, and by making it easier to report errors, you make it more likely users will do so. We do this using a great tool called SmartAssembly (full disclosure: this is a product made by Red Gate). It captures complete stack traces including the values of all local variables and then allows the user to send all this information back with a single click. We also capture log files to help understand what lead up to the error. We then use the free SmartAssembly Sync for Jira to dedupe these reports and raise them as bugs in our bug tracking system. The combined effect of loud errors during development and then automatic error reporting once software is deployed allows us to find and fix more bugs, correct misunderstandings on how our software works, and overall is a key piece in delivering higher quality software. However it is no substitute for having motivated cunning testers in the building - and we're looking to hire more of those too. If you found this post interesting you should follow me on twitter.

Read the article
It was a figure of speech!

- by Ratman21

Yesterday I posted the following as attention getter / advertisement (as well as my feelings). In the groups, (I am in) on the social networking site, LinkedIn and boy did I get responses. I am fighting mad about (a figure of speech, really) not having a job! Look just because I am over 55 and have gray hair. It does not mean, my brain is dead or I can no longer trouble shoot a router or circuit or LAN issue. Or that I can do “IT” work at all. And I could prove this if; some one would give me at job. Come on try me for 90 days at min. wage. I know you will end up keeping me (hope fully at normal pay) around. Is any one hearing me…come on take up the challenge! This was the responses I got. I hear you. We just need to retrain and get our skills up to speed is all. That is what I am doing. I have not given up. Just got to stay on top of the game. Experience is on our side if we have the credentials and we are reasonable about our salaries this should not be an issue. Already on it, going back to school and have got three certifications (CompTIA A+, Security+ and Network+. I am now studying for my CISCO CCNA certification. As to my salary, I am willing to work at very reasonable rate. You need to re-brand yourself like a product, market and sell yourself. You need to smarten up, look and feel a million dollars, re-energize yourself, regain your confidents. Either start your own business, or re-write your CV so it stands out from the rest, get the template off the internet. Contact every recruitment agent in your town, state, country and overseas, and on the web. Apply to every job you think you could do, you may not get it but you will make a contact for your network, which may lead to a job at the end of the tunnel. Get in touch with everyone you know from past jobs. Do charity work. I maintain the IT Network, stage electrical and the Telecom equipment in my church, Again already on it. I have email the world is seems with my resume and cover letters. So far, I have rewritten or had it rewrote, my resume and cover letters; over seven times so far. Re-energize? I never lost my energy level or my self-confidents in my work (now if could get some HR personal to see the same). I also volunteer at my church, I created and maintain the church web sit. I share your frustration. Sucks being over 50 and looking for work. Please don't sell yourself short at min wage because the employer will think that’s your worth. Keep trying!! I never stop trying and min wage is only for 90 days. If some one takes up the challenge. Some post asked if I am keeping up technology. Do you keep up with the latest technology and can speak the language fluidly? Yep to that and as to speaking it also a yep! I am a geek you know. I heard from others over the 50 year mark and younger too. I'm with you! I keep getting told that I don't have enough experience because I just recently completed a Masters level course in Microsoft SQL Server, which gave me a project-intensive equivalent of between 2 and 3 years of experience. On top of that training, I have 19 years as an applications programmer and database administrator. I can normalize rings around experienced DBAs and churn out effective code with the best of them. But my 19 years is worthless as far as most recruiters and HR people are concerned because it is not the specific experience for which they're looking. HR AND RECRUITERS TAKE NOTE: Experience, whatever the language, translates across platforms and technology! By the way, I'm also over 55 and still have "got it"! I never lost it and I also can work rings round younger techs. I'm 52 and female and seem to be having the same issues. I have over 10 years experience in tech support (with a BS in CIS) and can't get hired either. Ow, I only have an AS in computer science along with my certifications. Keep the faith, I have been unemployed since August of 2008. I agree with you...I am willing to return to the beginning of my retail career and work myself back through the ranks, if someone will look past the grey and realize the knowledge I would bring to the table. I also would like some one to look past the gray. Interesting approach, volunteering to work for minimum wage for 90 days. I'm in the same situation as you, being 55 & balding w/white hair, so I know where you're coming from. I've been out of work now for a year. I'm in Michigan, where the unemployment rate is estimated to be 15% (the worst in the nation) & even though I've got 30+ years of IT experience ranging from mainframe to PC desktop support, it's difficult to even get a face-to-face interview. I had one prospective employer tell me flat out that I "didn't have the energy required for this position". Mostly I never get any feedback. All I can say is good luck & try to remain optimistic. He said WHAT! Yes remaining optimistic is key. Along with faith in God. Then there was this (for lack of better word) jerk. Give it up already. You were too old to work in high tech 10 years ago. Scratch that, 20 years ago! Try selling hot dogs in front of Fry's Electronics. At least you would get a chance to eat lunch with your previous colleagues.... You know funny thing on this person is that I checked out his profile. He is older than I am.

Read the article
KVM + Cloudmin + IpTables

- by Alex

I have a KVM virtualization on a machine. I use Ubuntu Server + Cloudmin (in order to manage virtual machine instances). On a host system I have four network interfaces: ebadmin@saturn:/var/log$ ifconfig br0 Link encap:Ethernet HWaddr 10:78:d2:ec:16:38 inet addr:192.168.0.253 Bcast:192.168.0.255 Mask:255.255.255.0 inet6 addr: fe80::1278:d2ff:feec:1638/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:589337 errors:0 dropped:0 overruns:0 frame:0 TX packets:334357 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:753652448 (753.6 MB) TX bytes:43385198 (43.3 MB) br1 Link encap:Ethernet HWaddr 6e:a4:06:39:26:60 inet addr:192.168.10.1 Bcast:192.168.10.255 Mask:255.255.255.0 inet6 addr: fe80::6ca4:6ff:fe39:2660/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:16995 errors:0 dropped:0 overruns:0 frame:0 TX packets:13309 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2059264 (2.0 MB) TX bytes:1763980 (1.7 MB) eth0 Link encap:Ethernet HWaddr 10:78:d2:ec:16:38 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:610558 errors:0 dropped:0 overruns:0 frame:0 TX packets:332382 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:769477564 (769.4 MB) TX bytes:44360402 (44.3 MB) Interrupt:20 Memory:fe400000-fe420000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:239632 errors:0 dropped:0 overruns:0 frame:0 TX packets:239632 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:50738052 (50.7 MB) TX bytes:50738052 (50.7 MB) tap0 Link encap:Ethernet HWaddr 6e:a4:06:39:26:60 inet6 addr: fe80::6ca4:6ff:fe39:2660/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:17821 errors:0 dropped:0 overruns:0 frame:0 TX packets:13703 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:2370468 (2.3 MB) TX bytes:1782356 (1.7 MB) br0 is connected to a real network, br1 is used to create a private network shared between guest systems. Now I need to configure iptables for network access. First of all I allow ssh sessions on port 8022 on the host system, then I allow all connections in state RELATED, ESTABLISHED. This is working ok. I install another system as guest, it's IP address is 192.168.10.2, and now I have two problems: I want to allow the access from this host to the outside world, cannot accomplish this. I can ssh from the host. I want to be able to ssh to the guest from the outside world using 8023 port. Cannot accomplish this. Full iptables configuration is following: ebadmin@saturn:/var/log$ sudo iptables --list [sudo] password for ebadmin: Chain INPUT (policy DROP) target prot opt source destination ACCEPT all -- anywhere anywhere ACCEPT tcp -- anywhere anywhere tcp dpt:8022 ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED LOG all -- anywhere anywhere LOG level warning Chain FORWARD (policy ACCEPT) target prot opt source destination LOG all -- anywhere anywhere LOG level warning Chain OUTPUT (policy ACCEPT) target prot opt source destination LOG all -- anywhere anywhere LOG level warning ebadmin@saturn:/var/log$ sudo iptables -t nat --list Chain PREROUTING (policy ACCEPT) target prot opt source destination DNAT tcp -- anywhere anywhere tcp spt:8023 to:192.168.10.2:22 Chain INPUT (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination Chain POSTROUTING (policy ACCEPT) target prot opt source destination The worst of all is that I don't know how to interpret iptables logs. I don't see the final decision of the firewall. Need help urgently.

Read the article
Source-control 'wet-work'?

- by Phil Factor

When a design or creative work is flawed beyond remedy, it is often best to destroy it and start again. The other day, I lost the code to a long and intricate SQL batch I was working on. I’d thought it was impossible, but it happened. With all the technology around that is designed to prevent this occurring, this sort of accident has become a rare event. If it weren’t for a deranged laptop, and my distraction, the code wouldn’t have been lost this time. As always, I sighed, had a soothing cup of tea, and typed it all in again. The new code I hastily tapped in was much better: I’d held in my head the essence of how the code should work rather than the details: I now knew for certain the start point, the end, and how it should be achieved. Instantly the detritus of half-baked thoughts fell away and I was able to write logical code that performed better. Because I could work so quickly, I was able to hold the details of all the columns and variables in my head, and the dynamics of the flow of data. It was, in fact, easier and quicker to start from scratch rather than tidy up and refactor the existing code with its inevitable fumbling and half-baked ideas. What a shame that technology is now so good that developers rarely experience the cleansing shock of losing one’s code and having to rewrite it from scratch. If you’ve never accidentally lost your code, then it is worth doing it deliberately once for the experience. Creative people have, until Technology mistakenly prevented it, torn up their drafts or sketches, threw them in the bin, and started again from scratch. Leonardo’s obsessive reworking of the Mona Lisa was renowned because it was so unusual: Most artists have been utterly ruthless in destroying work that didn’t quite make it. Authors are particularly keen on writing afresh, and the results are generally positive. Lawrence of Arabia actually lost the entire 250,000 word manuscript of ‘The Seven Pillars of Wisdom’ by accidentally leaving it on a train at Reading station, before rewriting a much better version. Now, any writer or artist is seduced by technology into altering or refining their work rather than casting it dramatically in the bin or setting a light to it on a bonfire, and rewriting it from the blank page. It is easy to pick away at a flawed work, but the real creative process is far more brutal. Once, many years ago whilst running a software house that supplied commercial software to local businesses, I’d been supervising an accounting system for a farming cooperative. No packaged system met their needs, and it was all hand-cut code. For us, it represented a breakthrough as it was for a government organisation, and success would guarantee more contracts. As you’ve probably guessed, the code got mangled in a disk crash just a week before the deadline for delivery, and the many backups all proved to be entirely corrupted by a faulty tape drive. There were some fragments left on individual machines, but they were all of different versions. The developers were in despair. Strangely, I managed to re-write the bulk of a three-month project in a manic and caffeine-soaked weekend. Sure, that elegant universally-applicable input-form routine was‘nt quite so elegant, but it didn’t really need to be as we knew what forms it needed to support. Yes, the code lacked architectural elegance and reusability. By dawn on Monday, the application passed its integration tests. The developers rose to the occasion after I’d collapsed, and tidied up what I’d done, though they were reproachful that some of the style and elegance had gone out of the application. By the delivery date, we were able to install it. It was a smaller, faster application than the beta they’d seen and the user-interface had a new, rather Spartan, appearance that we swore was done to conform to the latest in user-interface guidelines. (we switched to Helvetica font to look more ‘Bauhaus’ ). The client was so delighted that he forgave the new bugs that had crept in. I still have the disk that crashed, up in the attic. In IT, we have had mixed experiences from complete re-writes. Lotus 123 never really recovered from a complete rewrite from assembler into C, Borland made the mistake with Arago and Quattro Pro and Netscape’s complete rewrite of their Navigator 4 browser was a white-knuckle ride. In all cases, the decision to rewrite was a result of extreme circumstances where no other course of action seemed possible. The rewrite didn’t come out of the blue. I prefer to remember the rewrite of Minix by young Linus Torvalds, or the rewrite of Bitkeeper by a slightly older Linus. The rewrite of CP/M didn’t do too badly either, did it? Come to think of it, the guy who decided to rewrite the windowing system of the Xerox Star never regretted the decision. I’ll agree that one should often resist calls for a rewrite. One of the worst habits of the more inexperienced programmer is to denigrate whatever code he or she inherits, and then call loudly for a complete rewrite. They are buoyed up by the mistaken belief that they can do better. This, however, is a different psychological phenomenon, more related to the idea of some motorcyclists that they are operating on infinite lives, or the occasional squaddies that if they charge the machine-guns determinedly enough all will be well. Grim experience brings out the humility in any experienced programmer. I’m referring to quite different circumstances here. Where a team knows the requirements perfectly, are of one mind on methodology and coding standards, and they already have a solution, then what is wrong with considering a complete rewrite? Rewrites are so painful in the early stages, until that point where one realises the payoff, that even I quail at the thought. One needs a natural disaster to push one over the edge. The trouble is that source-control systems, and disaster recovery systems, are just too good nowadays. If I were to lose this draft of this very blog post, I know I’d rewrite it much better. However, if you read this, you’ll know I didn’t have the nerve to delete it and start again. There was a time that one prayed that unreliable hardware would deliver you from an unmaintainable mess of a codebase, but now technology has made us almost entirely immune to such a merciful act of God. An old friend of mine with long experience in the software industry has long had the idea of the ‘source-control wet-work’, where one hires a malicious hacker in some wild eastern country to hack into one’s own source control system to destroy all trace of the source to an application. Alas, backup systems are just too good to make this any more than a pipedream. Somehow, it would be difficult to promote the idea. As an alternative, could one construct a source control system that, on doing all the code-quality metrics, would systematically destroy all trace of source code that failed the quality test? Alas, I can’t see many managers buying into the idea. In reading the full story of the near-loss of Toy Story 2, it set me thinking. It turned out that the lucky restoration of the code wasn’t the happy ending one first imagined it to be, because they eventually came to the conclusion that the plot was fundamentally flawed and it all had to be rewritten anyway. Was this an early case of the ‘source-control wet-job’?’ It is very hard nowadays to do a rapid U-turn in a development project because we are far too prone to cling to our existing source-code.

Read the article
Disaster, or Migration?

- by Rob Farley

This post is in two parts – technical and personal. And I should point out that it’s prompted in part by this month’s T-SQL Tuesday, hosted by Allen Kinsel. First, the technical: I’ve had a few conversations with people recently about migration – moving a SQL Server database from one box to another (sometimes, but not primarily, involving an upgrade). One question that tends to come up is that of downtime. Obviously there will be some period of time between the old server being available and the new one. The way that most people seem to think of migration is this: Build a new server. Stop people from using the old server. Take a backup of the old server Restore it on the new server. Reconfigure the client applications (or alternatively, configure the new server to use the same address as the old) Make the new server online. There are other things involved, such as testing, of course. But this is essentially the process that people tell me they’re planning to follow. The bit that I want to look at today (as you’ve probably guessed from my title) is the “backup and restore” section. If a SQL database is using the Simple Recovery Model, then the only restore option is the last database backup. This backup could be full or differential. The transaction log never gets backed up in the Simple Recovery Model. Instead, it truncates regularly to stay small. One that’s using the Full Recovery Model (or Bulk-Logged) won’t truncate its log – the log must be backed up regularly. This provides the benefit of having a lot more option available for restores. It’s a requirement for most systems of High Availability, because if you’re making sure that a spare box is up-and-running, ready to take over, then you have to be interested in the logs that are happening on the current box, rather than truncating them all the time. A High Availability system such as Mirroring, Replication or Log Shipping will initialise the spare machine by restoring a full database backup (and maybe a differential backup if available), and then any subsequent log backups. Once the secondary copy is close, transactions can be applied to keep the two in sync. The main aspect of any High Availability system is to have a redundant system that is ready to take over. So the similarity for migration should be obvious. If you need to move a database from one box to another, then introducing a High Availability mechanism can help. By turning on the Full Recovery Model and then taking a backup (so that the now-interesting logs have some context), logs start being kept, and are therefore available for getting the new box ready (even if it’s an upgraded version). When the migration is ready to occur, a failover can be done, letting the new server take over the responsibility of the old, just as if a disaster had happened. Except that this is a planned failover, not a disaster at all. There’s a fine line between a disaster and a migration. Failovers can be useful in patching, upgrading, maintenance, and more. Hopefully, even an unexpected disaster can be seen as just another failover, and there can be an opportunity there – perhaps to get some work done on the principal server to increase robustness. And if I’ve just set up a High Availability system for even the simplest of databases, it’s not necessarily a bad thing. :) So now the personal: It’s been an interesting time recently... June has been somewhat odd. A court case with which I was involved got resolved (through mediation). I can’t go into details, but my lawyers tell me that I’m allowed to say how I feel about it. The answer is ‘lousy’. I don’t regret pursuing it as long as I did – but in the end I had to make a decision regarding the commerciality of letting it continue, and I’m going to look forward to the days when the kind of money I spent on my lawyers is small change. Mind you, if I had a similar situation with an employer, I’d do the same again, but that doesn’t really stop me feeling frustrated about it. The following day I had to fly to country Victoria to see my grandmother, who wasn’t expected to last the weekend. She’s still around a week later as I write this, but her 92-year-old body has basically given up on her. She’s been a Christian all her life, and is looking forward to eternity. We’ll all miss her though, and it’s hard to see my family grieving. Then on Tuesday, I was driving back to the airport with my family to come home, when something really bizarre happened. We were travelling down the freeway, just pulled out to go past a truck (farm-truck sized, not a semi-trailer), when a car-sized mass of metal fell off it. It was something like an industrial air-conditioner, but from where I was sitting, it was just a mass of spinning metal, like something out of a movie (one friend described it as “holidays by Michael Bay”). Somehow, and I’m really don’t know how, the part of it nearest us bounced high enough to clear the car, and there wasn’t even a scratch. We pulled over the check, and I was just thanking God that we’d changed lanes when we had, and that we remained unharmed. I had all kinds of thoughts about what could’ve happened if we’d had something that size land on the windscreen... All this has drilled home that while I feel that I haven’t provided as well for the family as I could’ve done (like by pursuing an expensive legal case), I shouldn’t even consider that I have proper control over things. I get to live life, and make decisions based on what I feel is right at the time. But I’m not going to get everything right, and there will be things that feel like disasters, some which could’ve been in my control and some which are very much beyond my control. The case feels like something I could’ve pursued differently, a disaster that could’ve been avoided in some way. Gran dying is lousy of course. An accident on the freeway would have been awful. I need to recognise that the worst disasters are ones that I can’t affect, and that I need to look at things in context – perhaps seeing everything that happens as a migration instead. Life is never the same from one day to the next. Every event has a before and an after – sometimes it’s clearly positive, sometimes it’s not. I remember good events in my life (such as my wedding), and bad (such as the loss of my father when I was ten, or the back injury I had eight years ago). I’m not suggesting that I know how to view everything from the “God works all things for good” perspective, but I am trying to look at last week as a migration of sorts. Those things are behind me now, and the future is in God’s hands. Hopefully I’ve learned things, and will be able to live accordingly. I’ve come through this time now, and even though I’ll miss Gran, I’ll see her again one day, and the future is bright.

Read the article
laptop crashed: why?

- by sds

my linux (ubuntu 12.04) laptop crashed, and I am trying to figure out why. # last sds pts/4 :0 Tue Sep 4 10:01 still logged in sds pts/3 :0 Tue Sep 4 10:00 still logged in reboot system boot 3.2.0-29-generic Tue Sep 4 09:43 - 11:23 (01:40) sds pts/8 :0 Mon Sep 3 14:23 - crash (19:19) this seems to indicate a crash at 09:42 (= 14:23+19:19). as per another question, I looked at /var/log: auth.log: Sep 4 09:17:02 t520sds CRON[32744]: pam_unix(cron:session): session closed for user root Sep 4 09:43:17 t520sds lightdm: pam_unix(lightdm:session): session opened for user lightdm by (uid=0) no messages file syslog: Sep 4 09:24:19 t520sds kernel: [219104.819975] CPU0: Package power limit normal Sep 4 09:43:16 t520sds kernel: imklog 5.8.6, log source = /proc/kmsg started. kern.log: Sep 4 09:24:19 t520sds kernel: [219104.819969] CPU1: Package power limit normal Sep 4 09:24:19 t520sds kernel: [219104.819971] CPU2: Package power limit normal Sep 4 09:24:19 t520sds kernel: [219104.819974] CPU3: Package power limit normal Sep 4 09:24:19 t520sds kernel: [219104.819975] CPU0: Package power limit normal Sep 4 09:43:16 t520sds kernel: imklog 5.8.6, log source = /proc/kmsg started. Sep 4 09:43:16 t520sds kernel: [ 0.000000] Initializing cgroup subsys cpuset Sep 4 09:43:16 t520sds kernel: [ 0.000000] Initializing cgroup subsys cpu I had a computation running until 9:24, but the system crashed 18 minutes later! kern.log has many pages of these: Sep 4 09:43:16 t520sds kernel: [ 0.000000] total RAM covered: 8086M Sep 4 09:43:16 t520sds kernel: [ 0.000000] gran_size: 64K chunk_size: 64K num_reg: 10 lose cover RAM: 38M Sep 4 09:43:16 t520sds kernel: [ 0.000000] gran_size: 64K chunk_size: 128K num_reg: 10 lose cover RAM: 38M Sep 4 09:43:16 t520sds kernel: [ 0.000000] gran_size: 64K chunk_size: 256K num_reg: 10 lose cover RAM: 38M Sep 4 09:43:16 t520sds kernel: [ 0.000000] gran_size: 64K chunk_size: 512K num_reg: 10 lose cover RAM: 38M Sep 4 09:43:16 t520sds kernel: [ 0.000000] gran_size: 64K chunk_size: 1M num_reg: 10 lose cover RAM: 38M Sep 4 09:43:16 t520sds kernel: [ 0.000000] gran_size: 64K chunk_size: 2M num_reg: 10 lose cover RAM: 38M Sep 4 09:43:16 t520sds kernel: [ 0.000000] gran_size: 64K chunk_size: 4M num_reg: 10 lose cover RAM: 38M Sep 4 09:43:16 t520sds kernel: [ 0.000000] gran_size: 64K chunk_size: 8M num_reg: 10 lose cover RAM: 38M Sep 4 09:43:16 t520sds kernel: [ 0.000000] gran_size: 64K chunk_size: 16M num_reg: 10 lose cover RAM: 38M Sep 4 09:43:16 t520sds kernel: [ 0.000000] *BAD*gran_size: 64K chunk_size: 32M num_reg: 10 lose cover RAM: -16M Sep 4 09:43:16 t520sds kernel: [ 0.000000] *BAD*gran_size: 64K chunk_size: 64M num_reg: 10 lose cover RAM: -16M Sep 4 09:43:16 t520sds kernel: [ 0.000000] gran_size: 64K chunk_size: 128M num_reg: 10 lose cover RAM: 0G Sep 4 09:43:16 t520sds kernel: [ 0.000000] gran_size: 64K chunk_size: 256M num_reg: 10 lose cover RAM: 0G Sep 4 09:43:16 t520sds kernel: [ 0.000000] gran_size: 64K chunk_size: 512M num_reg: 10 lose cover RAM: 0G Sep 4 09:43:16 t520sds kernel: [ 0.000000] gran_size: 64K chunk_size: 1G num_reg: 10 lose cover RAM: 0G Sep 4 09:43:16 t520sds kernel: [ 0.000000] *BAD*gran_size: 64K chunk_size: 2G num_reg: 10 lose cover RAM: -1G does this mean that my RAM is bad?! it also says Sep 4 09:43:16 t520sds kernel: [ 2.944123] EXT4-fs (sda1): INFO: recovery required on readonly filesystem Sep 4 09:43:16 t520sds kernel: [ 2.944126] EXT4-fs (sda1): write access will be enabled during recovery Sep 4 09:43:16 t520sds kernel: [ 3.088001] firewire_core: created device fw0: GUID f0def1ff8fbd7dff, S400 Sep 4 09:43:16 t520sds kernel: [ 8.929243] EXT4-fs (sda1): orphan cleanup on readonly fs Sep 4 09:43:16 t520sds kernel: [ 8.929249] EXT4-fs (sda1): ext4_orphan_cleanup: deleting unreferenced inode 658984 ... Sep 4 09:43:16 t520sds kernel: [ 9.343266] EXT4-fs (sda1): ext4_orphan_cleanup: deleting unreferenced inode 525343 Sep 4 09:43:16 t520sds kernel: [ 9.343270] EXT4-fs (sda1): 56 orphan inodes deleted Sep 4 09:43:16 t520sds kernel: [ 9.343271] EXT4-fs (sda1): recovery complete Sep 4 09:43:16 t520sds kernel: [ 9.645799] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null) does this mean my HD is bad? As per FaultyHardware, I tried smartctl -l selftest, which uncovered no errors: smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-30-generic] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Seagate Momentus 7200.4 Device Model: ST9500420AS Serial Number: 5VJE81YK LU WWN Device Id: 5 000c50 0440defe3 Firmware Version: 0003LVM1 User Capacity: 500,107,862,016 bytes [500 GB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Mon Sep 10 16:40:04 2012 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED See vendor-specific Attribute list for marginal Attributes. General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 0) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 109) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x103b) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 117 099 034 Pre-fail Always - 162843537 3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 571 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 069 060 030 Pre-fail Always - 17210154023 9 Power_On_Hours 0x0032 095 095 000 Old_age Always - 174362787320258 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 571 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 1 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 061 043 045 Old_age Always In_the_past 39 (0 11 44 26) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 84 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 20 193 Load_Cycle_Count 0x0032 099 099 000 Old_age Always - 2434 194 Temperature_Celsius 0x0022 039 057 000 Old_age Always - 39 (0 15 0 0) 195 Hardware_ECC_Recovered 0x001a 041 041 000 Old_age Always - 162843537 196 Reallocated_Event_Count 0x000f 095 095 030 Pre-fail Always - 4540 (61955, 0) 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 254 Free_Fall_Sensor 0x0032 100 100 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 4545 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Googling for the messages proved inconclusive, I can't even figure out whether the messages are routine or catastrophic. So, what do I do now?

Read the article
Using different SSDs types (not only SATA based) as system drive

- by Hubert Kario

Currently I have a Thinkpad X61s and want to make it both a bit faster and a bit more power efficient. For that reason I thought that adding SSD drive would make most sense. Unfortunately, because of financial reasons, buying SSD of over 200GB capacity is out of reach for me (not only it would be worth more than the rest of the laptop, but also I currently have a 500GB drive in it, so even such a drive would be kind of a downgrade for me). During preliminary testing with a cheap Transcend 4GB Class 6 (14MiB/s streaming, 9MiB/s random read) card I experienced boot times to be reduced by half so putting the OS only on it would already would be an improvement. Unfortunately, my system now is about 11GiB in size so anything less than 16GB would be constraining. In this laptop I can connect additional drives on at least 5 different ways: using SATA-ATA converter caddy in the X6 Ultrabase using internal mini PCIe slot using integrated SDHC slot using CardBus (a.k.a PCMCIA or PC Card) slot using USB Thankfully, because I use only Linux on this PC the bootability of them is irrelevant as I can put the /boot partition on internal HDD and / on any of the above mentioned Flash memories (as I already did for the SDHC test). From what I was able to research and from my own experience those options come with rather big downsides or other problems: SATA-ATA caddy It has three downsides: I have to carry the Ultrabse with me at all times (it's not really inconvenient, but those grams do add) and couldn't disconnect it when I want to disconnect the battery It makes the bay unusable for the optical drive and occasional quick access to other hard drives the only caddies I could buy have rather flaky controllers in them so putting my OS on it would hamper its stability Internal mini PCIe slot This would be an ideal solution, if only I could find real PCIe SSDs, not only devices that could talk only SATA or ATA over PCIe mechanical connection (the ones used in Dell Mini or Asus EEE). Theoretically Samsung did release such devices but I couldn't find them in retail anywhere. Integrated SDHC slot It's a nice solution with a single drawback: the fastest 16GB SDHC card on the market can only do around 35MiB/s read and 15MiB/s write while still costing like a normal 40GB SATA SSD that's 10 times faster. Not really cost-effective. CardBus (a.k.a PCMCIA or PC Card) slot Those cards are much faster than the SDHC option (there are ones that can do well over 50MiB/s read in benchmarks) and from what I could find the PCMCIA controller in my laptop does support UDMA so it should be able to deliver comparable speeds. They still cost similarly to SD cards but at least they provide streaming performance comparable to my current HDD. USB That's the worst option. Not only is it limited to 20-30MiB/s by the interface itself the drive would stick out of the laptop so it's a big no no. The question As such I think that going the "CF in a CardBus adapter" route will be the best option. My question is: did anyone try using CF cards in CardBus adapters as system drives with Linux on Thinkpad laptops? Laptops in general? What was the real-world performance? I don't have any CF cards so I can't check how well does it work with suspend/resume, or whatever it's easy to make it work in initramfs (I'm using ArchLinux and SD card was trivial — add 3 modules in single config line and rebuilding initramfs) so any tips/gotchas on this are welcome as well.

Read the article
What *exactly* gets screwed when I kill -9 or pull the power?

- by Mike

Set-Up I've been a programmer for quite some time now but I'm still a bit fuzzy on deep, internal stuff. Now. I am well aware that it's not a good idea to either: kill -9 a process (bad) spontaneously pull the power plug on a running computer or server (worse) However, sometimes you just plain have to. Sometimes a process just won't respond no matter what you do, and sometimes a computer just won't respond, no matter what you do. Let's assume a system running Apache 2, MySQL 5, PHP 5, and Python 2.6.5 through mod_wsgi. Note: I'm most interested about Mac OS X here, but an answer that pertains to any UNIX system would help me out. My Concern Each time I have to do either one of these, especially the second, I'm very worried for a period of time that something has been broken. Some file somewhere could be corrupt -- who knows which file? There are over 1,000,000 files on the computer. I'm often using OS X, so I'll run a "Verify Disk" operation through the Disk Utility. It will report no problems, but I'm still concerned about this. What if some configuration file somewhere got screwed up. Or even worse, what if a binary file somewhere is corrupt. Or a script file somewhere is corrupt now. What if some hardware is damaged? What if I don't find out about it until next month, in a critical scenario, when the corruption or damage causes a catastrophe? Or, what if valuable data is already lost? My Hope My hope is that these concerns and worries are unfounded. After all, after doing this many times before, nothing truly bad has happened yet. The worst is I've had to repair some MySQL tables, but I don't seem to have lost any data. But, if my worries are not unfounded, and real damage could happen in either situation 1 or 2, then my hope is that there is a way to detect it and prevent against it. My Question(s) Could this be because modern operating systems are designed to ensure that nothing is lost in these scenarios? Could this be because modern software is designed to ensure that nothing lost? What about modern hardware design? What measures are in place when you pull the power plug? My question is, for both of these scenarios, what exactly can go wrong, and what steps should be taken to fix it? I'm under the impression that one thing that can go wrong is some programs might not have flushed their data to the disk, so any highly recent data that was supposed to be written to the disk (say, a few seconds before the power pull) might be lost. But what about beyond that? And can this very issue of 5-second data loss screw up a system? What about corruption of random files hiding somewhere in the huge forest of files on my hard drives? What about hardware damage? What Would Help Me Most Detailed descriptions about what goes on internally when you either kill -9 a process or pull the power on the whole system. (it seems instant, but can someone slow it down for me?) Explanations of all things that could go wrong in these scenarios, along with (rough of course) probabilities (i.e., this is very unlikely, but this is likely)... Descriptions of measures in place in modern hardware, operating systems, and software, to prevent damage or corruption when these scenarios occur. (to comfort me) Instructions for what to do after a kill -9 or a power pull, beyond "verifying the disk", in order to truly make sure nothing is corrupt or damaged somewhere on the drive. Measures that can be taken to fortify a computer setup so that if something has to be killed or the power has to be pulled, any potential damage is mitigated. Thanks so much!

Read the article

Search Results

Search found 908 results on 37 pages for 'the worst shady'.

Page 34/37 | < Previous Page | 30 31 32 33 34 35 36 37 | Next Page >

- by terje

- by Swartz

- by Tom Black

- by Glister

- by James

- by xenoterracide

- by Nic

- by James Michael Hare

- by Kerrie Foy

- by Reed

- by ashwalk

- by Leopd

- by Phil Factor

- by Brian Cline

- by D'Arcy Lussier

- by HCM-Oracle

- by theo.spears

- by Ratman21

- by Alex

- by Phil Factor

- by Rob Farley

- by sds

- by Hubert Kario

- by Mike

< Previous Page | 30 31 32 33 34 35 36 37 | Next Page >