redundant - Page 29 - Developer IT

.NET Code Evolution

- by Alois Kraus

Originally posted on: http://geekswithblogs.net/akraus1/archive/2013/07/24/153504.aspxAt my day job I do look at a lot of code written by other people. Most of the code is quite good and some is even a masterpiece. And there is also code which makes you think WTF… oh it was written by me. Hm not so bad after all. There are many excuses reasons for bad code. Most often it is time pressure followed by not enough ambition (who cares) or insufficient training. Normally I do care about code quality quite a lot which makes me a (perceived) slow worker who does write many tests and refines the code quite a lot because of the design deficiencies. Most of the deficiencies I do find by putting my design under stress while checking for invariants. It does also help a lot to step into the code with a debugger (sometimes also Windbg). I do this much more often when my tests are red. That way I do get a much better understanding what my code really does and not what I think it should be doing. This time I do want to show you how code can evolve over the years with different .NET Framework versions. Once there was time where .NET 1.1 was new and many C++ programmers did switch over to get rid of not initialized pointers and memory leaks. There were also nice new data structures available such as the Hashtable which is fast lookup table with O(1) time complexity. All was good and much code was written since then. At 2005 a new version of the .NET Framework did arrive which did bring many new things like generics and new data structures. The “old” fashioned way of Hashtable were coming to an end and everyone used the new Dictionary<xx,xx> type instead which was type safe and faster because the object to type conversion (aka boxing) was no longer necessary. I think 95% of all Hashtables and dictionaries use string as key. Often it is convenient to ignore casing to make it easy to look up values which the user did enter. An often followed route is to convert the string to upper case before putting it into the Hashtable. Hashtable Table = new Hashtable(); void Add(string key, string value) { Table.Add(key.ToUpper(), value); } This is valid and working code but it has problems. First we can pass to the Hashtable a custom IEqualityComparer to do the string matching case insensitive. Second we can switch over to the now also old Dictionary type to become a little faster and we can keep the the original keys (not upper cased) in the dictionary. Dictionary<string, string> DictTable = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase); void AddDict(string key, string value) { DictTable.Add(key, value); } Many people do not user the other ctors of Dictionary because they do shy away from the overhead of writing their own comparer. They do not know that .NET has for strings already predefined comparers at hand which you can directly use. Today in the many core area we do use threads all over the place. Sometimes things break in subtle ways but most of the time it is sufficient to place a lock around the offender. Threading has become so mainstream that it may sound weird that in the year 2000 some guy got a huge incentive for the idea to reduce the time to process calibration data from 12 hours to 6 hours by using two threads on a dual core machine. Threading does make it easy to become faster at the expense of correctness. Correct and scalable multithreading can be arbitrarily hard to achieve depending on the problem you are trying to solve. Lets suppose we want to process millions of items with two threads and count the processed items processed by all threads. A typical beginners code might look like this: int Counter; void IJustLearnedToUseThreads() { var t1 = new Thread(ThreadWorkMethod); t1.Start(); var t2 = new Thread(ThreadWorkMethod); t2.Start(); t1.Join(); t2.Join(); if (Counter != 2 * Increments) throw new Exception("Hmm " + Counter + " != " + 2 * Increments); } const int Increments = 10 * 1000 * 1000; void ThreadWorkMethod() { for (int i = 0; i < Increments; i++) { Counter++; } } It does throw an exception with the message e.g. “Hmm 10.222.287 != 20.000.000” and does never finish. The code does fail because the assumption that Counter++ is an atomic operation is wrong. The ++ operator is just a shortcut for Counter = Counter + 1 This does involve reading the counter from a memory location into the CPU, incrementing value on the CPU and writing the new value back to the memory location. When we do look at the generated assembly code we will see only inc dword ptr [ecx+10h] which is only one instruction. Yes it is one instruction but it is not atomic. All modern CPUs have several layers of caches (L1,L2,L3) which try to hide the fact how slow actual main memory accesses are. Since cache is just another word for redundant copy it can happen that one CPU does read a value from main memory into the cache, modifies it and write it back to the main memory. The problem is that at least the L1 cache is not shared between CPUs so it can happen that one CPU does make changes to values which did change in meantime in the main memory. From the exception you can see we did increment the value 20 million times but half of the changes were lost because we did overwrite the already changed value from the other thread. This is a very common case and people do learn to protect their data with proper locking. void Intermediate() { var time = Stopwatch.StartNew(); Action acc = ThreadWorkMethod_Intermediate; var ar1 = acc.BeginInvoke(null, null); var ar2 = acc.BeginInvoke(null, null); ar1.AsyncWaitHandle.WaitOne(); ar2.AsyncWaitHandle.WaitOne(); if (Counter != 2 * Increments) throw new Exception(String.Format("Hmm {0:N0} != {1:N0}", Counter, 2 * Increments)); Console.WriteLine("Intermediate did take: {0:F1}s", time.Elapsed.TotalSeconds); } void ThreadWorkMethod_Intermediate() { for (int i = 0; i < Increments; i++) { lock (this) { Counter++; } } } This is better and does use the .NET Threadpool to get rid of manual thread management. It does give the expected result but it can result in deadlocks because you do lock on this. This is in general a bad idea since it can lead to deadlocks when other threads use your class instance as lock object. It is therefore recommended to create a private object as lock object to ensure that nobody else can lock your lock object. When you read more about threading you will read about lock free algorithms. They are nice and can improve performance quite a lot but you need to pay close attention to the CLR memory model. It does make quite weak guarantees in general but it can still work because your CPU architecture does give you more invariants than the CLR memory model. For a simple counter there is an easy lock free alternative present with the Interlocked class in .NET. As a general rule you should not try to write lock free algos since most likely you will fail to get it right on all CPU architectures. void Experienced() { var time = Stopwatch.StartNew(); Task t1 = Task.Factory.StartNew(ThreadWorkMethod_Experienced); Task t2 = Task.Factory.StartNew(ThreadWorkMethod_Experienced); t1.Wait(); t2.Wait(); if (Counter != 2 * Increments) throw new Exception(String.Format("Hmm {0:N0} != {1:N0}", Counter, 2 * Increments)); Console.WriteLine("Experienced did take: {0:F1}s", time.Elapsed.TotalSeconds); } void ThreadWorkMethod_Experienced() { for (int i = 0; i < Increments; i++) { Interlocked.Increment(ref Counter); } } Since time does move forward we do not use threads explicitly anymore but the much nicer Task abstraction which was introduced with .NET 4 at 2010. It is educational to look at the generated assembly code. The Interlocked.Increment method must be called which does wondrous things right? Lets see: lock inc dword ptr [eax] The first thing to note that there is no method call at all. Why? Because the JIT compiler does know very well about CPU intrinsic functions. Atomic operations which do lock the memory bus to prevent other processors to read stale values are such things. Second: This is the same increment call prefixed with a lock instruction. The only reason for the existence of the Interlocked class is that the JIT compiler can compile it to the matching CPU intrinsic functions which can not only increment by one but can also do an add, exchange and a combined compare and exchange operation. But be warned that the correct usage of its methods can be tricky. If you try to be clever and look a the generated IL code and try to reason about its efficiency you will fail. Only the generated machine code counts. Is this the best code we can write? Perhaps. It is nice and clean. But can we make it any faster? Lets see how good we are doing currently. Level Time in s IJustLearnedToUseThreads Flawed Code Intermediate 1,5 (lock) Experienced 0,3 (Interlocked.Increment) Master 0,1 (1,0 for int[2]) That lock free thing is really a nice thing. But if you read more about CPU cache, cache coherency, false sharing you can do even better. int[] Counters = new int[12]; // Cache line size is 64 bytes on my machine with an 8 way associative cache try for yourself e.g. 64 on more modern CPUs void Master() { var time = Stopwatch.StartNew(); Task t1 = Task.Factory.StartNew(ThreadWorkMethod_Master, 0); Task t2 = Task.Factory.StartNew(ThreadWorkMethod_Master, Counters.Length - 1); t1.Wait(); t2.Wait(); Counter = Counters[0] + Counters[Counters.Length - 1]; if (Counter != 2 * Increments) throw new Exception(String.Format("Hmm {0:N0} != {1:N0}", Counter, 2 * Increments)); Console.WriteLine("Master did take: {0:F1}s", time.Elapsed.TotalSeconds); } void ThreadWorkMethod_Master(object number) { int index = (int) number; for (int i = 0; i < Increments; i++) { Counters[index]++; } } The key insight here is to use for each core its own value. But if you simply use simply an integer array of two items, one for each core and add the items at the end you will be much slower than the lock free version (factor 3). Each CPU core has its own cache line size which is something in the range of 16-256 bytes. When you do access a value from one location the CPU does not only fetch one value from main memory but a complete cache line (e.g. 16 bytes). This means that you do not pay for the next 15 bytes when you access them. This can lead to dramatic performance improvements and non obvious code which is faster although it does have many more memory reads than another algorithm. So what have we done here? We have started with correct code but it was lacking knowledge how to use the .NET Base Class Libraries optimally. Then we did try to get fancy and used threads for the first time and failed. Our next try was better but it still had non obvious issues (lock object exposed to the outside). Knowledge has increased further and we have found a lock free version of our counter which is a nice and clean way which is a perfectly valid solution. The last example is only here to show you how you can get most out of threading by paying close attention to your used data structures and CPU cache coherency. Although we are working in a virtual execution environment in a high level language with automatic memory management it does pay off to know the details down to the assembly level. Only if you continue to learn and to dig deeper you can come up with solutions no one else was even considering. I have studied particle physics which does help at the digging deeper part. Have you ever tried to solve Quantum Chromodynamics equations? Compared to that the rest must be easy ;-). Although I am no longer working in the Science field I take pride in discovering non obvious things. This can be a very hard to find bug or a new way to restructure data to make something 10 times faster. Now I need to get some sleep ….

Read the article

help with fixing fwts errors log

- by jasmines

Here is an extract of results.log: MTRR validation. Test 1 of 3: Validate the kernel MTRR IOMEM setup. FAILED [MEDIUM] MTRRIncorrectAttr: Test 1, Memory range 0xc0000000 to 0xdfffffff (PCI Bus 0000:00) has incorrect attribute Write-Combining. FAILED [MEDIUM] MTRRIncorrectAttr: Test 1, Memory range 0xfee01000 to 0xffffffff (PCI Bus 0000:00) has incorrect attribute Write-Protect. ==================================================================================================== Test 1 of 1: Kernel log error check. Kernel message: [ 0.208079] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored ADVICE: This is not exactly a failure mode but a warning from the kernel. The _OSI() method has implemented a match to the 'Linux' query in the DSDT and this is redundant because the ACPI driver matches onto the Windows _OSI strings by default. FAILED [HIGH] KlogACPIErrorMethodExecutionParse: Test 1, HIGH Kernel message: [ 3.512783] ACPI Error : Method parse/execution failed [\_SB_.PCI0.GFX0._DOD] (Node f7425858), AE_AML_PACKAGE_LIMIT (20110623/psparse-536) ADVICE: This is a bug picked up by the kernel, but as yet, the firmware test suite has no diagnostic advice for this particular problem. Found 1 unique errors in kernel log. ==================================================================================================== Check if system is using latest microcode. ---------------------------------------------------------------------------------------------------- Cannot read microcode file /usr/share/misc/intel-microcode.dat. Aborted test, initialisation failed. ==================================================================================================== MSR register tests. FAILED [MEDIUM] MSRCPUsInconsistent: Test 1, MSR SYSENTER_ESP (0x175) has 1 inconsistent values across 2 CPUs for (shift: 0 mask: 0xffffffffffffffff). MSR CPU 0 -> 0xf7bb9c40 vs CPU 1 -> 0xf7bc7c40 FAILED [MEDIUM] MSRCPUsInconsistent: Test 1, MSR MISC_ENABLE (0x1a0) has 1 inconsistent values across 2 CPUs for (shift: 0 mask: 0x400c51889). MSR CPU 0 -> 0x850088 vs CPU 1 -> 0x850089 ==================================================================================================== Checks firmware has set PCI Express MaxReadReq to a higher value on non-motherboard devices. ---------------------------------------------------------------------------------------------------- Test 1 of 1: Check firmware settings MaxReadReq for PCI Express devices. MaxReadReq for pci://00:00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio Controller (rev 03) is low (128) [Audio device]. MaxReadReq for pci://00:02:00.0 Network controller: Intel Corporation PRO/Wireless 5100 AGN [Shiloh] Network Connection is low (128) [Network controller]. FAILED [LOW] LowMaxReadReq: Test 1, 2 devices have low MaxReadReq settings. Firmware may have configured these too low. ADVICE: The MaxReadRequest size is set too low and will affect performance. It will provide excellent bus sharing at the cost of bus data transfer rates. Although not a critical issue, it may be worth considering setting the MaxReadRequest size to 256 or 512 to increase throughput on the PCI Express bus. Some drivers (for example the Brocade Fibre Channel driver) allow one to override the firmware settings. Where possible, this BIOS configuration setting is worth increasing it a little more for better performance at a small reduction of bus sharing. ==================================================================================================== PCIe ASPM check. ---------------------------------------------------------------------------------------------------- Test 1 of 2: PCIe ASPM ACPI test. PCIE ASPM is not controlled by Linux kernel. ADVICE: BIOS reports that Linux kernel should not modify ASPM settings that BIOS configured. It can be intentional because hardware vendors identified some capability bugs between the motherboard and the add-on cards. Test 2 of 2: PCIe ASPM registers test. WARNING: Test 2, RP 00h:1Ch.01h L0s not enabled. WARNING: Test 2, RP 00h:1Ch.01h L1 not enabled. WARNING: Test 2, Device 02h:00h.00h L0s not enabled. WARNING: Test 2, Device 02h:00h.00h L1 not enabled. PASSED: Test 2, PCIE aspm setting matched was matched. WARNING: Test 2, RP 00h:1Ch.05h L0s not enabled. WARNING: Test 2, RP 00h:1Ch.05h L1 not enabled. WARNING: Test 2, Device 85h:00h.00h L0s not enabled. WARNING: Test 2, Device 85h:00h.00h L1 not enabled. PASSED: Test 2, PCIE aspm setting matched was matched. ==================================================================================================== Extract and analyse Windows Management Instrumentation (WMI). Test 1 of 2: Check Windows Management Instrumentation in DSDT Found WMI Method WMAA with GUID: 5FB7F034-2C63-45E9-BE91-3D44E2C707E4, Instance 0x01 Found WMI Event, Notifier ID: 0x80, GUID: 95F24279-4D7B-4334-9387-ACCDC67EF61C, Instance 0x01 PASSED: Test 1, GUID 95F24279-4D7B-4334-9387-ACCDC67EF61C is handled by driver hp-wmi (Vendor: HP). Found WMI Event, Notifier ID: 0xa0, GUID: 2B814318-4BE8-4707-9D84-A190A859B5D0, Instance 0x01 FAILED [MEDIUM] WMIUnknownGUID: Test 1, GUID 2B814318-4BE8-4707-9D84-A190A859B5D0 is unknown to the kernel, a driver may need to be implemented for this GUID. ADVICE: A WMI driver probably needs to be written for this event. It can checked for using: wmi_has_guid("2B814318-4BE8-4707-9D84-A190A859B5D0"). One can install a notify handler using wmi_install_notify_handler("2B814318-4BE8-4707-9D84-A190A859B5D0", handler, NULL). http://lwn.net/Articles/391230 describes how to write an appropriate driver. Found WMI Object, Object ID AB, GUID: 05901221-D566-11D1-B2F0-00A0C9062910, Instance 0x01, Flags: 00 Found WMI Method WMBA with GUID: 1F4C91EB-DC5C-460B-951D-C7CB9B4B8D5E, Instance 0x01 Found WMI Object, Object ID BC, GUID: 2D114B49-2DFB-4130-B8FE-4A3C09E75133, Instance 0x7f, Flags: 00 Found WMI Object, Object ID BD, GUID: 988D08E3-68F4-4C35-AF3E-6A1B8106F83C, Instance 0x19, Flags: 00 Found WMI Object, Object ID BE, GUID: 14EA9746-CE1F-4098-A0E0-7045CB4DA745, Instance 0x01, Flags: 00 Found WMI Object, Object ID BF, GUID: 322F2028-0F84-4901-988E-015176049E2D, Instance 0x01, Flags: 00 Found WMI Object, Object ID BG, GUID: 8232DE3D-663D-4327-A8F4-E293ADB9BF05, Instance 0x01, Flags: 00 Found WMI Object, Object ID BH, GUID: 8F1F6436-9F42-42C8-BADC-0E9424F20C9A, Instance 0x00, Flags: 00 Found WMI Object, Object ID BI, GUID: 8F1F6435-9F42-42C8-BADC-0E9424F20C9A, Instance 0x00, Flags: 00 Found WMI Method WMAC with GUID: 7391A661-223A-47DB-A77A-7BE84C60822D, Instance 0x01 Found WMI Object, Object ID BJ, GUID: DF4E63B6-3BBC-4858-9737-C74F82F821F3, Instance 0x05, Flags: 00 ==================================================================================================== Disassemble DSDT to check for _OSI("Linux"). ---------------------------------------------------------------------------------------------------- Test 1 of 1: Disassemble DSDT to check for _OSI("Linux"). This is not strictly a failure mode, it just alerts one that this has been defined in the DSDT and probably should be avoided since the Linux ACPI driver matches onto the Windows _OSI strings { If (_OSI ("Linux")) { Store (0x03E8, OSYS) } If (_OSI ("Windows 2001")) { Store (0x07D1, OSYS) } If (_OSI ("Windows 2001 SP1")) { Store (0x07D1, OSYS) } If (_OSI ("Windows 2001 SP2")) { Store (0x07D2, OSYS) } If (_OSI ("Windows 2006")) { Store (0x07D6, OSYS) } If (LAnd (MPEN, LEqual (OSYS, 0x07D1))) { TRAP (0x01, 0x48) } TRAP (0x03, 0x35) } WARNING: Test 1, DSDT implements a deprecated _OSI("Linux") test. ==================================================================================================== 0 passed, 0 failed, 1 warnings, 0 aborted, 0 skipped, 0 info only. ==================================================================================================== ACPI DSDT Method Semantic Tests. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP Failed to install global event handler. Test 22 of 93: Check _PSR (Power Source). ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 22, Detected an infinite loop when evaluating method '\_SB_.AC__._PSR'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. PASSED: Test 22, \_SB_.AC__._PSR correctly acquired and released locks 16 times. Test 35 of 93: Check _TMP (Thermal Zone Current Temp). ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 35, Detected an infinite loop when evaluating method '\_TZ_.DTSZ._TMP'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. PASSED: Test 35, \_TZ_.DTSZ._TMP correctly acquired and released locks 14 times. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 35, Detected an infinite loop when evaluating method '\_TZ_.CPUZ._TMP'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. PASSED: Test 35, \_TZ_.CPUZ._TMP correctly acquired and released locks 10 times. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 35, Detected an infinite loop when evaluating method '\_TZ_.SKNZ._TMP'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. PASSED: Test 35, \_TZ_.SKNZ._TMP correctly acquired and released locks 10 times. PASSED: Test 35, _TMP correctly returned sane looking value 0x00000b4c (289.2 degrees K) PASSED: Test 35, \_TZ_.BATZ._TMP correctly acquired and released locks 9 times. PASSED: Test 35, _TMP correctly returned sane looking value 0x00000aac (273.2 degrees K) PASSED: Test 35, \_TZ_.FDTZ._TMP correctly acquired and released locks 7 times. Test 46 of 93: Check _DIS (Disable). FAILED [MEDIUM] MethodShouldReturnNothing: Test 46, \_SB_.PCI0.LPCB.SIO_.COM1._DIS returned values, but was expected to return nothing. Object returned: INTEGER: 0x00000000 ADVICE: This probably won't cause any errors, but it should be fixed as the AML code is not conforming to the expected behaviour as described in the ACPI specification. FAILED [MEDIUM] MethodShouldReturnNothing: Test 46, \_SB_.PCI0.LPCB.SIO_.LPT0._DIS returned values, but was expected to return nothing. Object returned: INTEGER: 0x00000000 ADVICE: This probably won't cause any errors, but it should be fixed as the AML code is not conforming to the expected behaviour as described in the ACPI specification. Test 61 of 93: Check _WAK (System Wake). Test _WAK(1) System Wake, State S1. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 61, Detected an infinite loop when evaluating method '\_WAK'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. Test _WAK(2) System Wake, State S2. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 61, Detected an infinite loop when evaluating method '\_WAK'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. Test _WAK(3) System Wake, State S3. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 61, Detected an infinite loop when evaluating method '\_WAK'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. Test _WAK(4) System Wake, State S4. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 61, Detected an infinite loop when evaluating method '\_WAK'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. Test _WAK(5) System Wake, State S5. ACPICA Exception AE_AML_INFINITE_LOOP during execution of method COMP WARNING: Test 61, Detected an infinite loop when evaluating method '\_WAK'. ADVICE: This may occur because we are emulating the execution in this test environment and cannot handshake with the embedded controller or jump to the BIOS via SMIs. However, the fact that AML code spins forever means that lockup conditions are not being checked for in the AML bytecode. Test 87 of 93: Check _BCL (Query List of Brightness Control Levels Supported). Package has 2 elements: 00: INTEGER: 0x00000000 01: INTEGER: 0x00000000 FAILED [MEDIUM] Method_BCLElementCount: Test 87, Method _BCL should return a package of more than 2 integers, got just 2. Test 88 of 93: Check _BCM (Set Brightness Level). ACPICA Exception AE_AML_PACKAGE_LIMIT during execution of method _BCM FAILED [CRITICAL] AEAMLPackgeLimit: Test 88, Detected error 'Package limit' when evaluating '\_SB_.PCI0.GFX0.DD02._BCM'. ==================================================================================================== ACPI table settings sanity checks. ---------------------------------------------------------------------------------------------------- Test 1 of 1: Check ACPI tables. PASSED: Test 1, Table APIC passed. Table ECDT not present to check. FAILED [MEDIUM] FADT32And64BothDefined: Test 1, FADT 32 bit FIRMWARE_CONTROL is non-zero, and X_FIRMWARE_CONTROL is also non-zero. Section 5.2.9 of the ACPI specification states that if the FIRMWARE_CONTROL is non-zero then X_FIRMWARE_CONTROL must be set to zero. ADVICE: The FADT FIRMWARE_CTRL is a 32 bit pointer that points to the physical memory address of the Firmware ACPI Control Structure (FACS). There is also an extended 64 bit version of this, the X_FIRMWARE_CTRL pointer that also can point to the FACS. Section 5.2.9 of the ACPI specification states that if the X_FIRMWARE_CTRL field contains a non zero value then the FIRMWARE_CTRL field *must* be zero. This error is also detected by the Linux kernel. If FIRMWARE_CTRL and X_FIRMWARE_CTRL are defined, then the kernel just uses the 64 bit version of the pointer. PASSED: Test 1, Table HPET passed. PASSED: Test 1, Table MCFG passed. PASSED: Test 1, Table RSDT passed. PASSED: Test 1, Table RSDP passed. Table SBST not present to check. PASSED: Test 1, Table XSDT passed. ==================================================================================================== Re-assemble DSDT and find syntax errors and warnings. ---------------------------------------------------------------------------------------------------- Test 1 of 2: Disassemble and reassemble DSDT FAILED [HIGH] AMLAssemblerError4043: Test 1, Assembler error in line 2261 Line | AML source ---------------------------------------------------------------------------------------------------- 02258| 0x00000000, // Range Minimum 02259| 0xFEDFFFFF, // Range Maximum 02260| 0x00000000, // Translation Offset 02261| 0x00000000, // Length | ^ | error 4043: Invalid combination of Length and Min/Max fixed flags 02262| ,, _Y0E, AddressRangeMemory, TypeStatic) 02263| DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, 02264| 0x00000000, // Granularity ==================================================================================================== ADVICE: (for error #4043): This occurs if the length is zero and just one of the resource MIF/MAF flags are set, or the length is non-zero and resource MIF/MAF flags are both set. These are illegal combinations and need to be fixed. See section 6.4.3.5 Address Space Resource Descriptors of version 4.0a of the ACPI specification for more details. FAILED [HIGH] AMLAssemblerError4050: Test 1, Assembler error in line 2268 Line | AML source ---------------------------------------------------------------------------------------------------- 02265| 0xFEE01000, // Range Minimum 02266| 0xFFFFFFFF, // Range Maximum 02267| 0x00000000, // Translation Offset 02268| 0x011FEFFF, // Length | ^ | error 4050: Length is not equal to fixed Min/Max window 02269| ,, , AddressRangeMemory, TypeStatic) 02270| }) 02271| Method (_CRS, 0, Serialized) ==================================================================================================== ADVICE: (for error #4050): The minimum address is greater than the maximum address. This is illegal. FAILED [HIGH] AMLAssemblerError1104: Test 1, Assembler error in line 8885 Line | AML source ---------------------------------------------------------------------------------------------------- 08882| Method (_DIS, 0, NotSerialized) 08883| { 08884| DSOD (0x02) 08885| Return (0x00) | ^ | warning level 0 1104: Reserved method should not return a value (_DIS) 08886| } 08887| 08888| Method (_SRS, 1, NotSerialized) ==================================================================================================== FAILED [HIGH] AMLAssemblerError1104: Test 1, Assembler error in line 9195 Line | AML source ---------------------------------------------------------------------------------------------------- 09192| Method (_DIS, 0, NotSerialized) 09193| { 09194| DSOD (0x01) 09195| Return (0x00) | ^ | warning level 0 1104: Reserved method should not return a value (_DIS) 09196| } 09197| 09198| Method (_SRS, 1, NotSerialized) ==================================================================================================== FAILED [HIGH] AMLAssemblerError1127: Test 1, Assembler error in line 9242 Line | AML source ---------------------------------------------------------------------------------------------------- 09239| CreateWordField (CRES, \_SB.PCI0.LPCB.SIO.LPT0._CRS._Y21._MAX, MAX2) 09240| CreateByteField (CRES, \_SB.PCI0.LPCB.SIO.LPT0._CRS._Y21._LEN, LEN2) 09241| CreateWordField (CRES, \_SB.PCI0.LPCB.SIO.LPT0._CRS._Y22._INT, IRQ0) 09242| CreateWordField (CRES, \_SB.PCI0.LPCB.SIO.LPT0._CRS._Y23._DMA, DMA0) | ^ | warning level 0 1127: ResourceTag smaller than Field (Tag: 8 bits, Field: 16 bits) 09243| If (RLPD) 09244| { 09245| Store (0x00, Local0) ==================================================================================================== FAILED [HIGH] AMLAssemblerError1128: Test 1, Assembler error in line 18682 Line | AML source ---------------------------------------------------------------------------------------------------- 18679| Store (0x01, Index (DerefOf (Index (Local0, 0x02)), 0x01)) 18680| If (And (WDPE, 0x40)) 18681| { 18682| Wait (\_SB.BEVT, 0x10) | ^ | warning level 0 1128: Result is not used, possible operator timeout will be missed 18683| } 18684| 18685| Store (BRID, Index (DerefOf (Index (Local0, 0x02)), 0x02)) ==================================================================================================== ADVICE: (for warning level 0 #1128): The operation can possibly timeout, and hence the return value indicates an timeout error. However, because the return value is not checked this very probably indicates that the code is buggy. A possible scenario is that a mutex times out and the code attempts to access data in a critical region when it should not. This will lead to undefined behaviour. This should be fixed. Table DSDT (0) reassembly: Found 2 errors, 4 warnings. Test 2 of 2: Disassemble and reassemble SSDT PASSED: Test 2, SSDT (0) reassembly, Found 0 errors, 0 warnings. FAILED [HIGH] AMLAssemblerError1104: Test 2, Assembler error in line 60 Line | AML source ---------------------------------------------------------------------------------------------------- 00057| { 00058| Store (CPDC (Arg0), Local0) 00059| GCAP (Local0) 00060| Return (Local0) | ^ | warning level 0 1104: Reserved method should not return a value (_PDC) 00061| } 00062| 00063| Method (_OSC, 4, NotSerialized) ==================================================================================================== FAILED [HIGH] AMLAssemblerError1104: Test 2, Assembler error in line 174 Line | AML source ---------------------------------------------------------------------------------------------------- 00171| { 00172| Store (\_PR.CPU0.CPDC (Arg0), Local0) 00173| GCAP (Local0) 00174| Return (Local0) | ^ | warning level 0 1104: Reserved method should not return a value (_PDC) 00175| } 00176| 00177| Method (_OSC, 4, NotSerialized) ==================================================================================================== FAILED [HIGH] AMLAssemblerError1104: Test 2, Assembler error in line 244 Line | AML source ---------------------------------------------------------------------------------------------------- 00241| { 00242| Store (\_PR.CPU0.CPDC (Arg0), Local0) 00243| GCAP (Local0) 00244| Return (Local0) | ^ | warning level 0 1104: Reserved method should not return a value (_PDC) 00245| } 00246| 00247| Method (_OSC, 4, NotSerialized) ==================================================================================================== FAILED [HIGH] AMLAssemblerError1104: Test 2, Assembler error in line 290 Line | AML source ---------------------------------------------------------------------------------------------------- 00287| { 00288| Store (\_PR.CPU0.CPDC (Arg0), Local0) 00289| GCAP (Local0) 00290| Return (Local0) | ^ | warning level 0 1104: Reserved method should not return a value (_PDC) 00291| } 00292| 00293| Method (_OSC, 4, NotSerialized) ==================================================================================================== Table SSDT (1) reassembly: Found 0 errors, 4 warnings. PASSED: Test 2, SSDT (2) reassembly, Found 0 errors, 0 warnings. PASSED: Test 2, SSDT (3) reassembly, Found 0 errors, 0 warnings. ==================================================================================================== 3 passed, 10 failed, 0 warnings, 0 aborted, 0 skipped, 0 info only. ==================================================================================================== Critical failures: 1 method test, at 1 log line: 1449: Detected error 'Package limit' when evaluating '\_SB_.PCI0.GFX0.DD02._BCM'. High failures: 11 klog test, at 1 log line: 121: HIGH Kernel message: [ 3.512783] ACPI Error: Method parse/execution failed [\_SB_.PCI0.GFX0._DOD] (Node f7425858), AE_AML_PACKAGE_LIMIT (20110623/psparse-536) syntaxcheck test, at 1 log line: 1668: Assembler error in line 2261 syntaxcheck test, at 1 log line: 1687: Assembler error in line 2268 syntaxcheck test, at 1 log line: 1703: Assembler error in line 8885 syntaxcheck test, at 1 log line: 1716: Assembler error in line 9195 syntaxcheck test, at 1 log line: 1729: Assembler error in line 9242 syntaxcheck test, at 1 log line: 1742: Assembler error in line 18682 syntaxcheck test, at 1 log line: 1766: Assembler error in line 60 syntaxcheck test, at 1 log line: 1779: Assembler error in line 174 syntaxcheck test, at 1 log line: 1792: Assembler error in line 244 syntaxcheck test, at 1 log line: 1805: Assembler error in line 290 Medium failures: 9 mtrr test, at 1 log line: 76: Memory range 0xc0000000 to 0xdfffffff (PCI Bus 0000:00) has incorrect attribute Write-Combining. mtrr test, at 1 log line: 78: Memory range 0xfee01000 to 0xffffffff (PCI Bus 0000:00) has incorrect attribute Write-Protect. msr test, at 1 log line: 165: MSR SYSENTER_ESP (0x175) has 1 inconsistent values across 2 CPUs for (shift: 0 mask: 0xffffffffffffffff). msr test, at 1 log line: 173: MSR MISC_ENABLE (0x1a0) has 1 inconsistent values across 2 CPUs for (shift: 0 mask: 0x400c51889). wmi test, at 1 log line: 528: GUID 2B814318-4BE8-4707-9D84-A190A859B5D0 is unknown to the kernel, a driver may need to be implemented for this GUID. method test, at 1 log line: 1002: \_SB_.PCI0.LPCB.SIO_.COM1._DIS returned values, but was expected to return nothing. method test, at 1 log line: 1011: \_SB_.PCI0.LPCB.SIO_.LPT0._DIS returned values, but was expected to return nothing. method test, at 1 log line: 1443: Method _BCL should return a package of more than 2 integers, got just 2. acpitables test, at 1 log line: 1643: FADT 32 bit FIRMWARE_CONTROL is non-zero, and X_FIRMWARE_CONTROL is also non-zero. Se

Read the article

value types in the vm

- by john.rose

value types in the vm p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times} p.p2 {margin: 0.0px 0.0px 14.0px 0.0px; font: 14.0px Times} p.p3 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times} p.p4 {margin: 0.0px 0.0px 15.0px 0.0px; font: 14.0px Times} p.p5 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Courier} p.p6 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Courier; min-height: 17.0px} p.p7 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times; min-height: 18.0px} p.p8 {margin: 0.0px 0.0px 0.0px 36.0px; text-indent: -36.0px; font: 14.0px Times; min-height: 18.0px} p.p9 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times; min-height: 18.0px} p.p10 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times; color: #000000} li.li1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times} li.li7 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times; min-height: 18.0px} span.s1 {font: 14.0px Courier} span.s2 {color: #000000} span.s3 {font: 14.0px Courier; color: #000000} ol.ol1 {list-style-type: decimal} Or, enduring values for a changing world. Introduction A value type is a data type which, generally speaking, is designed for being passed by value in and out of methods, and stored by value in data structures. The only value types which the Java language directly supports are the eight primitive types. Java indirectly and approximately supports value types, if they are implemented in terms of classes. For example, both Integer and String may be viewed as value types, especially if their usage is restricted to avoid operations appropriate to Object. In this note, we propose a definition of value types in terms of a design pattern for Java classes, accompanied by a set of usage restrictions. We also sketch the relation of such value types to tuple types (which are a JVM-level notion), and point out JVM optimizations that can apply to value types. This note is a thought experiment to extend the JVM’s performance model in support of value types. The demonstration has two phases. Initially the extension can simply use design patterns, within the current bytecode architecture, and in today’s Java language. But if the performance model is to be realized in practice, it will probably require new JVM bytecode features, changes to the Java language, or both. We will look at a few possibilities for these new features. An Axiom of Value In the context of the JVM, a value type is a data type equipped with construction, assignment, and equality operations, and a set of typed components, such that, whenever two variables of the value type produce equal corresponding values for their components, the values of the two variables cannot be distinguished by any JVM operation. Here are some corollaries: A value type is immutable, since otherwise a copy could be constructed and the original could be modified in one of its components, allowing the copies to be distinguished. Changing the component of a value type requires construction of a new value. The equals and hashCode operations are strictly component-wise. If a value type is represented by a JVM reference, that reference cannot be successfully synchronized on, and cannot be usefully compared for reference equality. A value type can be viewed in terms of what it doesn’t do. We can say that a value type omits all value-unsafe operations, which could violate the constraints on value types. These operations, which are ordinarily allowed for Java object types, are pointer equality comparison (the acmp instruction), synchronization (the monitor instructions), all the wait and notify methods of class Object, and non-trivial finalize methods. The clone method is also value-unsafe, although for value types it could be treated as the identity function. Finally, and most importantly, any side effect on an object (however visible) also counts as an value-unsafe operation. A value type may have methods, but such methods must not change the components of the value. It is reasonable and useful to define methods like toString, equals, and hashCode on value types, and also methods which are specifically valuable to users of the value type. Representations of Value Value types have two natural representations in the JVM, unboxed and boxed. An unboxed value consists of the components, as simple variables. For example, the complex number x=(1+2i), in rectangular coordinate form, may be represented in unboxed form by the following pair of variables: /*Complex x = Complex.valueOf(1.0, 2.0):*/ double x_re = 1.0, x_im = 2.0; These variables might be locals, parameters, or fields. Their association as components of a single value is not defined to the JVM. Here is a sample computation which computes the norm of the difference between two complex numbers: double distance(/*Complex x:*/ double x_re, double x_im, /*Complex y:*/ double y_re, double y_im) { /*Complex z = x.minus(y):*/ double z_re = x_re - y_re, z_im = x_im - y_im; /*return z.abs():*/ return Math.sqrt(z_re*z_re + z_im*z_im); } A boxed representation groups component values under a single object reference. The reference is to a ‘wrapper class’ that carries the component values in its fields. (A primitive type can naturally be equated with a trivial value type with just one component of that type. In that view, the wrapper class Integer can serve as a boxed representation of value type int.) The unboxed representation of complex numbers is practical for many uses, but it fails to cover several major use cases: return values, array elements, and generic APIs. The two components of a complex number cannot be directly returned from a Java function, since Java does not support multiple return values. The same story applies to array elements: Java has no ’array of structs’ feature. (Double-length arrays are a possible workaround for complex numbers, but not for value types with heterogeneous components.) By generic APIs I mean both those which use generic types, like Arrays.asList and those which have special case support for primitive types, like String.valueOf and PrintStream.println. Those APIs do not support unboxed values, and offer some problems to boxed values. Any ’real’ JVM type should have a story for returns, arrays, and API interoperability. The basic problem here is that value types fall between primitive types and object types. Value types are clearly more complex than primitive types, and object types are slightly too complicated. Objects are a little bit dangerous to use as value carriers, since object references can be compared for pointer equality, and can be synchronized on. Also, as many Java programmers have observed, there is often a performance cost to using wrapper objects, even on modern JVMs. Even so, wrapper classes are a good starting point for talking about value types. If there were a set of structural rules and restrictions which would prevent value-unsafe operations on value types, wrapper classes would provide a good notation for defining value types. This note attempts to define such rules and restrictions. Let’s Start Coding Now it is time to look at some real code. Here is a definition, written in Java, of a complex number value type. @ValueSafe public final class Complex implements java.io.Serializable { // immutable component structure: public final double re, im; private Complex(double re, double im) { this.re = re; this.im = im; } // interoperability methods: public String toString() { return "Complex("+re+","+im+")"; } public List<Double> asList() { return Arrays.asList(re, im); } public boolean equals(Complex c) { return re == c.re && im == c.im; } public boolean equals(@ValueSafe Object x) { return x instanceof Complex && equals((Complex) x); } public int hashCode() { return 31*Double.valueOf(re).hashCode() + Double.valueOf(im).hashCode(); } // factory methods: public static Complex valueOf(double re, double im) { return new Complex(re, im); } public Complex changeRe(double re2) { return valueOf(re2, im); } public Complex changeIm(double im2) { return valueOf(re, im2); } public static Complex cast(@ValueSafe Object x) { return x == null ? ZERO : (Complex) x; } // utility methods and constants: public Complex plus(Complex c) { return new Complex(re+c.re, im+c.im); } public Complex minus(Complex c) { return new Complex(re-c.re, im-c.im); } public double abs() { return Math.sqrt(re*re + im*im); } public static final Complex PI = valueOf(Math.PI, 0.0); public static final Complex ZERO = valueOf(0.0, 0.0); } This is not a minimal definition, because it includes some utility methods and other optional parts. The essential elements are as follows: The class is marked as a value type with an annotation. The class is final, because it does not make sense to create subclasses of value types. The fields of the class are all non-private and final. (I.e., the type is immutable and structurally transparent.) From the supertype Object, all public non-final methods are overridden. The constructor is private. Beyond these bare essentials, we can observe the following features in this example, which are likely to be typical of all value types: One or more factory methods are responsible for value creation, including a component-wise valueOf method. There are utility methods for complex arithmetic and instance creation, such as plus and changeIm. There are static utility constants, such as PI. The type is serializable, using the default mechanisms. There are methods for converting to and from dynamically typed references, such as asList and cast. The Rules In order to use value types properly, the programmer must avoid value-unsafe operations. A helpful Java compiler should issue errors (or at least warnings) for code which provably applies value-unsafe operations, and should issue warnings for code which might be correct but does not provably avoid value-unsafe operations. No such compilers exist today, but to simplify our account here, we will pretend that they do exist. A value-safe type is any class, interface, or type parameter marked with the @ValueSafe annotation, or any subtype of a value-safe type. If a value-safe class is marked final, it is in fact a value type. All other value-safe classes must be abstract. The non-static fields of a value class must be non-public and final, and all its constructors must be private. Under the above rules, a standard interface could be helpful to define value types like Complex. Here is an example: @ValueSafe public interface ValueType extends java.io.Serializable { // All methods listed here must get redefined. // Definitions must be value-safe, which means // they may depend on component values only. List<? extends Object> asList(); int hashCode(); boolean equals(@ValueSafe Object c); String toString(); } //@ValueSafe inherited from supertype: public final class Complex implements ValueType { … The main advantage of such a conventional interface is that (unlike an annotation) it is reified in the runtime type system. It could appear as an element type or parameter bound, for facilities which are designed to work on value types only. More broadly, it might assist the JVM to perform dynamic enforcement of the rules for value types. Besides types, the annotation @ValueSafe can mark fields, parameters, local variables, and methods. (This is redundant when the type is also value-safe, but may be useful when the type is Object or another supertype of a value type.) Working forward from these annotations, an expression E is defined as value-safe if it satisfies one or more of the following: The type of E is a value-safe type. E names a field, parameter, or local variable whose declaration is marked @ValueSafe. E is a call to a method whose declaration is marked @ValueSafe. E is an assignment to a value-safe variable, field reference, or array reference. E is a cast to a value-safe type from a value-safe expression. E is a conditional expression E0 ? E1 : E2, and both E1 and E2 are value-safe. Assignments to value-safe expressions and initializations of value-safe names must take their values from value-safe expressions. A value-safe expression may not be the subject of a value-unsafe operation. In particular, it cannot be synchronized on, nor can it be compared with the “==” operator, not even with a null or with another value-safe type. In a program where all of these rules are followed, no value-type value will be subject to a value-unsafe operation. Thus, the prime axiom of value types will be satisfied, that no two value type will be distinguishable as long as their component values are equal. More Code To illustrate these rules, here are some usage examples for Complex: Complex pi = Complex.valueOf(Math.PI, 0); Complex zero = pi.changeRe(0); //zero = pi; zero.re = 0; ValueType vtype = pi; @SuppressWarnings("value-unsafe") Object obj = pi; @ValueSafe Object obj2 = pi; obj2 = new Object(); // ok List<Complex> clist = new ArrayList<Complex>(); clist.add(pi); // (ok assuming List.add param is @ValueSafe) List<ValueType> vlist = new ArrayList<ValueType>(); vlist.add(pi); // (ok) List<Object> olist = new ArrayList<Object>(); olist.add(pi); // warning: "value-unsafe" boolean z = pi.equals(zero); boolean z1 = (pi == zero); // error: reference comparison on value type boolean z2 = (pi == null); // error: reference comparison on value type boolean z3 = (pi == obj2); // error: reference comparison on value type synchronized (pi) { } // error: synch of value, unpredictable result synchronized (obj2) { } // unpredictable result Complex qq = pi; qq = null; // possible NPE; warning: “null-unsafe" qq = (Complex) obj; // warning: “null-unsafe" qq = Complex.cast(obj); // OK @SuppressWarnings("null-unsafe") Complex empty = null; // possible NPE qq = empty; // possible NPE (null pollution) The Payoffs It follows from this that either the JVM or the java compiler can replace boxed value-type values with unboxed ones, without affecting normal computations. Fields and variables of value types can be split into their unboxed components. Non-static methods on value types can be transformed into static methods which take the components as value parameters. Some common questions arise around this point in any discussion of value types. Why burden the programmer with all these extra rules? Why not detect programs automagically and perform unboxing transparently? The answer is that it is easy to break the rules accidently unless they are agreed to by the programmer and enforced. Automatic unboxing optimizations are tantalizing but (so far) unreachable ideal. In the current state of the art, it is possible exhibit benchmarks in which automatic unboxing provides the desired effects, but it is not possible to provide a JVM with a performance model that assures the programmer when unboxing will occur. This is why I’m writing this note, to enlist help from, and provide assurances to, the programmer. Basically, I’m shooting for a good set of user-supplied “pragmas” to frame the desired optimization. Again, the important thing is that the unboxing must be done reliably, or else programmers will have no reason to work with the extra complexity of the value-safety rules. There must be a reasonably stable performance model, wherein using a value type has approximately the same performance characteristics as writing the unboxed components as separate Java variables. There are some rough corners to the present scheme. Since Java fields and array elements are initialized to null, value-type computations which incorporate uninitialized variables can produce null pointer exceptions. One workaround for this is to require such variables to be null-tested, and the result replaced with a suitable all-zero value of the value type. That is what the “cast” method does above. Generically typed APIs like List<T> will continue to manipulate boxed values always, at least until we figure out how to do reification of generic type instances. Use of such APIs will elicit warnings until their type parameters (and/or relevant members) are annotated or typed as value-safe. Retrofitting List<T> is likely to expose flaws in the present scheme, which we will need to engineer around. Here are a couple of first approaches: public interface java.util.List<@ValueSafe T> extends Collection<T> { … public interface java.util.List<T extends Object|ValueType> extends Collection<T> { … (The second approach would require disjunctive types, in which value-safety is “contagious” from the constituent types.) With more transformations, the return value types of methods can also be unboxed. This may require significant bytecode-level transformations, and would work best in the presence of a bytecode representation for multiple value groups, which I have proposed elsewhere under the title “Tuples in the VM”. But for starters, the JVM can apply this transformation under the covers, to internally compiled methods. This would give a way to express multiple return values and structured return values, which is a significant pain-point for Java programmers, especially those who work with low-level structure types favored by modern vector and graphics processors. The lack of multiple return values has a strong distorting effect on many Java APIs. Even if the JVM fails to unbox a value, there is still potential benefit to the value type. Clustered computing systems something have copy operations (serialization or something similar) which apply implicitly to command operands. When copying JVM objects, it is extremely helpful to know when an object’s identity is important or not. If an object reference is a copied operand, the system may have to create a proxy handle which points back to the original object, so that side effects are visible. Proxies must be managed carefully, and this can be expensive. On the other hand, value types are exactly those types which a JVM can “copy and forget” with no downside. Array types are crucial to bulk data interfaces. (As data sizes and rates increase, bulk data becomes more important than scalar data, so arrays are definitely accompanying us into the future of computing.) Value types are very helpful for adding structure to bulk data, so a successful value type mechanism will make it easier for us to express richer forms of bulk data. Unboxing arrays (i.e., arrays containing unboxed values) will provide better cache and memory density, and more direct data movement within clustered or heterogeneous computing systems. They require the deepest transformations, relative to today’s JVM. There is an impedance mismatch between value-type arrays and Java’s covariant array typing, so compromises will need to be struck with existing Java semantics. It is probably worth the effort, since arrays of unboxed value types are inherently more memory-efficient than standard Java arrays, which rely on dependent pointer chains. It may be sufficient to extend the “value-safe” concept to array declarations, and allow low-level transformations to change value-safe array declarations from the standard boxed form into an unboxed tuple-based form. Such value-safe arrays would not be convertible to Object[] arrays. Certain connection points, such as Arrays.copyOf and System.arraycopy might need additional input/output combinations, to allow smooth conversion between arrays with boxed and unboxed elements. Alternatively, the correct solution may have to wait until we have enough reification of generic types, and enough operator overloading, to enable an overhaul of Java arrays. Implicit Method Definitions The example of class Complex above may be unattractively complex. I believe most or all of the elements of the example class are required by the logic of value types. If this is true, a programmer who writes a value type will have to write lots of error-prone boilerplate code. On the other hand, I think nearly all of the code (except for the domain-specific parts like plus and minus) can be implicitly generated. Java has a rule for implicitly defining a class’s constructor, if no it defines no constructors explicitly. Likewise, there are rules for providing default access modifiers for interface members. Because of the highly regular structure of value types, it might be reasonable to perform similar implicit transformations on value types. Here’s an example of a “highly implicit” definition of a complex number type: public class Complex implements ValueType { // implicitly final public double re, im; // implicitly public final //implicit methods are defined elementwise from te fields: // toString, asList, equals(2), hashCode, valueOf, cast //optionally, explicit methods (plus, abs, etc.) would go here } In other words, with the right defaults, a simple value type definition can be a one-liner. The observant reader will have noticed the similarities (and suitable differences) between the explicit methods above and the corresponding methods for List<T>. Another way to abbreviate such a class would be to make an annotation the primary trigger of the functionality, and to add the interface(s) implicitly: public @ValueType class Complex { … // implicitly final, implements ValueType (But to me it seems better to communicate the “magic” via an interface, even if it is rooted in an annotation.) Implicitly Defined Value Types So far we have been working with nominal value types, which is to say that the sequence of typed components is associated with a name and additional methods that convey the intention of the programmer. A simple ordered pair of floating point numbers can be variously interpreted as (to name a few possibilities) a rectangular or polar complex number or Cartesian point. The name and the methods convey the intended meaning. But what if we need a truly simple ordered pair of floating point numbers, without any further conceptual baggage? Perhaps we are writing a method (like “divideAndRemainder”) which naturally returns a pair of numbers instead of a single number. Wrapping the pair of numbers in a nominal type (like “QuotientAndRemainder”) makes as little sense as wrapping a single return value in a nominal type (like “Quotient”). What we need here are structural value types commonly known as tuples. For the present discussion, let us assign a conventional, JVM-friendly name to tuples, roughly as follows: public class java.lang.tuple.$DD extends java.lang.tuple.Tuple { double $1, $2; } Here the component names are fixed and all the required methods are defined implicitly. The supertype is an abstract class which has suitable shared declarations. The name itself mentions a JVM-style method parameter descriptor, which may be “cracked” to determine the number and types of the component fields. The odd thing about such a tuple type (and structural types in general) is it must be instantiated lazily, in response to linkage requests from one or more classes that need it. The JVM and/or its class loaders must be prepared to spin a tuple type on demand, given a simple name reference, $xyz, where the xyz is cracked into a series of component types. (Specifics of naming and name mangling need some tasteful engineering.) Tuples also seem to demand, even more than nominal types, some support from the language. (This is probably because notations for non-nominal types work best as combinations of punctuation and type names, rather than named constructors like Function3 or Tuple2.) At a minimum, languages with tuples usually (I think) have some sort of simple bracket notation for creating tuples, and a corresponding pattern-matching syntax (or “destructuring bind”) for taking tuples apart, at least when they are parameter lists. Designing such a syntax is no simple thing, because it ought to play well with nominal value types, and also with pre-existing Java features, such as method parameter lists, implicit conversions, generic types, and reflection. That is a task for another day. Other Use Cases Besides complex numbers and simple tuples there are many use cases for value types. Many tuple-like types have natural value-type representations. These include rational numbers, point locations and pixel colors, and various kinds of dates and addresses. Other types have a variable-length ‘tail’ of internal values. The most common example of this is String, which is (mathematically) a sequence of UTF-16 character values. Similarly, bit vectors, multiple-precision numbers, and polynomials are composed of sequences of values. Such types include, in their representation, a reference to a variable-sized data structure (often an array) which (somehow) represents the sequence of values. The value type may also include ’header’ information. Variable-sized values often have a length distribution which favors short lengths. In that case, the design of the value type can make the first few values in the sequence be direct ’header’ fields of the value type. In the common case where the header is enough to represent the whole value, the tail can be a shared null value, or even just a null reference. Note that the tail need not be an immutable object, as long as the header type encapsulates it well enough. This is the case with String, where the tail is a mutable (but never mutated) character array. Field types and their order must be a globally visible part of the API. The structure of the value type must be transparent enough to have a globally consistent unboxed representation, so that all callers and callees agree about the type and order of components that appear as parameters, return types, and array elements. This is a trade-off between efficiency and encapsulation, which is forced on us when we remove an indirection enjoyed by boxed representations. A JVM-only transformation would not care about such visibility, but a bytecode transformation would need to take care that (say) the components of complex numbers would not get swapped after a redefinition of Complex and a partial recompile. Perhaps constant pool references to value types need to declare the field order as assumed by each API user. This brings up the delicate status of private fields in a value type. It must always be possible to load, store, and copy value types as coordinated groups, and the JVM performs those movements by moving individual scalar values between locals and stack. If a component field is not public, what is to prevent hostile code from plucking it out of the tuple using a rogue aload or astore instruction? Nothing but the verifier, so we may need to give it more smarts, so that it treats value types as inseparable groups of stack slots or locals (something like long or double). My initial thought was to make the fields always public, which would make the security problem moot. But public is not always the right answer; consider the case of String, where the underlying mutable character array must be encapsulated to prevent security holes. I believe we can win back both sides of the tradeoff, by training the verifier never to split up the components in an unboxed value. Just as the verifier encapsulates the two halves of a 64-bit primitive, it can encapsulate the the header and body of an unboxed String, so that no code other than that of class String itself can take apart the values. Similar to String, we could build an efficient multi-precision decimal type along these lines: public final class DecimalValue extends ValueType { protected final long header; protected private final BigInteger digits; public DecimalValue valueOf(int value, int scale) { assert(scale >= 0); return new DecimalValue(((long)value << 32) + scale, null); } public DecimalValue valueOf(long value, int scale) { if (value == (int) value) return valueOf((int)value, scale); return new DecimalValue(-scale, new BigInteger(value)); } } Values of this type would be passed between methods as two machine words. Small values (those with a significand which fits into 32 bits) would be represented without any heap data at all, unless the DecimalValue itself were boxed. (Note the tension between encapsulation and unboxing in this case. It would be better if the header and digits fields were private, but depending on where the unboxing information must “leak”, it is probably safer to make a public revelation of the internal structure.) Note that, although an array of Complex can be faked with a double-length array of double, there is no easy way to fake an array of unboxed DecimalValues. (Either an array of boxed values or a transposed pair of homogeneous arrays would be reasonable fallbacks, in a current JVM.) Getting the full benefit of unboxing and arrays will require some new JVM magic. Although the JVM emphasizes portability, system dependent code will benefit from using machine-level types larger than 64 bits. For example, the back end of a linear algebra package might benefit from value types like Float4 which map to stock vector types. This is probably only worthwhile if the unboxing arrays can be packed with such values. More Daydreams A more finely-divided design for dynamic enforcement of value safety could feature separate marker interfaces for each invariant. An empty marker interface Unsynchronizable could cause suitable exceptions for monitor instructions on objects in marked classes. More radically, a Interchangeable marker interface could cause JVM primitives that are sensitive to object identity to raise exceptions; the strangest result would be that the acmp instruction would have to be specified as raising an exception. @ValueSafe public interface ValueType extends java.io.Serializable, Unsynchronizable, Interchangeable { … public class Complex implements ValueType { // inherits Serializable, Unsynchronizable, Interchangeable, @ValueSafe … It seems possible that Integer and the other wrapper types could be retro-fitted as value-safe types. This is a major change, since wrapper objects would be unsynchronizable and their references interchangeable. It is likely that code which violates value-safety for wrapper types exists but is uncommon. It is less plausible to retro-fit String, since the prominent operation String.intern is often used with value-unsafe code. We should also reconsider the distinction between boxed and unboxed values in code. The design presented above obscures that distinction. As another thought experiment, we could imagine making a first class distinction in the type system between boxed and unboxed representations. Since only primitive types are named with a lower-case initial letter, we could define that the capitalized version of a value type name always refers to the boxed representation, while the initial lower-case variant always refers to boxed. For example: complex pi = complex.valueOf(Math.PI, 0); Complex boxPi = pi; // convert to boxed myList.add(boxPi); complex z = myList.get(0); // unbox Such a convention could perhaps absorb the current difference between int and Integer, double and Double. It might also allow the programmer to express a helpful distinction among array types. As said above, array types are crucial to bulk data interfaces, but are limited in the JVM. Extending arrays beyond the present limitations is worth thinking about; for example, the Maxine JVM implementation has a hybrid object/array type. Something like this which can also accommodate value type components seems worthwhile. On the other hand, does it make sense for value types to contain short arrays? And why should random-access arrays be the end of our design process, when bulk data is often sequentially accessed, and it might make sense to have heterogeneous streams of data as the natural “jumbo” data structure. These considerations must wait for another day and another note. More Work It seems to me that a good sequence for introducing such value types would be as follows: Add the value-safety restrictions to an experimental version of javac. Code some sample applications with value types, including Complex and DecimalValue. Create an experimental JVM which internally unboxes value types but does not require new bytecodes to do so. Ensure the feasibility of the performance model for the sample applications. Add tuple-like bytecodes (with or without generic type reification) to a major revision of the JVM, and teach the Java compiler to switch in the new bytecodes without code changes. A staggered roll-out like this would decouple language changes from bytecode changes, which is always a convenient thing. A similar investigation should be applied (concurrently) to array types. In this case, it seems to me that the starting point is in the JVM: Add an experimental unboxing array data structure to a production JVM, perhaps along the lines of Maxine hybrids. No bytecode or language support is required at first; everything can be done with encapsulated unsafe operations and/or method handles. Create an experimental JVM which internally unboxes value types but does not require new bytecodes to do so. Ensure the feasibility of the performance model for the sample applications. Add tuple-like bytecodes (with or without generic type reification) to a major revision of the JVM, and teach the Java compiler to switch in the new bytecodes without code changes. That’s enough musing me for now. Back to work!

Read the article

Authoritative sources about Database vs. Flatfile decision

- by FastAl

<tldr>looking for a reference to a book or other undeniably authoritative source that gives reasons when you should choose a database vs. when you should choose other storage methods. I have provided an un-authoritative list of reasons about 2/3 of the way down this post.</tldr> I have a situation at my company where a database is being used where it would be better to use another solution (in this case, an auto-generated piece of source code that contains a static lookup table, searched by binary sort). Normally, a database would be an OK solution even though the problem does not require a database, e.g, none of the elements of ACID are needed, as it is read-only data, updated about every 3-5 years (also requiring other sourcecode changes), and fits in memory, and can be keyed into via binary search (a tad faster than db, but speed is not an issue). The problem is that this code runs on our enterprise server, but is shared with several PC platforms (some disconnected, some use a central DB, etc.), and parts of it are managed by multiple programming units, parts by the DBAs, parts even by mathematicians in another department, etc. These hit their own platform’s version of their databases (containing their own copy of the static data). What happens is that every implementation, every little change, something different goes wrong. There are many other issues as well. I can’t even use a flatfile, because one mode of running on our enterprise server does not have permission to read files (only databases, and of course, its own literal storage, e.g., in-source table). Of course, other parts of the system use databases in proper, less obscure manners; there is no problem with those parts. So why don’t we just change it? I don’t have administrative ability to force a change. But I’m affected because sometimes I have to help fix the problems, but mostly because it causes outages and tons of extra IT time by other programmers and d*mmit that makes me mad! The reason neither management, nor the designers of the system, can see the problem is that they propose a solution that won’t work: increase communication; implement more safeguards and standards; etc. But every time, in a different part of the already-pared-down but still multi-step processes, a few different diligent, hard-working, top performing IT personnel make a unique subtle error that causes it to fail, sometimes after the last round of testing! And in general these are not single-person failures, but understandable miscommunications. And communication at our company is actually better than most. People just don't think that's the case because they haven't dug into the matter. However, I have it on very good word from somebody with extensive formal study of sociology and psychology that the relatively small amount of less-than-proper database usage in this gigantic cross-platform multi-source, multi-language project is bureaucratically un-maintainable. Impossible. No chance. At least with Human Beings in the loop, and it can’t be automated. In addition, the management and developers who could change this, though intelligent and capable, don’t understand the rigidity of this ‘how humans are’ issue, and are not convincible on the matter. The reason putting the static data in sourcecode will solve the problem is, although the solution is less sexy than a database, it would function with no technical drawbacks; and since the sharing of sourcecode already works very well, you basically erase any database-related effort from this section of the project, along with all the drawbacks of it that are causing problems. OK, that’s the background, for the curious. I won’t be able to convince management that this is an unfixable sociological problem, and that the real solution is coding around these limits of human nature, just as you would code around a bug in a 3rd party component that you can’t change. So what I have to do is exploit the unsuitableness of the database solution, and not do it using logic, but rather authority. I am aware of many reasons, and posts on this site giving reasons for one over the other; I’m not looking for lists of reasons like these (although you can add a comment if I've miss a doozy): WHY USE A DATABASE? instead of flatfile/other DB vs. file: if you need... Random Read / Transparent search optimization Advanced / varied / customizable Searching and sorting capabilities Transaction/rollback Locks, semaphores Concurrency control / Shared users Security 1-many/m-m is easier Easy modification Scalability Load Balancing Random updates / inserts / deletes Advanced query Administrative control of design, etc. SQL / learning curve Debugging / Logging Centralized / Live Backup capabilities Cached queries / dvlp & cache execution plans Interleaved update/read Referential integrity, avoid redundant/missing/corrupt/out-of-sync data Reporting (from on olap or oltp db) / turnkey generation tools [Disadvantages:] Important to get right the first time - professional design - but only b/c it's meant to last s/w & h/w cost Usu. over a network, speed issue (best vs. best design vs. local=even then a separate process req's marshalling/netwk layers/inter-p comm) indicies and query processing can stand in the way of simple processing (vs. flatfile) WHY USE FLATFILE: If you only need... Sequential Row processing only Limited usage append only (no reading, no master key/update) Only Update the record you're reading (fixed length recs only) Too big to fit into memory If Local disk / read-ahead network connection Portability / small system Email / cut & Paste / store as document by novice - simple format Low design learning curve but high cost later WHY USE IN-MEMORY/TABLE (tables, arrays, etc.): if you need... Processing a single db/ff record that was imported Known size of data Static data if hardcoding the table Narrow, unchanging use (e.g., one program or proc) -includes a class that will be shared, but encapsulates its data manipulation Extreme speed needed / high transaction frequency Random access - but search is dependent on implementation Following are some other posts about the topic: http://stackoverflow.com/questions/1499239/database-vs-flat-text-file-what-are-some-technical-reasons-for-choosing-one-over http://stackoverflow.com/questions/332825/are-flat-file-databases-any-good http://stackoverflow.com/questions/2356851/database-vs-flat-files http://stackoverflow.com/questions/514455/databases-vs-plain-text/514530 What I’d like to know is if anybody could recommend a hard, authoritative source containing these reasons. I’m looking for a paper book I can buy, or a reputable website with whitepapers about the issue (e.g., Microsoft, IBM), not counting the user-generated content on those sites. This will have a greater change to elicit a change that I’m looking for: less wasted programmer time, and more reliable programs. Thanks very much for your help. You win a prize for reading such a large post!

Read the article

How do I prove I should put a table of values in source code instead of a database table?

- by FastAl

<tldr>looking for a reference to a book or other undeniably authoritative source that gives reasons when you should choose a database vs. when you should choose other storage methods. I have provided an un-authoritative list of reasons about 2/3 of the way down this post.</tldr> I have a situation at my company where a database is being used where it would be better to use another solution (in this case, an auto-generated piece of source code that contains a static lookup table, searched by binary sort). Normally, a database would be an OK solution even though the problem does not require a database, e.g, none of the elements of ACID are needed, as it is read-only data, updated about every 3-5 years (also requiring other sourcecode changes), and fits in memory, and can be keyed into via binary search (a tad faster than db, but speed is not an issue). The problem is that this code runs on our enterprise server, but is shared with several PC platforms (some disconnected, some use a central DB, etc.), and parts of it are managed by multiple programming units, parts by the DBAs, parts even by mathematicians in another department, etc. These hit their own platform’s version of their databases (containing their own copy of the static data). What happens is that every implementation, every little change, something different goes wrong. There are many other issues as well. I can’t even use a flatfile, because one mode of running on our enterprise server does not have permission to read files (only databases, and of course, its own literal storage, e.g., in-source table). Of course, other parts of the system use databases in proper, less obscure manners; there is no problem with those parts. So why don’t we just change it? I don’t have administrative ability to force a change. But I’m affected because sometimes I have to help fix the problems, but mostly because it causes outages and tons of extra IT time by other programmers and d*mmit that makes me mad! The reason neither management, nor the designers of the system, can see the problem is that they propose a solution that won’t work: increase communication; implement more safeguards and standards; etc. But every time, in a different part of the already-pared-down but still multi-step processes, a few different diligent, hard-working, top performing IT personnel make a unique subtle error that causes it to fail, sometimes after the last round of testing! And in general these are not single-person failures, but understandable miscommunications. And communication at our company is actually better than most. People just don't think that's the case because they haven't dug into the matter. However, I have it on very good word from somebody with extensive formal study of sociology and psychology that the relatively small amount of less-than-proper database usage in this gigantic cross-platform multi-source, multi-language project is bureaucratically un-maintainable. Impossible. No chance. At least with Human Beings in the loop, and it can’t be automated. In addition, the management and developers who could change this, though intelligent and capable, don’t understand the rigidity of this ‘how humans are’ issue, and are not convincible on the matter. The reason putting the static data in sourcecode will solve the problem is, although the solution is less sexy than a database, it would function with no technical drawbacks; and since the sharing of sourcecode already works very well, you basically erase any database-related effort from this section of the project, along with all the drawbacks of it that are causing problems. OK, that’s the background, for the curious. I won’t be able to convince management that this is an unfixable sociological problem, and that the real solution is coding around these limits of human nature, just as you would code around a bug in a 3rd party component that you can’t change. So what I have to do is exploit the unsuitableness of the database solution, and not do it using logic, but rather authority. I am aware of many reasons, and posts on this site giving reasons for one over the other; I’m not looking for lists of reasons like these (although you can add a comment if I've miss a doozy): WHY USE A DATABASE? instead of flatfile/other DB vs. file: if you need... Random Read / Transparent search optimization Advanced / varied / customizable Searching and sorting capabilities Transaction/rollback Locks, semaphores Concurrency control / Shared users Security 1-many/m-m is easier Easy modification Scalability Load Balancing Random updates / inserts / deletes Advanced query Administrative control of design, etc. SQL / learning curve Debugging / Logging Centralized / Live Backup capabilities Cached queries / dvlp & cache execution plans Interleaved update/read Referential integrity, avoid redundant/missing/corrupt/out-of-sync data Reporting (from on olap or oltp db) / turnkey generation tools [Disadvantages:] Important to get right the first time - professional design - but only b/c it's meant to last s/w & h/w cost Usu. over a network, speed issue (best vs. best design vs. local=even then a separate process req's marshalling/netwk layers/inter-p comm) indicies and query processing can stand in the way of simple processing (vs. flatfile) WHY USE FLATFILE: If you only need... Sequential Row processing only Limited usage append only (no reading, no master key/update) Only Update the record you're reading (fixed length recs only) Too big to fit into memory If Local disk / read-ahead network connection Portability / small system Email / cut & Paste / store as document by novice - simple format Low design learning curve but high cost later WHY USE IN-MEMORY/TABLE (tables, arrays, etc.): if you need... Processing a single db/ff record that was imported Known size of data Static data if hardcoding the table Narrow, unchanging use (e.g., one program or proc) -includes a class that will be shared, but encapsulates its data manipulation Extreme speed needed / high transaction frequency Random access - but search is dependent on implementation Following are some other posts about the topic: http://stackoverflow.com/questions/1499239/database-vs-flat-text-file-what-are-some-technical-reasons-for-choosing-one-over http://stackoverflow.com/questions/332825/are-flat-file-databases-any-good http://stackoverflow.com/questions/2356851/database-vs-flat-files http://stackoverflow.com/questions/514455/databases-vs-plain-text/514530 What I’d like to know is if anybody could recommend a hard, authoritative source containing these reasons. I’m looking for a paper book I can buy, or a reputable website with whitepapers about the issue (e.g., Microsoft, IBM), not counting the user-generated content on those sites. This will have a greater change to elicit a change that I’m looking for: less wasted programmer time, and more reliable programs. Thanks very much for your help. You win a prize for reading such a large post!

Read the article

You Might Be a DBA

- by BuckWoody

With all apologies to Jeff Foxworthy, I was up late Friday night on a holiday weekend (which translated into T-SQL becomes “Maintenance Window”) and I got bored in between the two or three minutes I had between clicks. So I started a “Twitter” meme – and it just took off. I haven’t cleaned these up much, but here, in author order as of Saturday the 29th of May is the list “You might be a DBA” from around the Twitterverse: buckwoody Your two main enemies are developers and SAN admins #youmightbeaDBA buckwoody People can use Access as a cross or garlic on you #youmightbeaDBA buckwoody You always plan an exit strategy, even when entering a McDonald's #youmightbeaDBA buckwoody You can't explain to your family what you really do for a living #youmightbeaDBA buckwoody You have at least one set of scripts you won't share #youmightbeaDBA buckwoody You have an opinion on the best code-beautifier #youmightbeaDBA buckwoody You have children older than the rest of your team #youmightbeaDBA buckwoody You and the Oracle DBA would kill each other, but you'll happily fight off a developer together first #youmightbeaDBA buckwoody You've threatened to quit if they give anyone the sa password on production #youmightbeaDBA buckwoody You've sent a vendor suggestions on improving their database design or code (and been ignored) #youmightbeaDBA buckwoody You've sent a vendor suggestions on improving their database design or code (and been ignored) #youmightbeaDBA buckwoody You have an opinion on the best code-beautifier #youmightbeaDBA buckwoody You have at least one set of scripts you won't share #youmightbeaDBA buckwoody You refer to co-workers as "carbon-units" #youmightbeaDBA buckwoody Being paranoid is on your resume at the top #youmightbeaDBA buckwoody Everyone comes to your cube to find the MSDN DVD's #youmightbeaDBA buckwoody You always plan an exit strategy, even when entering a McDonald's #youmightbeaDBA buckwoody You've worn down developers to get your way by explaining normalization levels #youmightbeaDBA buckwoody You refer to clothes as "Data Abstractions" #youmightbeaDBA buckwoody Users pester you to be able to put data in a database, then they pester you to take it out and put it in Excel #youmightbeaDBA buckwoody Others try to de-duplicate data, you try to copy it to more than three locations #youmightbeaDBA buckwoody You have at least one DLT tape in the trunk of your car #youmightbeaDBA buckwoody You use twitter and facebook to talk with colleagues because there's no one else in your company that does what you do #youmightbeaDBA buckwoody Your spouse knows what "ETL" means #youmightbeaDBA buckwoody You've referred to yourself as the "Data Janitor" #youmightbeaDBA buckwoody You don't have positive connotations of the word "upgrade" #youmightbeaDBA buckwoody You get your coffee before you check your servers, because you know you won't get any if you don't #youmightbeaDBA buckwoody You always come to work through the back door so no one hijacks you on the way to your cube #youmightbeaDBA buckwoody You check your server logs before you check your e-mail in the morning so you can reply "Yeah, I already fixed that." #youmightbeaDBA buckwoody You have more conference badges than clean socks #youmightbeaDBA buckwoody Your coffee mug says "It depends" #youmightbeaDBA buckwoody You can convince a boss that you need 16GB of RAM in your laptop #youmightbeaDBA buckwoody You've used ebay to find production equipment #youmightbeaDBA buckwoody You pad all project timelines by 2X, and you still miss them #youmightbeaDBA buckwoody You know when your company is acquiring another even before the CFO #youmightbeaDBA buckwoody You pad all project timelines by 2X, and you still miss them #youmightbeaDBA buckwoody You call aspirin "work vitamins" #youmightbeaDBA buckwoody You get the same amount of sleep even after you have a child #youmightbeaDBA buckwoody You obsess about performance metrics from over one year ago #youmightbeaDBA buckwoody The first thing you buy after the database software is aftermarket tools to manage the database software #youmightbeaDBA buckwoody You've tried to convince someone else to become a DBA #youmightbeaDBA buckwoody You use twitter and facebook to talk with colleagues because there's no one else in your company that does what you do #youmightbeaDBA buckwoody You only know other DBA's by their Tweet Handle #youmightbeaDBA buckwoody You've explained the difference between 32 and 64-bit to more than one manager in terms they can understand, using puppets #youmightbeaDBA buckwoody Your two main enemies are developers and SAN admins #youmightbeaDBA buckwoody You've driven to the Datacenter to install SQL Server because "you don't trust those NOC admins" #youmightbeaDBA buckwoody You pay more for faster Internet connections than cable at home so you don't have to drive in #youmightbeaDBA buckwoody You call texting a "queuing system" #youmightbeaDBA buckwoody You know that if someone can read Perl, they manage an Oracle system #youmightbeaDBA buckwoody You have an e-mail rule for backup notifications #youmightbeaDBA buckwoody Your food pyramid includes coffee, salt and fat #youmightbeaDBA buckwoody You wish everything had a graphical query plan #youmightbeaDBA buckwoody You refactor your e-mails #youmightbeaDBA buckwoody You've gotten more help from twitter and facebook than all your years in college #youmightbeaDBA buckwoody You would pay money for a license plate that has the letters S-Q-L together #youmightbeaDBA buckwoody You have actually considered making a RAID array from thumb drives #youmightbeaDBA buckwoody Everything on your laptop is installed from your MSDN subscription #youmightbeaDBA buckwoody You've written blog posts on technology you've never actually implemented in production #youmightbeaDBA buckwoody Everything on your laptop is installed from your MSDN subscription #youmightbeaDBA buckwoody @MidnightDBA Click the #youmightbeaDBA tag. I've had WAY too much coffee today. buckwoody There is no other position that is 1-deep except you and the CEO #youmightbeaDBA buckwoody When you watch "The Office" you call it "OJT" #youmightbeaDBA buckwoody You would pay money for a license plate that has the letters S-Q-L together #youmightbeaDBA buckwoody Your blog would make a "best practices" or "worst practices" book #youmightbeaDBA buckwoody You have actually considered making a RAID array from thumb drives #youmightbeaDBA buckwoody The first thing you install on your netbook is SSMS #youmightbeaDBA buckwoody Everything on your laptop is installed from your MSDN subscription #youmightbeaDBA buckwoody Your watch is set to UTC because it's just easier #youmightbeaDBA buckwoody You make plenty of money, but you're excited to get a $2.00 squeeze-ball from Quest and Redgate #youmightbeaDBA buckwoody You make plenty of money, but you're excited to get a $2.00 squeeze-ball from Quest and Redgate #youmightbeaDBA buckwoody You think data can be represented as something OTHER than XML #youmightbeaDBA buckwoody You tell people that you made a database query go faster, and expect them to be happy for you #youmightbeaDBA buckwoody You take the word "NoSQL" as a personal attack #youmightbeaDBA buckwoody People can use Access as a cross or garlic on you #youmightbeaDBA buckwoody * == bad #youmightbeaDBA buckwoody * == bad #youmightbeaDBA buckwoody There are just as many females in your technical field as males #youmightbeaDBA buckwoody People can use Access as a cross or garlic on you #youmightbeaDBA buckwoody You've gotten more help from twitter and facebook than all your years in college #youmightbeaDBA buckwoody You think that something OTHER than the database might be the performance bottleneck #youmightbeaDBA buckwoody You refer to time as a "Clustered Index" #youmightbeaDBA buckwoody You know why "user" refers to both business people and crack addicts #youmightbeaDBA buckwoody You make plenty of money, but you're excited to get a $2.00 squeeze-ball from Quest and Redgate #youmightbeaDBA buckwoody You can't explain to your family what you really do for a living #youmightbeaDBA buckwoody You tell people that you made a database query go faster, and expect them to be happy for you #youmightbeaDBA buckwoody You think a millisecond is a really long time #youmightbeaDBA buckwoody You're sitting and typing #youmightbeaDBA when you could be outside #youmightbeaDBA buckwoody You can't wait for a technical conference so you can wear a kilt - and you're not Scottish #youmightbeaDBA buckwoody You know that "DBA" stands for "Default Blame Acceptor" #youmightbeaDBA buckwoody People can use Access as a cross or garlic on you #youmightbeaDBA buckwoody You know what "the truth, thole truth and nothing but the truth, so help me Codd" means #youmightbeaDBA buckwoody You've gotten more help from twitter and facebook than all your years in college #youmightbeaDBA buckwoody You can't talk fast enough to get a concept out of your head so you tweet it instead #youmightbeaDBA buckwoody You cry when someone doesn't use a WHERE clause #youmightbeaDBA buckwoody You think data can be represented as something OTHER than XML #youmightbeaDBA buckwoody You think "Set theory" is not an verb but a noun #youmightbeaDBA buckwoody You try to convince random strangers to vote on your Connect item #youmightbeaDBA buckwoody You think 3 hours of contiguous sleep is a good thing #youmightbeaDBA or #youmightbeamother buckwoody You don't like Oracle, and not just because of what she did to Neo #youmightbeaDBA buckwoody You know when to say "sequel" and "s-q-l" #youmightbeaDBA buckwoody You know where the data is #youmightbeaDBA buckwoody You refer to your children as "Fully Redundant Mirrors" #youmightbeaDBA buckwoody Holiday == "Maintenance Window" #youmightbeaDBA buckwoody Your laptop is more powerful than the servers in most companies - including your own #youmightbeaDBA buckwoody You capitalize SELECTed words #youmightbeaDBA buckwoody You take the word "NoSQL" as a personal attack #youmightbeaDBA buckwoody You know why "user" refers to both business people and crack addicts #youmightbeaDBA buckwoody You cringe in public when the word "upgrade" is used in a sentence #youmightbeaDBA buckwoody Holiday == "Maintenance Window" #youmightbeaDBA buckwoody All Data Is MetaData means something to you #youmightbeaDBA buckwoody You've never seen the driveway to your house in the daylight #youmightbeaDBA buckwoody You think that something OTHER than the database might be the performance bottleneck #youmightbeaDBA buckwoody Most of your bloodstream is composed of caffeine #youmightbeaDBA buckwoody Your task list is labeled "CRUD Matrix" #youmightbeaDBA buckwoody You call your wife/husband a "Linked Server" #youmightbeaDBA anonythemouse When someone tells you they are going to take a dump and you wonder of which database then #youmightbeaDBA anonythemouse When it's 11pm on a holiday weekend and you are working #youmightbeaDBA anonythemouse When you sit down at a table and look for it's primary key #youmightbeaDBA anonythemouse When getting milk from the fridge you check the expiry date is > getdate() #youmightbeaDBA blakmk when you wake up dreaming about sql #youmightbeaDBA CharlesGarver You think a @buckwoody bobblehead would be a cool thing to have on the dashboard of your car #youmightbeaDBA CharlesGarver Your friends don't understand why you think there's a difference between single and double quotes #youmightbeaDBA CharlesGarver Even the newest employees know your name from all the downtime notices you've sent out #youmightbeaDBA CharlesGarver You sometimes feel anxious and think "I should test restoring those backups" and then the feeling passes #youmightbeadba CharlesGarver You know what a co-worker means when they ask "how is your squirrel server?" #youmightbeadba CharlesGarver You can't sleep at night and you ponder the logisitcs of collecting every copy of Access for the world's biggest bonfire #youmightbeaDBA CharlesGarver You can't sleep at night and you ponder the logisitcs of collecting every copy of Access for the world's biggest bonfire #youmightbeaDBA CharlesGarver You're willing to move someone's job up in priority for a box of #voodoodonuts #youmightbeaDBA CharlesGarver Each person in your company seems to think you work for THEM #youmightbeaDBA CharlesGarver You have a Love/Hate relationship going on with #Microsoft #youmightbeaDBA CharlesGarver People ask you to troubleshoot their Access program #youmightbeaDBA CharlesGarver The first words you hear in the morning are 'your voicemail box is full' #youmightbeaDBA CharlesGarver The thought of disrupting 500 people's work so you can do something doesn't phase you #youmightbeaDBA CharlesGarver You can't sleep at night and you ponder the logisitcs of collecting every copy of Access for the world's biggest bonfire #youmightbeaDBA CharlesGarver Your home computer is backed up in 3 different places #youmightbeaDBA CharlesGarver Your wardrobe for work includes pajamas #youmightbeaDBA CharlesGarver Someone tells you to look in the INDEX and you look puzzled before finally going to the back of the book. #youmightbeaDBA chuckboycejr If you have ever set up a SQLAgent job to email your mobile phone to serve as an alarm clock #youmightbeaDBA chuckboycejr If you'd rather meet Itzik than Jay Z #youmightbeaDBA chuckboycejr If you'd rather meet Itzik than Jay Z #youmightbeaDBA chuckboycejr If you'd wrestle a SysAdmin to the ground to implement #DPA best practices as per @aspiringgeek #youmightbeaDBA databaseguy I need to be up in 7 hours, so I'm off to bed! I'll have to read the rest of @buckwoody's #youmightbeaDBA posts in the AM. (g'night Buck!) databaseguy When people ask you about your house, the first thing you describe is the network. #youmightbeaDBA databaseguy The last thing you say at the office each day is, "is anybody else here? I'm shutting off the lights!" #youmightbeaDBA databaseguy Your blood pressure rises when you read application specs drafted by marketing. #youmightbeaDBA databaseguy A good day at work is one when nobody pays you no mind. #youmightbeaDBA databaseguy You care about latches and wait states. #youmightbeaDBA databaseguy You have worked over 200 hours on a performance tuning project that required no application changes at all. #youmightbeaDBA databaseguy The late-night security guard knows the names of your spouse and kids. #youmightbeaDBA databaseguy You have had vigorous debates about whether it should be pronounced "sequel" or "ess-queue-ell". #youmightbeaDBA databaseguy You have VPN and RDP software installed on your phone ... just in case. #youmightbeaDBA databaseguy You have edited a data file by hand, just to see what would happen. #youmightbeaDBA databaseguy You decorate your office walls with database catalog posters. #youmightbeaDBA databaseguy You've built programs that access data just to keep other developers from asking you to run queries all the time. #youmightbeaDBA databaseguy When you watch movies like The Matrix, you find yourself calculating the fasibility of storing all that data. #youmightbeaDBA databaseguy You have tried to convince someone to spend money on an SSD storage array. #youmightbeaDBA databaseguy When CPU is spiked on a server, you want to gather forensic evidence. #youmightbeaDBA databaseguy You have to remind developers not to push code to production without checking if the database is ready. #youmightbeaDBA databaseguy Nobody cares what you wear to work, as long as the thing keeps running. #youmightbeaDBA databaseguy Telepathy is a job requirement when working with app dev teams. #youmightbeaDBA databaseguy You read database statistics for the educational value. #youmightbeaDBA databaseguy And your boss freely admits this to anyone within earshot. #youmightbeaDBA databaseguy Your boss cannot explain or understand what you do. #youmightbeaDBA databaseguy You envision ERDs when you see a GUI. #youmightbeaDBA databaseguy You say things like "applications come and go, but data lasts forever." #youmightbeaDBA databaseguy You have memorized the names of several of the AdventureWorks employees. #youmightbeaDBA databaseguy You know what MAXDOP setting you can get away with for a big query based on current server load. #youmightbeaDBA databaseguy And you immediately recognize the recursion in my last tweet. #youmightbeaDBA databaseguy You find 50 simultaneous tweets from @buckwoody about #youmightbeaDBA :O) DBAishness You have "funny stories" about the times your developers accidentally deleted the T-log in their test environment. #youmightbeaDBA DBAishness Planning to slice and dice your MDW data with PowerPivot makes you giggle like a schoolgirl. #youmightbeaDBA donalddotfarmer You think @buckwoody lives in the "real world." #youmightbeaDBA jamach09 @buckwoody #youmightbeaDBA Why go outside when you can sit in the nice cool server room? jamach09 If you refer to procreation as "Replication", #youmightbeaDBA. jamach09 If you think ORM is a four-letter word, #youmightbeaDBA JamesMarsh If you have ever preached the value of Source Code Control, #YouMightBeADBA jethrocarr @venzann You store your shopping list in a ACID compliant DB #youmightbeaDBA joe_positive @buckwoody thought it stood for "Don't Bother Asking" #youmightbeaDBA joe_positive when you check your IT Events Calendar before making weekend plans #youmightbeaDBA LadyRuna You cringe whenever someone calls Excel a database #youmightbeaDBA LadyRuna When the waiter says he'll be your server today, you ask how many terabytes he is #youmightbeaDBA LadyRuna you always call the asterisk a "Star" #youmightbeaDBA LadyRuna You walk into a server room, say "Nice RACK!" and everyone there knows you're talking about server rack... #youmightbeaDBA LadyRuna You receive more messages from servers than from friends #youmightbeaDBA LadyRuna hmmm... #youmightbeaDBA if your recipe for gumbo is "SELECT * FROM Refrigerator" markjholmes @SQLSoldier Heh. #youmightbeaDBA if you correct other DBAs' spelling of @PaulRandal markjholmes #youmightbeaDBA if you actually test RAID5 vs RAID10 on your SAN because when it comes to configuration, "it depends." markjholmes #youmightbeaDBA if you have at least 3 definitions of the word "cluster" MarlonRibunal 3 Words: @BrentO, snicker, & Access #youmightbeaDBA MarlonRibunal @onpnt @mikeSQL my appeal was a couple of mins late. Enjoying #youmightbeaDBA MarlonRibunal @mikeSQL @onpnt pls, don't mention bacon #youmightbeaDBA merv @buckwoody You HATE 3-way joins #youmightbeaDBA MidnightDBA If you're up at midnight Tweeting about SQL #youmightbeaDBA MidnightDBA @buckwoody I'd noticed that. :) #youmightbeaDBA mikeSQL when people talk about "their type" you're thinking varchar, bigint, binary, etc #youmightbeadba mikeSQL people ask you to go to lunch , but you can't go because you're attending #SQLlunch #youmightbeadba mikeSQL you laugh for hours at all of the #sqlmoviequotes ....things in which a normal individual would scratch their head at. #youmightbeadba mikeSQL you laugh for hours at all of the #sqlmoviequotes ....things in which a normal individual would scratch their head at. #youmightbeadba mrdenny If you think that @buckwoody's demo using PowerPivot to analyze index usage data from DMVs is awesome then #youmightbeaDBA mrdenny You wish @PaulRandal still worked at Microsoft so that they would make a bobble head of him #youmightbeadba mrdenny When it's 11pm on a holiday weekend, and your posting stupid jokes on Twitter then #youmightbeadba mrdenny If you go out with friends and wonder why no one's wearing a kilt then #YouMightBeADBA mrdenny You can't do basic math, but you know off the top of your head how many CALs $14,412 can buy you. #YoumightbeaDBA mrdenny If you've ever setup a SQL Job to email you to get you out of a regularly scheduled meeting #YouMightBeADBA. mrdenny You throw up in your mouth a little when ever you here the word "Access". Even if it doesn't relate to a MS product. #YouMightBeADBA msdtjones You spend more time listening to @buckwoody than your wife #youmightbeaDBA NFDotCom You perform "hail deltas" on a regular basis. #YouMightBeADBA NoelMcKinney If you tell your wife you want to go to Columbus Ohio for your wedding anniversary so you can attend #sqlsat42 then #youmightbeaDBA NoelMcKinney You read a union is on strike and wonder if it's a UNION ALL #youmightbeaDBA NoelMcKinney You read a union is on strike and wonder if it's a UNION ALL #youmightbeaDBA NoelMcKinney Someone asks you to throw another log on the fire and you tell them not to worry about it because Autogrowth is turned on #youmightbeaDBA Nuurdygirl Even if you have a girlfriend...its possible #youmightbeadba. Yeah-i said its possible! Nuurdygirl When your girlfriend has to lean around the laptop to kiss you goodnight #youmightbeadba Old_Man_Fish If you worry about how big your package is and how long it takes to finish #youmightbeaDBA Old_Man_Fish If you no longer wonder if someone is in trouble or died if you are getting calls at 2AM #youmightbeaDBA Old_Man_Fish If, when you hear the word ACCESS with no connotation you blood pressure jumps 50 points, #youmightbeaDBA onpnt When you hear the word inject you immediately get concerned if your databases are OK #youmightbeaDBA onpnt Your servers haven't been rebooted in a year #youmightbeaDBA onpnt You know why it's funny when @PaulRandal has the word, "Sheep" in a tweet #youmightbeaDBA onpnt You have read BOL without actually having a problem to figure out #youmightbeaDBA onpnt You can type "SELECT columns FROM tables" without typos but tipen ni Banglish ares a messis #youmightbeaDBA onpnt DR strategies doesn't include the word, RAID in them #youmightbeaDBA onpnt you can move a SQL Server instance to a new server without the users ever knowing #youmightbeaDBA onpnt You have made an SSIS package that is more than one step #youmightbeaDBA onpnt You have the balls to say no to your boss when they ask for the sa password #youmightbeaDBA onpnt you google to trouble shoot a problem and end up at your own blog (and it fixes it) #youmightbeaDBA onpnt You talk your wife into moving the family vacation a week earlier so you can attend the areas local SSUG meeting #youmightbeaDBA onpnt you can explain to a nontechnical person what a deadlock is #youmightbeaDBA onpnt You hope a girl asks you what your collation is #youmightbeaDBA onpnt you make jokes that include the words shrink, truncate and 1205. And you are the only one that laughs at them #youmightbeaDBA onpnt You rate your ability to stay awake to work longer on blogs, twitter, forums and your day to day job with the 5 9's goal #youmightbeaDBA onpnt you have major surgery and beg the doctor to release you back to work 5 days later because you miss your servers #youmightbeaDBA #TrueStory onpnt You do have backups and you know how to use them #youmightbeaDBA onpnt It's the network #youmightbeaDBA onpnt When the developers get to work your mood changes rapidly #youmightbeaDBA onpnt When someone says, "PASS", you first think of karaoke #youmightbeaDBA onpnt Recruiters try to get you to call them *just* because they think you'll give them @BrentO contact info #youmightbeaDBA onpnt You chuckle every time you go to grab the "CLR" Calcium, Lime and Rust Remover to clean something #youmightbeaDBA onpnt @MarlonRibunal @mikeSQL Sorry man, it was already in motion ;-) #youmightbeaDBA onpnt When you have an "I love bacon" sticker on your laptop. #youmightbeaDBA http://twitpic.com/1ry671 onpnt You sing SELECT statements in the shower #youmightbeaDBA onpnt When you see a chicken it doesn't remind you of food. It reminds you of a guy named Jorge #youmightbeaDBA onpnt At time, SQL is your mistress #youmightbeaDBA onpnt Your wife wonders if SQL is the code name of your mistress at times #youmightbeaDBA onpnt it's Friday and you are on twitter thinking really hard about what would be funny for hash tag #youmightbeaDBA onpnt You organize your wife's "decorative"pillows on the bed in a B-Tree structure #youmightbeaDBA PaulWhiteNZ If you: SELECT TOP (1) milk FROM fridge WHERE use_by_date >= GET_DATE() ORDER BY use_by_date ASC #YouMightBeaDBA RonDBA #youmightbeaDBA if you read @buckwoody's and @BrentO's blogs. ryaneastabrook @buckwoody omg, you have to stand up a website with these on them, they are awesome #youmightbeaDBA soulvy @StrateSQL @LadyRuna Or a "Splat" #youmightbeaDBA speedracer You can still fall asleep after three cups of coffee #youmightbeaDBA speedracer You retweet @buckwoody on a Friday night #youmightbeaDBA speedracer You can still fall asleep after three cups of coffee #youmightbeaDBA speedracer Developers make you twitch #youmightbeaDBA sqlagentman You know what X/1024*8 is. #YouMightBeADBA SqlAsylum Your still in the office at 5:00 on memorial day weekend. #youmightbeadba :) SQLBob Whenever someone you know gets pregnant you bring up INNER JOINs or SQL Injection attacks... #youmightbeaDBA SQLChicken You know one or more SQL folks in the community with an animal in their username #youmightbeaDBA SQLChicken You've used one or more car analogies to explain how a database works #youmightbeaDBA SQLChicken “@sqljoe: #youmightbeaDBA if you applied to attend #sqlu and requested @SQLChicken to pull strings for you” lmao nice! SQLChicken When talking about SSIS your discussions break down into various jokes about packages #youmightbeaDBA SQLChicken Just SEEING the code for cursors makes you break out in hives #youmightbeaDBA SQLChicken Just SEEING the code for cursors makes you break out in hives #youmightbeaDBA SQLCraftsman You coined the phrase "Magic SAN Dust" because calling a vendor's marketing claims BS is not acceptable in a meeting. #YouMightBeADBA SQLCraftsman If you hear about a new feature with the acronym "DAC" and wonder what disaster of a feature it is attached to this time. #YouMightBeADBA SQLCraftsman You really own a "Stick of Much Developer Whacking" #YouMightBeADBA SQLCraftsman You coined the phrase "Magic SAN Dust" because calling a vendor's marketing claims BS is not acceptable in a meeting. #YouMightBeADBA SQLCraftsman Default Blame Acceptor #YouMightBeADBA SQLCraftsman If you hear about a new feature with the acronym "DAC" and wonder what disaster of a feature it is attached to this time. #YouMightBeADBA SQLCraftsman Default Blame Acceptor #YouMightBeADBA SQLCraftsman If you hear about a new feature with the acronym "DAC" and wonder what disaster of a feature it is attached to this time. #YouMightBeADBA sqljoe #youmightbeaDBA if you wished your wife knew T-sql. USE ShoppingList SELECT NecessaryItems from Supermarket WHERE Category<> ("junk food") sqljoe #youmightbeaDBA if the first thing you kiss when you wake up is your mobile for not waking you up in the middle of the night sqljoe #youmightbeaDBA if your wife has a "Do Not Fly" family vacation list of her own including your laptop and mobile sqljoe #youmightbeaDBA if you have researched for DBA Anonymous groups and attended a #SSUG willing to drop your database (vice) sqljoe #youmightbeaDBA if your only maintenance windows are staff meetings sqljoe #youmightbeaDBA if you think of yourself as "The One" in The Matrix "balancing the equation" from The Architect's (developers) poor coding sqljoe #youmightbeaDBA if you think @PaulRandal should have played the Oracle in The Matrix sqljoe #youmightbeaDBA if home CD & Movie collection is stored in secured containers,in logical order & naming convention,and with a backup copy sqljoe #youmightbeaDBA if you applied to attend #sqlu and requested @SQLChicken to pull strings for you sqljoe #youmightbeaDBA if you have tried to TiVo @MidnightDBA broadcasts sqljoe #youmightbeaDBA if your #sql user group feels like #AA meetings sqljoe #youmightbeaDBA if you thought of bringing your #sql books to #sqlsaturday and #sqlpass for autographs sqljoe #youmightbeaDBA if #sqlpass feels like the #oscars sqljoe #youmightbeaDBA if you are proud of your small package SQLLawman #youmightbeaDBA when you hear MDX and Acura is not first thought that comes to mind. sqlrunner If your wife double checks that there isn't a SQLSat within 200 miles of your vacation destination #youmightbeaDBA sqlrunner When you're on a conference call and your wife thinks your speaking in a foreign language #youmightbeaDBA sqlrunner When you're on a conference call and your wife thinks your speaking in a foreign language #youmightbeaDBA sqlrunner You treat the word 'access' as a verb, not a noun #youmightbeaDBA sqlrunner If you are happy with sub-second performance #youmightbeaDBA sqlrunner When you know the names of the NOC people AND their families #youmightbeadba sqlrunner When you know the names of the NOC people AND their families #youmightbeadba sqlrunner Your company set's up international phone coverage for your cruise #youmightbeaDBA sqlsamson @buckwoody if your manager asks you for data and you respond with "there's a script for that" #youmightbeadba sqlsamson @buckwoody If you receive more messages from your server then your spouse #youmightbeadba SQLSoldier You've spent all night Valentines Day upgrading the SQL Servers and forgot to tell your wife you'd be working late. #youmightbeadba SQLSoldier You're flattered when someone calls you a geek. #youmightbeadba SQLSoldier @llangit @mrdenny it's 11pm on a holiday weekend, & your reading stupid jokes on Twitter then #youmightbeadba SQLSoldier Your manager borrows lunch money from you because your salary is 30% higher than his. #youmightbeaDBA SQLSoldier You think "intellisense" is a double negative because it's not intelligent nor makes sense. #youmightbeaDBA SQLSoldier 75% of the emails you receive at home have the phrase "now following you on Twitter!" in the subject line. #youmightbeaDBA SQLSoldier You petition Ken Burns to remake Office Space because it should have been 18 hours long. #youmightbeaDBA SQLSoldier You select a candidate for a Jr DBA position because his resume said he's willing to get your coffee. #youmightbeaDBA SQLSoldier Somebody misquotes @PaulRandall and you call him on your cell to verify. #youmightbeaDBA SQLSoldier You wish the elevator in your building was slower because it's the last time you'll be left alone all day. #youmightbeaDBA SQLSoldier The developers sacrifice small animals before giving you their code for review. #youmightbeaDBA SQLSoldier Developers bring you coffee and a BLT when you review their code. #youmightbeaDBA #IWish SQLSoldier You can get out of any family get-together by saying you have to work and nobody questions it. #youmightbeaDBA SQLSoldier You've requested a HP Superdome for you "test" box. #youmightbeaDBA SQLSoldier Your leave work early because your internet connection to the data center is better at home #youmightbeaDBA SQLSoldier The new CEO asks you to justify your salary, so you go on vacation for 2 weeks. And he never questions you again. #youmightbeaDBA SQLSoldier You cheer when Milton burns down the company in Office Space #youmightbeaDBA SQLSoldier A dev. asks if you've heard about some great new feature in SQL and you show the 16 blog posts you wrote on it ... last year #youmightbeaDBA SQLSoldier Your dev team is still testing SQL 2008 and you're already planning for SQL 11. #youmightbeaDBA #TrueStory SQLSoldier The new CEO asks you to justify your salary, so you go on vacation for 2 weeks. And he never questions you again. #youmightbeaDBA SQLSoldier Your dev team is still testing SQL 2008 and you're already planning for SQL 11. #youmightbeaDBA SQLSoldier You use a cell phone service coverage map to plan your next vacation. #youmightbeaDBA SQLSoldier You come in to work at 7 AM because it gives you at least 3 hours without any developers around. #youmightbeaDBA SQLSoldier You figure out a way to make take your wife on a cruise and deduct it as a business expense. #youmightbeaDBA #sqlcruise SQLSoldier You name your cat SQLDog because the name @SQLCat was already taken. #youmightbeaDBA SQLSoldier You rate your blog posts based on the number of retweets you get. #youmightbeaDBA SQLSoldier You disable random logins just to mess with people. #youmightbeaDBA SQLSoldier You fall for the pickup line, "Hey baby, what's your collation?" #youmightbeaDBA SQLSoldier You can blame an outage on anyone in the company because you're the only one that knows how to find out what really happened #youmightbeaDBA SQLSoldier You can blame an outage on anyone in the company because you're the only one that knows how to find out what really happened #youmightbeaDBA SQLSoldier You cheer when Milton burns down the company in Office Space #youmightbeaDBA SQLSoldier Your leave work early because your internet connection to the data center is better at home #youmightbeaDBA SQLSoldier You cheer when Milton burns down the company in Office Space #youmightbeaDBA SQLSoldier Your think the 4 food groups are coffee, bacon, fast food, and Mountain Dew. #youmightbeaDBA SQLSoldier You tell someone your job title and they ask "What?" You describe it and they ask "What?". So you say "computer geek". #youmightbeaDBA SQLSoldier The #1 referrer to your blog is Twitter.com. #youmightbeaDBA SQLSoldier Your idea of a good time on a Saturday involves free training. #youmightbeaDBA #sqlsat43 SQLSoldier You write a book that all of your co-workers have and none have read it. #youmightbeaDBA SQLSoldier You write a book that sells a couple thousand copies and is heralded a best seller. #youmightbeaDBA SQLSoldier No matter how sick you are, you go to work if it's time to pass the pager on to the next guy. #youmightbeaDBA #TrueStory SQLSoldier You go out on the town, and strangers walk up to you and say, "Hey you're that SQL guy" #youmightbeaDBA #TrueStory SQLSoldier Your wife asks you to fix something, and you request a downtime window. #youmightbeaDBA SQLSoldier Your wife asks when you'll be home, and you tell her that you wish you knew. #youmightbeaDBA SQLSoldier Your best pickup line, "Hey baby, what's your collation?" #youmightbeaDBA SQLSoldier Your wife asks when you'll be home, and you tell her that you wish you knew. #youmightbeaDBA SQLSoldier You know that @BuckWoody is not someone's porno name. #youmightbeaDBA SQLSoldier You list TSQL as your native language on the 2010 census. #youmightbeaDBA SQLSoldier Starbucks' stock price drops every time you go on vacation. #youmightbeaDBA SQLSoldier You're happy when the web master says that the website is down. #youmightbeaDBA SQLSoldier You know that @BuckWoody is not someone's porno name. #youmightbeaDBA SQLSoldier You get mad when someone calls your car a "heap" because you've always considered it to be a "clustered index". #youmightbeaDBA SQLSoldier Your blog has more hits than your company's website. #youmightbeaDBA SQLSoldier You systematically remove the asterisk key from all keyboards in the company except yours. #youmightbeaDBA SQLSoldier When asked if you recycle, you reply that you run sp_cycle_errorlog every night at midnight #youmightbeaDBA SQLSoldier You wouldn't allow someone named @AdamMachanic to work on your car. #youmightbeaDBA SQLSoldier You switch offices every 3 days to avoid developers #youmightbeaDBA SQLSoldier PSS has your number on speed dial. #youmightbeaDBA SQLSoldier You frown when you they tell Neo that he's going to the Oracle #youmightbeaDBA swhaley you regretted saying "This shouldn't effect production" #youmightbeaDBA swhaley you regretted saying "This shouldn't effect production" #youmightbeaDBA Tarwn A pleasurable saturday means spending the day learning more about what you already do the rest of the week #youmightbeaDBA ...oh, wait... thelostforum For great justice; all our base are belong to YOU !! #youmightbeadba thelostforum @SQLSoldier: You need a witness to use a mirror #youmightbeaDBA ;) TimCost you capitalize key words. always. everywhere. you can't help it, usually don't even notice. #youmightbeaDBA Toshana Your the only one in your company not impressed with the developers new application. #youmightbeaDBA venzann Coming soon from a (respected) book publisher - @buckwoody's #youmightbeaDBA venzann He's on a role tonight. @buckwoody is summing up my life with his #youmightbeaDBA tweets... venzann I love the #youmightbeaDBA tag. Found at least 6 new DBAs to follow.. venzann He's on a role tonight. @buckwoody is summing up my life with his #youmightbeaDBA tweets... venzann You use #sqlhelp as a primary resource during troubleshooting #youmightbeaDBA venzann You insist on stricter password security for your sql servers than you implement on your own laptop #youmightbeaDBA WesBrownSQL @buckwoody you are up so late the only tweets you see are from @buckwoody #youmightbeaDBA WesBrownSQL @SQLSoldier you are upgrading all your 2005 prod servers to 2008 R2 on a three day weekend... #youmightbeaDBA zippy1981 #youmightbeaDBA if everytime you do something with #mongodb you think of the Vulcan proverb "only Nixon could go to China." Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article

How John Got 15x Improvement Without Really Trying

- by rchrd

The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here. How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

Read the article

SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Read the article

SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Read the article

Bulk inserting best way to about it? + Helping me understand fully what I found so far

- by chobo2

Hi So I saw this post here and read it and it seems like bulk copy might be the way to go. http://stackoverflow.com/questions/682015/whats-the-best-way-to-bulk-database-inserts-from-c I still have some questions and want to know how things actually work. So I found 2 tutorials. http://www.codeproject.com/KB/cs/MultipleInsertsIn1dbTrip.aspx#_Toc196622241 http://www.codeproject.com/KB/linq/BulkOperations_LinqToSQL.aspx First way uses 2 ado.net 2.0 features. BulkInsert and BulkCopy. the second one uses linq to sql and OpenXML. This sort of appeals to me as I am using linq to sql already and prefer it over ado.net. However as one person pointed out in the posts what he just going around the issue at the cost of performance( nothing wrong with that in my opinion) First I will talk about the 2 ways in the first tutorial I am using VS2010 Express, .net 4.0, MVC 2.0, SQl Server 2005 Is ado.net 2.0 the most current version? Based on the technology I am using, is there some updates to what I am going to show that would improve it somehow? Is there any thing that these tutorial left out that I should know about? BulkInsert I am using this table for all the examples. CREATE TABLE [dbo].[TBL_TEST_TEST] ( ID INT IDENTITY(1,1) PRIMARY KEY, [NAME] [varchar](50) ) SP Code USE [Test] GO /****** Object: StoredProcedure [dbo].[sp_BatchInsert] Script Date: 05/19/2010 15:12:47 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO ALTER PROCEDURE [dbo].[sp_BatchInsert] (@Name VARCHAR(50) ) AS BEGIN INSERT INTO TBL_TEST_TEST VALUES (@Name); END C# Code /// <summary> /// Another ado.net 2.0 way that uses a stored procedure to do a bulk insert. /// Seems slower then "BatchBulkCopy" way and it crashes when you try to insert 500,000 records in one go. /// http://www.codeproject.com/KB/cs/MultipleInsertsIn1dbTrip.aspx#_Toc196622241 /// </summary> private static void BatchInsert() { // Get the DataTable with Rows State as RowState.Added DataTable dtInsertRows = GetDataTable(); SqlConnection connection = new SqlConnection(connectionString); SqlCommand command = new SqlCommand("sp_BatchInsert", connection); command.CommandType = CommandType.StoredProcedure; command.UpdatedRowSource = UpdateRowSource.None; // Set the Parameter with appropriate Source Column Name command.Parameters.Add("@Name", SqlDbType.VarChar, 50, dtInsertRows.Columns[0].ColumnName); SqlDataAdapter adpt = new SqlDataAdapter(); adpt.InsertCommand = command; // Specify the number of records to be Inserted/Updated in one go. Default is 1. adpt.UpdateBatchSize = 1000; connection.Open(); int recordsInserted = adpt.Update(dtInsertRows); connection.Close(); } So first thing is the batch size. Why would you set a batch size to anything but the number of records you are sending? Like I am sending 500,000 records so I did a Batch size of 500,000. Next why does it crash when I do this? If I set it to 1000 for batch size it works just fine. System.Data.SqlClient.SqlException was unhandled Message="A transport-level error has occurred when sending the request to the server. (provider: Shared Memory Provider, error: 0 - No process is on the other end of the pipe.)" Source=".Net SqlClient Data Provider" ErrorCode=-2146232060 Class=20 LineNumber=0 Number=233 Server="" State=0 StackTrace: at System.Data.Common.DbDataAdapter.UpdatedRowStatusErrors(RowUpdatedEventArgs rowUpdatedEvent, BatchCommandInfo[] batchCommands, Int32 commandCount) at System.Data.Common.DbDataAdapter.UpdatedRowStatus(RowUpdatedEventArgs rowUpdatedEvent, BatchCommandInfo[] batchCommands, Int32 commandCount) at System.Data.Common.DbDataAdapter.Update(DataRow[] dataRows, DataTableMapping tableMapping) at System.Data.Common.DbDataAdapter.UpdateFromDataTable(DataTable dataTable, DataTableMapping tableMapping) at System.Data.Common.DbDataAdapter.Update(DataTable dataTable) at TestIQueryable.Program.BatchInsert() in C:\Users\a\Downloads\TestIQueryable\TestIQueryable\TestIQueryable\Program.cs:line 124 at TestIQueryable.Program.Main(String[] args) in C:\Users\a\Downloads\TestIQueryable\TestIQueryable\TestIQueryable\Program.cs:line 16 InnerException: Time it took to insert 500,000 records with insert batch size of 1000 took "2 mins and 54 seconds" Of course this is no official time I sat there with a stop watch( I am sure there are better ways but was too lazy to look what they where) So I find that kinda slow compared to all my other ones(expect the linq to sql insert one) and I am not really sure why. Next I looked at bulkcopy /// <summary> /// An ado.net 2.0 way to mass insert records. This seems to be the fastest. /// http://www.codeproject.com/KB/cs/MultipleInsertsIn1dbTrip.aspx#_Toc196622241 /// </summary> private static void BatchBulkCopy() { // Get the DataTable DataTable dtInsertRows = GetDataTable(); using (SqlBulkCopy sbc = new SqlBulkCopy(connectionString, SqlBulkCopyOptions.KeepIdentity)) { sbc.DestinationTableName = "TBL_TEST_TEST"; // Number of records to be processed in one go sbc.BatchSize = 500000; // Map the Source Column from DataTabel to the Destination Columns in SQL Server 2005 Person Table // sbc.ColumnMappings.Add("ID", "ID"); sbc.ColumnMappings.Add("NAME", "NAME"); // Number of records after which client has to be notified about its status sbc.NotifyAfter = dtInsertRows.Rows.Count; // Event that gets fired when NotifyAfter number of records are processed. sbc.SqlRowsCopied += new SqlRowsCopiedEventHandler(sbc_SqlRowsCopied); // Finally write to server sbc.WriteToServer(dtInsertRows); sbc.Close(); } } This one seemed to go really fast and did not even need a SP( can you use SP with bulk copy? If you can would it be better?) BatchCopy had no problem with a 500,000 batch size.So again why make it smaller then the number of records you want to send? I found that with BatchCopy and 500,000 batch size it took only 5 seconds to complete. I then tried with a batch size of 1,000 and it only took 8 seconds. So much faster then the bulkinsert one above. Now I tried the other tutorial. USE [Test] GO /****** Object: StoredProcedure [dbo].[spTEST_InsertXMLTEST_TEST] Script Date: 05/19/2010 15:39:03 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO ALTER PROCEDURE [dbo].[spTEST_InsertXMLTEST_TEST](@UpdatedProdData nText) AS DECLARE @hDoc int exec sp_xml_preparedocument @hDoc OUTPUT,@UpdatedProdData INSERT INTO TBL_TEST_TEST(NAME) SELECT XMLProdTable.NAME FROM OPENXML(@hDoc, 'ArrayOfTBL_TEST_TEST/TBL_TEST_TEST', 2) WITH ( ID Int, NAME varchar(100) ) XMLProdTable EXEC sp_xml_removedocument @hDoc C# code. /// <summary> /// This is using linq to sql to make the table objects. /// It is then serailzed to to an xml document and sent to a stored proedure /// that then does a bulk insert(I think with OpenXML) /// http://www.codeproject.com/KB/linq/BulkOperations_LinqToSQL.aspx /// </summary> private static void LinqInsertXMLBatch() { using (TestDataContext db = new TestDataContext()) { TBL_TEST_TEST[] testRecords = new TBL_TEST_TEST[500000]; for (int count = 0; count < 500000; count++) { TBL_TEST_TEST testRecord = new TBL_TEST_TEST(); testRecord.NAME = "Name : " + count; testRecords[count] = testRecord; } StringBuilder sBuilder = new StringBuilder(); System.IO.StringWriter sWriter = new System.IO.StringWriter(sBuilder); XmlSerializer serializer = new XmlSerializer(typeof(TBL_TEST_TEST[])); serializer.Serialize(sWriter, testRecords); db.insertTestData(sBuilder.ToString()); } } So I like this because I get to use objects even though it is kinda redundant. I don't get how the SP works. Like I don't get the whole thing. I don't know if OPENXML has some batch insert under the hood but I do not even know how to take this example SP and change it to fit my tables since like I said I don't know what is going on. I also don't know what would happen if the object you have more tables in it. Like say I have a ProductName table what has a relationship to a Product table or something like that. In linq to sql you could get the product name object and make changes to the Product table in that same object. So I am not sure how to take that into account. I am not sure if I would have to do separate inserts or what. The time was pretty good for 500,000 records it took 52 seconds The last way of course was just using linq to do it all and it was pretty bad. /// <summary> /// This is using linq to sql to to insert lots of records. /// This way is slow as it uses no mass insert. /// Only tried to insert 50,000 records as I did not want to sit around till it did 500,000 records. /// http://www.codeproject.com/KB/linq/BulkOperations_LinqToSQL.aspx /// </summary> private static void LinqInsertAll() { using (TestDataContext db = new TestDataContext()) { db.CommandTimeout = 600; for (int count = 0; count < 50000; count++) { TBL_TEST_TEST testRecord = new TBL_TEST_TEST(); testRecord.NAME = "Name : " + count; db.TBL_TEST_TESTs.InsertOnSubmit(testRecord); } db.SubmitChanges(); } } I did only 50,000 records and that took over a minute to do. So I really narrowed it done to the linq to sql bulk insert way or bulk copy. I am just not sure how to do it when you have relationship for either way. I am not sure how they both stand up when doing updates instead of inserts as I have not gotten around to try it yet. I don't think I will ever need to insert/update more than 50,000 records at one type but at the same time I know I will have to do validation on records before inserting so that will slow it down and that sort of makes linq to sql nicer as your got objects especially if your first parsing data from a xml file before you insert into the database. Full C# code using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml.Serialization; using System.Data; using System.Data.SqlClient; namespace TestIQueryable { class Program { private static string connectionString = ""; static void Main(string[] args) { BatchInsert(); Console.WriteLine("done"); } /// <summary> /// This is using linq to sql to to insert lots of records. /// This way is slow as it uses no mass insert. /// Only tried to insert 50,000 records as I did not want to sit around till it did 500,000 records. /// http://www.codeproject.com/KB/linq/BulkOperations_LinqToSQL.aspx /// </summary> private static void LinqInsertAll() { using (TestDataContext db = new TestDataContext()) { db.CommandTimeout = 600; for (int count = 0; count < 50000; count++) { TBL_TEST_TEST testRecord = new TBL_TEST_TEST(); testRecord.NAME = "Name : " + count; db.TBL_TEST_TESTs.InsertOnSubmit(testRecord); } db.SubmitChanges(); } } /// <summary> /// This is using linq to sql to make the table objects. /// It is then serailzed to to an xml document and sent to a stored proedure /// that then does a bulk insert(I think with OpenXML) /// http://www.codeproject.com/KB/linq/BulkOperations_LinqToSQL.aspx /// </summary> private static void LinqInsertXMLBatch() { using (TestDataContext db = new TestDataContext()) { TBL_TEST_TEST[] testRecords = new TBL_TEST_TEST[500000]; for (int count = 0; count < 500000; count++) { TBL_TEST_TEST testRecord = new TBL_TEST_TEST(); testRecord.NAME = "Name : " + count; testRecords[count] = testRecord; } StringBuilder sBuilder = new StringBuilder(); System.IO.StringWriter sWriter = new System.IO.StringWriter(sBuilder); XmlSerializer serializer = new XmlSerializer(typeof(TBL_TEST_TEST[])); serializer.Serialize(sWriter, testRecords); db.insertTestData(sBuilder.ToString()); } } /// <summary> /// An ado.net 2.0 way to mass insert records. This seems to be the fastest. /// http://www.codeproject.com/KB/cs/MultipleInsertsIn1dbTrip.aspx#_Toc196622241 /// </summary> private static void BatchBulkCopy() { // Get the DataTable DataTable dtInsertRows = GetDataTable(); using (SqlBulkCopy sbc = new SqlBulkCopy(connectionString, SqlBulkCopyOptions.KeepIdentity)) { sbc.DestinationTableName = "TBL_TEST_TEST"; // Number of records to be processed in one go sbc.BatchSize = 500000; // Map the Source Column from DataTabel to the Destination Columns in SQL Server 2005 Person Table // sbc.ColumnMappings.Add("ID", "ID"); sbc.ColumnMappings.Add("NAME", "NAME"); // Number of records after which client has to be notified about its status sbc.NotifyAfter = dtInsertRows.Rows.Count; // Event that gets fired when NotifyAfter number of records are processed. sbc.SqlRowsCopied += new SqlRowsCopiedEventHandler(sbc_SqlRowsCopied); // Finally write to server sbc.WriteToServer(dtInsertRows); sbc.Close(); } } /// <summary> /// Another ado.net 2.0 way that uses a stored procedure to do a bulk insert. /// Seems slower then "BatchBulkCopy" way and it crashes when you try to insert 500,000 records in one go. /// http://www.codeproject.com/KB/cs/MultipleInsertsIn1dbTrip.aspx#_Toc196622241 /// </summary> private static void BatchInsert() { // Get the DataTable with Rows State as RowState.Added DataTable dtInsertRows = GetDataTable(); SqlConnection connection = new SqlConnection(connectionString); SqlCommand command = new SqlCommand("sp_BatchInsert", connection); command.CommandType = CommandType.StoredProcedure; command.UpdatedRowSource = UpdateRowSource.None; // Set the Parameter with appropriate Source Column Name command.Parameters.Add("@Name", SqlDbType.VarChar, 50, dtInsertRows.Columns[0].ColumnName); SqlDataAdapter adpt = new SqlDataAdapter(); adpt.InsertCommand = command; // Specify the number of records to be Inserted/Updated in one go. Default is 1. adpt.UpdateBatchSize = 500000; connection.Open(); int recordsInserted = adpt.Update(dtInsertRows); connection.Close(); } private static DataTable GetDataTable() { // You First need a DataTable and have all the insert values in it DataTable dtInsertRows = new DataTable(); dtInsertRows.Columns.Add("NAME"); for (int i = 0; i < 500000; i++) { DataRow drInsertRow = dtInsertRows.NewRow(); string name = "Name : " + i; drInsertRow["NAME"] = name; dtInsertRows.Rows.Add(drInsertRow); } return dtInsertRows; } static void sbc_SqlRowsCopied(object sender, SqlRowsCopiedEventArgs e) { Console.WriteLine("Number of records affected : " + e.RowsCopied.ToString()); } } }

Developer IT