Search Results

Search found 4026 results on 162 pages for 'programmer calculator'.

Page 135/162 | < Previous Page | 131 132 133 134 135 136 137 138 139 140 141 142 | Next Page >

How John Got 15x Improvement Without Really Trying

- by rchrd

The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here. How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

Read the article
Algorithm to Find the Aggregate Mass of "Granola Bar"-Like Structures?

- by Stuart Robbins

I'm a planetary science researcher and one project I'm working on is N-body simulations of Saturn's rings. The goal of this particular study is to watch as particles clump together under their own self-gravity and measure the aggregate mass of the clumps versus the mean velocity of all particles in the cell. We're trying to figure out if this can explain some observations made by the Cassini spacecraft during the Saturnian summer solstice when large structures were seen casting shadows on the nearly edge-on rings. Below is a screenshot of what any given timestep looks like. (Each particle is 2 m in diameter and the simulation cell itself is around 700 m across.) The code I'm using already spits out the mean velocity at every timestep. What I need to do is figure out a way to determine the mass of particles in the clumps and NOT the stray particles between them. I know every particle's position, mass, size, etc., but I don't know easily that, say, particles 30,000-40,000 along with 102,000-105,000 make up one strand that to the human eye is obvious. So, the algorithm I need to write would need to be a code with as few user-entered parameters as possible (for replicability and objectivity) that would go through all the particle positions, figure out what particles belong to clumps, and then calculate the mass. It would be great if it could do it for "each" clump/strand as opposed to everything over the cell, but I don't think I actually need it to separate them out. The only thing I was thinking of was doing some sort of N2 distance calculation where I'd calculate the distance between every particle and if, say, the closest 100 particles were within a certain distance, then that particle would be considered part of a cluster. But that seems pretty sloppy and I was hoping that you CS folks and programmers might know of a more elegant solution? Edited with My Solution: What I did was to take a sort of nearest-neighbor / cluster approach and do the quick-n-dirty N2 implementation first. So, take every particle, calculate distance to all other particles, and the threshold for in a cluster or not was whether there were N particles within d distance (two parameters that have to be set a priori, unfortunately, but as was said by some responses/comments, I wasn't going to get away with not having some of those). I then sped it up by not sorting distances but simply doing an order N search and increment a counter for the particles within d, and that sped stuff up by a factor of 6. Then I added a "stupid programmer's tree" (because I know next to nothing about tree codes). I divide up the simulation cell into a set number of grids (best results when grid size ˜7 d) where the main grid lines up with the cell, one grid is offset by half in x and y, and the other two are offset by 1/4 in ±x and ±y. The code then divides particles into the grids, then each particle N only has to have distances calculated to the other particles in that cell. Theoretically, if this were a real tree, I should get order N*log(N) as opposed to N2 speeds. I got somewhere between the two, where for a 50,000-particle sub-set I got a 17x increase in speed, and for a 150,000-particle cell, I got a 38x increase in speed. 12 seconds for the first, 53 seconds for the second, 460 seconds for a 500,000-particle cell. Those are comparable speeds to how long the code takes to run the simulation 1 timestep forward, so that's reasonable at this point. Oh -- and it's fully threaded, so it'll take as many processors as I can throw at it.

Read the article
Extend Your Applications Your Way: Oracle OpenWorld Live Poll Results

- by Applications User Experience

Lydia Naylor, Oracle Applications User Experience Manager At OpenWorld 2012, I attended one of our team’s very exciting sessions: “Extend Your Applications, Your Way”. It was clear that customers were engaged by the topics presented. Not only did we see many heads enthusiastically nodding in agreement during the presentation, and witness a large crowd surround our speakers Killian Evers, Kristin Desmond and Greg Nerpouni afterwards, but we can prove it…with data! Figure 1. Killian Evers, Kristin Desmond, and Greg Nerpouni of Oracle at the OOW 2012 session. At the beginning of our OOW 2012 journey, Greg Nerpouni, Fusion HCM Principal Product Manager, told me he really wanted to get feedback from the audience on our extensibility direction. Initially, we were thinking of doing a group activity at the OOW UX labs events that we hold every year, but Greg was adamant- he wanted “real-time” feedback. So, after a little tinkering, we came up with a way to use an online survey tool, a simple QR code (Quick Response code: a matrix barcode that can include information like URLs and can be read by mobile device cameras), and the audience’s mobile devices to do just that. Figure 2. Actual QR Code for survey Prior to the session, we developed a short survey in Vovici (an online survey tool), with questions to gather feedback on certain points in the presentation, as well as demographic data from our participants. We used Vovici’s feature to generate a mobile HTML version of the survey. At the session, attendees accessed the survey by simply scanning a QR code or typing in a TinyURL (a shorthand web address that is easily accessible through mobile devices). Killian, Kristin and Greg paused at certain points during the session and asked participants to answer a few survey questions about what they just presented. Figure 3. Session survey deployed on a mobile phone The nice thing about Vovici’s survey tool is that you can see the data real-time as participants are responding to questions - so we knew during the session that not only was our direction on track but we were hitting the mark and fulfilling Greg’s request. We planned on showing the live polling results to the audience at the end of the presentation but it ran just a little over time, and we were gently nudged out of the room by the session attendants. We’ve included a quick summary below and this link to the full results for your enjoyment. Figure 4. Most important extensions to Fusion Applications So what did participants think of our direction for extensibility? A total of 94% agreed that it was an improvement upon their current process. The vast majority, 80%, concurred that the extensibility model accounts for the major roles involved: end user, business systems analyst and programmer. Attendees suggested a few supporting roles such as systems administrator, data architect and integrator. Customers and partners in the audience verified that Oracle‘s Fusion Composers allow them to make changes in the most common areas they need to: user interface, business processes, reporting and analytics. Integrations were also suggested. All top 10 things customers can do on a page rated highly in importance, with all but two getting an average rating above 4.4 on a 5 point scale. The kinds of layout changes our composers allow customers to make align well with customers’ needs. The most common were adding columns to a table (94%) and resizing regions and drag and drop content (both selected by 88% of participants). We want to thank the attendees of the session for allowing us another great opportunity to gather valuable feedback from our customers! If you didn’t have a chance to attend the session, we will provide a link to the OOW presentation when it becomes available.

Read the article
SQLAuthority News – History of the Database – 5 Years of Blogging at SQLAuthority

- by pinaldave

Don’t miss the Contest:Participate in 5th Anniversary Contest Today is this blog’s birthday, and I want to do a fun, informative blog post. Five years ago this day I started this blog. Intention – my personal web blog. I wrote this blog for me and still today whatever I learn I share here. I don’t want to wander too far off topic, though, so I will write about two of my favorite things – history and databases. And what better way to cover these two topics than to talk about the history of databases. If you want to be technical, databases as we know them today only date back to the late 1960’s and early 1970’s, when computers began to keep records and store memories. But the idea of memory storage didn’t just appear 40 years ago – there was a history behind wanting to keep these records. In fact, the written word originated as a way to keep records – ancient man didn’t decide they suddenly wanted to read novels, they needed a way to keep track of the harvest, of their flocks, and of the tributes paid to the local lord. And that is how writing and the database began. You could consider the cave paintings from 17,0000 years ago at Lascaux, France, or the clay token from the ancient Sumerians in 8,000 BC to be the first instances of record keeping – and thus databases. If you prefer, you can consider the advent of written language to be the first database. Many historians believe the first written language appeared in the 37th century BC, with Egyptian hieroglyphics. The ancient Sumerians, not to be outdone, also created their own written language within a few hundred years. Databases could be more closely described as collections of information, in which case the Sumerians win the prize for the first archive. A collection of 20,000 stone tablets was unearthed in 1964 near the modern day city Tell Mardikh, in Syria. This ancient database is from 2,500 BC, and appears to be a sort of law library where apprentice-scribes copied important documents. Further archaeological digs hope to uncover the palace library, and thus an even larger database. Of course, the most famous ancient database would have to be the Royal Library of Alexandria, the great collection of records and wisdom in ancient Egypt. It was created by Ptolemy I, and existed from 300 BC through 30 AD, when Julius Caesar effectively erased the hard drives when he accidentally set fire to it. As any programmer knows who has forgotten to hit “save” or has experienced a sudden power outage, thousands of hours of work was lost in a single instant. Databases existed in very similar conditions up until recently. Cuneiform tablets gave way to papyrus, which led to vellum, and eventually modern paper and the printing press. Someday the databases we rely on so much today will become another chapter in the history of record keeping. Who knows what the databases of tomorrow will look like! Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: About Me, Database, Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

Read the article
The busy developers guide to the Kinect SDK Beta

- by mbcrump

The Kinect is awesome. From day one, I’ve said this thing has got potential. After playing with several open-source Kinect projects, I am please to announce that Microsoft has released the official SDK beta on 6/16/2011. I’ve created this quick start guide to get you up to speed in no time flat. Let’s begin: What is it? The Kinect for Windows SDK beta is a starter kit for applications developers that includes APIs, sample code, and drivers. This SDK enables the academic research and enthusiast communities to create rich experiences by using Microsoft Xbox 360 Kinect sensor technology on computers running Windows 7. (defined by Microsoft) Links worth checking out: Download Kinect for Windows SDK beta – You can either download a 32 or 64 bit SDK depending on your OS. Readme for Kinect for Windows SDK Beta from Microsoft Research Programming Guide: Getting Started with the Kinect for Windows SDK Beta Code Walkthroughs of the samples that ship with the Kinect for Windows SDK beta (Found in \Samples Folder) Coding4Fun Kinect Toolkit – Lots of extension methods and controls for WPF and WinForms. Kinect Mouse Cursor – Use your hands to control things like a mouse created by Brian Peek. Kinect Paint – Basically MS Paint but use your hands! Kinect for Windows SDK Quickstarts Installing and Using the Kinect Sensor Getting it installed: After downloading the Kinect SDK Beta, double click the installer to get the ball rolling. Hit the next button a few times and it should complete installing. Once you have everything installed then simply plug in your Kinect device into the USB Port on your computer and hopefully you will get the following screen: Once installed, you are going to want to check out the following folders: C:\Program Files (x86)\Microsoft Research KinectSDK – This contains the actual Kinect Sample Executables along with the documentation as a CHM file. Also check out the C:\Users\Public\Documents\Microsoft Research KinectSDK Samples directory: The main thing to note here is that these folders contain the source code to the applications where you can compile/build them yourself. Audio NUI DEMO Time Let’s get started with some demos. Navigate to the C:\Program Files (x86)\Microsoft Research KinectSDK folder and double click on ShapeGame.exe. Next up is SkeletalViewer.exe (image taken from http://www.i-programmer.info/news/91-hardware/2619-microsoft-launch-kinect-sdk-beta.html as I could not get a good image using SnagIt) At this point, you will have to download Kinect Mouse Cursor – This is really cool because you can use your hands to control the mouse cursor. I actually used this to resize itself. Last up is Kinect Paint – This is very cool, just make sure you read the instructions! MS Paint on steroids! A few tips for getting started building Kinect Applications. It appears WPF is the way to go with building Kinect Applications. You must also use a version of Visual Studio 2010. Your going to need to reference Microsoft.Research.Kinect.dll when building a Kinect Application. Right click on References and then goto Browse and navigate to C:\Program Files (x86)\Microsoft Research KinectSDK and select Microsoft.Research.Kinect.dll. You are going to want to make sure your project has the Platform target set to x86. The Coding4Fun Kinect Toolkit really makes things easier with extension methods and controls. Just note that this is for WinForms or WPF. Conclusion It looks like we have a lot of fun in store with the Kinect SDK. I’m very excited about the release and have already been thinking about all the applications that I can begin building. It seems that development will be easier now that we have an official SDK and the great work from Coding4Fun. Please subscribe to my blog or follow me on twitter for more information about Kinect, Silverlight and other great technology. Subscribe to my feed

Read the article
Guessing Excel Data Types

- by AjarnMark

Note to Self HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel: TypeGuessRows = 0 means scan everything. Note to Others About 10 years ago I stumbled across this bit of information just when I needed it and it saved my project. Then for some reason, a few years later when it would have been nice, but not critical, for some reason I could not find it again anywhere. Well, now I have stumbled across it again, and to preserve my future self from nightmares and sudden baldness due to pulling my hair out, I have decided to blog it in the hopes that I can find it again this way. Here’s the story… When you query data from an Excel spreadsheet, such as with old-fashioned DTS packages in SQL 2000 (my first reference) or simply with an OLEDB Data Adapter from ASP.NET (recent task) and if you are using the Microsoft Jet 4.0 driver (newer ones may deal with this differently) then you can get funny results where the query reports back that a cell value is null even when you know it contains data. What happens is that Excel doesn’t really have data types. While you can format information in cells to appear like certain data types (e.g. Date, Time, Decimal, Text, etc.) that is not really defining the cell as being of a certain type like we think of when working with databases. But, presumably, to make things more convenient for the user (programmer) when you issue a query against Excel, the query processor tries to guess what type of data is contained in each column and returns it in an appropriate manner. This is all well and good IF your data is consistent in every row and matches what the processor guessed. And, for efficiency’s sake, when the query processor is trying to figure out each column’s data type, it does so by analyzing only the first 8 rows of data (default setting). Now here’s the problem, suppose that your spreadsheet contains information about clothing, and one of the columns is Size. Now suppose that in the first 8 rows, all of your sizes look like 32, 34, 18, 10, and so on, using numbers, but then, somewhere after the 8th row, you have some rows with sizes like S, M, L, XL. What happens is that by examining only the first 8 rows, the query processor inferred that the column contained numerical data, and then when it hits the non-numerical data in later rows, it comes back blank. Major bummer, and a real pain to track down if you don’t know that Excel is doing this, because you study the spreadsheet and say, “the data is RIGHT THERE! WHY doesn’t the query see it?!?!” And the hair-pulling begins. So, what’s a developer to do? One option is to go to the registry setting noted above and change the DWORD value of TypeGuessRows from the default of 8 to 0 (zero). Setting this value to zero will force Jet to scan every row in the spreadsheet before making its determination as to what type of data the column contains. And that means that in the example above, it would have treated the column as a string rather than as numeric, and presto! your query now returns all of the values that you know are in there. Of course, there is a caveat… if you are querying large spreadsheets, making Jet scan every row can be quite a performance hit. You could enter a different number (more than 8) that you believe is a better sampling of rows to make the guess, but you still have the possibility that every row scanned looks alike, but that later rows are different, and that you might get blanks when there really is data there. That’s the type of gamble, I really don’t like to take with my data. Anyone with a better approach, or with experience with more recent drivers that have a better way of handling data types, please chime in!

Read the article
javascript complex recurrsion [on hold]

- by Achilles

Given Below is my data in data array. What i am doing in code below is that from that given data i have to construct json in a special format which i also gave below. //code start here var hierarchy={}; hierarchy.name="Hierarchy"; hierarchy.children=[{"name":"","children":[{"name":"","children":[]}]}]; var countryindex; var flagExist=false; var data = [ {country :"America", city:"Kansas", employe:'Jacob'}, {country :"Pakistan", city:"Lahore", employe:'tahir'}, {country :"Pakistan", city:"Islamabad", employe:'fakhar'} , {country :"Pakistan", city:"Lahore", employe:'bilal'}, {country :"India", city:"d", employe:'ali'} , {country :"Pakistan", city:"Karachi", employe:'eden'}, {country :"America", city:"Kansas", employe:'Jeen'} , {country :"India", city:"Banglore", employe:'PP'} , {country :"India", city:"Banglore", employe:'JJ'} , ]; for(var i=0;i<data.length;i++) { for(var j=0;j<hierarchy.children.length;j++) { //for checking country match if(hierarchy.children[j].name==data[i].country) { countryindex=j; flagExist=true; break; } } if(flagExist)//country match now no need to add new country just add city in it { var cityindex; var cityflag=false; //hierarchy.children[countryindex].children.push({"name":data[i].city,"children":[]}) //if(hierarchy.children[index].children!=undefined) for(var k=0;k< hierarchy.children[countryindex].children.length;k++) { //for checking city match if(hierarchy.children[countryindex].children[k].name==data[i].city) { // hierarchy.children[countryindex].children[k].children.push({"name":data[i].employe}) cityflag=true; cityindex=k; break; } } if(cityflag)//city match now add just empolye at that city index { hierarchy.children[countryindex].children[cityindex].children.push({"name":data[i].employe}); cityflag=false; } else//no city match so add new with employe also as this is new city so its emplye will be 1st { hierarchy.children[countryindex].children.push({"name":data[i].city,children:[{"name":data[i].employe}]}); //same as above //hierarchy.children[countryindex].children[length-1].children.push({"name":data[i].employe}); } flagExist=false; } else{ //no country match adding new country //with city also as this is new city of new country console.log("sparta"); hierarchy.children.push({"name":data[i].country,"children":[{"name":data[i].city,"children":[{"name":data[i].employe}]}]}); // hierarchy.children.children.push({"name":data[i].city,"children":[]}); } //console.log(hierarchy); } hierarchy.children.shift(); var j=JSON.stringify(hierarchy); //code ends here //here is the json which i seccessfully formed from the code { "name":"Hierarchy", "children":[ { "name":"America", "children":[ { "name":"Kansas", "children":[{"name":"Jacob"},{"name":"Jeen"}]}]}, { "name":"Pakistan", "children":[ { "name":"Lahore", "children": [ {"name":"tahir"},{"name":"bilal"}]}, { "name":"Islamabad", "children":[{"name":"fakhar"}]}, { "name":"Karachi", "children":[{"name":"eden"}]}]}, { "name":"India", "children": [ { "name":"d", "children": [ {"name":"ali"}]}, { "name":"Banglore", "children":[{"name":"PP"},{"name":"JJ"}]}]}]} Now the orignal problem is that currently i am solving this problem for data of array of three keys and i have to go for 3 nested loops now i want to optimize this solution so that if data array of object has more than 3 key say 5 {country :"America", state:"NewYork",city:"newYOrk",street:"elm", employe:'Jacob'}, or more than my solution will not work and i cannot decide before how many keys will come so i thought recursion may suit best here. But i am horrible in writing recurrsion and the case is also complex. Can some awesome programmer help me writing recurrsion or suggest some other solution.

Read the article
What Counts For A DBA: ESP

- by Louis Davidson

Now I don’t want to get religious here, and I’m not going to, but what I’m going to describe in this ‘What Counts for a DBA’ installment sometimes feels like magic. Often I will spend hours thinking about the solution to a design issue or coding problem, working diligently to try to come up with a solution and then finally just give up with the feeling that I’m not even qualified to be a data entry clerk, much less a data architect. At this point I often take a walk (or sometimes a nap), and then it hits me. I realize that I have the answer just sitting in my brain, ready to implement. This phenomenon is not limited to walks either; it can happen almost any time after I stop my obsession about a problem. I call this phenomena ESP (or Extra-Sensory Programming.) Another term for this could be ‘sleeping on it’, and while the idiom tends to mean to let time pass to actively think about a problem, sleeping on a problem also lets you relax and let your brain do the work. I first noticed this back in my college days when I would play video games for hours on end. We would get stuck deep in some dungeon unable to find a way out, playing for days on end until we were beaten down tired. Once we gave up and walked away, the solution would usually be there waiting for one of us before we came back to play the next day. Sometimes it would be in the form of a dream, and sometimes it would just be that the problem was now easy to solve when we started to play again. While it worked great for video games, it never occurred when I studied English Literature for hours on end, or even when I worked for the same sort of frustrating hours attempting to solve a homework problem in Calculus. I believe that the difference was that I was passionate about the video game, and certainly far less so about homework where people used the word “thou” instead of “you” or x to represent a number. This phenomenon occurs somewhat more often in my current work as a professional data programmer, because I am very passionate about SQL and love those aspects of my career choice. Every day that I get to draw a new data model to solve a customer issue, or write a complex SELECT statement to ferret out the answer to a complex data question, is a great day. I hope it is the same for any reader of this blog. But, unfortunately, while the day on a whole is great, a heck of a lot of noise is generated in work life. There are the typical project deadlines, along with the requisite project manager sitting on your shoulders shouting slogans to try to make you to go faster: Add in office politics, and the occasional family issues that permeate the mind, and you lose the ability to think deeply about any problem, not to mention occasionally forgetting your own name. These office realities coupled with a difficult SQL problem staring at you from your widescreen monitor will slowly suck the life force out of your body, making it seem impossible to solve the problem This is when the walk starts; or a nap. Maybe you hide from the madness under your desk like George Costanza hides from Steinbrenner on Seinfeld. Forget about the problem. Free your mind from the insanity of the problem and your surroundings. Then let your training and education deep in your brain take over and see if it will passively do the rest for you. If you don’t end up with a solution, the worst case scenario is that you have a bit of exercise or rest, and you won’t have heard the phrase “better is the enemy of good enough” even once…which certainly will do your brain some good. Once you stop expecting whipping your brain for information, inspiration may just strike and instead of a humdrum solution you find a solution you hadn’t even considered, almost magically. So, my beloved manager, next time you have an urgent deadline and you come across me taking a nap, creep away quietly because I’m working, doing some extra-sensory programming.

Read the article
Modernizr Rocks HTML5

- by Laila

HTML5 is a moving target. At the moment, we don't know what will be in future versions. In most circumstances, this really matters to the developer. When you're using Adobe Air, you can be reasonably sure what works, what is there, and what isn't, since you have a version of the browser built-in. With Metro, you can assume that you're going to be using at least IE 10. If, however, you are using HTML5 in a web application, then you are going to rely heavily on Feature Detection. Feature-Detection is a collection of techniques that tell you, via JavaScript, whether the current browser has this feature natively implemented or not Feature Detection isn't just there for the esoteric stuff such as Geo-location, progress bars, <canvas> support, the new <input> types, Audio, Video, web workers or storage, but is required even for semantic markup, since old browsers make a pigs ear out of rendering this. Feature detection can't rely just on reading the browser version and inferring from that what works. Instead, you must use JavaScript to check that an HTML5 feature is there before using it. The problem with relying on the user-agent is that it takes a lot of historical data to work out what version does what, and, anyway, the user-agent can be, and sometimes is, spoofed. The open-source library Modernizr is just about the most essential JavaScript library for anyone using HTML5, because it provides APIs to test for most of the CSS3 and HTML5 features before you use them, and is intelligent enough to alter semantic markup into 'legacy' 'markup using shims on page-load for old browsers. It also allows you to check what video Codecs are installed for playing video. It also provides media queries and conditional resource-loading (formerly YepNope.js.). Generally, Modernizr gives you the choice of what you do about browsers that don't support the feature that you want. Often, the best choice is graceful degradation, but the resource-loading feature allows you to dynamically load JavaScript Shims to replace the standard API for missing or defective HTML5 functionality, called 'PolyFills'. As the Modernizr site says 'Yes, not only can you use HTML5 today, but you can use it in the past, too!' The evolutionary progress of HTML5 requires a more defensive style of JavaScript programming where the programmer adopts a mindset of fearing the worst ( IE 6) rather than assuming the best, whilst exploiting as many of the new HTML features as possible for the requirements of the site or HTML application. Why would anyone want the distraction of developing their own techniques to do this when Modernizr exists to do this for you? Laila

Read the article
Subterranean IL: Compiling C# exception handlers

- by Simon Cooper

An exception handler in C# combines the IL catch and finally exception handling clauses into a single try statement: try { Console.WriteLine("Try block") // ... } catch (IOException) { Console.WriteLine("IOException catch") // ... } catch (Exception e) { Console.WriteLine("Exception catch") // ... } finally { Console.WriteLine("Finally block") // ... } How does this get compiled into IL? Initial implementation If you remember from my earlier post, finally clauses must be specified with their own .try clause. So, for the initial implementation, we take the try/catch/finally, and simply split it up into two .try clauses (I have to use label syntax for this): StartTry: ldstr "Try block" call void [mscorlib]System.Console::WriteLine(string) // ... leave.s End EndTry: StartIOECatch: ldstr "IOException catch" call void [mscorlib]System.Console::WriteLine(string) // ... leave.s End EndIOECatch: StartECatch: ldstr "Exception catch" call void [mscorlib]System.Console::WriteLine(string) // ... leave.s End EndECatch: StartFinally: ldstr "Finally block" call void [mscorlib]System.Console::WriteLine(string) // ... endfinally EndFinally: End: // ... .try StartTry to EndTry catch [mscorlib]System.IO.IOException handler StartIOECatch to EndIOECatch catch [mscorlib]System.Exception handler StartECatch to EndECatch .try StartTry to EndTry finally handler StartFinally to EndFinally However, the resulting program isn't verifiable, and doesn't run: [IL]: Error: Shared try has finally or fault handler. Nested try blocks What's with the verification error? Well, it's a condition of IL verification that all exception handling regions (try, catch, filter, finally, fault) of a single .try clause have to be completely contained within any outer exception region, and they can't overlap with any other exception handling clause. In other words, IL exception handling clauses must to be representable in the scoped syntax, and in this example, we're overlapping catch and finally clauses. Not only is this example not verifiable, it isn't semantically correct. The finally handler is specified round the .try. What happens if you were able to run this code, and an exception was thrown? Program execution enters top of try block, and exception is thrown within it CLR searches for an exception handler, finds catch Because control flow is leaving .try, finally block is run The catch block is run leave.s End inside the catch handler branches to End label. We're actually running the finally before the catch! What we do about it What we actually need to do is put the catch clauses inside the finally clause, as this will ensure the finally gets executed at the correct time (this time using scoped syntax): .try { .try { ldstr "Try block" call void [mscorlib]System.Console::WriteLine(string) // ... leave.s End } catch [mscorlib]System.IO.IOException { ldstr "IOException catch" call void [mscorlib]System.Console::WriteLine(string) // ... leave.s End } catch [mscorlib]System.Exception { ldstr "Exception catch" call void [mscorlib]System.Console::WriteLine(string) // ... leave.s End } } finally { ldstr "Finally block" call void [mscorlib]System.Console::WriteLine(string) // ... endfinally } End: ret Returning from methods There is a further semantic mismatch that the C# compiler has to deal with; in C#, you are allowed to return from within an exception handling block: public int HandleMethod() { try { // ... return 0; } catch (Exception) { // ... return -1; } } However, you can't ret inside an exception handling block in IL. So the C# compiler does a leave.s to a ret outside the exception handling area, loading/storing any return value to a local variable along the way (as leave.s clears the stack): .method public instance int32 HandleMethod() { .locals init ( int32 retVal ) .try { // ... ldc.i4.0 stloc.0 leave.s End } catch [mscorlib]System.Exception { // ... ldc.i4.m1 stloc.0 leave.s End } End: ldloc.0 ret } Conclusion As you can see, the C# compiler has quite a few hoops to jump through to translate C# code into semantically-correct IL, and hides the numerous conditions on IL exception handling blocks from the C# programmer. Next up: catch-all blocks, and how the runtime deals with non-Exception exceptions.

Read the article
What Counts For a DBA: Replaceable

- by Louis Davidson

Replaceable is what every employee in every company instinctively strives not to be. Yet, if you’re an irreplaceable DBA, meaning that the company couldn’t find someone else who could do what you do, then you’re not doing a great job. A good DBA is replaceable. I imagine some of you are already reaching for the lighter fluid, about to set the comments section ablaze, but before you destroy a perfectly good Commodore 64, read on… Everyone is replaceable, ultimately. Anyone, anywhere, in any job, could be sitting at their desk reading this, blissfully unaware that this is to be their last day at work. Morbidly, you could be about to take your terminal breath. Ideally, it will be because another company suddenly offered you a truck full of money to take a new job, forcing you to bid a regretful farewell to your current employer (with barely a “so long suckers!” left wafting in the air as you zip out of the office like the Wile E Coyote wearing two pairs of rocket skates). I’ve often wondered what it would be like to be present at the meeting where your former work colleagues discuss your potential replacement. It is perhaps only at this point, as they struggle with the question “What kind of person do we need to replace old Wile?” that you would know your true worth in their eyes. Of course, this presupposes you need replacing. I’ve known one or two people whose absence we adequately compensated with a small rock, to keep their old chair from rolling down a slight incline in the floor. On another occasion, we bought a noise-making machine that frequently attracted attention its way, with unpleasant sounds, but never contributed anything worthwhile. These things never actually happened, of course, but you take my point: don’t confuse replaceable with expendable. Likewise, if the term “trained seal” comes up, someone they can teach to follow basic instructions and push buttons in the right order, then the replacement discussion is going to be over quickly. What, however, if your colleagues decide they’ll need a super-specialist to replace you. That’s a good thing, right? Well, usually, in my experience, no it is not. It often indicates that no one really knows what you do, or how. A typical example is the “senior” DBA who built a system just before 16-bit computing became all the rage and then settled into a long career managing it. Such systems are often central to the company’s operations and the DBA very skilled at what they do, but almost impossible to replace, because the system hasn’t evolved, and runs on processes and routines that others no longer understand or recognize. The only thing you really want to hear, at your replacement discussion, is that they need someone skilled at the fundamentals and adaptable. This means that the person they need understands that their goal is to be an excellent DBA, not a specialist in whatever the-heck the company does. Someone who understands the new versions of SQL Server and can adapt the company’s systems to the way things work today, who uses industry standard methods that any other qualified DBA/programmer can understand. More importantly, this person rarely wants to get “pigeon-holed” and so documents and shares the specialized knowledge and responsibilities with their teammates. Being replaceable doesn’t mean being “dime a dozen”. The company might need four people to take your place due to the depth of your skills, but still, they could find those replacements and those replacements could step right in using techniques that any decent DBA should know. It is a tough question to contemplate, but take some time to think about the sort of person that your colleagues would seek to replace you. If you think they would go looking for a “super-specialist” then consider urgently how you can diversify and share your knowledge, and start documenting all the processes you know as if today were your last day, because who knows, it just might be.

Read the article
How to prepare for a programming competition? Graphs, Stacks, Trees, oh my! [closed]

- by Simucal

Last semester I attended ACM's (Association for Computing Machinery) bi-annual programming competition at a local University. My University sent 2 teams of 3 people and we competed amongst other schools in the mid-west. We got our butts kicked. You are given a packet with about 11 problems (1 problem per page) and you have 4 hours to solve as many as you can. They'll run your program you submit against a set of data and your output must match theirs exactly. In fact, the judging is automated for the most part. In any case.. I went there fairly confident in my programming skills and I left there feeling drained and weak. It was a terribly humbling experience. In 4 hours my team of 3 people completed only one of the problems. The top team completed 4 of them and took 1st place. The problems they asked were like no problems I have ever had to answer before. I later learned that in order to solve them some of them effectively you have to use graphs/graph algorithms, trees, stacks. Some of them were simply "greedy" algo's. My question is, how can I better prepare for this semesters programming competition so I don't leave there feeling like a complete moron? What tips do you have for me to be able to answer these problems that involve graphs, trees, various "well known" algorithms? How can I easily identify the algorithm we should implement for a given problem? I have yet to take Algorithm Design in school so I just feel a little out of my element. Here are some examples of the questions asked at the competitions: ACM Problem Sets Update: Just wanted to update this since the latest competition is over. My team placed 1st for our small region (about 6-7 universities with between 1-5 teams each school) and ~15th for the midwest! So, it is a marked improvement over last years performance for sure. We also had no graduate students on our team and after reviewing the rules we found out that many teams had several! So, that would be a pretty big advantage in my own opinion. Problems this semester ranged from about 1-2 "easy" problems (ie bit manipulation, string manipulation) to hard (graph problems involving fairly complex math and network flow problems). We were able to solve 4 problems in our 5 hours. Just wanted to thank everyone for the resources they provided here, we used them for our weekly team practices and it definitely helped! Some quick tips that I have that aren't suggested below: When you are seated at your computer before the competition starts, quickly type out various data structures that you might need that you won't have access to in your languages libraries. I typed out a Graph data-structure complete with floyd-warshall and dijkstra's algorithm before the competition began. We ended up using it in our 2nd problem that we solved and this is the main reason why we solved this problem before anyone else in the midwest. We had it ready to go from the beginning. Similarly, type out the code to read in a file since this will be required for every problem. Save this answer "template" someplace so you can quickly copy/paste it to your IDE at the beginning of each problem. There are no rules on programming anything before the competition starts so get any boilerplate code out the way. We found it useful to have one person who is on permanent whiteboard duty. This is usually the person who is best at math and at working out solutions to get a head start on future problems you will be doing. One person is on permanent programming duty. Your fastest/most skilled "programmer" (most familiar with the language). This will save debugging time also. The last person has several roles between assessing the packet of problems for the next "easiest" problem, helping the person on the whiteboard work out solutions and helping the person programming work out bugs/issues. This person needs to be flexible and be able to switch between roles easily.

Read the article
Come see us at JavaU at JavaOne!

- by tmcginn

In just a little under a month, JavaOne will be in full swing (no pun intended) and thousands of Java developers will gather to hear the latest Java news, immerse themselves in Java technology and learn some new things. This year, I am fortunate enough to be able to attend, along with my Java curriculum development colleagues Matt Heimer and Mike Williams. We start our week at JavaOne teaching a one-day session at JavaU on Sunday morning. If you have never attended a training session through JavaU, you should check it out. There are some terrific sessions this year, and it might help to justify your trip to JavaOne if you can say it was for training! This year I am teaching a one day session on Java SE 7 New Features - a great session for anyone interested in the specific details of what is new in Java SE 7. Matt is teaching a one-day session on Developing Portable Java EE applications with the Enterprise JavaBeans 3.1 API and Java Persistence 2.0 API EJB, and Mike is doing a one-day session on developing Rich Client applications with Java SE 7 using Java FX 2. I asked Matt and Mike to tell me what developers can expect from their sessions. Matt: "My session will get you up to speed on everything you need to know to create portable Java EE 6 applications using EJB 3.1 and JPA 2. I am going to cover why everyone can benefit from using EJBs (and why developers should relearn them if they haven't looked at them for years). Students who attend my session will see JPA examples showcasing how to use relational databases in an enterprise applications without programming to JDBC and without writing SQL statements. EJB and JPA benefit from being paired together, so I will also show how transaction management is easier in a container. I encourage students to bring a laptop and code as they learn!" Mike: "My session covers how to develop a rich client application using Java FX 2. Starting with the basic concepts of JavaFX, students will see how a JavaFX application is built from its layout, to its controls, to its data structures. In addition, more advanced controls like charts, smart tables, and transitions will be added to the application. Finally, a quick review of JavaFX concurrency and data binding is included. Blended with the core concepts the session will include some of the latest JavaFX technology. This includes using Scene Builder to create a JavaFX UI and connecting your XML UI definition to Java code. In addition, packaging of the JavaFX application will be covered with some examples of the new native packaging features." As I mentioned, my session covers the changes in the Java for SE 7, including the language changes that were voted into Java SE 7 from Project Coin. I will also look at how you can take advantage if the the new I/O library (NIO.2) for writing applications that work with files, directories and file systems. We will also look at the changes in Asynchronous I/O that are a part of the changes in NIO/2. We will spend some time looking at the changes to the Java Virtual Machine as well, including support for dynamically typed languages (JSR-292). We will spend some time looking at the Java Concurrency enhancements (JSR-166), including the new Fork/Join framework. And we'll round out the day with a look at changes in Swing, XML and a number of smaller changes in the API's. And, if these topics aren't grabbing your interest, take a look at the other 10 sessions that range from topics on architecture to how to pass the Oracle Certified Programmer I and II exams. See you soon!

Read the article
What are some good questions (and good/bad answers) to ask at an interview to gauge the competency of the company/team?

- by Wayne M

I'm already familiar with the Joel Test, but it's been my experience that some of the questions there have the answers "massaged" to make the company seem better than it is. I've had several jobs in the past that, for instance, claimed they had a QA process and did unit testing, and what they really meant is "The programmers test the app, and test with the debugger and via trial-and-error."; they said they used SVN but they just lumped everything into one giant repository and had no concept of branching/merging or anything more complicated than updating and committing; said they can build in one step and what they really mean is it's "one step" to copy dozens of files by hand from the programmer's PC to the live server. How do you go about properly gauging a company's environment to make sure that it's a well-evolved company and not stuck on doing things a certain way because they've done it for years and they're ignorant of change? You can almost never ask to see their source code, so you're stuck trying to figure out if the interviewer's answer is accurate or BS to make the company seem good. Besides the Joel Test what are some other good questions to get the proper feel for a company, and more importantly what are some good and bad answers that could indicate a good or bad company? I mean something like (take at face value, please, it's all I could think of at short notice): Question: How does the software team apply the SOLID principles and Inversion of Control to their code? Good Answer: We adhere to SOLID wherever possible; we use TDD so it kind of forces us to write abstract, testable code. We use Ninject for our IoC container because it's fairly easy to configure - it was that or StructureMap but I find Ninject a bit more intuitive, and who doesn't like ninjas? You're not a pirate, are you? Bad Answer: Our code is pretty secure, yeah. And what's this Inversion of Control thing? I've never heard of it before. You see what I did there. The "good" answer uses facts to back it up and has a bit of "in crowd" humor; the bad answer shows complete ignorance of the question - not necessarily a bad thing if you are interviewing for a manger/director position, but a terrible answer and a huge red flag if you're interviewing as a developer and talking to a senior developer or manager! My biggest problem at the moment is being able to take a generic response and gauge whether it's the good or bad answer; more often than not it's the bad kind and I find myself frustrated almost from day one at the new job. I suppose I could name drop if I ask about specific things (e.g. "Do you write unit tests?" and if the answer is yes, ask if they use NUnit, MbUnit or something else; if they mention data access ask if they use a clean ORM like NHibernate or something more coupled like EF or Linq) but is there another way short of being resolute to actually call the interview on things (which will almost certainly result in not getting the job, but if they are skirting the question it's probably not a job I want).

Read the article
ArchBeat Link-o-Rama Top 10 for August 19-26, 2012

- by Bob Rhubart

The Top 10 most popular items shared via the OTN ArchBeat Facebook page for the week of August 19-26, 2012. Now Available: Oracle SQL Developer 3.2 (3.2.09.23) The latest release of Oracle SQl Developer includes UI enhancements, 12c database support, and bug fixes. ADF Tutorial Chapter 3: Creating a Master-Detail taskflow | Yannick Ongena Oracle ACE Yannick Ongena continues his ADF tutorial with a chapter devoted to view layer and using the data control to build pages that allow user to update reference data. GlassFish Community Event at JavaOne 2012 Don't miss out on this exclusive GlassFish Community Event on Sunday, September 30th from 11:00 a.m. – 1:00 p.m. in Moscone South. Register Now! Part of JavaOne 2012. Oracle BI 11g Book Authors – Podcast #9 | Art of Business Intelligence In this home-grown podcast, authors Christian Screen, Haroun Khan, and Adrian Ward talk about their new book, "Oracle Business Intelligence Enterprise Edition 11g: A Hands-On Tutorial," about their sessions at Oracle OpenWorld, and about their ORACLENERD t-shirts. Oracle Service Bus duplicate message check using Coherence | Jan van Zoggel "Giving the fact that every message on our ESB has an unique messageID element in the SOAP header we could store this on disk, database or in memory,"says Jan van Zoggel. "With the help of Oracle Coherence this last option, in memory, is relatively simple." Even simpler with Jan's detailed instructions. Oracle Technology Network Architect Day - Boston - Sept 12 There are easier ways to increase your IT brainpower. Skip the electrodes and register for Oracle Technology Network Architect Day in Boston, September 12, 2012. This free event includes 8 technical sessions, panel Q&A, roundtable discussions—and a free lunch. 8:00 a.m. – 5:00 p.m. at the Boston Marriott Burlington, One Burlington Mall Road, Burlington, MA 01803. Oracle BPM enable BAM | Peter Paul van de Beek "BAM enables you to make decisions based on real-time information gathered from your running processes," says Peter Paul van de Beek. "With BPMN processes you can use the standard Business Indicators that the BPM Suite offers you and use them to with BAM without much extra effort." Sample Application for Switching Application Module Data Sources | Andrejus Baranovskis A sample application and how-to guide from Oracle ACE Director and ADF expert Andrejus Baranovskis. ORCLville: Some Basic BI Thoughts "If we'd stop to consider what business intelligence really is, many of us might grow a different perspective about how we implement enterprise apps," says Oracle ACE Director Floyd Teter. "What if we implemented with an eye to what kind of information we'd like to get from our enterprise apps?" Oracle VM VirtualBox 4.1.20 released |Oracle's Virtualization Blog Oracle VM VirtualBox 4.1.20 was just released at the community and Oracle download sites, reports the Fat Bloke. This is a maintenance release containing bug fixes and stability improvements. Thought for the Day "The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures." — Frederick P. Brooks Source: SoftwareQuotes

Read the article
Ti Launchpad

- by raysmithequip

Just thought I would get a couple of notes up here for reference to anyone that is interested...it is now Feb 2011 and I have not been posting here enough to remember this blog. Back in Nov 2010 I ordered the Ti launchpad msp430, it is a little target board kit replete with a mini USB cable, two very inexpensive programmable mcu's and a couple of pin headers with a couple of led's on board, a spi connector some on board jumpers and two programmable micro switches....all for less than $5.00...INCLUDING SHIPPING!!....not bad when the ardruino's are running around 20.00 for the target board, atmega328 and cable off of eBay...I wont even mention the microchip pic right now. Naw, for $5.00 the Ti launchpad kit is about the cheapest fun around...if-uns your a geek that is... Well, the launchpad was backordered for almost two months, came like Xmas eve in fact...I had almost forgotten it!! And really, it was way late and not my idea of an Xmas present for myself. That would of been the web expressions 4 I bought a few weeks back. With all the holidays, I did not even look at it till last week, in fact I passed the wrapped board around at my local ham club meeting during points of personal privilege....some oh's and ahhs but mostly duhs...I actually ordered it to avoid downloading the huge code compressor studio 4 (CCS) that was supposed to be included on the cd. No cd. I had already downloaded IAR another programming IDE for these little micro bugs. In my spare time I toyed with IAR and the launchpad board but after about two days of playing delete the driver with windows I decided to just download CCS 4, the code limited version, and give that a shot......CCS 4, is a good rewrite from the earlier versions, it is based on Eclipse as an IDE and includes the drivers for the msp430 target board I received in the kit. Once installed I quickly configured the debugger for the target chip which was already plugged into the dip socket at the factory, msp430G2131 from he drop down list and clicked ok...I was in!! The CCS4 is full of bells and whistles compared to the IAR, which I would of preferred for the simplicity. But the code compressor studio really does have it all!!..the code limited version is free, and of all things will give you java script editor box. The whole layout in debugger mode reminds me of any modern programmer IDE...I mean sure give me Tex anytime but you simply must admire all the boxes and options included in the GUI. It was a simple matter to check the assembly code in the flash and ram memory that came preloaded for the launchpad kit. Assembly. I am right now looking for my old assembly textbooks...sure I remember how to use mov and add etc but a couple of the commands are a little more than vague anymore. Still, these little mcu's are about 50 cents each and might just work in a couple of projects I have lined up for the near future. I may document the code here. Luckily, I plan to write the code in c++ for the main project but if it has to be assembly, no prob. For reference, the program that came already on the 2131 in the kit was a temperature indicator that alternately flashed red and green leds and changed the intensity of either depending on whether the temp was rising or falling...neat. Neat enough that it might be worthwhile banging out a little GUI in windows 7 to test the new user device system calls, maybe put a temp gauge widget up on the desktop...just to keep from getting bored. If you see some assembly code on this blog, you know I was doing something with one of the many mcu's out there.....thats all for now, more to follow...a bit later, of course.

Read the article
Efficiently separating Read/Compute/Write steps for concurrent processing of entities in Entity/Component systems

- by TravisG

Setup I have an entity-component architecture where Entities can have a set of attributes (which are pure data with no behavior) and there exist systems that run the entity logic which act on that data. Essentially, in somewhat pseudo-code: Entity { id; map<id_type, Attribute> attributes; } System { update(); vector<Entity> entities; } A system that just moves along all entities at a constant rate might be MovementSystem extends System { update() { for each entity in entities position = entity.attributes["position"]; position += vec3(1,1,1); } } Essentially, I'm trying to parallelise update() as efficiently as possible. This can be done by running entire systems in parallel, or by giving each update() of one system a couple of components so different threads can execute the update of the same system, but for a different subset of entities registered with that system. Problem In reality, these systems sometimes require that entities interact(/read/write data from/to) each other, sometimes within the same system (e.g. an AI system that reads state from other entities surrounding the current processed entity), but sometimes between different systems that depend on each other (i.e. a movement system that requires data from a system that processes user input). Now, when trying to parallelize the update phases of entity/component systems, the phases in which data (components/attributes) from Entities are read and used to compute something, and the phase where the modified data is written back to entities need to be separated in order to avoid data races. Otherwise the only way (not taking into account just "critical section"ing everything) to avoid them is to serialize parts of the update process that depend on other parts. This seems ugly. To me it would seem more elegant to be able to (ideally) have all processing running in parallel, where a system may read data from all entities as it wishes, but doesn't write modifications to that data back until some later point. The fact that this is even possible is based on the assumption that modification write-backs are usually very small in complexity, and don't require much performance, whereas computations are very expensive (relatively). So the overhead added by a delayed-write phase might be evened out by more efficient updating of entities (by having threads work more % of the time instead of waiting). A concrete example of this might be a system that updates physics. The system needs to both read and write a lot of data to and from entities. Optimally, there would be a system in place where all available threads update a subset of all entities registered with the physics system. In the case of the physics system this isn't trivially possible because of race conditions. So without a workaround, we would have to find other systems to run in parallel (which don't modify the same data as the physics system), other wise the remaining threads are waiting and wasting time. However, that has disadvantages Practically, the L3 cache is pretty much always better utilized when updating a large system with multiple threads, as opposed to multiple systems at once, which all act on different sets of data. Finding and assembling other systems to run in parallel can be extremely time consuming to design well enough to optimize performance. Sometimes, it might even not be possible at all because a system just depends on data that is touched by all other systems. Solution? In my thinking, a possible solution would be a system where reading/updating and writing of data is separated, so that in one expensive phase, systems only read data and compute what they need to compute, and then in a separate, performance-wise cheap, write phase, attributes of entities that needed to be modified are finally written back to the entities. The Question How might such a system be implemented to achieve optimal performance, as well as making programmer life easier? What are the implementation details of such a system and what might have to be changed in the existing EC-architecture to accommodate this solution?

Read the article
Where should you put constants and why?

- by Tim Meyer

In our mostly large applications, we usually have a only few locations for constants: One class for GUI and internal contstants (Tab Page titles, Group Box titles, calculation factors, enumerations) One class for database tables and columns (this part is generated code) plus readable names for them (manually assigned) One class for application messages (logging, message boxes etc) The constants are usually separated into different structs in those classes. In our C++ applications, the constants are only defined in the .h file and the values are assigned in the .cpp file. One of the advantages is that all strings etc are in one central place and everybody knows where to find them when something must be changed. This is especially something project managers seem to like as people come and go and this way everybody can change such trivial things without having to dig into the application's structure. Also, you can easily change the title of similar Group Boxes / Tab Pages etc at once. Another aspect is that you can just print that class and give it to a non-programmer who can check if the captions are intuitive, and if messages to the user are too detailed or too confusing etc. However, I see certain disadvantages: Every single class is tightly coupled to the constants classes Adding/Removing/Renaming/Moving a constant requires recompilation of at least 90% of the application (Note: Changing the value doesn't, at least for C++). In one of our C++ projects with 1500 classes, this means around 7 minutes of compilation time (using precompiled headers; without them it's around 50 minutes) plus around 10 minutes of linking against certain static libraries. Building a speed optimized release through the Visual Studio Compiler takes up to 3 hours. I don't know if the huge amount of class relations is the source but it might as well be. You get driven into temporarily hard-coding strings straight into code because you want to test something very quickly and don't want to wait 15 minutes just for that test (and probably every subsequent one). Everybody knows what happens to the "I will fix that later"-thoughts. Reusing a class in another project isn't always that easy (mainly due to other tight couplings, but the constants handling doesn't make it easier.) Where would you store constants like that? Also what arguments would you bring in order to convince your project manager that there are better concepts which also comply with the advantages listed above? Feel free to give a C++-specific or independent answer. PS: I know this question is kind of subjective but I honestly don't know of any better place than this site for this kind of question. Update on this project I have news on the compile time thing: Following Caleb's and gbjbaanb's posts, I split my constants file into several other files when I had time. I also eventually split my project into several libraries which was now possible much easier. Compiling this in release mode showed that the auto-generated file which contains the database definitions (table, column names and more - more than 8000 symbols) and builds up certain hashes caused the huge compile times in release mode. Deactivating MSVC's optimizer for the library which contains the DB constants now allowed us to reduce the total compile time of your Project (several applications) in release mode from up to 8 hours to less than one hour! We have yet to find out why MSVC has such a hard time optimizing these files, but for now this change relieves a lot of pressure as we no longer have to rely on nightly builds only. That fact - and other benefits, such as less tight coupling, better reuseability etc - also showed that spending time splitting up the "constants" wasn't such a bad idea after all ;-)

Read the article
Google Fetch issue

- by Karen

When I do a Google fetch on any of my webpages the results are all the same (below). I'm not a programmer but I'm pretty sure this is not correct. Out of all the fetches I have done only one was different and the content length was 6x below and showed meta tags etc. Maybe this explains other issues I've been having with the site: a drop in indexed pages. Meta tag analyzer says I have no title tag, meta tags or description even though I do it on all pages. I had an SEO team working on the site and they were stumped by why pages were not getting indexed. So they figure it was some type of code error. Are they right? HTTP/1.1 200 OK Cache-Control: private Content-Type: text/html; charset=utf-8 Content-Encoding: gzip Vary: Accept-Encoding Server: Microsoft-IIS/7.5 X-AspNet-Version: 4.0.30319 X-Powered-By: ASP.NET Date: Thu, 11 Oct 2012 11:45:41 GMT Content-Length: 1054 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title></title> <script type="text/javascript"> function getCookie(cookieName) { if (document.cookie.length > 0) { cookieStart = document.cookie.indexOf(cookieName + "="); if (cookieStart != -1) { cookieStart = cookieStart + cookieName.length + 1; cookieEnd = document.cookie.indexOf(";", cookieStart); if (cookieEnd == -1) cookieEnd = document.cookie.length; return unescape(document.cookie.substring(cookieStart, cookieEnd)); } } return ""; } function setTimezone() { var rightNow = new Date(); var jan1 = new Date(rightNow.getFullYear(), 0, 1, 0, 0, 0, 0); // jan 1st var june1 = new Date(rightNow.getFullYear(), 6, 1, 0, 0, 0, 0); // june 1st var temp = jan1.toGMTString(); var jan2 = new Date(temp.substring(0, temp.lastIndexOf(" ") - 1)); temp = june1.toGMTString(); var june2 = new Date(temp.substring(0, temp.lastIndexOf(" ") - 1)); var std_time_offset = (jan1 - jan2) / (1000 * 60 * 60); var daylight_time_offset = (june1 - june2) / (1000 * 60 * 60); var dst; if (std_time_offset == daylight_time_offset) { dst = "0"; // daylight savings time is NOT observed } else { // positive is southern, negative is northern hemisphere var hemisphere = std_time_offset - daylight_time_offset; if (hemisphere >= 0) std_time_offset = daylight_time_offset; dst = "1"; // daylight savings time is observed } var exdate = new Date(); var expiredays = 1; exdate.setDate(exdate.getDate() + expiredays); document.cookie = "TimeZoneOffset=" + std_time_offset + ";"; document.cookie = "Dst=" + dst + ";expires=" + exdate.toUTCString(); } function checkCookie() { var timeOffset = getCookie("TimeZoneOffset"); var dst = getCookie("Dst"); if (!timeOffset || !dst) { setTimezone(); window.location.reload(); } } </script> </head> <body onload="checkCookie()"> </body> </html>

Read the article
The inevitable Hello World post!

- by brendonpage

Greetings to anyone reading this! This is my first of hopefully many posts. I would like to use this post to introduce myself and to let you know what to expect from this blog in future. Okay so a bit about myself. In case you missed the name of this blog, my name is Brendon Page! I am a Software Developer from South Africa and work for a small company who’s main focus is producing software for the kitchen cupboard industry, although from time to time we do produce custom solutions for other industries. I work in a small team of 3, including myself, and am fortunate enough to work from home! I have been involved in IT since 1996, which is when I got my first PC, and started working as a junior programmer in 2003. Outside of work I enjoy playing squash, PC Games and of course LANing with my friends. If I get any free time between all of that I will usually dedicate some of it to a personal project, these are mainly prototypes for an idea I have had or for something that could be useful at work. I was in 2 minds on whether to include a photo of myself. The reason for this was because while I was looking for a suitable photo to use, it dawned on me how much time I dedicate to pulling funny faces in photos! I also realized how little I shave, which I blame completely on working form home. So after much debate here I am, funny face, beard and all! Now that you know a bit about me lets move onto what expect from this blog. I work predominantly with Microsoft technologies so most if not all of my posts will be related to something Microsoft. Since most of my job entails Software Development you can expect a lot of posts which will deal with the .NET Framework. I am currently working on a large Silverlight project, so my first few posts will be targeted at in that direction. I will be striving to make the content of my posts as useful as possible from both an explanation and code perspective, I aim to include a working solution for every post, which I will put up on my skydrive for download. Here is what I have planned for my next few posts: Where did my session variables go? Here I will take you through the lessons I learnt the hard way about the ASP.NET session. I am not going to go into to much depth in this post, as there is already a lot of information available on it. I mainly want to cover it in an effort to keep the scope creep of my posts to a minimum, some the solutions I upload will use it and I would like to have a post that I can reference to explain why I am doing something a certain way. Uploading files through SIlverlight Again there is a lot of existing information on this topic, so I wont be going into to much depth, but I will be using the solution from this as a base for my next post. Generating and Displaying DeepZoom images dynamically in Silverlight Well the title pretty much speaks for it’s self on this one. As I mentioned I will be building off the solution that I create in my ‘Uploading files through Silverlight’ post. Securing DeepZoom images using a custom implementation of the MultiScaleTileSource In this post I will look at the privacy issue surrounding the default usage of DeepZoom images in Silverlight and how to overcome it. This makes the use of DeepZoom in privacy conscious applications more viable. Thanks to anyone who actually read this post! I look forward to producing more which will hopefully be helpful to you.

Read the article
Refactoring existing PHP Project. I need some advices

- by b0x

i have a small SAS ERP that was written some years ago using PHP. At that time, it didn't used any framework, but the code isn't a mess as i will explain more detailed in the following lines. Nowadays, the project grow and I’m now working with 3 more programmers. Often, they ask to me why we don’t migrate to a framework such Laravel. Although I'd love trying Laravel, I’m a small business and i don't have time/money to stop and spend a whole year building everything from scratch. I need to live and pay the bills. So, I've read a lot about this matter, and I decided that doing a refactoring is the best way to do it. Also, I'm not so sure that a framework will make things easy. Business goals are: Make the code easier to new hired programmers I must separate the "view", because: I want to release different versions of this product (using the same code), but under different brands and websites at the minimum cost (just changing view) Release different versions to fit mobile/tablet. Make different types of this product, seeling packages as if it were plugins. Develop custom packages for some costumers (like plugins/addon's that they can buy to put on the main application). Code goals: Introduce best pratices, standards for everyone Try to build my own MVC structure Improve validation of data/forms (today they are mixed in both ajax and classes) Create automated testing rotines, to quality assurance. My actual structure project: class\ extra\ hd\ logs\ public_html\ public_html\includes\ public_html\css|js|images\ class\ There are three types of classes. They are all “autoloaded” with something similar with PSR-0, but I don’t use namespaces. 1. class.Something.php Connects to Database using specific methods. I.e: Costumer-list(); It uses “class.Db.php”, that it’s an abstraction of mysqli on every method. 2. class.SomethingProc.php Do things that “join” things that come from “class.Something.php”. Like IF/ELSE, math operations. 3. class.SomethingHTML.php The classes with “HTML” suffix implements only static methods and HTML code only. A real life example: All the programmers need to use $cSomething ($c to class) and $arrSomething (to array). Costumer.php (view) <?php $cCosumter = new Costumer(); $arrCostumer = $cCostumer->list(); echo CostumerHTML::table($arrCostumer); ?> Extra\ Store 3rdparty projects/classes from others, such MPDF, PHPMailer, etc. Hd\ Store user’s fies outsite wwwroot dir. Logs\ Store phplogs and the system itself logs (We have a static Log::error() method, that we put in every method of every class) Public_html\ Stores the files that people use. Public_html\includes\ Store the main “config.php” file and all files that do “ajax things” ajax.Costumer.php, for example. Help is needed ;) So, as you can see we have some standards, and also for database things. But i want to write a manual of our rules. Something that i can give to any new programmer at my companie and he can go on. This is not totally a mess, but It could be better seeing the new practices. What could I do to separate this as MVC, to have multiple VIEW’s. Could you gimme some tips considering my goals? Keep im mind the different products/custom things for specific costumers without breaking the main application. URL for tutorials, books, etc. It would be nice. Thanks!

Read the article
Advice on refactoring PHP Project

- by b0x

I have a small SAS ERP that was written some years ago using PHP. At that time, it didn't use any framework, but the code isn't a mess. Nowadays, the project grows and I’m now working with 3 more programmers. Often, they ask to me why we don’t migrate to a framework such as Laravel. Although I'd love trying Laravel, I’m a small business and I don't have time nor money to stop and spend a whole year building everything from scratch. I need to live and pay the bills. So, I've read a lot about this matter, and I decided that doing a refactoring is the best way to do it. Also, I'm not so sure that a framework will make things easy. Business goals are: Make the code easier to new hired programmers Separate the "view", in order to: release different versions of this product (using the same code), but under different brands and websites at the minimum cost (just changing view) release different versions to fit mobile/tablet. Make different types of this product, selling packages as if they were plugins. Develop custom packages for some costumers (like plugins/addon's that they can buy to put on the main application). Code goals: Introduce best pratices, standards for everyone Try to build my own MVC structure Improve validation of data/forms (today they are mixed in both ajax and classes) Create automated testing routines for quality assurance. My current structure project: class\ extra\ hd\ logs\ public_html\ public_html\includes\ public_html\css|js|images\ class\ There are three types of classes. They are all “autoloaded” with something similar with PSR-0, but I don’t use namespaces. 1. class.Something.php Connects to Database using specific methods. I.e: Costumer-list(); It uses “class.Db.php”, that it’s an abstraction of mysql on every method. 2. class.SomethingProc.php Do things that “join” things that come from “class.Something.php”. Like IF/ELSE, math operations. 3. class.SomethingHTML.php The classes with “HTML” suffix implements only static methods and HTML code only. A real life example: All the programmers need to use $cSomething ($c to class) and $arrSomething (to array). Costumer.php (view) <?php $cCosumter = new Costumer(); $arrCostumer = $cCostumer->list(); echo CostumerHTML::table($arrCostumer); ?> Extra\ Store 3rdparty projects/classes from others, such MPDF, PHPMailer, etc. Hd\ Store user’s files outsite wwwroot dir. Logs\ Store phplogs and the system itself logs (We have a static Log::error() method, that we put in every method of every class) Public_html\ Stores the files that people use. Public_html\includes\ Store the main “config.php” file and all files that do “ajax things” ajax.Costumer.php, for example. Help is needed ;) So, as you can see we have some standards, and also for database things. But I want to write a manual of our rules. Something that I can give to any new programmer at my company and he can go on. This is not totally a mess, but it could be better seeing the new practices. What could I do to separate this as MVC, to have multiple views. Could you give me some tips considering my goals? Keep im mind the different products/custom things for specific costumers without breaking the main application. URL for tutorials, books, etc, would be nice.

Read the article
Move on and look elsewhere, or confront the boss?

- by Meister

Background: I have my Associates in Applied Science (Comp/Info Tech) with a strong focus in programming, and I'm taking University classes to get my Bachelors. I was recently hired at a local company to be a Software Engineer I on a team of about 8, and I've been told they're looking to hire more. This is my first job, and I was offered what I feel to be an extremely generous starting salary ($30/hr essentially + benefits and yearly bonus). What got me hired was my passion for programming and a strong set of personal projects. Problem: I had no prior experience when I interviewed, so I didn't know exactly what to ask them about the company when I was hired. I've spotted a number of warning signs and annoyances since then, such as: Four developers when I started, with everyone talking about "Ben" or "Ryan" leaving. One engineer hired thirty days before me, one hired two weeks after me. Most of the department has been hiring a large number of people since I started. Extremely limited internet access. I understand the idea from an IT point of view, but not only is Facebook blocked, but so it Youtube, Twitter, and Pandora. I've also figured out that they block all access to non-DNS websites (http://xxx.xxx.xxx.xxx/) and strangely enough Miranda-IM. Low cubicles. Which is fine because I like my immediate coworkers, but they put the developers with the customer service, customer training, and QA department in a huge open room. Noise, noise, noise, and people stop to chitchat all day long. Headphones only go so far. Several emails have been sent out by my boss since I started telling us programmers to not talk about non-work-related-things like Video Games at our cubicles, despite us only spending maybe five minutes every few hours doing so. Further digging tells me that this is because someone keeps complaining that the programmers are "slacking off". People are looking over my shoulder all day. I was in the Freenode webchat to get help with a programming issue, and within minutes I had an email from my boss (to all the developers) telling us that we should NOT be connected to any outside chat servers at work. Version control system from 2005 that we must access with IE and keep the Java 1.4 JRE installed to be able to use. I accidentally updated to Java 6 one day and spent the next two days fighting with my PC to undo this "problem". No source control, no comments on anything, no standards, no code review, no unit testing, no common sense. I literally found a problem in how they handle string resource translations that stems from the simple fact that they don't trim excess white spaces, leading to developers doing: getResource("Date: ") instead of: getResource("Date") + ": ", and I was told to just add the excess white spaces back to the database instead of dealing with the issue directly. Some of these things I'd like to try to understand, but I like having IRC open to talk in a few different rooms during the day and keep in touch with friends/family over IM. They don't break my concentration (not NEARLY as much as the lady from QA stopping by to talk about her son), but because people are looking over my shoulder all day as they walk by they complain when they see something that's not "programmer-looking work". I've been told by my boss and QA that I do good, fast work. I should be judged on my work output and quality, not what I have up on my screen for the five seconds you're walking by So, my question is, even though I'm just barely at my 90 days: How do you decide to move on from a job and looking elsewhere, or when you should start working with your boss to resolve these issues? Is it even possible to get the boss to work with me in many of these things? This is the only place I heard back from even though I sent out several resume's a day for several months, and this place does pay well for putting up with their many flaws, but I'm just starting to get so miserable working here already. Should I just put up with it?

Read the article
Any tips on getting hired as a software project manager straight out of college?

- by MHarrison

I graduated with a BS in compsci last September, and I've been trying (unsuccessfully) to find a job as a project manager ever since. I fell in love with software engineering (the formal practice behind it all, not just coding) in school, and I've dedicated the last 3-4 years of my life to learning everything I can about project management and gaining experience. I've managed several projects (with teams around 12 people) while in school, and I worked with my university's software engineering research lab. My résumé is also decent - I worked as a programmer before I went to school (I'm 27 now), and I did Google Summer of Code for 3 summers. I also have general "people management" experience via working as the photo editor for my university's newspaper for 2 years. My first problem with the job hunt is not getting enough interviews. I use careers.stackoverflow.com, which is awesome because I usually get contacted by non-HR people who know what they're talking about, but there's just not enough companies using it for me to get interviews on a regular basis. I've also tried sites like monster.com, and in a fit of desperation, I sent out no less than 60 applications to project management positions. I've gotten 3 automated rejection letters and that's it. At least careers.stackoverflow gets me a phone interview with 8/10 places I apply to. But the main (and extremely frustrating) problem is the matter of experience. I've successfully managed projects from start to finish (in my software engineering classes we had real customers come in with a real software need and we built it for them), but I've never had to deal with budgets and money (I know this is why HR people immediately turn me away). Most of these positions require 5+ years PM experience, and I've seen absurd things like 12+ years required. Interviews are also maddening. I've had so many places who absolutely loved me and I made it to the final round of interviews, and I left thinking things went extremely well and they'd consider me. However, when I check in with them a week later, they tell me "We really liked you and your qualifications are excellent, but we're hoping to find someone with more experience." The bad interviews I can understand - like the PM position that would have had me managing developers both locally and overseas - I had 3 interviews with them and the ENTIRE interview process was them asking me CS brainteasers and having me waste time on things like writing quicksort on paper or writing binary search trees. Even when I tried steering the discussion towards more relevant PM stuff, they gave me some vague generic replies and went back to the "We want to be Google/MS" crap. But when I have a GOOD interview, they say my "qualifications are excellent" but they want "more experience"...that makes me want to tear my hair out. What else can I DO? While I'm aiming for technically-involved PM positions (not just crunching budget numbers), I really don't want a straight development job because I like creating software from the very high-level vs. spending a lot of time debugging memory leaks. In fact, I can't even GET development positions that I'm qualified for because I make the mistake of telling them that my future career goals are as PM (which usually results in them saying something like "Well we already have PMs and this position isn't really set up to get you there." - which I take to mean "No, that's my job, stay away.") My apologies on the long rant, but I'm seriously hellbent on getting hired as a PM since it's both my career goal and the passion that keeps me awake at night. Any suggestions on what the heck else I can do? I'm currently writing a blog where I talk about my philosophies about software engineering, and I'm writing up specs for an iOS app which I will design, code, and show employers, but this takes an awful lot of time that I don't have.

Read the article
Move over DFS and Robocopy, here is SyncToy!

- by andywe

Ever since Windows 2000, I have always had the need to replicate data to multiple endpoints with the same content. Until DFS was introduced, the method of thinking was to either manually copy the data location by location, or to batch script it with xcopy and schedule a task. Even though this worked (and still does today), it was cumbersome, and intensive on the network, especially when dealing with larger amounts of data. Then along came robocopy, as an internal tool written by an enterprising programmer at Microsoft. We used it quite a bit, especially when we could not use DFS in the early days. It was received so well, it made it into the public realm. At least now we could have the ability to determine what files had changed and only replicate those. Well, over time there has been evolution of this ideal. DFS is obviously the Windows enterprise class service to do this, along with BrancheCache..however you don’t always need or want the power of DFS, especially when it comes to small datacenter installations, or remote offices. I have specific data sets that are on closed or restricted networks, that either have a security need for this, or are in remote countries where bandwidth is a premium. FOr this, I use the latest evolution for one off replication names Synctoy. Synctoy is from Microsoft, seemingly released in 2009, that wraps a nice GUI around setting up a paired set of folders (remember the mobile briefcase from Windows 98?), and allowing you the choice of synchronization methods. 1 way, or 2 way. Simply create a paired set of folders on the source and destination, choose your options for content, exclude any file types you don’t want to replicate, and click run. Scheduling is even easier. MS has included a wrapper for doing just this so all you enter in your task schedule in the SynToyCMD.exe, a –R as an argument, and the time schedule. No more complicated command lines or scripts. I find this especially useful when I use MS backup to back up a system volume, but only want subsets of backup information of a data share and ONLY when that dataset has changed. Not relying on full backups and incremental. An example of this is my application installation master share. I back this up with SyncToy because I do not need multiple backup copies..one copy elsewhere suffices to back it up. At home, very useful for your pictures, videos, music, ect..the backup is online and ready to access, not waiting for you to restore a backup file, and no need to institute a domain simply to have DFS.' Do note there is a risk..if you accidently delete a file and do not catch this before the next sync, then depending on your SyncToy settings, you can indeed lose that file as the destination updates..so due diligence applies. I make it a rule to sync manly one way…I use my master share for making changes, and allow the schedule to follow suit. Any real important file I lock down as read only through file permissions so it cannot be deleted unless I intervene. Check out the tool and have some fun! http://www.microsoft.com/en-us/download/details.aspx?DisplayLang=en&id=15155

Read the article

Search Results

Search found 4026 results on 162 pages for 'programmer calculator'.

Page 135/162 | < Previous Page | 131 132 133 134 135 136 137 138 139 140 141 142 | Next Page >

- by rchrd

- by Stuart Robbins

- by Applications User Experience

- by pinaldave

- by mbcrump

- by AjarnMark

- by Achilles

- by Louis Davidson

- by Laila

- by Simon Cooper

- by Louis Davidson

- by Simucal

- by tmcginn

- by Wayne M

- by Bob Rhubart

- by raysmithequip

- by TravisG

- by Tim Meyer

- by Karen

- by brendonpage

- by b0x

- by b0x

- by Meister

- by MHarrison

- by andywe

< Previous Page | 131 132 133 134 135 136 137 138 139 140 141 142 | Next Page >