Search Results

Search found 19350 results on 774 pages for 'address book'.

Page 376/774 | < Previous Page | 372 373 374 375 376 377 378 379 380 381 382 383  | Next Page >

  • Data and Secularism

    - by kaleidoscope
    Ever since we’ve been using Data we’ve been religious. Religious about the way we represent it and equally religious about the way we access it. Be it plain old SQL, DAO, ADO, ADO.Net and I am just referring to religions in MSFT world. A peek outside and I’d need a separate book to list out the Data faiths. Various application areas in networked computing are converging under the HTTP umbrella with a plausible transition to purist HTTP and in turn REST fuelled by the Web2.0 storm. It was time the Data access faiths also gave up the religious silos wrapped around our long worshipped data publishing and access methods. OData is the secular solution we have at hand today. It is an open protocol for sharing data. It can be exposed via REST. It is Open as in the Microsoft Open Specification Promise. This allows virtually everyone to build Data Services for any runtime. OData is one of the key standards for Data publishing/subscribing on Microsoft Codename Dallas. For us .Netters OData data sources can be exposed/consumed via WCF Data Services and the process is very simple, elegant and intuitive. Applications exposing OData Services Sharepoint 2010 IBM Web Sphere Microsoft SQL Azure Windows Azure Table Storage SQL Server Reporting Services   Live OData Services Netflix Open Science Data Initiative Open Government Data Initiatives Northwind database exposed as OData Service and many others Some may prefer to call it commoditization of data, unification of data access strategies or any other sweet name. I for one will stick to my secular definition. :) Technorati Tags: Sarang,OData,MOSP

    Read the article

  • How John Got 15x Improvement Without Really Trying

    - by rchrd
    The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here.  How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

    Read the article

  • LibreOffice - Can't drag and replace images [LibreOffice 3.5 and 3.6, Ubuntu 12.04]

    - by Anderxale
    My wife uses LibreOffice to create catalogues of a couple hundred items and then converts it to PDF. Man we love LibreOffice! Anyway I set up Windows-7 and Mint-14-Mate dual boot for her to ease into Linux. Today she was ready for Ubuntu and so I did a clean install on her machine. Everything was smooth but today when she tried to work she ran into an issue... She used to be able to download a folder full of images to use in her catalogues and then update her catalogues by replacing old images with new ones. It was so simple - Open an old catalogue, save with a new date, drag and drop replace images of items that no longer exist. The drag and drop process would scale the image and then crop it horizontally or vertically to fit. Now that I have installed Ubuntu 12.04 for her she can no longer replace images, it just does nothing... If she drags the image to the left or right of the original it appears next to the original so I know d&d works, unfortunately it does not resize or crop the image. I tried this on my laptop and my desktop... same thing! I then tried updating the LibreOffice to 3.6, no change. I then tried opening a virtual machine windows xp and Mint 14 on this computer and tried with 3.6 in those operating systems and it worked. Can anyone help? I have a lot of hope that there is an answer because Mint is based on Ubuntu/Debian and that distro can perform this task successfully! Sorry about writing a book....

    Read the article

  • Killer content for my Kindle - The Economist with no need for an iPad - yipeee!

    - by Liam Westley
    I admin it, I was jealous of someone's iPad. They were reading The Economist, for free, as they were a print subscriber. I'm a print subscriber too. However, I don't have an iPad or an iPhone, just an Android phone and a Kindle. As soon as I got the Kindle, I looked up how to get The Economist on it. £9.99 per month. Hmmm, twice as much again as a my print subscription and I wanted to maintain the print subscription. No way Amazon. Fortunately some nice person wrote similar comments on The Economist subscription for Kindle, but added a very important additional nugget of information; and there is no need, as a print subscriber you can just use the free Calibre e-book creation tool anyway. So I downloaded it, searched for The Economist online 'recipe', entered my login name and password (part of my print subscription) and off went Calibre to screen scrape every single article from the Christmas 2010 issue into a .mobi file, complete with front cover image and full indexing. It's wonderful. Truely wonderful. Every section individually indexed, with each article separated and all inline images preserved. It even feels wonderfully retro, back to the days when The Economist only used black and white images. So many thanks the guys behind Calibre and The Economist recipe creators. Finally, I have my essential Kindle content that I've been waiting for.

    Read the article

  • UX Design Principles Pluralsight course review

    - by pluginbaby
    I've just finished the "Creating User Experiences: Fundamental Design Principles" course on Pluralsight, I am glad I took it, and here is why you should. The course is held by Billy Hollis, an internationally known author and speaker focused on user experience design. It was published in May 2012, so it is quite fresh (You’ll hear some reference to XAML, even if the content is not focused on any particular technology). I think what I liked the most about this course is the fact that Billy is not just imposing design ideas and pushing them in your throat (which would be too confronting for us developers, even if he was right), he spends a fair share amount of time explaining each topics, and illustrate them with great metaphors. If you are a minimum open minded you should get great value out of this course. Billy makes you think outside the box, he encourages you to use your right side brain, and understand design principles by simply looking at what’s around us (physical objects, nature, …). During the course he refers several time to "don't make me think" a book on UX design, which is about giving confidence to users, by making it easier for them to achieve their goals when using your app. Billy thinks that every developer can participate in elaborating good design when building software, not only designers should be involved. Get away of the easy path "let's build functional stuff for now and we will hire a designer later if we have time and budget". The course is also live and interactive as the author suggests that you do some live exercises during each module. He actually makes you realize and understand by yourself the need for change. We’re in a new era of software and devices, where grids and menus aren't enough. You can’t remain satisfied by just making things possible, you need to make them easier for your users. Understanding some fundamental design principles will help. This course can definitely be followed by any developers who wants to improve user experience of software they are working on, and I definitely recommend it.

    Read the article

  • Message Passing Interface (MPI)

    So you have installed your cluster and you are done with introductory material on Windows HPC. Now you want to develop an application with the most common programming model: Message Passing Interface.The MPI programming model is a standard with implementations from many vendors. For newbies (like myself!), I have aggregated below links for getting started.Non-Microsoft MPI resources (useful even if you are not on the Windows platform)1. Message Passing Interface on wikipedia. 2. The MPI standard.3. MPICH2 - an MPI implementation.4. Tutorial on MPI by William Gropp.5. MPI patterns presented as a tutorial with sample code. 6. THE official MPI Forum (maintains the standard) including the wiki discussing the MPI future.7. Great MPI tutorial including at the end the MPI Exercise.8. C++ MPI Exercises by John Burkardt.9. Book online: MPI The Complete Reference.MS-MPI10. Windows HPC Server 2008 - Using MS-MPI whitepaper (15 page doc).11. Tracing MPI applications (27 page doc).12. Using Microsoft MPI (TechNet section).13. Windows HPC Server MPI forum (for posting questions). MPI.NET14. MPI.NET Home Page (not owned by Microsoft).15. MPI.NET Tutorial.16. HPC Development using F# using MPI.NET (38 page doc).Next time I'll post resources for the Microsoft Cluster SOA programming model - happy coding... Comments about this post welcome at the original blog.

    Read the article

  • Due to the Classes

    - by Ratman21
    Why does it seem that I am always saying sorry (or in Japanese Gomennasi)?  Well I am late again for blog as you can see. The CCNA class’s part 1 (also known as CCENT) was, well more intense than all of the certification classes before it.   The teacher was cramming as much as he could into us during the week and it was hard to come home and do much more than fall into bed (Well I was doing still doing my Job search and checking up on my web sites and groups).   But I didn’t have much left in the way of blogging (Which by the way is now in 3 different sites). Even though it was hard some times, I really liked the fact I was getting back to something like (and mean really like, in fact I like Cisco routers than some people I know). At the class, I got some software that allows me to simulate setting up and troubles shoot Lan’s or Wan’s.   When we weren’t getting facts for the test thrown at us, we were doing labs with this software. It was fun for me to be able to use the CISCO router commands and trouble shoot router issues. Even if it was just a sim. So now it is study, study, take practices tests and do the labs. I took the week end and more off after cram CCENT week but, now I am back at it.  Also I could not keep up with my Love Dare book during week of the class. No I did not stop or forget what I already learned. I just put the next dare on hold. Well the hold is off starting tomorrow and tonight I think I am going to write a new cover letter. Let’s see what else I can get done tonight. Hmm I think I will try to do a sim of my home wireless LAN and study for CCENT test in about 3 weeks.   So see you tomorrow (I hope).

    Read the article

  • Declarative programming vs. Imperative programming

    - by EpsilonVector
    I feel very comfortable with Imperative programming. I never have trouble expressing algorithmically what I want the computer to do once I figured out what is it that I want it to do. But when it comes to languages like SQL or Relational Algebra I often get stuck because my head is too used to Imperative programming. For example, suppose you have the relations band(bandName, bandCountry), venue(venueName, venueCountry), plays(bandName, venueName), and I want to write a query that says: all venueNames such that for every bandCountry there's a band from that country that plays in venue of that name. In my mind I immediately go "for each venueName iterate over all the bandCountries and for each bandCountry get the list of bands that come from it. If none of them play in venueName, go to next venueName. Else, at the end of the bandCountries iteration add venueName to the set of good venueNames". ...but you can't talk like that in SQL and I actually need to think about how to formulate this, with the intuitive Imperative solution constantly nagging in the back of my head. Did anybody else had this problem? How did you overcome this? Did you figured out a paradigm shift? Made a map from Imperative concepts to SQL concepts to translate Imperative solutions into Declarative ones? Read a good book? PS I'm not looking for a solution to the above query, I did solve it.

    Read the article

  • iPhone 4S Costs 50k In India. Heck! Rather I Buy Tata Nano Car For Twice The Money

    - by Gopinath
    Are you waiting to buy iPhone 4S in India? Stop waiting and start looking for alternatives as its going to be released in India with mind blowing price tags. A 16 GB iPhone 4S costs Rs. 44,500 + tax, 32 GB at 50,900 and the 64 GB..wait! Are you really interested to know the price? I’m not at all. Its ridiculous to spend 50,000 for a mobile phone in India. I hope majority of Indians agree with me. The Tata Nano, the world’s cheapest car, costs close to the double the price of iPhone 4S. Instead of buying an iPhone 4S for around 50K, it’s a wiser decision to buy a Tata Nano. Will the super rich of India afford to pay around 50,000 to own an iPhone 4S? I think they love to own it to show off their status but I guess they prefer to get it from US through their friends and relatives. In USA an unlocked iPhone 4S available through Apple Online Store costs just 33,500(~ 650 USD IN INR) and that is a straight away Rs. 11,000 discount. Why would the rich burn money? Airtel and Aircel has announced that the iPhone 4S is going to be available in their networks from November 25 onwards and both the operators started accepting the pre-orders. If you are really willing to burn your cash go ahead and book an iPhone 4S. This article titled,iPhone 4S Costs 50k In India. Heck! Rather I Buy Tata Nano Car For Twice The Money, was originally published at Tech Dreams. Grab our rss feed or fan us on Facebook to get updates from us.

    Read the article

  • Venezuela's Highly Inflationary Economy Means Changes to Financial Statements

    - by Theresa Hickman
    This is a bit of an esoteric topic, but given the number of U.S. Companies (particularly oil companies) that operate and have subsidiaries in Venezuela, I think it is worthy of an honorable mention. As you may or may not know, Venezuela's currency has had some changes over the years. In 2008, the Venezuelan Bolivar became the Bolivar Fuerte which dropped three zeros. So Bs.10,000 became Bs.F.10 and all their bills and coins were changed to reflect this. Then on Jan. 8, 2010, the government devalued the currency by 100%. The conversion from VEF to USD dropped from 2.15 to 4.30. (I always wanted to visit Venezuela; I guess it's time to book my vacation). The SEC recently labeled Venezuela a highly inflationary economy. This means that US companies with investments/subsidiaries in Venezuela will need to apply highly inflationary accounting rules starting on Jan. 1, 2010. In addition, companies need to make more detailed disclosures when the Venezuelan reported balances differ from the actual US dollar denominated balances. In a nut shell, if you formerly used translation, then starting Jan 1 of this year, you must now use remeasurement (or temporal method) to restate your Venezuelan entity's financial statements. See ASC topic 830, Foreign Currency Matters, which states that "[t]he financial statements of a foreign entity in a highly inflationary economy shall be remeasured as if the functional currency were the reporting currency." For you non-accountants that I haven't bored and are still reading at this point, the reason why the SEC is doing this is to ensure financial statements are presented as accurately as possible. Hyperinflationary economies have volatile currencies, such as Venezuela (it's not every day a currency devalues 100% overnight) which can distort financial statements if the local currency (Venezuelan Bolivar Fuerte) is used as the functional currency. To make financial statements more accurate, the reporting currency of the U.S. parent (US dollars) should be used as the functional currency. FASB.orgactually has a nice write-up on this.

    Read the article

  • Gain Total Control of Systems running Oracle Linux

    - by Anand Akela
    Oracle Linux is the best Linux for enterprise computing needs and Oracle Enterprise Manager enables enterprises to gain total control over systems running Oracle Linux. Linux Management functionality is available as part of Oracle Enterprise Manager 12c and is available to Oracle Linux Basic and Premier Support customers at no cost. The solution provides an integrated and cost-effective solution for complete Linux systems lifecycle management and delivers comprehensive provisioning, patching, monitoring, and administration capabilities via a single, web-based user interface thus significantly reducing the complexity and cost associated with managing Linux operating system environments. Many enterprises are transforming their IT infrastructure from multiple independent datacenters to an Infrastructure-as-a-Service (IaaS) model, in which shared pools of compute and storage are made available to end-users on a self-service basis. While providing significant improvements when implemented properly, this strategy introduces change and complexity at a time when datacenters are already understaffed and overburdened. To aid in this transformation, IT managers need the proper tools to help them provide the array of IT capabilities required throughout the organization without stretching their staff and budget to the limit. Oracle Enterprise Manager 12c offers  the advanced capabilities to enable IT departments and end-users to take advantage of many benefits and cost savings of IaaS. Oracle Enterprise Manager Ops Center 12c addresses this challenge with a converged approach that integrates systems management across the infrastructure stack, helping organizations to streamline operations, increase productivity, and reduce system downtime.  You can see the Linux management functionality in action by watching the latest integrated Linux management demo . Stay Connected with Oracle Enterprise Manager: Twitter |  Face book |  You Tube |  Linked in |  Newsletter

    Read the article

  • Use Your Chart-Drawing Skills to Win a Free Chrome Cr-48 Notebook

    - by ETC
    Today Google announced that they are partnering with a number of Chrome web application developers to distribute a number of their Chrome OS Notebooks to lucky fans. That’s when we noticed something interesting that can greatly increase your odds of getting one. Unlike Box, MOG, and Zoho, who are doing random giveaways, the LucidChart giveaway is based on a contest of skill – they are picking the best drawings using their flowchart tool and giving away Chrome Notebooks to the winners. So all you have to do is create one of the most interesting drawings / charts, and you will get your hands on one. We’ve also confirmed this with the fine people at LucidChart, who told us “any user who spends a bit of time and effort to do something creative has a good shot at winning one.” How great is the Chrome Cr-48 Notebook? What’s it all about? We wouldn’t know, since Google hasn’t given us here at How-To Geek an opportunity to use one, despite our attempts. It’s sad, since we’re huge fans of the Chrome browser, that we can’t share our Chrome notebook experiences with hundreds of thousands of daily subscribers and millions of monthly visitors. Hint. Hint. Win a Chrome Cr-48 notebook from LucidChart [LucidChart] Latest Features How-To Geek ETC How To Create Your Own Custom ASCII Art from Any Image How To Process Camera Raw Without Paying for Adobe Photoshop How Do You Block Annoying Text Message (SMS) Spam? How to Use and Master the Notoriously Difficult Pen Tool in Photoshop HTG Explains: What Are the Differences Between All Those Audio Formats? How To Use Layer Masks and Vector Masks to Remove Complex Backgrounds in Photoshop Bring Summer Back to Your Desktop with the LandscapeTheme for Chrome and Iron The Prospector – Home Dash Extension Creates a Whole New Browsing Experience in Firefox KinEmote Links Kinect to Windows Why Nobody Reads Web Site Privacy Policies [Infographic] Asian Temple in the Snow Wallpaper 10 Weird Gaming Records from the Guinness Book

    Read the article

  • DC Comics Identifies Krypton on the Star Map

    - by Jason Fitzpatrick
    This week Action Comics Superman #14 hits the stands and DC comics reveals the actual location of Kyrpton, delivered by none other than beloved astrophysicist Neil Tyson. Phil Plait at Bad Astronomy reports on the resolution of fans’ long standing curiosity about the location of Krypton: Well, that’s about to change. DC comics is releasing a new book this week – Action Comics Superman #14 – that finally reveals the answer to this stellar question. And they picked a special guest to reveal it: my old friend Neil Tyson. Actually, Neil did more than just appear in the comic: he was approached by DC to find a good star to fit the story. Red supergiants don’t work; they explode as supernovae when they are too young to have an advanced civilization rise on any orbiting planets. Red giants aren’t a great fit either; they can be old, but none is at the right distance to match the storyline. It would have to be a red dwarf: there are lots of them, they can be very old, and some are close enough to fit the plot. I won’t keep you in suspense: the star is LHS 2520, a red dwarf in the southern constellation of Corvus (at the center of the picture here). It’s an M3.5 dwarf, meaning it has about a quarter of the Sun’s mass, a third its diameter, roughly half the Sun’s temperature, and a luminosity of a mere 1% of our Sun’s. It’s only 27 light years away – very close on the scale of the galaxy – but such a dim bulb you need a telescope to see it at all (for any astronomers out there, the coordinates are RA: 12h 10m 5.77s, Dec: -15° 4m 17.9 s). 6 Ways Windows 8 Is More Secure Than Windows 7 HTG Explains: Why It’s Good That Your Computer’s RAM Is Full 10 Awesome Improvements For Desktop Users in Windows 8

    Read the article

  • Upcoming Webinar: Practical Performance Profiling presented by Jean-Philippe Gouigoux

    - by Michaela Murray
    Hot on the heels of releasing his new book, Practical Performance Profiling, I'm delighted that Jean-Philippe Gouigoux will be joining us on April 3rd to present a free webinar on optimizing .NET code performance. He gave me a sneak preview of his talk last week and there's a lot of really useful advice in there. He'll be discussing why he thinks 20% of performance problems account for 80% of lost time, before looking at some real examples of both server-side and client-side profiling, and covering a variety of best practices you can use to improve the performance of your own code. The webinar will be followed by a Q&A session where he'll be joined by Red Gate technical support engineer Chris Allen to answer any of your questions. Jean-Philippe has 10 years' experience in .NET, most recently as system architect at MGDIS, and was recently made a Microsoft MVP for his contributions to the .NET community. I'm really excited that he's found a gap between his day job and university lecturing to share his knowledge, and I hope you'll be able to join us on April 3rd - it's free but you do need to register in advance at https://www3.gotomeeting.com/register/829014934. I'll see you there!

    Read the article

  • Get to Know a Candidate (7 of 25): Will Christensen&ndash;Independent American Party

    - by Brian Lanham
    DISCLAIMER: This is not a post about “Romney” or “Obama”. This is not a post for whom I am voting. Information sourced for Wikipedia. NOTE: Wikipedia does not have a page for Christensen.  If you follow links to the party site you can find information about him. Christensen served in the United States Marine Corps and has degrees from Penn State University (my alma mater), Drexel Institute of Technology, University of Utah, and Brigham Young University (BYU) focusing on Math, Physics, and Electrical Engineering.  He has worked for IBM and BYU but for the last 35 years has run small businesses including an Internet book business as well as an Amway franchise. He has held numerous offices in various political parties including, County Campaign Chairman for Barry Goldwater in 1964, County Central Committee, Republican Party; National Committeeman, and State Chairman of the American Party; one of the Founders, and the State Chairman of the Independent American Party of Utah; Vice-Chairman, Chairman, and the Treasurer of the National Independent American Party. The Independent American Party (IAP) officially started in 1998 and began as the Utah Independent American Party. The founders claim to have been inspired by a speech given by Ezra Taft Benson, former United States Secretary of Agriculture, entitled “The Proper Role of Government”. The 15 principles for the proper role of government, taken from his speech, are held as the IAP’s basis for recruiting. Learn more about the Independent American Party on Wikipedia.

    Read the article

  • Edit the Windows Live Writer Custom Dictionary

    - by Matthew Guay
    Windows Live Writer is a great tool for writing and publishing posts to your blog, but its spell check unfortunately doesn’t include many common tech words.  Here’s how you can easily edit your custom dictionary and add your favorite words. Customize Live Writer’s Dictionary Adding an individual word to the Windows Live Writer dictionary works as you would expect.  Right-click on a word and select Add to dictionary. And changing the default spell check settings is easy too.  In the menu, click Tools, then Options, and select the Spelling tab in this dialog.  Here you can choose your dictionary language and turn on/off real-time spell checking and other settings. But there’s no obvious way to edit your custom dictionary.  Editing the custom dictionary directly is nice if you accidently add a misspelled word to your dictionary and want to remove it, or if you want to add a lot of words to the dictionary at once. Live Writer actually stores your custom dictionary entries in a plain text file located in your appdata folder.  It is saved as User.dic in the C:\Users\user_name\AppData\Roaming\Windows Live Writer\Dictionaries folder.  The easiest way to open the custom dictionary is to enter the following in the Run box or the address bar of an Explorer window: %appdata%\Windows Live Writer\Dictionaries\User.dic   This will open the User.dic file in your default text editor.  Add any new words to the custom dictionary on separate lines, and delete any misspelled words you accidently added to the dictionary.   Microsoft Office Word also stores its custom dictionary in a plain text file.  If you already have lots of custom words in it and want to import them into Live Writer, enter the following in the Run command or Explorer’s address bar to open Word’s custom dictionary.  Then copy the words, and past them into your Live Writer custom dictionary file. %AppData%\Microsoft\UProof\Custom.dic Don’t forget to save the changes when you’re done.  Note that the changes to the dictionary may not show up in Live Writer’s spell check until you restart the program.  If it’s currently running, save any posts you’re working on, exit, and then reopen, and all of your new words should be in the dictionary. Conclusion Whether you use Live Writer daily in your job or occasionally post an update to a personal blog, adding your own custom words to the dictionary can save you a lot of time and frustration in editing.  Plus, if you’ve accidently added a misspelled word to the dictionary, this is a great way to undo your mistake and make sure your spelling is up to par! Similar Articles Productive Geek Tips Backup Your Windows Live Writer SettingsTransfer or Move Your Microsoft Office Custom DictionaryFuture Date a Post in Windows Live WriterTools to Help Post Content On Your WordPress BlogInstall Windows Live Essentials In Windows 7 TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Acronis Online Backup DVDFab 6 Revo Uninstaller Pro Registry Mechanic 9 for Windows Video Toolbox is a Superb Online Video Editor Fun with 47 charts and graphs Tomorrow is Mother’s Day Check the Average Speed of YouTube Videos You’ve Watched OutlookStatView Scans and Displays General Usage Statistics How to Add Exceptions to the Windows Firewall

    Read the article

  • Copy wrongs and Copyright

    - by Tony Davis
    Recently, a Chinese blog website copied, wholesale and without permission, a Simple-Talk article on troubleshooting locking and blocking. Our initial reaction was exasperation and anger, tempered slightly by the fact that there was, at the top, a clear link to the original, and the book from which it was extracted. On the day the copy was posted, our original article saw a 30K spike in visits, so the site clearly has a substantial following! This made us pause for thought. Indeed, we wondered whether it might not be more profitable, and certainly more enjoyable, to notify the offender of similar content and serve a "put up" notice, rather than the usual DMCA "take down" . The DMCA request, issued to protect our and our authors' assets, is a necessary but tiresome, chore. So often, simple communication and negotiation could have averted the need for it. We are, after all, in the business of presenting knowledge, information and help to the SQL Server Community. If only they had asked! Of course, one's attitude changes according to the motivation behind the copying of content. One of the motivations seems to be pure vanity; they do it to try to enhance their CV, or their company's expertise, by pretending to expertise they don't possess. There is a class of plagiariser, however, that is doing it purely for money, getting advertising revenue by attracting hapless readers to their site. Not content with stealing content, sites can invest in services that provide 'load-testing' for websites that is so realistic that even the search engines can be fooled. Stolen content, fake visitors, swindled advertisers. Zero-tolerance is really the only way of dealing with plagiarism, and action will only be completely effective once Bing, Google, and the other search engines strike out from their listings the rogue sites that refuse to take down plagiarised content. It is, after all in everyone else's interests. Cheers, Tony.

    Read the article

  • Unused Indexes Gotcha

    - by DavidWimbush
    I'm currently looking into dropping unused indexes to reduce unnecessary overhead and I came across a very good point in the excellent SQL Server MVP Deep Dives book that I haven't seen highlighted anywhere else. I was thinking it was simply a case of dropping indexes that didn't show as being used in DMV sys.dm_db_index_usage_stats (assuming a solid representative workload had been run since the last service start). But Rob Farley points out that the DMV only shows indexes whose pages have been read or updated. An index that isn't listed in the DMV might still be useful by providing metadata to the Query Optimizer and thus streamlining query plans. For example, if you have a query like this: select  au.author_name         , count(*) as books from    books b         inner join authors au on au.author_id = b.author_id group by au.author_name If you have a unique index on authors.author_name the Query Optimizer will realise that each author_id will have a different author_name so it can produce a plan that just counts the books by author_id and then adds the author name to each row in that small table. If you delete that index the query will have to join all the books with their authors and then apply the GROUP BY - a much more expensive query. So be cautious about dropping apparently unused unique indexes.

    Read the article

  • Is it possible for beginner to learn and develop an application in rails in 4 months?

    - by Parth
    I want to develop a web application or a website using rails. My current knowledge includes 1. HTML 2. CSS 3. C 4. Java And I am currently going through 5th chapter of the well grounded rubyist book by David A. Thomas. I came to know that learning ruby is beneficial for good knowledge of rails. So currently I am going through the basics of ruby. And learning rails in parallel. I want to know if in this scenario is it practically feasible to understand rails and develop an application/website in it within the time frame of 4 months. I need to develop an application which have atleast 3 complexity (complex functionality). Any ideas of good application for rails beginners is welcomed. But the application should be large or if it is small than it should have some complexity. Time is a constraint for me. I would have to develop application for college work but rails technology is my choice as I want to learn it.

    Read the article

  • BlueTooth not working on my HP Probook 4720s

    - by mtrento
    the blue tooth on my ubuntu 11.10 does not work. When i try to ad a device it scans indefinitely and never find anything. Wireless is working perfeclty and with windows 7 it is detected. As i read somewhere , the bluetooth is not listed in the usb devices. Is it supported under ubuntu? here are the output of the various debug command i tested : hciconfig -a hci0: Type: BR/EDR Bus: USB BD Address: E0:2A:82:7A:8B:04 ACL MTU: 310:10 SCO MTU: 64:8 UP RUNNING PSCAN ISCAN RX bytes:1895 acl:0 sco:0 events:70 errors:0 TX bytes:1986 acl:0 sco:0 commands:64 errors:0 Features: 0xff 0xff 0x8f 0xfe 0x9b 0xff 0x59 0x83 Packet type: DM1 DM3 DM5 DH1 DH3 DH5 HV1 HV2 HV3 Link policy: RSWITCH HOLD SNIFF PARK Link mode: SLAVE ACCEPT Name: 'PC543host-0' Class: 0x5a0100 Service Classes: Networking, Capturing, Object Transfer, Telephony Device Class: Computer, Uncategorized HCI Version: 2.1 (0x4) Revision: 0x149c LMP Version: 2.1 (0x4) Subversion: 0x149c Manufacturer: Cambridge Silicon Radio (10) hcitool scan hcitool scan Scanning ... lsusb Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 001 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub Bus 002 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub Bus 001 Device 003: ID 04f2:b1ac Chicony Electronics Co., Ltd Bus 002 Device 003: ID 413c:3010 Dell Computer Corp. Optical Wheel Mouse Bus 002 Device 004: ID 148f:1000 Ralink Technology, Corp. lsmod | grep -i bluetooth bluetooth 166112 23 bnep,rfcomm,btusb dmesg | grep -i bluetooth [ 18.543947] Bluetooth: Core ver 2.16 [ 18.544017] Bluetooth: HCI device and connection manager initialized [ 18.544020] Bluetooth: HCI socket layer initialized [ 18.544021] Bluetooth: L2CAP socket layer initialized [ 18.545469] Bluetooth: SCO socket layer initialized [ 18.548890] Bluetooth: Generic Bluetooth USB driver ver 0.6 [ 30.204776] Bluetooth: RFCOMM TTY layer initialized [ 30.204782] Bluetooth: RFCOMM socket layer initialized [ 30.204784] Bluetooth: RFCOMM ver 1.11 [ 30.247291] Bluetooth: BNEP (Ethernet Emulation) ver 1.3 [ 30.247295] Bluetooth: BNEP filters: protocol multicast lspci 00:00.0 Host bridge: Intel Corporation Core Processor DRAM Controller (rev 02) 00:01.0 PCI bridge: Intel Corporation Core Processor PCI Express x16 Root Port (rev 02) 00:16.0 Communication controller: Intel Corporation 5 Series/3400 Series Chipset HECI Controller (rev 06) 00:1a.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05) 00:1b.0 Audio device: Intel Corporation 5 Series/3400 Series Chipset High Definition Audio (rev 05) 00:1c.0 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 1 (rev 05) 00:1c.1 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 2 (rev 05) 00:1c.3 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 4 (rev 05) 00:1c.5 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 6 (rev 05) 00:1d.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05) 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev a5) 00:1f.0 ISA bridge: Intel Corporation Mobile 5 Series Chipset LPC Interface Controller (rev 05) 00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset 6 port SATA AHCI Controller (rev 05) 00:1f.6 Signal processing controller: Intel Corporation 5 Series/3400 Series Chipset Thermal Subsystem (rev 05) 01:00.0 VGA compatible controller: ATI Technologies Inc Manhattan [Mobility Radeon HD 5400 Series] 01:00.1 Audio device: ATI Technologies Inc Manhattan HDMI Audio [Mobility Radeon HD 5000 Series] 44:00.0 Network controller: Ralink corp. RT3090 Wireless 802.11n 1T/1R PCIe 45:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 03) ff:00.0 Host bridge: Intel Corporation Core Processor QuickPath Architecture Generic Non-core Registers (rev 02) ff:00.1 Host bridge: Intel Corporation Core Processor QuickPath Architecture System Address Decoder (rev 02) ff:02.0 Host bridge: Intel Corporation Core Processor QPI Link 0 (rev 02) ff:02.1 Host bridge: Intel Corporation Core Processor QPI Physical 0 (rev 02) ff:02.2 Host bridge: Intel Corporation Core Processor Reserved (rev 02) ff:02.3 Host bridge: Intel Corporation Core Processor Reserved (rev 02) rfkill list 0: phy0: Wireless LAN Soft blocked: no Hard blocked: no 1: hci0: Bluetooth Soft blocked: no Hard blocked: no 2: hp-wifi: Wireless LAN Soft blocked: no Hard blocked: no 3: hp-bluetooth: Bluetooth Soft blocked: no Hard blocked: no

    Read the article

  • Apache virtual host does not work properly

    - by Jori
    I have read a lot of information all over the Internet regarding this subject, and can not figure out what I'am doing wrong. I'm trying to host two websites under different names locally under Windows 7 with Apaches Virtual Hosting functionality. This is what I have done already: In the httpd.conf file I uncommented the following line, so that the virtual host configuration file will be included in the main configuration sequence. # Virtual hosts Include conf/extra/httpd-vhosts.conf This is how I edited my httpd-vhosts.conf: # # Virtual Hosts # # If you want to maintain multiple domains/hostnames on your # machine you can setup VirtualHost containers for them. Most configurations # use only name-based virtual hosts so the server doesn't need to worry about # IP addresses. This is indicated by the asterisks in the directives below. # # Please see the documentation at # <URL:http://httpd.apache.org/docs/2.2/vhosts/> # for further details before you try to setup virtual hosts. # # You may use the command line option '-S' to verify your virtual host # configuration. # # Use name-based virtual hosting. # NameVirtualHost *:80 # # VirtualHost example: # Almost any Apache directive may go into a VirtualHost container. # The first VirtualHost section is used for all requests that do not # match a ServerName or ServerAlias in any <VirtualHost> block. # #<VirtualHost *:80> # ServerAdmin [email protected] # DocumentRoot "C:/apache/docs/dummy-host.localhost" # ServerName dummy-host.localhost # ServerAlias www.dummy-host.localhost # ErrorLog "logs/dummy-host.localhost-error.log" # CustomLog "logs/dummy-host.localhost-access.log" common #</VirtualHost> # #<VirtualHost *:80> # ServerAdmin [email protected] # DocumentRoot "C:/apache/docs/dummy-host2.localhost" # ServerName dummy-host2.localhost # ErrorLog "logs/dummy-host2.localhost-error.log" # CustomLog "logs/dummy-host2.localhost-access.log" common #</VirtualHost> <VirtualHost *:80> ServerName arterieur DocumentRoot "J:/webcontent/www20" <Directory "J:/webcontent/www20"> Order allow,deny Allow from all </Directory> </VirtualHost> As you can see I commented the Virtual Host examples out and added my own one (I did one for this example). Also am I sure that J:\webcontent\www20 exists. At last I edited the Windows host file located in: C:\Windows\System32\drivers\etc\hosts, now it looks this: # Copyright (c) 1993-2009 Microsoft Corp. # # This is a sample HOSTS file used by Microsoft TCP/IP for Windows. # # This file contains the mappings of IP addresses to host names. Each # entry should be kept on an individual line. The IP address should # be placed in the first column followed by the corresponding host name. # The IP address and the host name should be separated by at least one # space. # # Additionally, comments (such as these) may be inserted on individual # lines or following the machine name denoted by a '#' symbol. # # For example: # # 102.54.94.97 rhino.acme.com # source server # 38.25.63.10 x.acme.com # x client host # localhost name resolution is handled within DNS itself. # 127.0.0.1 localhost # ::1 localhost 127.0.0.1 arterieur Then I restarted Apache with the Apache Service Monitor, and it gave me the following fatal error: The requested operation has failed!, I tried to look at the apache/logs/error.log file but I did not log anything, I guess it only logs the errors after startup. Does anyone knows what I'am doing wrong?

    Read the article

  • F# and the rose-tinted reflection

    - by CliveT
    We're already seeing increasing use of many cores on client desktops. It is a change that has been long predicted. It is not just a change in architecture, but our notions of efficiency in a program. No longer can we focus on the asymptotic complexity of an algorithm by counting the steps that a single core processor would take to execute it. Instead we'll soon be more concerned about the scalability of the algorithm and how well we can increase the performance as we increase the number of cores. This may even lead us to throw away our most efficient algorithms, and switch to less efficient algorithms that scale better. We might even be willing to waste cycles in order to speculatively execute at the algorithm rather than the hardware level. State is the big headache in this parallel world. At the hardware level, main memory doesn't necessarily contain the definitive value corresponding to a particular address. An update to a location might still be held in a CPU's local cache and it might be some time before the value gets propagated. To get the latest value, and the notion of "latest" takes a lot of defining in this world of rapidly mutating state, the CPUs may well need to communicate to decide who has the definitive value of a particular address in order to avoid lost updates. At the user program level, this means programmers will need to lock objects before modifying them, or attempt to avoid the overhead of locking by understanding the memory models at a very deep level. I think it's this need to avoid statefulness that has led to the recent resurgence of interest in functional languages. In the 1980s, functional languages started getting traction when research was carried out into how programs in such languages could be auto-parallelised. Sadly, the impracticality of some of the languages, the overheads of communication during this parallel execution, and rapid improvements in compiler technology on stock hardware meant that the functional languages fell by the wayside. The one thing that these languages were good at was getting rid of implicit state, and this single idea seems like a solution to the problems we are going to face in the coming years. Whether these languages will catch on is hard to predict. The mindset for writing a program in a functional language is really very different from the way that object-oriented problem decomposition happens - one has to focus on the verbs instead of the nouns, which takes some getting used to. There are a number of hybrid functional/object languages that have been becoming more popular in recent times. These half-way houses make it easy to use functional ideas for some parts of the program while still allowing access to the underlying object-focused platform without a great deal of impedance mismatch. One example is F# running on the CLR which, in Visual Studio 2010, has because a first class member of the pack. Inside Visual Studio 2010, the tooling for F# has improved to the point where it is easy to set breakpoints and watch values change while debugging at the source level. In my opinion, it is the tooling support that will enable the widespread adoption of functional languages - without this support, people will put off any transition into the functional world for as long as they possibly can. Without tool support it will make it hard to learn these languages. One tool that doesn't currently support F# is Reflector. The idea of decompiling IL to a functional language is daunting, but F# is potentially so important I couldn't dismiss the idea. As I'm currently developing Reflector 6.5, I thought it wise to take four days just to see how far I could get in doing so, even if it achieved little more than to be clearer on how much was possible, and how long it might take. You can read what happened here, and of the insights it gave us on ways to improve the tool.

    Read the article

  • How much Ruby should I learn before moving to Rails?

    - by Kevin
    Just a quick question.. I can never get a definitive answer when googling this, either. Some people say you can learn Rails without knowing any Ruby, but at some point you'll run into a brick wall and wish you knew Ruby and will have to go back to learn it..and some say to learn the "basics" of Ruby before learning Rails and it will make your life that much easier.. My current knowledge is low. I'm not a beginner, but I'm not pro, either. I went through the Learn Python The Hard Way online book in about a month, but I stopped once I got to the OOP side of Python (I know booleans, elif/if/else/statements, for loops, while loops, functions) I agree with learning the "basics" of Ruby before learning Rails, but what exactly are the "basics" of Ruby? Would I need to learn the whole OOP side of Ruby before I went on to Rails? Or would I just need to learn the Ruby syntax up to where I learned Python (booleans, elif/if/else/statements, for loops, while loops, functions) before I went on to Rails? Thanks!

    Read the article

  • SQL SERVER – Difference between COUNT(DISTINCT) vs COUNT(ALL)

    - by pinaldave
    This blog post is written in response to the T-SQL Tuesday hosted by Jes Schultz Borland. Earlier today, I was presenting a 45-minute session at the Community College about “The Beginning SQL Server Database”. One of the students asked me the following question. What is the difference between COUNT(DISTINCT) vs COUNT(ALL)? I found this question from the student very interesting. He seems to have read the documentation (Book Online) and was then asking me this question. I always carry laptop which has SQL Server installed. I quickly opened it and ran the following script. After looking at the result, I think it was clear to everybody. Here is the script: SELECT COUNT([Title]) Value FROM [AdventureWorks].[Person].[Contact] GO SELECT COUNT(ALL [Title]) ALLValue FROM [AdventureWorks].[Person].[Contact] GO SELECT COUNT(DISTINCT [Title]) DistinctValue FROM [AdventureWorks].[Person].[Contact] GO The above script will give me the following results. You can clearly notice from the result set that COUNT (ALL ColumnName) is the same as COUNT(ColumnName). The reality is that the “ALL” is actually  the default option and it needs not to be specified. The ALL keyword includes all the non-NULL values. I know this is very simple and may be it does not change how we work; however looking at the whole angle, I really enjoyed the question. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQLAuthority News, T SQL, Technology

    Read the article

  • Learning Objective-C for iPad/iPhone/iPod Development

    - by Jeff Julian
    I am learning how to write apps for the iPad/iPhone/iPod!  Why, well several reasons.  One reason, I have 5 devices in my house on the platform.  I had an iPad and iPhone, Michelle has an iPhone, and each of the kids have iPod Touches.  They are excellent devices for life management, entertainment, and learning.  I am amazed at how well the kids pick up on it and how much it effects the way they learn.  My two year old knows how to use it better than any other device we own and she is learning new words and letters so quickly. Because of this saturation at home, it would be fun to write some apps my family could use.  Some games to bring the hobby of development back into my life.  Second reason is we want to have a Geekswithblogs app for the iPhone and iPad.  We are not sure if it is purely informational (blog posts and tweets) or if members want to be able to publish from the app.  Creating a blog editor would be tough stuff, but could be just the right challenge. There are so many more reasons, but the last one that really makes me excited is that it is a new domain of development where I get excited when I think about writing apps.  That excitement level where I want to see if there are User Groups and if we are just watching TV, to break out the MBP and start working on it.  That excitement level where I could really read a development book cover to cover and not just use as a reference.  I really do like this feeling. Who knows how long this will last and I am definitely not leaving .NET.  Microsoft software will always be my main focus, but for the time, my hobby is changing and I am getting excited about development again.   Technorati Tags: Apple,iPad Development,Objective-C,New Frontiers Image: Courtesy of Apple

    Read the article

< Previous Page | 372 373 374 375 376 377 378 379 380 381 382 383  | Next Page >